|
|
|
|
|
|
|
|
|
|
xen-devel
[Xen-devel] Invalid types between save and restore, Xen 3.1.4
Hi list,
I am currently charged with the implementation of save/restore/migrate
inside NetBSD.
So far, my current work does manage to save/restore a NetBSD domU, but I
am erratically (one out of ten) facing issues regarding page type
validation and pinning when cycling saves/restores.
For unknown reasons, the save operation works, but restore might fail,
with xend reporting:
[2008-12-04 17:24:40 219] INFO (XendCheckpoint:370) Received all pages
(0 races)
[2008-12-04 17:24:40 219] INFO (XendCheckpoint:370) ERROR Internal
error: Failed to pin batch of 21 page tables
[2008-12-04 17:24:40 219] INFO (XendCheckpoint:370) Restore exit with rc=1
This is due to hypervisor refusing some type validation when xc_restore
is issuing its xc_mmuext_op():
(XEN) mm.c:1842:d0 Bad type (saw 28000008 != exp e0000000) for mfn 1f16f
(pfn 43e)
(XEN) mm.c:649:d0 Error getting mfn 1f16f (pfn 43e) from L1 entry
1f16f023 for dom13
(XEN) mm.c:916:d0 Failure in alloc_l1_table: entry 768
(XEN) mm.c:1863:d0 Error while validating mfn 1ee38 (pfn 775) for type
20000000: caf=80000003 taf=20000001
(XEN) mm.c:683: get_l2_linear_pagetable() ret: 0 (exp 1)
(XEN) mm.c:1091:d0 Failure in alloc_l2_table: entry 1007
(XEN) mm.c:1863:d0 Error while validating mfn 1efb4 (pfn 5f9) for type
40000000: caf=80000003 taf=40000001
(XEN) mm.c:2132:d0 Error while pinning mfn 1efb4
It is kind of erratic, and hard to reproduce. I suppose that I am facing
a race inside VM code, but as I am not familiar with Xen's inner
workings with MMU, I am having a hard time tracking it.
The L1 and L2 entries at fault are always the same. The 1007 L2 entry
corresponds to an "alternative" recursive PD in our VM subsystem, and
the L1 768 is the start of our kernel's virtual memory.
This is with Xen 3.1.4. NetBSD does not use writable mappings, and
manipulates MMU only through the hypercall API. MFN's manipulation are
suspended during a save, to avoid any incorrect one after a restore.
What I would like to know is the kind of operations that could result on
such a situation. Considering that the xentools should have an accurate
view of the pfn_types through the p2m table, how could it become
possible that between save and restore, hypervisor refuses to validate
pages, as mappings should not change after the call to HYPERVISOR_suspend()?
For example, why is Xen expecting a writable mapping while the page is
validated as L1?
I was wondering if anyone could shed some light for me. Please correct
me if I am wrong.
Thanking you in advance for your help,
--
Jean-Yves Migeon
jeanyves.migeon@xxxxxxx
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-devel] Invalid types between save and restore, Xen 3.1.4,
Jean-Yves Migeon <=
|
|
|
|
|