|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Error restoring DomU when using GPLPV
Keir Fraser wrote: On 15/09/2009 03:25, "Mukesh Rathor" <mukesh.rathor@xxxxxxxxxx> wrote:Ok, I've been looking at this and figured what's going on. Annie's problem lies in not remapping the grant frames post migration. Hence the leak, tot_pages goes up every time until migration fails. On linux, remapping is where the frames created by restore (for heap pfn's), get freed back to the dom heap, is what I found. So that's a fix to be made on win pv driver side.Although obviosuly that is a bug, I'm not sure why it would cause this particular issue? The domheap pages do not get freed and replaced with xenheap pages, but why does that affect the next save/restore cycle? After all, xc_domain_save does not distinguish between Xenheap and domheap pages?
xc_domain_save doesn't distinguish is actually the problem, as
xc_domain_restore then backs xenheap pfn's for shinfo/gnt frames with dom
heap pages. These dom heap pages do get freed and replaced by xenheap pages
on target host (upon guest remap in gnttab_map()) in following code:
arch_memory_op():
/* Remove previously mapped page if it was present. */
prev_mfn = gmfn_to_mfn(d, xatp.gpfn);
if ( mfn_valid(prev_mfn) )
{
.....
guest_remove_page(d, xatp.gpfn); <=======
}
Eg. my guest with 128M gets created with tot_pages=0x83eb
max_pages:0x8400. Now xc_domain_save saves all, 0x83eb+shinfo+gnt
frames(2), so I see tot_pages on target go upto 0x83ee. Now, guest
remaps() shinfo and gnt frames. The dom heap pages are returned in
guest_remove_page(), tot_pages goes back to 0x83eb. In Annie's case,
driver forgets to remap the 2 gnt frames, so dom heap pages are wrongly
mapped and tot_pages remains at 0x83ed, and after few more when it reaches
0x83ff, migration fails as save is not be able to create
0x83ff+shinfo+gntframes temporarily, max_page being 0x8400.
Hope that makes sense.
Yup, that's what I thought, but just wanted to make sure.
Ok got it, I think driver change is the way to go. Also, unfortunately, the failure case is not handled properly sometimes. If migration fails after suspend, then no way to get the guest back. I even noticed, the guest disappeared totally from both source and target when failed, couple times of several dozen migrations I did.That shouldn't happen since there is a mechanism to cancel the suspension of a suspended guest. Possibly xend doesn't get it right every time, as it's error handling is pretty poor in general. I trust the underlying mechanisms below xend pretty well however. -- Keir thanks a lot, Mukesh _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |