The problem is that every page that is ballooned down by
the balloon driver can be slurped up as a private-
persistent ("preswap") page by tmem. Private-persistent
pages contain indirectly-accessible domain data, are counted
against the domain's tot_pages, and are migrated along with
the domain-directly-accessible pages.
So any temporary mapping of xenheap pages into domheap,
such as occurs during restore/migration, can cause max_pages
to be exceeded.
This isn't a problem today for tmem because tmem only runs
in PV domains today, but I suspect the fragileness of this
approach will come back and bite us. It reminds me
of the classic "shell game".
Is there a per-domain counter of these special pages
somewhere? If so, a MEMF flag could subtract this
from max_pages in the limit check in assign_pages(),
e.g.:
max = d->max_pages;
if ( memflags & MEMF_no_special )
max -= d->special_pages;
<snip>
if ( unlikely((d->tot_pages + ... > max )
/* Over-allocation */
(Special_pages counts any xenheap pages
that contain domain-specific data that needs
to be retained across a migration.)
Dan
> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: Thursday, September 17, 2009 12:21 AM
> To: Mukesh Rathor; Dan Magenheimer
> Cc: Annie Li; Joshua West; James Harper; xen-devel; Wayne Gong; Kurt
> Hackel
> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV
>
>
> Yeah, all the PV drivers are having to do is balloon down one
> page for every
> Xenheap page they map. There's no further complexity than
> that, so let's not
> make a mountain out of a molehill. The approach as discussed and now
> implemented should work fine with tmem I think.
>
> -- Keir
>
> On 16/09/2009 21:50, "Mukesh Rathor" <mukesh.rathor@xxxxxxxxxx> wrote:
>
> > just in case someone missed the thread earlier,
> >
> > 3 = 1 shinfo + 2 gnt frames default.
> >
> > so, tot_pages + shinfo + num gnt frames.
> >
> >
> > Mukesh
> >
> >
> >
> > Dan Magenheimer wrote:
> >> Before we close down this thread, I have a concern:
> >>
> >> According to Mukesh, the fix to this bug is dependent
> >> on the pv drivers tracking tot_pages for a domain
> >> and ballooning to ensure tot_pages+3 does not exceed
> >> max_pages for the domain.
> >>
> >> Well, tmem can affect tot_pages for a domain inside
> >> the hypervisor without any notification to pv drivers
> >> or the balloon driver. And I'd imagine that PoD and
> >> future memory optimization mechanisms such as
> >> swapping and page-sharing may do the same.
> >>
> >> So this solution seems very fragile.
> >>
> >> Dan
> >>
> >>> -----Original Message-----
> >>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> >>> Sent: Wednesday, September 16, 2009 6:28 AM
> >>> To: Annie Li
> >>> Cc: Joshua West; Dan Magenheimer; xen-devel; Kurt Hackel;
> >>> James Harper;
> >>> Wayne Gong
> >>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV
> >>>
> >>>
> >>> On 16/09/2009 12:10, "ANNIE LI" <annie.li@xxxxxxxxxx> wrote:
> >>>
> >>>>> I will do more test to make sure it and update here.
> >>>> I tried to map 256 grant frames during initialization and
> >>> balloon down
> >>>> 256+1(shinfo+gnttab) pages driver first
> >>>> load. Then i did save/restore for 50 times, and live
> >>> migration for 10
> >>>> times. No error occurs.
> >>> Okay, well I still can't explain why that fixes it, but
> >>> clearly it does. So
> >>> that's good. :-)
> >>>
> >>> -- Keir
> >>>
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> http://lists.xensource.com/xen-devel
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|