Very nice!
One thing that might be worth adding to the requirements list or
README is that this approach (or any which depends on ballooning)
will now almost certainly require any participating hvm domain
to have an adequately-sized properly-configured swap disk.
Ballooning is insufficiently responsive to grow memory fast
enough to handle rapidly growing memory needs of an active domain
The consequence for a no-swap-disk is application failures
and the consequence even if a swap disk IS configured is temporarily
very poor performance.
I'm working on fixing that (at least on pv domains). Watch
this list after the new year.
So this won't work for any domain that does start-of-day
scrubbing with a non-zero value? I suppose that's OK.
Happy holidays to all!
Dan
> -----Original Message-----
> From: George Dunlap [mailto:dunlapg@xxxxxxxxx]
> Sent: Tuesday, December 23, 2008 5:55 AM
> To: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: [Xen-devel] [RFC][PATCH] 0/9 Populate-on-demand memory
>
>
> This set of patches introduces a set of mechanisms and interfaces to
> implement populate-on-demand memory. The purpose of
> populate-on-demand memory is to allow non-paravirtualized guests (such
> as Windows or Linux HVM) boot in a ballooned state.
>
> BACKGROUND
>
> When non-PV domains boots, they typically read the e820 maps to
> determine how much memory they have, and then assume that much memory
> thereafter. Memory requirements can be reduced using a balloon
> driver, but it cannot be increased past this initial value.
> Currently, this means that a non-PV domain must be booted with the
> maximum amount of memory you want that VM every to be able to use.
>
> Populate-on-demand allows us to "boot ballooned", in the
> following manner:
> * Mark the entire range of memory (memory_static_max aka maxmem) with
> a new p2m type, populate_on_demand, reporting memory_static_max in th
> e820 map. No memory is allocated at this stage.
> * Allocate the "memory_dynamic_max" (aka "target") amount of memory
> for a "PoD cache". This memory is kept on a separate list in the
> domain struct.
> * Boot the guest.
> * Populate the p2m table on-demand as it's accessed with pages from
> the PoD cache.
> * When the balloon driver loads, it inflates the balloon size to
> (maxmem - target), giving the memory back to Xen. When this is
> accomplished, the "populate-on-demand" portion of boot is effectively
> finished.
>
> One complication is that many operating systems have start-of-day page
> scrubbers, which touch all of memory to zero it. This scrubber may
> run before the balloon driver can return memory to Xen. These zeroed
> pages, however, don't contain any information; we can safely replace
> them with PoD entries again. So when we run out of PoD cache, we do
> an "emergency sweep" to look for zero pages we can reclaim for the
> populate-on-demand cache. When we find a page range which is entirely
> zero, we mark the gfn range PoD again, and put the memory back into
> the PoD cache.
>
> NB that this code is designed to work only in conjunction with a
> balloon driver. If the balloon driver is not loaded, eventually all
> pages will be dirtied (non-zero), the emergency sweep will fail, and
> there will be no memory to back outstanding PoD pages. When this
> happens, the domain will crash.
>
> The code works for both shadow mode and HAP mode; it has been tested
> with NPT/RVI and shadow, but not yet with EPT. It also attempts to
> avoid splintering superpages, to allow HAP to function more
> effectively.
>
> To use:
> * ensure that you have a functioning balloon driver in the guest
> (e.g., xen_balloon.ko for Linux HVM guests).
> * Set maxmem/memory_static_max to one value, and
> memory/memory_dynamic_max to another when creating the domain; e.g:
> # xm create debian-hvm maxmem=512 memory=256
>
> The patches are as follows:
> 01 - Add a p2m_query_type to core gfn_to_mfn*() functions.
>
> 02 - Change some gfn_to_mfn() calls to gfn_to_mfn_query(), which will
> not populate PoD entries. Specifically, since gfn_to_mfn() may grab
> the p2m lock, it must not be called while the shadow lock is held.
>
> 03 - Populate-on-demand core. Introduce new p2m type, PoD cache
> structures, and core functionality. Add PoD checking to audit_p2m().
> Add PoD information to the 'q' debug key.
>
> 04 - Implement p2m_decrease_reservation. As the balloon driver
> returns gfns to Xen, it handles PoD entries properly; it also "steals"
> memory being returned for the PoD cache instead of freeing it, if
> necessary.
>
> 05 - emergency sweep: Implement emergency sweep for zero memory if the
> cache is low. If it finds pages (or page ranges) entirely zero, it
> will replace the entry with a PoD entry again, reclaiming the memory
> for the PoD cache.
>
> 06 - Deal with splintering both PoD pages (to back singleton PoD
> entries) and PoD ranges
>
> 07 - Xen interface for populate-on-demand functionality: PoD flag for
> populate_physmap, {get,set}_pod_target for interacting with the PoD
> cache. set_pod_target() should be called for any domain that may have
> PoD entries. It will increase the size of the cache if necessary, but
> will never decrease the size of the cache. (This will be done as the
> balloon driver balloons down.)
>
> 08 - libxc interface. Add a new libxc functions:
> + xc_hvm_build_target_mem(), which accepts memsize and target. If
> these are equal, PoD functionality is not invoked. Otherwise, memsize
> is marked PoD, and the target MiB is allocated to the PoD cache.
> + xc_[sg]et_pod_target(): get / set PoD target. set_pod_target()
> should be called whenever you change the guest target mem on a domain
> which may have outstaning PoD entries. This may increase the size of
> the PoD cache up to the number of outstanding PoD entries, but will
> not reduce the size of the cache. (The cache may be reduced as the
> balloon driver returns gfn space to Xen.)
>
> 09 - xend integration.
> + Always calls xc_hvm_build_target_mem() with memsize=maxmem and
> target=memory. If these the same, the internal function will not use
> PoD.
> + Calls xc_set_target_mem() whenever a domain's target is changed.
> Also calls balloon.free(), causing dom0 to balloon down itself if
> there's not enough memory otherwise.
>
> Things still to do:
> * When reduce_reservation() is called with a superpage, keep the
> superpage intact.
> * Create a hypercall continuation for set_pod_target.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|