On Tue, Dec 23, 2008 at 7:06 PM, Dan Magenheimer
<dan.magenheimer@xxxxxxxxxx> wrote:
> Very nice!
Thanks!
> One thing that might be worth adding to the requirements list or
> README is that this approach (or any which depends on ballooning)
> will now almost certainly require any participating hvm domain
> to have an adequately-sized properly-configured swap disk.
> Ballooning is insufficiently responsive to grow memory fast
> enough to handle rapidly growing memory needs of an active domain
> The consequence for a no-swap-disk is application failures
> and the consequence even if a swap disk IS configured is temporarily
> very poor performance.
I don't think this is particular to the PoD patches, or even
ballooning per se. A swap disk would be required any time you boot
with a small amount of memory, whether it could be increased or not.
But you're right, in that this differs from a typical operating
system's "demang-paging" mechanism, where the goal is to give a
process only the memory it actually needs, so you can use it for other
processes. You're still allocating a fixed amount of memory to a
guest at start-up. The un-populated memory is not available to use by
other VMs, and allocating more memory is a (relatively) slow process.
I guess a brief note pointing out the difference between "populate on
demand" and "allocate on demand" would be useful.
> So this won't work for any domain that does start-of-day
> scrubbing with a non-zero value? I suppose that's OK.
Not if the scrubber might win the race against the balloon driver. :-)
If this really becomes an issue, it should be straightforward to add
functionality to handle it. It just requires having a simple way of
specifying what "scrubbed" pages look like, an extra p2m type for "PoD
scrubbed" (rather than PoD zero, the default), and how to change from
scrubbed <-> zero.
Did you have a particular system in mind?
-George
>> -----Original Message-----
>> From: George Dunlap [mailto:dunlapg@xxxxxxxxx]
>> Sent: Tuesday, December 23, 2008 5:55 AM
>> To: xen-devel@xxxxxxxxxxxxxxxxxxx
>> Subject: [Xen-devel] [RFC][PATCH] 0/9 Populate-on-demand memory
>>
>>
>> This set of patches introduces a set of mechanisms and interfaces to
>> implement populate-on-demand memory. The purpose of
>> populate-on-demand memory is to allow non-paravirtualized guests (such
>> as Windows or Linux HVM) boot in a ballooned state.
>>
>> BACKGROUND
>>
>> When non-PV domains boots, they typically read the e820 maps to
>> determine how much memory they have, and then assume that much memory
>> thereafter. Memory requirements can be reduced using a balloon
>> driver, but it cannot be increased past this initial value.
>> Currently, this means that a non-PV domain must be booted with the
>> maximum amount of memory you want that VM every to be able to use.
>>
>> Populate-on-demand allows us to "boot ballooned", in the
>> following manner:
>> * Mark the entire range of memory (memory_static_max aka maxmem) with
>> a new p2m type, populate_on_demand, reporting memory_static_max in th
>> e820 map. No memory is allocated at this stage.
>> * Allocate the "memory_dynamic_max" (aka "target") amount of memory
>> for a "PoD cache". This memory is kept on a separate list in the
>> domain struct.
>> * Boot the guest.
>> * Populate the p2m table on-demand as it's accessed with pages from
>> the PoD cache.
>> * When the balloon driver loads, it inflates the balloon size to
>> (maxmem - target), giving the memory back to Xen. When this is
>> accomplished, the "populate-on-demand" portion of boot is effectively
>> finished.
>>
>> One complication is that many operating systems have start-of-day page
>> scrubbers, which touch all of memory to zero it. This scrubber may
>> run before the balloon driver can return memory to Xen. These zeroed
>> pages, however, don't contain any information; we can safely replace
>> them with PoD entries again. So when we run out of PoD cache, we do
>> an "emergency sweep" to look for zero pages we can reclaim for the
>> populate-on-demand cache. When we find a page range which is entirely
>> zero, we mark the gfn range PoD again, and put the memory back into
>> the PoD cache.
>>
>> NB that this code is designed to work only in conjunction with a
>> balloon driver. If the balloon driver is not loaded, eventually all
>> pages will be dirtied (non-zero), the emergency sweep will fail, and
>> there will be no memory to back outstanding PoD pages. When this
>> happens, the domain will crash.
>>
>> The code works for both shadow mode and HAP mode; it has been tested
>> with NPT/RVI and shadow, but not yet with EPT. It also attempts to
>> avoid splintering superpages, to allow HAP to function more
>> effectively.
>>
>> To use:
>> * ensure that you have a functioning balloon driver in the guest
>> (e.g., xen_balloon.ko for Linux HVM guests).
>> * Set maxmem/memory_static_max to one value, and
>> memory/memory_dynamic_max to another when creating the domain; e.g:
>> # xm create debian-hvm maxmem=512 memory=256
>>
>> The patches are as follows:
>> 01 - Add a p2m_query_type to core gfn_to_mfn*() functions.
>>
>> 02 - Change some gfn_to_mfn() calls to gfn_to_mfn_query(), which will
>> not populate PoD entries. Specifically, since gfn_to_mfn() may grab
>> the p2m lock, it must not be called while the shadow lock is held.
>>
>> 03 - Populate-on-demand core. Introduce new p2m type, PoD cache
>> structures, and core functionality. Add PoD checking to audit_p2m().
>> Add PoD information to the 'q' debug key.
>>
>> 04 - Implement p2m_decrease_reservation. As the balloon driver
>> returns gfns to Xen, it handles PoD entries properly; it also "steals"
>> memory being returned for the PoD cache instead of freeing it, if
>> necessary.
>>
>> 05 - emergency sweep: Implement emergency sweep for zero memory if the
>> cache is low. If it finds pages (or page ranges) entirely zero, it
>> will replace the entry with a PoD entry again, reclaiming the memory
>> for the PoD cache.
>>
>> 06 - Deal with splintering both PoD pages (to back singleton PoD
>> entries) and PoD ranges
>>
>> 07 - Xen interface for populate-on-demand functionality: PoD flag for
>> populate_physmap, {get,set}_pod_target for interacting with the PoD
>> cache. set_pod_target() should be called for any domain that may have
>> PoD entries. It will increase the size of the cache if necessary, but
>> will never decrease the size of the cache. (This will be done as the
>> balloon driver balloons down.)
>>
>> 08 - libxc interface. Add a new libxc functions:
>> + xc_hvm_build_target_mem(), which accepts memsize and target. If
>> these are equal, PoD functionality is not invoked. Otherwise, memsize
>> is marked PoD, and the target MiB is allocated to the PoD cache.
>> + xc_[sg]et_pod_target(): get / set PoD target. set_pod_target()
>> should be called whenever you change the guest target mem on a domain
>> which may have outstaning PoD entries. This may increase the size of
>> the PoD cache up to the number of outstanding PoD entries, but will
>> not reduce the size of the cache. (The cache may be reduced as the
>> balloon driver returns gfn space to Xen.)
>>
>> 09 - xend integration.
>> + Always calls xc_hvm_build_target_mem() with memsize=maxmem and
>> target=memory. If these the same, the internal function will not use
>> PoD.
>> + Calls xc_set_target_mem() whenever a domain's target is changed.
>> Also calls balloon.free(), causing dom0 to balloon down itself if
>> there's not enough memory otherwise.
>>
>> Things still to do:
>> * When reduce_reservation() is called with a superpage, keep the
>> superpage intact.
>> * Create a hypercall continuation for set_pod_target.
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|