WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [RFC][PATCH] 0/9 Populate-on-demand memory

To: "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
Subject: Re: [Xen-devel] [RFC][PATCH] 0/9 Populate-on-demand memory
From: "George Dunlap" <dunlapg@xxxxxxxxx>
Date: Wed, 24 Dec 2008 13:55:20 +0000
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 24 Dec 2008 05:55:48 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references :x-google-sender-auth; bh=nVxIlZnR4SxhK7eUG94TlOH/SebxXePghCmQS1RETxs=; b=RDKm9SbTUxmMQmbv3MRftPJ9UWcVFN/dRB28h0RzGLIZkN4oihV8Nk/dKhIzG7TdKC cxfVuyXqPC1T6etB0gQi1eLeemdUCYFIfGVihFoia0reXr5YqBDyO6zsB1xSc3Wlsmy2 5cnHgU+5ABMJ/NFggwEpQy6/lrVzh7t4LkPAU=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=vUhK3xttzB2KCZkcdWIm1w3ZtBNfYN2Npgy9Ibiv/pLp8bd6JPOt/rqm9upFwAx7Ju wj2Yx+C3KAAMA/8Dz6OmAFnxGsWBXUHJKN6mb1lgGbWvOvWpEJB6ggtxjs5SF9YiY8jD LLpMAl4YhFNUvQXEs0EKtHhAVtsWkRRl/U+Xk=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <42c6b1aa-198a-412b-ae07-f25a2649914c@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <de76405a0812230455m51a8bd62ncf1b38dbccb3d442@xxxxxxxxxxxxxx> <42c6b1aa-198a-412b-ae07-f25a2649914c@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Tue, Dec 23, 2008 at 7:06 PM, Dan Magenheimer
<dan.magenheimer@xxxxxxxxxx> wrote:
> Very nice!

Thanks!

> One thing that might be worth adding to the requirements list or
> README is that this approach (or any which depends on ballooning)
> will now almost certainly require any participating hvm domain
> to have an adequately-sized properly-configured swap disk.
> Ballooning is insufficiently responsive to grow memory fast
> enough to handle rapidly growing memory needs of an active domain
> The consequence for a no-swap-disk is application failures
> and the consequence even if a swap disk IS configured is temporarily
> very poor performance.

I don't think this is particular to the PoD patches, or even
ballooning per se.  A swap disk would be required any time you boot
with a small amount of memory, whether it could be increased or not.

But you're right, in that this differs from a typical operating
system's "demang-paging" mechanism, where the goal is to give a
process only the memory it actually needs, so you can use it for other
processes.  You're still allocating a fixed amount of memory to a
guest at start-up.  The un-populated memory is not available to use by
other VMs, and allocating more memory is a (relatively) slow process.
I guess a brief note pointing out the difference between "populate on
demand" and "allocate on demand" would be useful.

> So this won't work for any domain that does start-of-day
> scrubbing with a non-zero value?  I suppose that's OK.

Not if the scrubber might win the race against the balloon driver. :-)
 If this really becomes an issue, it should be straightforward to add
functionality to handle it.  It just requires having a simple way of
specifying what "scrubbed" pages look like, an extra p2m type for "PoD
scrubbed" (rather than PoD zero, the default), and how to change from
scrubbed <-> zero.

Did you have a particular system in mind?

-George

>> -----Original Message-----
>> From: George Dunlap [mailto:dunlapg@xxxxxxxxx]
>> Sent: Tuesday, December 23, 2008 5:55 AM
>> To: xen-devel@xxxxxxxxxxxxxxxxxxx
>> Subject: [Xen-devel] [RFC][PATCH] 0/9 Populate-on-demand memory
>>
>>
>> This set of patches introduces a set of mechanisms and interfaces to
>> implement populate-on-demand memory.  The purpose of
>> populate-on-demand memory is to allow non-paravirtualized guests (such
>> as Windows or Linux HVM) boot in a ballooned state.
>>
>> BACKGROUND
>>
>> When non-PV domains boots, they typically read the e820 maps to
>> determine how much memory they have, and then assume that much memory
>> thereafter.  Memory requirements can be reduced using a balloon
>> driver, but it cannot be increased past this initial value.
>> Currently, this means that a non-PV domain must be booted with the
>> maximum amount of memory you want that VM every to be able to use.
>>
>> Populate-on-demand allows us to "boot ballooned", in the
>> following manner:
>> * Mark the entire range of memory (memory_static_max aka maxmem) with
>> a new p2m type, populate_on_demand, reporting memory_static_max in th
>> e820 map.  No memory is allocated at this stage.
>> * Allocate the "memory_dynamic_max" (aka "target") amount of memory
>> for a "PoD cache".  This memory is kept on a separate list in the
>> domain struct.
>> * Boot the guest.
>> * Populate the p2m table on-demand as it's accessed with pages from
>> the PoD cache.
>> * When the balloon driver loads, it inflates the balloon size to
>> (maxmem - target), giving the memory back to Xen.  When this is
>> accomplished, the "populate-on-demand" portion of boot is effectively
>> finished.
>>
>> One complication is that many operating systems have start-of-day page
>> scrubbers, which touch all of memory to zero it.  This scrubber may
>> run before the balloon driver can return memory to Xen.  These zeroed
>> pages, however, don't contain any information; we can safely replace
>> them with PoD entries again.  So when we run out of PoD cache, we do
>> an "emergency sweep" to look for zero pages we can reclaim for the
>> populate-on-demand cache.  When we find a page range which is entirely
>> zero, we mark the gfn range PoD again, and put the memory back into
>> the PoD cache.
>>
>> NB that this code is designed to work only in conjunction with a
>> balloon driver.  If the balloon driver is not loaded, eventually all
>> pages will be dirtied (non-zero), the emergency sweep will fail, and
>> there will be no memory to back outstanding PoD pages.  When this
>> happens, the domain will crash.
>>
>> The code works for both shadow mode and HAP mode; it has been tested
>> with NPT/RVI and shadow, but not yet with EPT.  It also attempts to
>> avoid splintering superpages, to allow HAP to function more
>> effectively.
>>
>> To use:
>> * ensure that you have a functioning balloon driver in the guest
>> (e.g., xen_balloon.ko for Linux HVM guests).
>> * Set maxmem/memory_static_max to one value, and
>> memory/memory_dynamic_max to another when creating the domain; e.g:
>>  # xm create debian-hvm maxmem=512 memory=256
>>
>> The patches are as follows:
>> 01 - Add a p2m_query_type to core gfn_to_mfn*() functions.
>>
>> 02 - Change some gfn_to_mfn() calls to gfn_to_mfn_query(), which will
>> not populate PoD entries.  Specifically, since gfn_to_mfn() may grab
>> the p2m lock, it must not be called while the shadow lock is held.
>>
>> 03 - Populate-on-demand core.  Introduce new p2m type, PoD cache
>> structures, and core functionality.  Add PoD checking to audit_p2m().
>> Add PoD information to the 'q' debug key.
>>
>> 04 - Implement p2m_decrease_reservation.  As the balloon driver
>> returns gfns to Xen, it handles PoD entries properly; it also "steals"
>> memory being returned for the PoD cache instead of freeing it, if
>> necessary.
>>
>> 05 - emergency sweep: Implement emergency sweep for zero memory if the
>> cache is low.  If it finds pages (or page ranges) entirely zero, it
>> will replace the entry with a PoD entry again, reclaiming the memory
>> for the PoD cache.
>>
>> 06 - Deal with splintering both PoD pages (to back singleton PoD
>> entries) and PoD ranges
>>
>> 07 - Xen interface for populate-on-demand functionality: PoD flag for
>> populate_physmap, {get,set}_pod_target for interacting with the PoD
>> cache.  set_pod_target() should be called for any domain that may have
>> PoD entries.  It will increase the size of the cache if necessary, but
>> will never decrease the size of the cache.  (This will be done as the
>> balloon driver balloons down.)
>>
>> 08 - libxc interface.  Add a new libxc functions:
>> + xc_hvm_build_target_mem(), which accepts memsize and target.  If
>> these are equal, PoD functionality is not invoked.  Otherwise, memsize
>> is marked PoD, and the target MiB is allocated to the PoD cache.
>> + xc_[sg]et_pod_target(): get / set PoD target.  set_pod_target()
>> should be called whenever you change the guest target mem on a domain
>> which may have outstaning PoD entries.  This may increase the size of
>> the PoD cache up to the number of outstanding PoD entries, but will
>> not reduce the size of the cache.  (The cache may be reduced as the
>> balloon driver returns gfn space to Xen.)
>>
>> 09 - xend integration.
>> + Always calls xc_hvm_build_target_mem() with memsize=maxmem and
>> target=memory.  If these the same, the internal function will not use
>> PoD.
>> + Calls xc_set_target_mem() whenever a domain's target is changed.
>> Also calls balloon.free(), causing dom0 to balloon down itself if
>> there's not enough memory otherwise.
>>
>> Things still to do:
>> * When reduce_reservation() is called with a superpage, keep the
>> superpage intact.
>> * Create a hypercall continuation for set_pod_target.
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel