[Xen-devel] Re: Q about System-wide Memory Management Strategies

To:	Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Subject:	[Xen-devel] Re: Q about System-wide Memory Management Strategies
From:	Joanna Rutkowska <joanna@xxxxxxxxxxxxxxxxxxxxxx>
Date:	Wed, 04 Aug 2010 00:33:17 +0200
Cc:	xen-devel@xxxxxxxxxxxxxxxxxxx, qubes-devel@xxxxxxxxxxxxxxxx
Delivery-date:	Tue, 03 Aug 2010 15:34:18 -0700
Dkim-signature:	v=1; a=rsa-sha1; c=relaxed/relaxed; d=messagingengine.com; h=message-id:date:from:mime-version:to:cc:subject:references:in-reply-to:content-type; s=smtpout; bh=rKbP6BbqXHcoC17mS1TJsE/ybZo=; b=U/BFVMY6zvpLOwkWiiGN3W5nr/qarbTKmBY0dGMVTWppKo2TIavWIa21xYklIpPqvaOulq116VaSWEYY4rHbrnYJkEsk1g4eO1Ddml8G2z3jNbAQwimzjvPWIYgAvNMr7yNASoaeV8ZYXLcruqd85QhvcaDtWyNwR5ujHOuVIKc=
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<2f3f8999-3013-4ffb-913f-3aac96c62fd8@default>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<4C573AF0.2050400@xxxxxxxxxxxxxxxxxxxxxx> <2f3f8999-3013-4ffb-913f-3aac96c62fd8@default>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Lightning/1.0b2pre Thunderbird/3.0.5

On 08/03/10 01:57, Dan Magenheimer wrote:
> Hi Joanna --
> 
> The slides you refer to are over two years old, and there's
> been a lot of progress in this area since then.  I suggest
> you google for "Transcendent Memory" and especially
> my presentation at the most recent Xen Summit North America
> and/or http://oss.oracle.com/projects/tmem 
> 

Thanks Dan. I've been aware of tmem, but I've been skeptical about it
for two reasons: it's complex, and seems rather unportable to other
OSes, specifically Windows, which is a concern for us, as we plan to
support Windows AppVMs in the future in Qubes.

(Hhm, is it really unportable? Perhaps one could create
pseudo-filesystem driver that would behave like precache, and a
pseudo-disk driver that would behave like preswap?)

From reading the papers on tmem (the hogs were really cute :), I
understand now that the single most important advantage of using tmem
vs. just-ballooning is: no memory inertia for needy VMs, correct? I'm
tempted to think that this might not be such a big deal for the
Qubes-specific types of workload -- after all, if some apps starts
slowing down, the user will temporarily stop "operating" them, and let
the system recover within a few seconds, when the balloon will return
some more memory. Or am I wrong here, and the recovery is not so easy in
practice?

> Specifically, I now have "selfballooning" built into
> the guest kernel...

In your latest presentation you mention selfballooning implemented in
kernel, rather than via a userland daemon -- any significant benefit of
this? I've been thinking of trying selfballooning using 2.6.34-xenlinux
kernel with usermode balloond...

How to initially provision the VMs in selfballooning, i.e. how to set
mem and memmax? I'm tempted to set memmax to the amount of all physical
memory minus memory reserved for Dom0, and other service VMs (which
would get fixed, small, amount). The rationale behind this is that we
don't know what type of tasks the user will end up doing in any given
VM, and she might very well end up with something reaaally memory-hungry
(sure, we will not let any other VMs to run at the same time in that
case, but we should still be able to handle this I think).

> I don't see direct ballooning as feasible (certainly without other
> guest changes such as cleancache and frontswap).
> 

Why is that? Intuitively it sounds like the most straightforward
solution -- only Dom0 can see the system-wide picture of all the VM
needs (and priorities).

What happens if too many guests would request too much memory, i.e.
within their maxmem limits, but such that the overall total exceeds the
total available in the system? I guess then whoever was first and lucky
would get the memory, but the last ones would get nothing, right? While
if we had centrally-managed allocation, we would be able to e.g. scale
down the target memory sizes equally, or tell the user that some VMs
must be closed for smooth operation of the others (or close them
automatically).

> Anyway, I have limited availability in the next couple of
> weeks but would love to talk (or email) more about
> this topic after that (but would welcome clarification
> questions in the meantime).
> 

No problem. Hopefully some of the above questions would fall into the
"clarification" category :) And maybe others will answer the others :)

Thanks,
joanna.

> Dan
> 
>> -----Original Message-----
>> From: Joanna Rutkowska [mailto:joanna@xxxxxxxxxxxxxxxxxxxxxx]
>> Sent: Monday, August 02, 2010 3:39 PM
>> To: xen-devel@xxxxxxxxxxxxxxxxxxx; Dan Magenheimer
>> Cc: qubes-devel@xxxxxxxxxxxxxxxx
>> Subject: Q about System-wide Memory Management Strategies
>>
>> Dan, Xen.org'ers,
>>
>> I have a few questions regarding strategies for optimal memory
>> assignment among VMs (PV DomU and Dom0, all Linux-based).
>>
>> We've been thinking about implementing a "Direct Ballooning" strategy
>> (as described on slide #20 in Dan's slides [1]), i.e. to write a daemon
>> that would be running in Dom0 and, based on the statistics provided by
>> ballond daemons running in DomUs, to adjust memory assigned to all VMs
>> in the system (via xm mem-set).
>>
>> Rather than trying to maximize the number of VMs we could run at the
>> same time, in Qubes OS we are more interested in optimizing user
>> experience for running "reasonable number" of VMs (i.e.
>> minimizing/eliminating swapping). In other words, given the number of
>> VMs that the user feels the need to run at the same time (in practice
>> usually between 3-6), and given the amount of RAM in the system (4-6 GB
>> in practice today), how to optimally distribute it among the VMs? In
>> our
>> model we assume the disk backend(s) are in Dom0.
>>
>> Some specific questions:
>> 1) What is the best estimator of the "ideal" amount of RAM each VM
>> would
>> like to have? Dan mentions [1] the Commited_AS value from
>> /proc/meminfo,
>> but what about the fs cache? I would expect that we should (ideally)
>> allocate Commited_AS + some_cache amount of RAM, no?
>>
>> 2) What's the best estimator for "minimal reasonable" amount of RAM for
>> VM (below which the swapping would kill the performance for good)? The
>> rationale behind this, is that if we couldn't allocate "ideal" amount
>> of
>> RAM (point 1 above), then we would be scaling the available RAM down,
>> until this "reasonable minimum" value. Below this, we would display a
>> message to the user that they should close some VMs (or will close
>> "inactive" one automatically), and also we would refuse to start any
>> new
>> AppVMs.
>>
>> 3) Assuming we have enough RAM to satisfy all the VMs' "ideal"
>> requests,
>> what should we do with the excessive RAM -- options are:
>> a) distribute among all the VMs (more per-VM RAM, means larger FS
>> caches, means faster I/O), or
>> b) assign it to Dom0, where the disk backend is running (larger FS
>> cache
>> means faster disk backends, means faster I/O in each VM?)
>>
>> Thanks,
>> joanna.
>>
>> [1]
>> http://www.xen.org/files/xensummitboston08/MemoryOvercommit-
>> XenSummit2008.pdf
>>

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: Q about System-wide Memory Management Strategies