This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") f

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Subject: [Xen-devel] Re: [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") for Linux
From: Avi Kivity <avi@xxxxxxxxxx>
Date: Sun, 12 Jul 2009 20:27:49 +0300
Cc: npiggin@xxxxxxx, akpm@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, tmem-devel@xxxxxxxxxxxxxx, kurt.hackel@xxxxxxxxxx, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, jeremy@xxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-mm@xxxxxxxxx, sunil.mushran@xxxxxxxxxx, chris.mason@xxxxxxxxxx, Anthony Liguori <anthony@xxxxxxxxxxxxx>, Schwidefsky <schwidefsky@xxxxxxxxxx>, dave.mccracken@xxxxxxxxxx, Marcelo Tosatti <mtosatti@xxxxxxxxxx>, alan@xxxxxxxxxxxxxxxxxxx, Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>
Delivery-date: Sun, 12 Jul 2009 10:28:05 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <426e84ca-be31-40ac-a4c1-42cd9677d86c@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <426e84ca-be31-40ac-a4c1-42cd9677d86c@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1b3pre) Gecko/20090513 Fedora/3.0-2.3.beta2.fc11 Thunderbird/3.0b2
On 07/12/2009 07:28 PM, Dan Magenheimer wrote:
Having no struct pages is also a downside; for example this
guest cannot
have more than 1GB of anonymous memory without swapping like mad.
Swapping to tmem is fast but still a lot slower than having
the memory

Yes, true.  Tmem offers little additional advantage for workloads
that have a huge variation in working set size that is primarily
anonymous memory.  That larger scale "memory shaping" is left to
ballooning and hotplug.

And this is where the policy problems erupt. When do you balloon in favor of tmem? which guest do you balloon? do you leave it to the administrator? there's the host's administrator and the guests' administrators.

CMM2 solves this neatly by providing information to the host. The host can pick the least recently used page (or a better algorithm) and evict it using information from the guest, either dropping it or swapping it. It also provides information back to the guest when it references an evicted page: either the guest needs to recreate the page or it just needs to wait.

tmem makes life a lot easier to the hypervisor and to the guest, but
also gives up a lot of flexibility.  There's a difference
between memory
and a very fast synchronous backing store.

I don't see that it gives up that flexibility.  System adminstrators
are still free to size their guests properly.  Tmem's contribution
is in environments that are highly dynamic, where the only
alternative is really sizing memory maximally (and thus wasting
it for the vast majority of time in which the working set is smaller).

I meant that once a page is converted to tmem, there's a limited amount of things you can do with it compared to normal memory. For example tmem won't help with a dcache intensive workload.

I'm certainly open to identifying compromises and layer modifications
that help meet the needs of both Xen and KVM (and others).  For
example, if we can determine that the basic hook placement for
precache/preswap (or even just precache for KVM) can be built
on different underlying layers, that would be great!

I'm not sure preswap/precache by itself justifies tmem since it can be emulated by backing the disk with a cached file. What I'm missing in tmem is the ability for the hypervisor to take a global view on memory; instead it's forced to look at memory and tmem separately. That's fine for Xen since it can't really make any decisions on normal memory (lacking swap); on the other hand kvm doesn't map well to tmem since "free memory" is already used by the host pagecache.

I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Xen-devel mailing list