This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] RE: Tmem [PATCH 0/5] (Take 3): Transcendent memory

To: ngupta@xxxxxxxxxx
Subject: [Xen-devel] RE: Tmem [PATCH 0/5] (Take 3): Transcendent memory
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Mon, 21 Dec 2009 15:46:28 -0800 (PST)
Cc: Nick Piggin <npiggin@xxxxxxx>, sunil.mushran@xxxxxxxxxx, jeremy@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, tmem-devel@xxxxxxxxxxxxxx, linux-mm <linux-mm@xxxxxxxxx>, linux-kernel <linux-kernel@xxxxxxxxxxxxxxx>, Pavel Machek <pavel@xxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, dave.mccracken@xxxxxxxxxx, Marcelo, Tosatti <mtosatti@xxxxxxxxxx>, chris.mason@xxxxxxxxxx, Avi Kivity <avi@xxxxxxxxxx>, Rusty@xxxxxxxxxxxxxxxxxxxx, Schwidefsky <schwidefsky@xxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Alan Cox <alan@xxxxxxxxxxxxxxxxxxx>, Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 22 Dec 2009 01:32:36 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4B2F7C41.9020106@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> From: Nitin Gupta [mailto:ngupta@xxxxxxxxxx]

> Hi Dan,

Hi Nitin --

Thanks for your review!

> (I'm not sure if gmane.org interface sends mail to everyone 
> in CC list, so
> sending again. Sorry if you are getting duplicate mail).

FWIW, I only got this one copy (at least so far)!
> I really like the idea of allocating cache memory from 
> hypervisor directly. This
> is much more flexible than assigning fixed size memory to guests.


> I think 'frontswap' part seriously overlaps the functionality 
> provided by 'ramzswap'

Could be, but I suspect there's a subtle difference.
A key part of the tmem frontswap api is that any
"put" at any time can be rejected.  There's no way
for the kernel to know a priori whether the put
will be rejected or not, and the kernel must be able
to react by writing the page to a "true" swap device
and must keep track of which pages were put
to tmem frontswap and which were written to disk.
As a result, tmem frontswap cannot be configured or
used as a true swap "device".

This is critical to acheive the flexibility you
commented above that you like.  Only the hypervisor
knows if a free page is available "now" because
it is flexibly managing tmem requests from multiple
guest kernels.

If my understanding of ramzswap is incorrect or you
have some clever solution that I misunderstood,
please let me know.

>> Cleancache is
> > "ephemeral" so whether a page is kept in cleancache 
> (between the "put" and
> > the "get") is dependent on a number of factors that are invisible to
> > the kernel.
> Just an idea: as an alternate approach, we can create an 
> 'in-memory compressed
> storage' backend for FS-Cache. This way, all filesystems 
> modified to use
> fs-cache can benefit from this backend. To make it 
> virtualization friendly like
> tmem, we can again provide (per-cache?) option to allocate 
> from hypervisor  i.e.
> tmem_{put,get}_page() or use [compress]+alloc natively.

I looked at FS-Cache and cachefiles and thought I understood
that it is not restricted to clean pages only, thus
not a good match for tmem cleancache.

Again, if I'm wrong (or if it is easy to tell FS-Cache that
pages may "disappear" underneath it), let me know.

BTW, pages put to tmem (both frontswap and cleancache) can
be optionally compressed.

> For guest<-->hypervisor interface, maybe we can use virtio so that all
> hypervisors can benefit? Not quite sure about this one.

I'm not very familiar with virtio, but the existence of "I/O"
in the name concerns me because tmem is entirely synchronous.

Also, tmem is well-layered so very little work needs to be
done on the Linux side for other hypervisors to benefit.
Of course these other hypervisors would need to implement
the hypervisor-side of tmem as well, but there is a well-defined
API to guide other hypervisor-side implementations... and the
opensource tmem code in Xen has a clear split between the
hypervisor-dependent and hypervisor-independent code, which
should simplify implementation for other opensource hypervisors.

I realize in "Take 3" I didn't provide the URL for more information:

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>