xen-devel
[Xen-devel] RE: Tmem [PATCH 0/5] (Take 3): Transcendent memory
To: |
ngupta@xxxxxxxxxx |
Subject: |
[Xen-devel] RE: Tmem [PATCH 0/5] (Take 3): Transcendent memory |
From: |
Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> |
Date: |
Mon, 21 Dec 2009 15:46:28 -0800 (PST) |
Cc: |
Nick Piggin <npiggin@xxxxxxx>, sunil.mushran@xxxxxxxxxx, jeremy@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, tmem-devel@xxxxxxxxxxxxxx, linux-mm <linux-mm@xxxxxxxxx>, linux-kernel <linux-kernel@xxxxxxxxxxxxxxx>, Pavel Machek <pavel@xxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, dave.mccracken@xxxxxxxxxx, Marcelo, Tosatti <mtosatti@xxxxxxxxxx>, chris.mason@xxxxxxxxxx, Avi Kivity <avi@xxxxxxxxxx>, Rusty@xxxxxxxxxxxxxxxxxxxx, Schwidefsky <schwidefsky@xxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>, Alan Cox <alan@xxxxxxxxxxxxxxxxxxx>, Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> |
Delivery-date: |
Tue, 22 Dec 2009 01:32:36 -0800 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<4B2F7C41.9020106@xxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
> From: Nitin Gupta [mailto:ngupta@xxxxxxxxxx]
> Hi Dan,
Hi Nitin --
Thanks for your review!
> (I'm not sure if gmane.org interface sends mail to everyone
> in CC list, so
> sending again. Sorry if you are getting duplicate mail).
FWIW, I only got this one copy (at least so far)!
> I really like the idea of allocating cache memory from
> hypervisor directly. This
> is much more flexible than assigning fixed size memory to guests.
Thanks!
> I think 'frontswap' part seriously overlaps the functionality
> provided by 'ramzswap'
Could be, but I suspect there's a subtle difference.
A key part of the tmem frontswap api is that any
"put" at any time can be rejected. There's no way
for the kernel to know a priori whether the put
will be rejected or not, and the kernel must be able
to react by writing the page to a "true" swap device
and must keep track of which pages were put
to tmem frontswap and which were written to disk.
As a result, tmem frontswap cannot be configured or
used as a true swap "device".
This is critical to acheive the flexibility you
commented above that you like. Only the hypervisor
knows if a free page is available "now" because
it is flexibly managing tmem requests from multiple
guest kernels.
If my understanding of ramzswap is incorrect or you
have some clever solution that I misunderstood,
please let me know.
>> Cleancache is
> > "ephemeral" so whether a page is kept in cleancache
> (between the "put" and
> > the "get") is dependent on a number of factors that are invisible to
> > the kernel.
>
> Just an idea: as an alternate approach, we can create an
> 'in-memory compressed
> storage' backend for FS-Cache. This way, all filesystems
> modified to use
> fs-cache can benefit from this backend. To make it
> virtualization friendly like
> tmem, we can again provide (per-cache?) option to allocate
> from hypervisor i.e.
> tmem_{put,get}_page() or use [compress]+alloc natively.
I looked at FS-Cache and cachefiles and thought I understood
that it is not restricted to clean pages only, thus
not a good match for tmem cleancache.
Again, if I'm wrong (or if it is easy to tell FS-Cache that
pages may "disappear" underneath it), let me know.
BTW, pages put to tmem (both frontswap and cleancache) can
be optionally compressed.
> For guest<-->hypervisor interface, maybe we can use virtio so that all
> hypervisors can benefit? Not quite sure about this one.
I'm not very familiar with virtio, but the existence of "I/O"
in the name concerns me because tmem is entirely synchronous.
Also, tmem is well-layered so very little work needs to be
done on the Linux side for other hypervisors to benefit.
Of course these other hypervisors would need to implement
the hypervisor-side of tmem as well, but there is a well-defined
API to guide other hypervisor-side implementations... and the
opensource tmem code in Xen has a clear split between the
hypervisor-dependent and hypervisor-independent code, which
should simplify implementation for other opensource hypervisors.
I realize in "Take 3" I didn't provide the URL for more information:
http://oss.oracle.com/projects/tmem
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|