WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] [linux-2.6.18-xen] transcendent memory ("tmem")l

To: "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] [linux-2.6.18-xen] transcendent memory ("tmem")linux-side changes
From: "Jan Beulich" <JBeulich@xxxxxxxxxx>
Date: Wed, 10 Jun 2009 08:58:26 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 10 Jun 2009 00:58:08 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <97574cc6-e389-446f-87ea-ed3f980fe5d3@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <97574cc6-e389-446f-87ea-ed3f980fe5d3@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> 10.06.09 00:17 >>>

A few general remarks:

>--- a/include/linux/swap.h     Mon Jun 08 12:23:24 2009 +0100
>+++ b/include/linux/swap.h     Tue Jun 09 15:37:03 2009 -0600
>@@ -6,6 +6,7 @@
> #include <linux/mmzone.h>
> #include <linux/list.h>
> #include <linux/sched.h>
>+#include <linux/vmalloc.h>
> 
> #include <asm/atomic.h>
> #include <asm/page.h>
>@@ -143,8 +144,69 @@ struct swap_info_struct {
>       unsigned int pages;
>       unsigned int max;
>       unsigned int inuse_pages;
>+      unsigned long *preswap_map;
>+      unsigned int preswap_pages;

Missing #ifdef CONFIG_PRESWAP?

>       int next;                       /* next entry on swap list */
> };
>+
>+#ifdef CONFIG_PRESWAP
>+
>+#include <linux/sysctl.h>
>+extern int preswap_sysctl_handler(struct ctl_table *, int, struct file *,
>+      void __user *, size_t *, loff_t *);
>+extern const unsigned long preswap_zero, preswap_infinity;
>+
>+extern void preswap_shrink(unsigned long);
>+extern int preswap_test(struct swap_info_struct *, unsigned long);
>+extern void preswap_init(unsigned);
>+extern int preswap_put(struct page *);
>+extern int preswap_get(struct page *);
>+extern void preswap_flush(unsigned, unsigned long);
>+extern void preswap_flush_area(unsigned);
>+/* in swapfile.c */
>+extern int try_to_unuse(unsigned int, unsigned int, unsigned long);
>+
>+static inline void *preswap_malloc(unsigned long size)
>+{
>+      return vmalloc(size);

I would view this as problematic on 32-bit, as the vmalloc space is rather
restricted there. Not because the allocation here may fail, but because it
may cause failures elsewhere. Some feedback on the amount of vmalloc
space used and/or still available is going to be needed here I'd think.

>...
>@@ -247,7 +309,7 @@ extern int can_share_swap_page(struct pa
> extern int can_share_swap_page(struct page *);
> extern int remove_exclusive_swap_page(struct page *);
> struct backing_dev_info;
>-
>+extern struct swap_list_t swap_list;
> extern spinlock_t swap_lock;
> 
> /* linux/mm/thrash.c */

That's not good - swap_list being non-static in 2.6.18 is just an oversight
afaict, which is being fixed in later kernels. I think it would be cleaner (also
wrt. a possible pv-ops implementation) to move some or all of the code in
mm/preswap.c into mm/swapfile.c (would likewise eliminate the need to
remove the 'static' from try_to_unuse()).

>--- a/mm/Kconfig       Mon Jun 08 12:23:24 2009 +0100
>+++ b/mm/Kconfig       Tue Jun 09 15:37:03 2009 -0600
>@@ -152,3 +152,34 @@ config RESOURCES_64BIT
>       default 64BIT
>       help
>         This option allows memory and IO resources to be 64 bit.
>+
>+#
>+# support for transcendent memory
>+#
>+config TMEM
>+      bool "Transcendent memory support"
>+      def_bool y
>+      depends on XEN
>+      help
>+        In a virtualized environment, allows unused and underutilized
>+        system physical memory to be made accessible through a narrow
>+        well-defined page-copy-based API.  If unsure, say Y.
>+
>+config PRECACHE
>+      bool "Cache clean pages in transcendent memory"
>+      def_bool y
>+      depends on TMEM
>+      help
>+        Allows the transcendent memory pool to be used to store clean
>+        page-cache pages which, under some circumstances, will greatly
>+        reduce paging and thus improve performance.  If unsure, say Y.
>+
>+config PRESWAP
>+      bool "Swap pages to transcendent memory"
>+      def_bool y
>+      depends on TMEM
>+      help
>+        Allows the transcendent memory pool to be used as a pseudo-swap
>+        device which, under some circumstances, will greatly reduce
>+        swapping and thus improve performance.  If unsure, say Y.
>+

'bool' followed by 'def_bool'? And even more, an experimental (as I would
view it at this stage) feature that defaults to 'yes'?

Also, does have selecting TMEM on its own any meaning? If not (which I
think is the case), then 'select'ing TMEM from the other two options may
be more appropriate (while TMEM then shouldn't have a prompt anymore).

>--- /dev/null  Thu Jan 01 00:00:00 1970 +0000
>+++ b/mm/tmem.c        Tue Jun 09 15:37:03 2009 -0600
>@@ -0,0 +1,41 @@
>+/*
>+ * Xen implementation for transcendent memory (tmem)
>+ *
>+ * Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> 2009
>+ */
>+
>+#include <linux/types.h>
>+#include <xen/interface/xen.h>
>+#include <asm/hypervisor.h>

This is Xen-only code sitting in an apparently generic source file. Wouldn't
that better be placed in drivers/xen/core/ if it's architecture neutral?

Also, it doesn't include its own header (but see below for comments on that
one)...

>--- /dev/null  Thu Jan 01 00:00:00 1970 +0000
>+++ b/mm/tmem.h        Tue Jun 09 15:37:03 2009 -0600
>@@ -0,0 +1,111 @@
>+/*
>+ * linux/mm/tmem.h
>+ *
>+ * Interface to transcendent memory, used by mm/precache.c and mm/preswap.c
>+ * Currently implemented on XEN, but may be implemented elsewhere in future.
>+ *
>+ * Copyright (C) 2008,2009 Dan Magenheimer, Oracle Corp.
>+ */
>+
>+
>+#define TMEM_CONTROL               0
>+#define TMEM_NEW_POOL              1
>+#define TMEM_DESTROY_POOL          2
>...

While I partly understand your intention of potentially sharing the 
implementation
with other hypervisors, I don't think Xen interface definitions belong here - 
you're
not using include/xen/interface/tmem.h at all, while that's where the 
definitions
should come from imo. If you really want this to represent an abstract 
interface,
then ensuring consistency with the Xen interface would be a minimal requirement
(by e.g. including the Xen header in the #ifdef CONFIG_XEN block). Otoh I'm
getting the impression that the patch as a whole isn't following this abstract
interface goal, so it should probably be consistent here and not try to do so in
a just single place.

>...
>+static inline int tmem_new_pool(u64 uuid_lo, u64 uuid_hi, u32 flags)
>+{
>+      flags |= (PAGE_SHIFT - 12) << TMEM_POOL_PAGESIZE_SHIFT;

This hard-coded literal 12 seems dubious: Is this part of the tmem hypervisor
interface? If so, there should be a #define in the interface header, plus a
(ideally build-time) check that the value really fits TMEM_POOL_PAGESIZE_MASK.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel