WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [PATCH] scrub pages on guest termination

To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] [PATCH] scrub pages on guest termination
From: Ben Guthro <bguthro@xxxxxxxxxxxxxxx>
Date: Fri, 23 May 2008 11:00:18 -0400
Cc: Robert Phillips <rphillips@xxxxxxxxxxxxxxx>
Delivery-date: Fri, 23 May 2008 08:00:43 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.14 (X11/20080501)
This patch solves the following problem.  When a large VS terminates, the node locks
up. The node locks up because the page_scrub_kick routine sends a softirq to
all processors instructing them to run the page scrub code.  There they interfere
with each other as they serialize behind the page_scrub_lock.

The patch does two things:

(1) In page_scrub_kick, only a single cpu is interrupted.  Some cpu other than
the calling cpu is chosen (if available) because we assume the calling cpu
has other higher priority work to do.

(2) In page_scrub_softirq, if more than one cpu is online, the first cpu
to start scrubbing designates itself as the primary_scrubber.  As such
it is dedicated to scrubbing pages until the list is empty.  Other cpus
might call page_scrub_softirq but they spend only 1 msec scrubbing before
returning to check for other higher priority work.  But, with multiple cpus
online, the node can afford to have one cpu dedicated to scrubbing when
that work needs to be done.

Signed-off-by: Robert Phillips <rphillips@xxxxxxxxxxxxxxx>
Signed-off-by: Ben Guthro <bguthro@xxxxxxxxxxxxxxx>
diff -r 29dc52031954 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -984,16 +984,23 @@
     void             *p;
     int               i;
     s_time_t          start = NOW();
+    static int        primary_scrubber = -1;
 
-    /* Aim to do 1ms of work every 10ms. */
+    /* Unless SMP, aim to do 1ms of work every 10ms. */
     do {
         spin_lock(&page_scrub_lock);
 
         if ( unlikely((ent = page_scrub_list.next) == &page_scrub_list) )
         {
+            if (primary_scrubber == smp_processor_id())
+                primary_scrubber = -1;
             spin_unlock(&page_scrub_lock);
             return;
         }
+        
+        /* If SMP, dedicate a cpu to scrubbing til the job is done */
+        if (primary_scrubber == -1 && num_online_cpus() > 1)
+            primary_scrubber = smp_processor_id();
         
         /* Peel up to 16 pages from the list. */
         for ( i = 0; i < 16; i++ )
@@ -1020,7 +1027,7 @@
             unmap_domain_page(p);
             free_heap_pages(pfn_dom_zone_type(page_to_mfn(pg)), pg, 0);
         }
-    } while ( (NOW() - start) < MILLISECS(1) );
+    } while ( primary_scrubber == smp_processor_id() || (NOW() - start) < 
MILLISECS(1) );
 
     set_timer(&this_cpu(page_scrub_timer), NOW() + MILLISECS(10));
 }
diff -r 29dc52031954 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -90,10 +90,21 @@
         if ( !list_empty(&page_scrub_list) )    \
             raise_softirq(PAGE_SCRUB_SOFTIRQ);  \
     } while ( 0 )
-#define page_scrub_kick()                                               \
-    do {                                                                \
-        if ( !list_empty(&page_scrub_list) )                            \
-            cpumask_raise_softirq(cpu_online_map, PAGE_SCRUB_SOFTIRQ);  \
+
+#define page_scrub_kick()                                       \
+    do {                                                        \
+        if ( !list_empty(&page_scrub_list) ) {                  \
+            int cpu;                                            \
+            /* Try to use some other cpu. */                    \
+            for_each_online_cpu(cpu) {                          \
+                if (cpu != smp_processor_id()) {                \
+                    cpu_raise_softirq(cpu, PAGE_SCRUB_SOFTIRQ); \
+                    break;                                      \
+                }                                               \
+            }                                                   \
+            if (cpu >= NR_CPUS)                                 \
+                raise_softirq(PAGE_SCRUB_SOFTIRQ);              \
+        }                                                       \
     } while ( 0 )
 unsigned long avail_scrub_pages(void);
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>