WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-changelog

[Xen-changelog] [xen-unstable] Fix hypervisor crash with unpopulated NUM

To: xen-changelog@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-changelog] [xen-unstable] Fix hypervisor crash with unpopulated NUMA nodes
From: Xen patchbot-unstable <patchbot-unstable@xxxxxxxxxxxxxxxxxxx>
Date: Wed, 07 Oct 2009 08:15:11 -0700
Delivery-date: Wed, 07 Oct 2009 08:15:19 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-changelog-request@lists.xensource.com?subject=help>
List-id: BK change log <xen-changelog.lists.xensource.com>
List-post: <mailto:xen-changelog@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=unsubscribe>
Reply-to: xen-devel@xxxxxxxxxxxxxxxxxxx
Sender: xen-changelog-bounces@xxxxxxxxxxxxxxxxxxx
# HG changeset patch
# User Keir Fraser <keir.fraser@xxxxxxxxxx>
# Date 1254927506 -3600
# Node ID 42a53969eb7e1f84c35f48cb685aca31111bb80d
# Parent  e1cac8e4bdeb9c02a6f936e060d625ffa8cb09e3
Fix hypervisor crash with unpopulated NUMA nodes

On NUMA systems with memory-less nodes Xen crashes quite early in the
hypervisor (while initializing the heaps). This is not an issue if
this happens to be the last node, but "inner" nodes trigger this
reliably.  On multi-node processors it is much more likely to leave a
node unequipped.  The attached patch fixes this by enumerating the
node via the node_online_map instead of counting from 0 to num_nodes.

The resulting NUMA setup is still somewhat strange, but at least it
does not crash. In lowlevel/xc/xc.c there is again this enumeration
bug, but I suppose we cannot access the HV's node_online_map from this
context, so the xm info output is not correct (but xm debug-keys H
is).  I plan to rework the handling of memory-less nodes later.

Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>
---
 xen/common/page_alloc.c |   11 +++++------
 1 files changed, 5 insertions(+), 6 deletions(-)

diff -r e1cac8e4bdeb -r 42a53969eb7e xen/common/page_alloc.c
--- a/xen/common/page_alloc.c   Wed Oct 07 15:56:05 2009 +0100
+++ b/xen/common/page_alloc.c   Wed Oct 07 15:58:26 2009 +0100
@@ -294,7 +294,6 @@ static struct page_info *alloc_heap_page
         node = cpu_to_node(smp_processor_id());
 
     ASSERT(node >= 0);
-    ASSERT(node < num_nodes);
     ASSERT(zone_lo <= zone_hi);
     ASSERT(zone_hi < NR_ZONES);
 
@@ -323,8 +322,9 @@ static struct page_info *alloc_heap_page
         } while ( zone-- > zone_lo ); /* careful: unsigned zone may wrap */
 
         /* Pick next node, wrapping around if needed. */
-        if ( ++node == num_nodes )
-            node = 0;
+        node = next_node(node, node_online_map);
+        if (node == MAX_NUMNODES)
+            node = first_node(node_online_map);
     }
 
     /* Try to free memory from tmem */
@@ -466,7 +466,6 @@ static void free_heap_pages(
 
     ASSERT(order <= MAX_ORDER);
     ASSERT(node >= 0);
-    ASSERT(node < num_online_nodes());
 
     for ( i = 0; i < (1 << order); i++ )
     {
@@ -817,13 +816,13 @@ static unsigned long avail_heap_pages(
 static unsigned long avail_heap_pages(
     unsigned int zone_lo, unsigned int zone_hi, unsigned int node)
 {
-    unsigned int i, zone, num_nodes = num_online_nodes();
+    unsigned int i, zone;
     unsigned long free_pages = 0;
 
     if ( zone_hi >= NR_ZONES )
         zone_hi = NR_ZONES - 1;
 
-    for ( i = 0; i < num_nodes; i++ )
+    for_each_online_node(i)
     {
         if ( !avail[i] )
             continue;

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-changelog] [xen-unstable] Fix hypervisor crash with unpopulated NUMA nodes, Xen patchbot-unstable <=