WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [PATCH] Correct handling node with CPU populated but no memo

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: [Xen-devel] [PATCH] Correct handling node with CPU populated but no memory populated
From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Date: Fri, 25 Dec 2009 18:42:00 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 25 Dec 2009 02:42:38 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcqFTuO2u2OxkVxJRv26PUuSaroeEg==
Thread-topic: [PATCH] Correct handling node with CPU populated but no memory populated
This patch fix one bug caused by changeset 20599.

BTW, when I was working on the patch, I'm really confused by srat_detect_node() 
and init_cpu_to_node().

Firstly,  Per my understanding to the code itself, these two functions achieve 
the same purpose, but I can't understanding why srat_detect_node need be called 
after the CPU is up. As this is backported from kernel, so I checked linux 
kernel. In linux kernel, the srat_detect_node is mainly for one quirk for AMD 
platform, but that quirk is not implemented in Xen side. So is the quirk is 
really needed?

Also, are there any special reason that init_cpu_to_node() can't be called 
right after numa_initmem_init()? I think it will be cleaner if we setup the 
whole numa information in one place. Originally I suspect it is related to 
generic_apic_probe(), but after checking the code, I didn't find anything 
related. Did I miss anything?

Thanks
--jyh

# HG changeset patch
# User Yunhong Jiang <yunhong.jiang@xxxxxxxxx>
# Date 1261684998 -28800
# Node ID 986dae93aea54f575202dcbaacbf3f4ff05032d4
# Parent  98c4b2498415751f97091631907d9d173ac0092f

Correct handling node with CPU populated but no memory populated

In changset 20599, the node that has no memory populated is marked parsed, but 
not online. However, if there are CPU populated in this node, the corresponding 
CPU mapping (i.e. the cpu_to_node) is still setup to the offline node, this 
will cause trouble for memory allocation.

This patch changes the init_cpu_to_node() and srant_detect_node(), to 
considering the node is offlined situation.

Now the apicid_to_node is only used to keep the mapping between cpu/node 
provided by BIOS, and should not be used for memory allocation anymore.

One thing left is to update the cpu_to_node mapping after memory populated by 
memory hot-add.

Signed-off-by: Jiang, Yunhong <yunhong.jiang@xxxxxxxxx>

diff -r 98c4b2498415 -r 986dae93aea5 xen/arch/x86/numa.c
--- a/xen/arch/x86/numa.c       Thu Dec 24 15:59:44 2009 +0000
+++ b/xen/arch/x86/numa.c       Fri Dec 25 04:03:18 2009 +0800
@@ -35,6 +35,9 @@ unsigned char cpu_to_node[NR_CPUS] __rea
 unsigned char cpu_to_node[NR_CPUS] __read_mostly = {
        [0 ... NR_CPUS-1] = NUMA_NO_NODE
 };
+/*
+ * Keep BIOS's CPU2node information, should not be used for memory allocaion
+ */
 unsigned char apicid_to_node[MAX_LOCAL_APIC] __cpuinitdata = {
        [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
@@ -288,14 +291,16 @@ static __init int numa_setup(char *opt)
  */
 void __devinit init_cpu_to_node(void)
 {
-       int i;
+       int i, node;
        for (i = 0; i < NR_CPUS; i++) {
                u32 apicid = x86_cpu_to_apicid[i];
                if (apicid == BAD_APICID)
                        continue;
-               if (apicid_to_node[apicid] == NUMA_NO_NODE)
-                       continue;
-               numa_set_node(i,apicid_to_node[apicid]);
+        node = apicid_to_node[apicid];
+
+               if ( node == NUMA_NO_NODE || !node_online(node) )
+                       node = 0;
+               numa_set_node(i, node);
        }
 }
 
diff -r 98c4b2498415 -r 986dae93aea5 xen/arch/x86/setup.c
--- a/xen/arch/x86/setup.c      Thu Dec 24 15:59:44 2009 +0000
+++ b/xen/arch/x86/setup.c      Fri Dec 25 04:03:18 2009 +0800
@@ -20,6 +20,7 @@
 #include <xen/rcupdate.h>
 #include <xen/vga.h>
 #include <xen/dmi.h>
+#include <xen/nodemask.h>
 #include <public/version.h>
 #ifdef CONFIG_COMPAT
 #include <compat/platform.h>
@@ -252,7 +253,7 @@ void __devinit srat_detect_node(int cpu)
     u32 apicid = x86_cpu_to_apicid[cpu];
 
     node = apicid_to_node[apicid];
-    if ( node == NUMA_NO_NODE )
+    if ( node == NUMA_NO_NODE || !node_online(node) )
         node = 0;
     numa_set_node(cpu, node);
 
diff -r 98c4b2498415 -r 986dae93aea5 xen/arch/x86/smpboot.c
--- a/xen/arch/x86/smpboot.c    Thu Dec 24 15:59:44 2009 +0000
+++ b/xen/arch/x86/smpboot.c    Fri Dec 25 04:03:18 2009 +0800
@@ -913,7 +913,7 @@ static int __devinit do_boot_cpu(int api
        }
 #else
        if (!per_cpu(compat_arg_xlat, cpu))
-               setup_compat_arg_xlat(cpu, apicid_to_node[apicid]);
+               setup_compat_arg_xlat(cpu, cpu_to_node[cpu]);
 #endif
 
        if (!idt_tables[cpu]) {
diff -r 98c4b2498415 -r 986dae93aea5 xen/arch/x86/x86_64/mm.c
--- a/xen/arch/x86/x86_64/mm.c  Thu Dec 24 15:59:44 2009 +0000
+++ b/xen/arch/x86/x86_64/mm.c  Fri Dec 25 04:03:18 2009 +0800
@@ -997,7 +997,7 @@ void __init subarch_init_memory(void)
     }
 
     if ( setup_compat_arg_xlat(smp_processor_id(),
-                               apicid_to_node[boot_cpu_physical_apicid]) )
+                               cpu_to_node[0]) )
         panic("Could not setup argument translation area");
 }
 


Attachment: node_online.patch
Description: node_online.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>