Re: [Xen-devel] [PATCH] Bind guest with with NUMA ndoe.

To:	"Duan, Ronghui" <ronghui.duan@xxxxxxxxx>
Subject:	Re: [Xen-devel] [PATCH] Bind guest with with NUMA ndoe.
From:	Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>
Date:	Wed, 27 Feb 2008 09:49:16 +0000
Cc:	xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date:	Wed, 27 Feb 2008 01:49:52 -0800
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxx
In-reply-to:	<82C666AA63DC75449C51EAD62E8B2BEC5CEEA5@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	Ach4KVAEekvoZpcMS0CmWaXBzIm2DQAKj36FADKyxiAAAeqvJA==
Thread-topic:	[Xen-devel] [PATCH] Bind guest with with NUMA ndoe.
User-agent:	Microsoft-Entourage/11.3.6.070618

Looks fine to me.

K.

On 27/2/08 08:56, "Duan, Ronghui" <ronghui.duan@xxxxxxxxx> wrote:

Is this the one that your want?
Thanks
Set vcpu affinity to make better performance in NUMA machine.

Signed-off-by: Duan Ronghui <ronghui.duan@xxxxxxxxx>

diff -r 9a890c817922 tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py   Wed Feb 27 18:53:08 2008 +0800
+++ b/tools/python/xen/xend/XendDomainInfo.py   Thu Feb 28 01:12:23 2008 +0800
@@ -1961,6 +1961,39 @@ class XendDomainInfo:
             if self.info['cpus'] is not None and len(self.info['cpus']) > 0:
                 for v in range(0, self.info['VCPUs_max']):
                    xc.vcpu_setaffinity(self.domid, v, self.info['cpus'])
+            else:
+                info = xc.physinfo()
+                if info['nr_nodes'] > 1:
+                   node_memory_list = info['node_to_memory']
+                    needmem = self.image.getRequiredAvailableMemory(self.info['memory_dynamic_max']) / 1024
+                   candidate_node_list = []
+                    for i in range(0, info['nr_nodes']):
+                        if node_memory_list[i] >= needmem:
+                           candidate_node_list.append(i)
+                    if candidate_node_list is None or len(candidate_node_list) == 1:
+                       index = node_memory_list.index( max(node_memory_list) )
+                       cpumask = info['node_to_cpu'][index]
+                    else:
+                       nodeload = [0]
+                       nodeload = nodeload * info['nr_nodes']
+                       from xen.xend import XendDomain
+                        doms = XendDomain.instance().list('all')
+                        for dom in doms:
+                           cpuinfo = dom.getVCPUInfo()
+                           for vcpu in sxp.children(cpuinfo, 'vcpu'):
+                                def vinfo(n, t):
+                                   return t(sxp.child_value(vcpu, n))
+                               cpumap = vinfo('cpumap', list)
+                               for i in candidate_node_list:
+                                   node_cpumask = info['node_to_cpu'][i]
+                                   for j in node_cpumask:
+                                       if j in cpumap:
+                                           nodeload[i] += 1
+                                            break
+                       index = nodeload.index( min(nodeload) )
+                       cpumask = info['node_to_cpu'][index]
+                    for v in range(0, self.info['VCPUs_max']):
+                        xc.vcpu_setaffinity(self.domid, v, cpumask)

             # Use architecture- and image-specific calculations to determine
             # the various headrooms necessary, given the raw configured

From: Keir Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx]
Sent: Tuesday, February 26, 2008 4:43 PM
To: Duan, Ronghui; xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] [PATCH] Bind guest with with NUMA ndoe.

Option 3, please. A better allocation policy might look for node with enough memory that has least load (perhaps measured as crudely as node with least VCPUs bound to it).

-- Keir

On 26/2/08 03:40, "Duan, Ronghui" <ronghui.duan@xxxxxxxxx> wrote:
Hi Keir;

Currently base on Xen’s scheduler, if users don’t set vcpu affinity, Vcpu can be run on all p-cpus in machine. If it is a NUMA machine, performance will be down because of memory latency in memory access when CPU and memory are on different nodes. So I think their may be need to supply a mechanism to make xen run better on NUMA machine even if users don’t set vcpu affinity. I think out policies:

1: Don’t make any changes and only supply node free memory info to help guest to set proper VCPU affinity which has been realized in my last patch.

2: When set max-vcpu in domain build, we can choose a node base on nowadays policy of choose CPU to locate VCPU which mainly considers CPU balance. Then set this node’s cpumask to all VCPUS’ affinity to bind domain on this node. The disadvantage of this method is after setting max-vcpu, if user configures VCPU affinity, VCPU affinity will be set again. This is done in first patch attached.

3: We can do this in CP. If user doesn’t set VCPU affinity, we can choose a VCPU affinity for guest domain. This need a new policy to choose which node guest will run on NUMA machine. I think it is reasonable to consider memory usage first. I do this in the second patch. This patch depends on my last patch of get free memory size per node.

Which method do you prefer? Comments are welcome. Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [PATCH] Bind guest with with NUMA ndoe.