WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Xen 3.4.1 NUMA support

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Subject: Re: [Xen-devel] Xen 3.4.1 NUMA support
From: Andre Przywara <andre.przywara@xxxxxxx>
Date: Mon, 9 Nov 2009 16:02:42 +0100
Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Papagiannis Anastasios <apapag@xxxxxxxxxxxx>
Delivery-date: Mon, 09 Nov 2009 07:04:52 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <bd4f4a54-5269-42d8-b16d-cbdfaeeba361@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <bd4f4a54-5269-42d8-b16d-cbdfaeeba361@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.21 (X11/20090329)
Dan Magenheimer wrote:
Add Xen boot parameter 'numa=on' to enable NUMA detection. Then it's up to you to, for example, pin domains to specific nodes, using the 'cpus=...' option in the domain config file. See
/etc/xen/xmexample1 for an example of its usage.
VMware has the notion of a "cell" where VMs can be
scheduled only within a cell, not across cells.
Cell boundaries are determined by VMware by
default, though certains settings can override them.
Well, If I got this right, then you are describing the current behaviour of Xen. It has a similar feature for some time now (since 3.3, I guess). When you launch a domain on a numa=on machine, it will pick the least busiest node (which can hold the requested memory) and restrict the domain to that node (by only allowing CPUs of that node).
This is in XendDomainInfo.py (c/s 17131, 17247, 17709)
Looks like this one:
(kernel xen.gz numa=on dom0_mem=6144M dom0_max_vcpus=6 dom0_vcpus_pin)
# xm create opensuse.hvm
# xm create opensuse2.hvm
# xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
001-LTP                              1     0     6   -b-      17.8 6-11
001-LTP                              1     1     7   -b-       6.3 6-11
002-LTP                              2     0    12   -b-      19.0 12-17
002-LTP                              2     1    16   -b-       1.6 12-17
002-LTP                              2     2    17   -b-       1.7 12-17
002-LTP                              2     3    14   -b-       1.6 12-17
002-LTP                              2     4    16   -b-       1.6 12-17
002-LTP                              2     5    15   -b-       1.5 12-17
002-LTP                              2     6    12   -b-       1.3 12-17
002-LTP                              2     7    13   -b-       1.8 12-17
Domain-0                             0     0     0   -b-      12.6 0
Domain-0                             0     1     1   -b-       7.6 1
Domain-0                             0     2     2   -b-       8.0 2
Domain-0                             0     3     3   -b-      14.6 3
Domain-0                             0     4     4   r--       1.4 4
Domain-0                             0     5     5   -b-       0.9 5
# xm debug-keys U
(XEN) Domain 0 (total: 2097152):
(XEN)     Node 0: 2097152
(XEN)     Node 1: 0
(XEN)     Node 2: 0
(XEN)     Node 3: 0
(XEN)     Node 4: 0
(XEN)     Node 5: 0
(XEN)     Node 6: 0
(XEN)     Node 7: 0
(XEN) Domain 1 (total: 394219):
(XEN)     Node 0: 0
(XEN)     Node 1: 394219
(XEN)     Node 2: 0
(XEN)     Node 3: 0
(XEN)     Node 4: 0
(XEN)     Node 5: 0
(XEN)     Node 6: 0
(XEN)     Node 7: 0
(XEN) Domain 2 (total: 394219):
(XEN)     Node 0: 0
(XEN)     Node 1: 0
(XEN)     Node 2: 394219
(XEN)     Node 3: 0
(XEN)     Node 4: 0
(XEN)     Node 5: 0
(XEN)     Node 6: 0
(XEN)     Node 7: 0

Note that there were no cpus= lines in the config files, Xen did that automatically.

Domains can be localhost-migrated to another node:
# xm migrate --node=4 1 localhost
The only issue is with domains larger than a node.
If someone has a useful use-case, I can start rebasing my old patches for NUMA aware HVM domains to Xen unstable.

Regards,
Andre.

BTW: Shouldn't we set finally numa=on as the default value?

--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel