This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Xen 3.4.1 NUMA support

To: Andre Przywara <andre.przywara@xxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Subject: RE: [Xen-devel] Xen 3.4.1 NUMA support
From: Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>
Date: Fri, 13 Nov 2009 14:29:55 +0000
Accept-language: en-US
Acceptlanguage: en-US
Cc: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>, Keir Fraser <Keir.Fraser@xxxxxxxxxxxxx>, Papagiannis Anastasios <apapag@xxxxxxxxxxxx>
Delivery-date: Fri, 13 Nov 2009 06:30:44 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4AFD69D9.4090204@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <bd4f4a54-5269-42d8-b16d-cbdfaeeba361@default> <4AF82F12.6040400@xxxxxxx> <4AF82FD8.6020409@xxxxxxxxxxxxx> <4AFD69D9.4090204@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acpka+pZmF499CPITS21072Y6vG4PAAAFTTA
Thread-topic: [Xen-devel] Xen 3.4.1 NUMA support
> Overcommitting the nodes (letting multiple guests use each node) lowered
> the values to about 80% for two guests and 60% for three guests per
> node, but it never got anywhere close to the numa=off values.
> So these results encourage me again to opt for numa=on as the default
> value.
> Keir, I will check if dropping the node containment in the CPU
> overcommitment case is an option, but what would be the right strategy
> in that case?
> Warn the user?
> Don't contain at all?
> Contain to more than onde node?

In the case where a VM is asking for more vCPUs there are pCPUs in a node we 
should contain the guest to multiple nodes. (I presume we favour nodes 
according to the number of vCPUs they already have committed to them?)

We should turn off automatic node containment of any kind if the total number 
of pCPUs in the system is <= 8  -- on such systems the statistical multiplexing 
gain of having access to more pCPUs likely outweighs the NUMA placement benefit 
and memory striping will be a better strategy.
I'm inclined to believe that may be true for 2 node systems with <=16 pCPUs too 
under many workloads 

I'd really like to see us enumerate pCPUs in a sensible order so that it's 
easier to see the topology.  It should be nodes.sockets.cores{.threads}, 
leaving gaps for missing execution units due to hot plug or non power of two 
Right now we're inconsistent in the enumeration order depending on how the BIOS 
has set things up. It would be great if someone could volunteer to fix this...


Xen-devel mailing list