This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] [RFC] Xen NUMA strategy

To: "Ian Pratt" <Ian.Pratt@xxxxxxxxxxxx>, "Akio Takebe" <takebe_akio@xxxxxxxxxxxxxx>, "Andre Przywara" <andre.przywara@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [RFC] Xen NUMA strategy
From: "Xu, Anthony" <anthony.xu@xxxxxxxxx>
Date: Thu, 20 Sep 2007 09:44:55 +0800
Delivery-date: Wed, 19 Sep 2007 18:45:36 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <8A87A9A84C201449A0C56B728ACF491E260723@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <46EA7906.2010504@xxxxxxx><54C7F9BA4B1341takebe_akio@xxxxxxxxxxxxxx> <51CFAB8CB6883745AE7B93B3E084EBE2011113AE@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <8A87A9A84C201449A0C56B728ACF491E260723@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acf5umQg/TXGkEp9Rja0eZWxrcXKfgAAfFFgAASYjkAAVXpjMA==
Thread-topic: [Xen-devel] [RFC] Xen NUMA strategy

>-----Original Message-----
>From: Ian Pratt [mailto:Ian.Pratt@xxxxxxxxxxxx]
>Sent: Tuesday, September 18, 2007 4:43 PM
>To: Xu, Anthony; Akio Takebe; Andre Przywara;
>Cc: ian.pratt@xxxxxxxxxxxx
>Subject: RE: [Xen-devel] [RFC] Xen NUMA strategy
>> >We may need to write something about guest NUMA in guest
>> file.
>> >For example, in guest configuration file;
>> >vnode = <a number of guest node>
>> >vcpu = [<vcpus# pinned into the node: machine node#>, ...]
>> >memory = [<amount of memory per node: machine node#>, ...]
>> >
>> >e.g.
>> >vnode = 2
>> >vcpu = [0-1:0, 2-3:1]
>> >memory = [128:0, 128:1]
>> >
>> >If we setup vnode=1, old OSes should work fine.
>We need to think carefully about NUMA use cases before implementing a
>bunch of mechanism.

Agree, that's why we posted this thread, we hope we can get enough

>The way I see it, in most situations it will not make sense for guests
>to span NUMA nodes: you'll have a number of guests with relatively
>numbers of vCPUs, and it probably makes sense to allow the guests to be
>pinned to nodes.What we have in Xen today works pretty well for this
>case, but we could make configuration easier by looking at more
>sophisticated mechanisms for specifying CPU groups rather than just
>pinning. Migration between nodes could be handled with a locahost
>migrate, but we could obviously come up with something more time/space
>efficient (particularly for HVM gusts) if required.
>There may be some usage scenarios where having a large SMP guest that
>spans multiple nodes would be desirable. However, there's a bunch of
>scalability works that's required in Xen before this will really make
>sense, and all of this is much higher priority (and more generally
>useful) than figuring out how to expose NUMA topology to guests. I'd
>definitely encourage looking at the guest scalability issues first.

        What have you said maybe true, many of guests have small numbers
of vCPUs. In this situation, we need to pin guest to node for good
Pining guest to node may lead to imbalance after some creating and
destroying guest. We also need to handle imbalance. Better host NUMA
support is needed.
        Even we don't have big guest, we may also need to let guest span
NUMA node.  For example, when we create a guest which has big memory,
none of the NUMA node can satisfy the memory request, so this guest has
to span NUMA node. We need to provide guest the NUMA information.

        There is still very small NUMA node. May be one CPU per node, if
guest has two vCPUs, we need provide guest NUMA information, and
otherwise it will impact performance badly.

- Anthony

Xen-devel mailing list