This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] [RFC] Xen NUMA strategy

To: "Xu, Anthony" <anthony.xu@xxxxxxxxx>, "Akio Takebe" <takebe_akio@xxxxxxxxxxxxxx>, "Andre Przywara" <andre.przywara@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [RFC] Xen NUMA strategy
From: "Ian Pratt" <Ian.Pratt@xxxxxxxxxxxx>
Date: Tue, 18 Sep 2007 09:43:24 +0100
Delivery-date: Tue, 18 Sep 2007 01:45:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <46EA7906.2010504@xxxxxxx><54C7F9BA4B1341takebe_akio@xxxxxxxxxxxxxx> <51CFAB8CB6883745AE7B93B3E084EBE2011113AE@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acf5umQg/TXGkEp9Rja0eZWxrcXKfgAAfFFgAASYjkA=
Thread-topic: [Xen-devel] [RFC] Xen NUMA strategy
> >We may need to write something about guest NUMA in guest
> file.
> >For example, in guest configuration file;
> >vnode = <a number of guest node>
> >vcpu = [<vcpus# pinned into the node: machine node#>, ...]
> >memory = [<amount of memory per node: machine node#>, ...]
> >
> >e.g.
> >vnode = 2
> >vcpu = [0-1:0, 2-3:1]
> >memory = [128:0, 128:1]
> >
> >If we setup vnode=1, old OSes should work fine.

We need to think carefully about NUMA use cases before implementing a
bunch of mechanism.

The way I see it, in most situations it will not make sense for guests
to span NUMA nodes: you'll have a number of guests with relatively small
numbers of vCPUs, and it probably makes sense to allow the guests to be
pinned to nodes. What we have in Xen today works pretty well for this
case, but we could make configuration easier by looking at more
sophisticated mechanisms for specifying CPU groups rather than just
pinning. Migration between nodes could be handled with a locahost
migrate, but we could obviously come up with something more time/space
efficient (particularly for HVM gusts) if required. 

There may be some usage scenarios where having a large SMP guest that
spans multiple nodes would be desirable. However, there's a bunch of
scalability works that's required in Xen before this will really make
sense, and all of this is much higher priority (and more generally
useful) than figuring out how to expose NUMA topology to guests. I'd
definitely encourage looking at the guest scalability issues first.


> This is something we need to do.
> But if user forget to configure guest NUMA in guest configuration
> Xen needs to provide an optimized guest NUMA information based on
> current workload on physical machine.
> We need provide both, user configuration can override default
> configuration.
> >
> >And almost OSes read NUMA configuration only at booting and
> >hotplug.
> >So if xen migrate vcpu, xen has to occur hotpulg event.
> Guest should not know the vcpu migration, so xen doesn't trigger
> hotplug
> event to guest.
> Maybe we should not call it vcpu migration; we can call it vnode
> migration.
> Xen (maybe dom0 application) needs to migrate vnode ( include vcpus
> memorys) from a physical node to another physical node. The guest NUMA
> topology is not changed, so Xen doesn't need to inform guest of the
> vnode migration.
> >It's costly. So pinning vcpu to node may be good.
> Agree
> >I think basicaly pinning a guest into a node is good.
> >If the system becomes imbalanced, and we absolutely want
> >to migration a guest, then xen temporarily migrate only vcpus,
> >and we abandon the performance at that time.
> As I mentioned above, it is not temporary migration. And it will not
> impact performance, (it may impact the performance only at the process
> of vnode migration)
> And I think imbalanced is rare in VMM if user doesn't create and
> destroy
> domain frequently. And VMs on VMM are far less than applications on
> machine.
> - Anthony
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

Xen-devel mailing list