This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [RFC] Xen NUMA strategy

To: aron@xxxxxx
Subject: Re: [Xen-devel] [RFC] Xen NUMA strategy
From: André Przywara <andre@xxxxxxxxx>
Date: Thu, 20 Sep 2007 12:26:05 +0200
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Anthony.Xu@xxxxxxxxx
Delivery-date: Thu, 20 Sep 2007 03:27:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Reply-to: André Przywara <andre.przywara@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird (X11/20061206)
Hi Aron,

>> 1.) Guest NUMA support: spread a guest's resources (CPUs and memory)
>> over several nodes and propagate the appropriate topology to the
>> guest. ...
>It seems like you are proposing two things at once here.  Let's call
>these 1a and 1b
>1a. Expose NUMA topology to the guests.  This isn't the topology of
>    dom0, just the topology of the domU, i.e. it is constructed by
>    dom0 when starting the domain.
>1b. Spread the guest over nodes.  I can't tell if you mean to do this
>    automatically or by request when starting the guest.  This seems
>    to be separate from 1a.
From an implementation point-of-view this is right, if you look at my patches I sent mid of August those parts are done in seperate patches:
Patch 3/4 cares about 1b), Patch 4/4 is about 1a)
But both parts do not make much sense if done seperately. If you spread the guest over several nodes and don't tell the guest OS about it, you will have about the same behaviour Xen had before the integration of the basic NUMA patches from Ryan Harper in October 2006.

>>       ***Disadvantages***:
>> - The guest has to support NUMA...
>> - The guest's workload has to fit NUMA...
>IMHO the list of disadvantages is only what we have in xen today.
>Presently no guests can see the NUMA topology, so it's the same as if
>they don't have support in the guest.  Adding NUMA topology
>propogation does not create these disadvantages, it simply exposes the
>weakness of the lesser operating systems.
This was mostly thought of disadvantages against the solution 2)

>> 2.) Dynamic load balancing and page migration:
>Again, this seems like a two-part proposal.
>2a. Add to xen the ability to run a guest within a node, so that cpus
>    and ram are allocated from within the node instead of randomly
>    across the system.
This is already in Xen, at least if you pin the guest manually to a certain node _before_ creating the guest (by saying for instance cpus=0,1 if the first node consists of the first two CPUs). Xen will try to allocate the guest's memory from within the node the first VCPU is currently scheduled on (at least for HVM guests).

>2b. NUMA balancing.  While this seems like a worthwhile goal, IMHO
>    it's separate from the first part of the proposal.
This is most of the work that has to be done.

> If the mechanics of migrating between NUMA nodes is implemented in the
> hypervisor, then policy and control can be implemented in dom0
> userland, so none of the automatic part of this needs to be in the
> hypervisor.
This maybe true, at least there should be some means to manually migrate domains between nodes, which must be triggered from Dom0. So automatic behavior could be triggered from there, too.


Andre Przywara
AMD - Operating System Research Center, Dresden, Germany

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>