WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support

To: "Ryan Harper" <ryanh@xxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support
From: "Ian Pratt" <m+Ian.Pratt@xxxxxxxxxxxx>
Date: Sun, 18 Dec 2005 20:18:29 -0000
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Ryan Grimm <grimm@xxxxxxxxxx>
Delivery-date: Sun, 18 Dec 2005 20:21:00 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcYCxeD+MxNiy1CRT6SXydPWXi+TlAAtmSPQ
Thread-topic: [Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support
> > Personally, I think we should have separate budy allocators 
> for each 
> > of the zones; much simpler and faster in the common case.
> 
> I'm not sure how having multiple buddy allocators helps one 
> choose memory local to a node.  Do you mean to have a buddy 
> allocator per node?

Absoloutely. You try to allocate from a local node, and then fall back
to others.
 
> > This makes sense for 1 vcpu guests, but for multi vcpu guests this 
> > needs way more discussion. How do we expose the 
> (potentially dynamic) 
> > mapping of vcpus to nodes? How do we expose the different 
> memory zones 
> > to guests? How does Linux make use of this information? 
> This is a can 
> > of worms, definitely phase 2.
> 
> I believe this makes sense for multi-vcpu guests as currently 
> the vcpu to cpu mapping is known at domain construction time 
> and prior to memory allocation.  The dynamic case requires 
> some thought as we don't want to spread memory around, unplug 
> two or three vcpus and potentially incur a large number of 
> misses because the remaining vcpus are not local to all the 
> domains memory.

Fortunately we already have a good mechanism for moving pages between
nodes: save/restore could be adapted to do this. For shadow-translate
guests this is even easier, but of course there are other penalties of
running in shadow translate mode the whole time.

> The phase two plan is to provide virtual SRAT and SLIT tables 
> to the guests to leverage existing Linux NUMA code.  Lots to 
> discuss here.

The existing mechanisms (in Linux and other OSes) are not intended for a
dynamic situation. I guess that will be phase 3, but it may mean that
using the SRAT is not the best way of communicating this information. 

Best,
Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>