RE: [Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support

To:	"Ryan Harper" <ryanh@xxxxxxxxxx>
Subject:	RE: [Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support
From:	"Ian Pratt" <m+Ian.Pratt@xxxxxxxxxxxx>
Date:	Sun, 18 Dec 2005 20:18:29 -0000
Cc:	xen-devel@xxxxxxxxxxxxxxxxxxx, Ryan Grimm <grimm@xxxxxxxxxx>
Delivery-date:	Sun, 18 Dec 2005 20:21:00 +0000
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AcYCxeD+MxNiy1CRT6SXydPWXi+TlAAtmSPQ
Thread-topic:	[Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support

> > Personally, I think we should have separate budy allocators 
> for each 
> > of the zones; much simpler and faster in the common case.
> 
> I'm not sure how having multiple buddy allocators helps one 
> choose memory local to a node.  Do you mean to have a buddy 
> allocator per node?

Absoloutely. You try to allocate from a local node, and then fall back
to others.
 
> > This makes sense for 1 vcpu guests, but for multi vcpu guests this 
> > needs way more discussion. How do we expose the 
> (potentially dynamic) 
> > mapping of vcpus to nodes? How do we expose the different 
> memory zones 
> > to guests? How does Linux make use of this information? 
> This is a can 
> > of worms, definitely phase 2.
> 
> I believe this makes sense for multi-vcpu guests as currently 
> the vcpu to cpu mapping is known at domain construction time 
> and prior to memory allocation.  The dynamic case requires 
> some thought as we don't want to spread memory around, unplug 
> two or three vcpus and potentially incur a large number of 
> misses because the remaining vcpus are not local to all the 
> domains memory.

Fortunately we already have a good mechanism for moving pages between
nodes: save/restore could be adapted to do this. For shadow-translate
guests this is even easier, but of course there are other penalties of
running in shadow translate mode the whole time.

> The phase two plan is to provide virtual SRAT and SLIT tables 
> to the guests to leverage existing Linux NUMA code.  Lots to 
> discuss here.

The existing mechanisms (in Linux and other OSes) are not intended for a
dynamic situation. I guess that will be phase 3, but it may mean that
using the SRAT is not the best way of communicating this information. 

Best,
Ian

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

RE: [Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support