WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] [PATCH] 0/7 xen: Add basic NUMA support
From: Ryan Harper <ryanh@xxxxxxxxxx>
Date: Fri, 16 Dec 2005 17:01:49 -0600
Cc: Ryan Grimm <grimm@xxxxxxxxxx>
Delivery-date: Fri, 16 Dec 2005 23:04:04 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.6+20040907i
The patchset will add basic NUMA support to Xen (hypervisor only).  We
borrowed from Linux support for NUMA SRAT table parsing, discontiguous
memory tracking (mem chunks), and cpu support (node_to_cpumask etc).

The hypervisor parses the SRAT tables and constructs mappings for each
node such as node to cpu mappings and memory range to node mappings.

Using this information, we also modified the page allocator to provide a
simple NUMA-aware API.  The modified allocator will attempt to find
pages local to the cpu where possible, but will fall back on using
memory that is of the requested size rather than fragmenting larger
contiguous chunks to find local pages.  We expect to tune this algorithm
in the future after further study.

We also modified Xen's increase_reservation memory op to balance memory
distribution across the vcpus in use by a domain.  Relying on previous
patches which have already been committed to xen-unstable, a guest can be
constructed such that its entire memory is contained within a specific
NUMA node.

We've added a keyhandler for exposing some of the NUMA-related
information and statistics that pertain to the hypervisor.

We export NUMA system information via the physinfo hypercall.  This
information provides cpu/memory topology and configuration information
gleaned from the SRAT tables to userspace applications.  Currently, xend
doesn't leverage any of the information automatically but we intend to
do so in the future.

We've integrated in NUMA information into xentrace so we can track various
points such as page allocator hits and misses as well as other
information.  In the process of implementing the trace, we also fixed
some incorrect assumptions about the symmetry of NUMA systems w.r.t the
sockets_per_node value.  Details are available a later email with the
patch.

These patches have been tested on several IBM NUMA and non-NUMA systems:

NUMA-aware systems: 
IBM Dual Opteron:  2 Node,  2 CPU,  4GB 
IBM x445        :  4 Node, 32 CPU, 32GB 
IBM x460        :  1 Node,  8 CPU, 16GB
IBM x460        :  2 Node, 32 CPU, 32GB

Non NUMA-aware systems (i.e, no SRAT tables):
IBM Dual Xeon   :  1 Node,  2 CPU,  2GB 
IBM P4          :  1 Node,  1 CPU,  1GB


We look forward to your review of the patches for acceptance.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@xxxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>