WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 4/6] xen: export NUMA topology in physinfo hcall

To: Ryan Harper <ryanh@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH 4/6] xen: export NUMA topology in physinfo hcall
From: Tristan Gingold <Tristan.Gingold@xxxxxxxx>
Date: Wed, 4 Oct 2006 09:30:57 +0200
Cc: Ryan Harper <ryanh@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, xen-ia64-devel <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 04 Oct 2006 00:26:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <20061003203727.GJ12702@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20060929185849.GE12702@xxxxxxxxxx> <200610031144.40820.Tristan.Gingold@xxxxxxxx> <20061003203727.GJ12702@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.5
Le Mardi 03 Octobre 2006 22:37, Ryan Harper a écrit :
> * Tristan Gingold <Tristan.Gingold@xxxxxxxx> [2006-10-03 04:40]:
> > Le Vendredi 29 Septembre 2006 20:58, Ryan Harper a écrit :
> > > This patch modifies the physinfo hcall to export NUMA CPU and Memory
> > > topology information.  The new physinfo hcall is integrated into libxc
> > > and xend (xm info specifically).  Included in this patch is a minor
> > > tweak to xm-test's xm info testcase.  The new fields in xm info are:
> > >
> > > nr_nodes               : 4
> > > mem_chunks             : node0:0x0000000000000000-0x0000000190000000
> > >                          node1:0x0000000190000000-0x0000000300000000
> > >                          node2:0x0000000300000000-0x0000000470000000
> > >                          node3:0x0000000470000000-0x0000000640000000
> > > node_to_cpu            : node0:0-7
> > >                          node1:8-15
> > >                          node2:16-23
> > >                          node3:24-31
> >
> > Hi,
> >
> > I have successfully applied this patch on xen-ia64-unstable.  It requires
> > a small patch to fix issues.
>
> Thanks for giving the patches a test.
>
> > I have tested it on a 4 node, 24 cpus system.
> >
> > I have two suggestions for physinfo hcall:
> > * We (Bull) already sell machines with more than 64 cpus (up to 128).
> > Unfortuantly the physinfo interface works with at most 64 cpus.  May I
> > suggest to replace the node_cpu_to maps with a cpu_to_node map ?
>
> That is fine.  It shouldn't be too much trouble to pass up an array of
> cpu_to_node and convert to node_to_cpu (I like the brevity of the above
> display; based on number of nodes rather than number of cpus).  Does
> that sound reasonable?
I like the current display, and yes it sounds reasonable.

> > * On ia64 memory can be sparsly populated.  There is no real relation
> > between number of nodes and number of memory chunks.  May I suggest to
> > add a new field (nr_mem_chunks) in physinfo ?  It should be a
> > read/written field: it should return the number of mem chunks at ouput
> > (which can be greather than the input value if the buffer was too small).
>
> Even if it sparsely populated, won't each of the chunks "belong" to a
> particular node?  The above list of 4 entries is not hard-coded, but a
> result of the behavior of the srat table memory affinity parsing.
>
> The current srat code from Linux x86_64 (specifically,
> acpi_numa_memory_affinity_init(), merges each memory entry from
> the srat table based on the entries proximity value (a.k.a node
> number).
>
> It will grow the node's memory range either down, or up if the new
> entry's start or end is outside the nodes current range:
>
>  if (!node_test_and_set(node, nodes_parsed)) {
>         nd->start = start;
>         nd->end = end;
>  } else {
>         if (start < nd->start)
>             nd->start = start;
>         if (nd->end < end)
>             nd->end = end;
>  }
>
> The end result will be a mapping of any number of memory chunks to the
> number of nodes in the system as each chunk must belong to one node.
>
> One of the goal for the NUMA patches was to not re-invent this parsing
> and data structures all over, but to reuse what is available in Linux.
> It may be that the x86_64 srat table parsing in Linux differs from ia64
> in Linux.  Is there something that needs fixing here?
On ia64 we have reused ia64 code from linux.  Therefore we don't share all the 
srat parsing code.

I know on my 4 nodes system there are 5 srat entries.  I have to check if the 
entries can be merged.
Stay tune!

Tristan.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>