|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] [PATCH 4/6] xen: export NUMA topology in physinfo hcall
Le Mardi 03 Octobre 2006 22:37, Ryan Harper a écrit :
> * Tristan Gingold <Tristan.Gingold@xxxxxxxx> [2006-10-03 04:40]:
> > Le Vendredi 29 Septembre 2006 20:58, Ryan Harper a écrit :
> > > This patch modifies the physinfo hcall to export NUMA CPU and Memory
> > > topology information. The new physinfo hcall is integrated into libxc
> > > and xend (xm info specifically). Included in this patch is a minor
> > > tweak to xm-test's xm info testcase. The new fields in xm info are:
> > >
> > > nr_nodes : 4
> > > mem_chunks : node0:0x0000000000000000-0x0000000190000000
> > > node1:0x0000000190000000-0x0000000300000000
> > > node2:0x0000000300000000-0x0000000470000000
> > > node3:0x0000000470000000-0x0000000640000000
> > > node_to_cpu : node0:0-7
> > > node1:8-15
> > > node2:16-23
> > > node3:24-31
> >
> > Hi,
> >
> > I have successfully applied this patch on xen-ia64-unstable. It requires
> > a small patch to fix issues.
>
> Thanks for giving the patches a test.
>
> > I have tested it on a 4 node, 24 cpus system.
> >
> > I have two suggestions for physinfo hcall:
> > * We (Bull) already sell machines with more than 64 cpus (up to 128).
> > Unfortuantly the physinfo interface works with at most 64 cpus. May I
> > suggest to replace the node_cpu_to maps with a cpu_to_node map ?
>
> That is fine. It shouldn't be too much trouble to pass up an array of
> cpu_to_node and convert to node_to_cpu (I like the brevity of the above
> display; based on number of nodes rather than number of cpus). Does
> that sound reasonable?
I like the current display, and yes it sounds reasonable.
> > * On ia64 memory can be sparsly populated. There is no real relation
> > between number of nodes and number of memory chunks. May I suggest to
> > add a new field (nr_mem_chunks) in physinfo ? It should be a
> > read/written field: it should return the number of mem chunks at ouput
> > (which can be greather than the input value if the buffer was too small).
>
> Even if it sparsely populated, won't each of the chunks "belong" to a
> particular node? The above list of 4 entries is not hard-coded, but a
> result of the behavior of the srat table memory affinity parsing.
>
> The current srat code from Linux x86_64 (specifically,
> acpi_numa_memory_affinity_init(), merges each memory entry from
> the srat table based on the entries proximity value (a.k.a node
> number).
>
> It will grow the node's memory range either down, or up if the new
> entry's start or end is outside the nodes current range:
>
> if (!node_test_and_set(node, nodes_parsed)) {
> nd->start = start;
> nd->end = end;
> } else {
> if (start < nd->start)
> nd->start = start;
> if (nd->end < end)
> nd->end = end;
> }
>
> The end result will be a mapping of any number of memory chunks to the
> number of nodes in the system as each chunk must belong to one node.
>
> One of the goal for the NUMA patches was to not re-invent this parsing
> and data structures all over, but to reuse what is available in Linux.
> It may be that the x86_64 srat table parsing in Linux differs from ia64
> in Linux. Is there something that needs fixing here?
On ia64 we have reused ia64 code from linux. Therefore we don't share all the
srat parsing code.
I know on my 4 nodes system there are 5 srat entries. I have to check if the
entries can be merged.
Stay tune!
Tristan.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|