|
|
|
|
|
|
|
|
|
|
xen-devel
RE: [Xen-devel] RE: Host Numa informtion in dom0
It would be good if the discussion includes how guest NUMA
works with (or is exclusive of) migration/save/restore. Also,
the discussion should include the interaction (or exclusivity
from) the various Xen RAM utilization technologies -- tmem,
page sharing/swapping, and PoD. Obviously it would be great
if Xen could provide both optimal affinity/performance and optimal
flexibility and resource utilization, but I suspect that will
be a VERY difficult combination.
> -----Original Message-----
> From: Ian Pratt [mailto:Ian.Pratt@xxxxxxxxxxxxx]
> Sent: Friday, February 05, 2010 10:39 AM
> To: Kamble, Nitin A; xen-devel@xxxxxxxxxxxxxxxxxxx
> Cc: Ian Pratt
> Subject: [Xen-devel] RE: Host Numa informtion in dom0
>
> > Attached is the patch which exposes the host numa information to
> dom0.
> > With the patch "xm info" command now also gives the cpu topology &
> host
> > numa information. This will be later used to build guest numa
> support.
> >
> > The patch basically changes physinfo sysctl, and adds topology_info &
> > numa_info sysctls, and also changes the python & libxc code
> accordingly.
>
>
> It would be good to have a discussion about how we should expose NUMA
> information to guests.
>
> I believe we can control the desired allocation of memory from nodes
> and creation of guest NUMA tables using VCPU affinity masks combined
> with a new boolean option to enable exposure of NUMA information to
> guests.
>
> For each guest VCPU, we should inspect its affinity mask to see which
> nodes the VCPU is able to run on, thus building a set of 'allowed node'
> masks. We should then compare all the 'allowed node' masks to see how
> many unique node masks there are -- this corresponds to the number of
> NUMA nodes that we wish to expose to the guest if this guest has NUMA
> enabled. We would aportion the guest's pseudo-physical memory equally
> between these virtual NUMA nodes.
>
> If guest NUMA is disabled, we just use a single node mask which is the
> union of the per-VCPU node masks.
>
> Where allowed node masks span more than one physical node, we should
> allocate memory to the guest's virtual node by pseudo randomly striping
> memory allocations (in 2MB chunks) from across the specified physical
> nodes. [pseudo random is probably better than round robin]
>
> Make sense? I can provide some worked exampled.
>
> As regards the socket vs node terminology, I agree the variables are
> probably badly named and would perhaps best be called 'node' and
> 'supernode'. The key thing is that the toolstack should allow hierarchy
> to be expressed when specifying CPUs (using a dotted notation) rather
> than having to specify the enumerated CPU number.
>
>
> Best,
> Ian
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|