Hi,
(sorry for the late reply, the mail was already scrolled out of the
window ;-). I will split the thread up to allow quicker and more focused
responses).
Cui, Dexuan wrote:
> Dulloor wrote:
> ...
> Hi Dulloor,
> In your patches, the toolstack tries to figure out the "best fit
> nodes" for a PV guest and invokes a hypercall set_domain_numa_layout
> to tell the hypervisor to remember the info, and later the PV guest
> invokes a hypercall get_domain_numa_layout to retrieve the info from
> the hypervisor.
> Can this be changed to: the toolstack writes the guest numa info
> directly into a new field in the start_info(or the share_info) (maybe
> in the starndard format of the SRAT/SLIT) and later PV guest reads the
> info and uses acpi_numa_init() to parse the info? I think in this way
> the new hypercalls can be avoided and the pv numa enlightenment code
> in guest kernel can be minimized.
> I'm asking this because this is the way how HVM numa patches of
> Andure do(the toolstack passes the info to hvmloader and the latter
> builds SRAT/SLIT for guest)
I think that is a fundamental difference between PV and HVM, where in
HVM you naturally have to inject all infos, but PV is mostly querying
the info it needs. AFAICS the design of PV Linux is to remove everything
that is not absolutely necessary. I once also tried PV NUMA support, but
gave up when I discovered that both NUMA and ACPI were turned off in the
then-recent PV kernels (read: kudos to Dulloor ;-). I like the ELF hint
trick, it solves a big problem we have with HVM guests: Are they NUMA
aware or not? If not, the striping is maybe a better option than
persisting on the NUMA layout. Only for HVM guests it is almost
impossible to know beforehand.
So as far as this goes, I am OK with PV guests using hypercalls to query
the NUMA information, the only thing I would hint is to leverage the
already existing guest NUMA code and actually provide the info in ACPI
SRAT/SLIT format.
But one has to consider possible runtime changes to the topology, as
this is something that ACPI currently does not provide.
Regards,
Andre.
--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|