On Mon, Nov 9, 2009 at 10:13 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
>>>> Dulloor <dulloor@xxxxxxxxx> 09.11.09 15:18 >>>
>>On Mon, Nov 9, 2009 at 7:57 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
>>>>>> Dulloor <dulloor@xxxxxxxxx> 09.11.09 11:35 >>>
>>>>- dom0 can read the numa tables (same as xen). Also, the memory map
>>>>for dom0 is (currently) set in a way that the numa ranges are
>>>>consistent. I don't see that changing, so I feel the assumption is
>>> Pseudo-consistent at best - there's no reason to believe that the node
>>> a physical page appears to live on (by looking up its address in the SRAT)
>>> has any relationship to the node it really lives on.
>>> And even if that was the case, you could easily end up with many (up to
>>> all but one) nodes appearing unpopulated (due to dom0_mem=).
>>Agreed pseudo-consistent (offseted by alloc_spfn). But, even with the
> alloc_spfn (or really the only instance I'm aware of that would matter
> here) is relevant only for the single big blob that contains kernel,
> initial page tables, and such; all other of Dom0's memory can be
> distributed randomly across the address space.
Offseted by alloc_spfn. (mfn = pfn+alloc_spfn) while setting the vphysmap.
Did you mean when dom0_mem is set ?
>>the numa ranges are silently clipped, so the mappings are still
> Correct - but, as previously said, with certain (possibly all but one)
> nodes having no memory at all (possibly until ballooning). (Have you
> checked that a previously unpopulated node suddenly becoming
> populated is being handled properly in all respects in the kernel's
> memory management subsystem, and can you guarantee this will
> always be the case in the future?)
You mean the dom0 starts with low memory (few nodes unpopulated)
and then ballooning adds more ? But, isn't the memory map (for dom0) set
upto dom0-max-mem. And, ballooning can only increase/decrease reservations
in dom0's address space. Maybe I didn't understand your point.
>>>>- XENMEMF flags are indeed meant for xen tools. But, ballooning is
>>>>completely xen specific too ... it is a xen tool, except that it
>>>>resides in domain's kernel/tree.
>>> That doesn't help you with the node ID issue: The tools can make
>>> meaningful use of Xen node IDs; if you want to do this in the kernel
>>> you'll have to establish a kernel<->Xen translation of node IDs.
>>For other guest domains, we will need translation (part of my next patches).
>>But, for dom0, translation is implicit due to shared acpi tables.
> Not really - just check setup_node() in Xen: The node ID is software
> assigned, what comes from SRAT is the pxm value.
But, it is done the same way in Dom0 and xen, although I do agree that this
is not guaranteed in future.
>>I could work on a patch to make mappings fully consistent (by rigging the
>>slit/srat values as seen by dom0), inertia being an interface acceptable to
>>Linux folks. Do we need that ?
In general, I agree there is work to be done (planned for in later patches).
Please do let know any ideas you have.
But, as far as this patch is concerned, it tries only one thing that the node
distribution of memory remains the same across ballooning, acknowledging that
mappings can change underneath and making no other assumptions. It might help in
some cases and is a no-op in others.
Whether the initial distribution is consistent or pseudo-consistent is
a matter of more work.
Moreover, this is just best effort, since even if XENMEMF_node(n) is
set, the allocation
inside xen could still be from other nodes' heaps.
If you/Jeremy don't find this (incremental) patch useful, we can drop
it for now and that's fine with me ! :)
Xen-devel mailing list