This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Xen-devel [XEN PATCH] [Linux-PVOPS] ballooning on numa

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] Xen-devel [XEN PATCH] [Linux-PVOPS] ballooning on numa domains
From: Dulloor <dulloor@xxxxxxxxx>
Date: Tue, 10 Nov 2009 01:37:37 -0500
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Mon, 09 Nov 2009 22:38:00 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=8wZJY+1tAiVlWq+qZZdoSMV5i/mPpPjGSe1xvNtNzJg=; b=aGeCt733wcfgaH+9v4hcfGqa3AAybWA/Sg+HISN1AMZ5ow2ju1wAfv2keN5JqggLhR ChCFgs9zaV72nCf4Ibrev2GXehPryuadizzPDBqt/qR6th3xwepRdgjv5qq8b9IJeozm oCBJ8i7ud3M0ngFba85nJ9pOYTVZUL+ccF5II=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=ogoFRTMD0QtxOWkpbIGKpNNeKIXhvOO5FxeE4eeCw/m5nco4p+LGDKsVjMaAjG7QsC 5LsvFQzXA8zSO3xTHdModHlTsKX4HXsxbOohTWC/8ngJuBpfkIBc0XynOP0tEjuYjL9W BdcBcbGN2nFxkhR355ccOSjhzGV4uorgRXJTM=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4AF83FBF020000780001E8C0@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <940bcfd20911082340x25ebdccep3984b4acc2d6ac57@xxxxxxxxxxxxxx> <4AF7E8EC020000780001E68E@xxxxxxxxxxxxxxxxxx> <940bcfd20911090235y704d163el4347a805e5a90b4f@xxxxxxxxxxxxxx> <4AF81FCB020000780001E7AB@xxxxxxxxxxxxxxxxxx> <940bcfd20911090618x4cb8e7deuf6df575f9146a1eb@xxxxxxxxxxxxxx> <4AF83FBF020000780001E8C0@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Mon, Nov 9, 2009 at 10:13 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
>>>> Dulloor <dulloor@xxxxxxxxx> 09.11.09 15:18 >>>
>>On Mon, Nov 9, 2009 at 7:57 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
>>>>>> Dulloor <dulloor@xxxxxxxxx> 09.11.09 11:35 >>>
>>>>-  dom0 can read the numa tables (same as xen). Also, the memory map
>>>>for dom0 is (currently) set in a way that the numa ranges are
>>>>consistent. I don't see that changing, so I feel the assumption is
>>> Pseudo-consistent at best - there's no reason to believe that the node
>>> a physical page appears to live on (by looking up its address in the SRAT)
>>> has any relationship to the node it really lives on.
>>> And even if that was the case, you could easily end up with many (up to
>>> all but one) nodes appearing unpopulated (due to dom0_mem=).
>>Agreed pseudo-consistent (offseted by alloc_spfn). But, even with the
> alloc_spfn (or really the only instance I'm aware of that would matter
> here) is relevant only for the single big blob that contains kernel,
> initial page tables, and such; all other of Dom0's memory can be
> distributed randomly across the address space.

Offseted by alloc_spfn. (mfn = pfn+alloc_spfn) while setting the vphysmap.
Did you mean when dom0_mem is set ?

>>dom0_mem set,ou
>>the numa ranges are silently clipped, so the mappings are still
> Correct - but, as previously said, with certain (possibly all but one)
> nodes having no memory at all (possibly until ballooning). (Have you
> checked that a previously unpopulated node suddenly becoming
> populated is being handled properly in all respects in the kernel's
> memory management subsystem, and can you guarantee this will
> always be the case in the future?)
You mean the dom0 starts with low memory (few nodes unpopulated)
and then ballooning adds more ? But, isn't the memory map (for dom0) set
upto dom0-max-mem. And, ballooning can only increase/decrease reservations
in dom0's address space. Maybe I didn't understand your point.

>>>>- XENMEMF flags are indeed meant for xen tools. But, ballooning is
>>>>completely xen specific too ... it is a xen tool, except that it
>>>>resides in domain's kernel/tree.
>>> That doesn't help you with the node ID issue: The tools can make
>>> meaningful use of Xen node IDs; if you want to do this in the kernel
>>> you'll have to establish a kernel<->Xen translation of node IDs.
>>For other guest domains, we will need translation (part of my next patches).
>>But, for dom0, translation is implicit due to shared acpi tables.
> Not really - just check setup_node() in Xen: The node ID is software
> assigned, what comes from SRAT is the pxm value.
But, it is done the same way in Dom0 and xen, although I do agree that this
is not guaranteed in future.

>>I could work on a patch to make mappings fully consistent (by rigging the
>>slit/srat values as seen by dom0), inertia being an interface acceptable to
>>Linux folks. Do we need that ?
> Jan

In general, I agree there is work to be done (planned for in later patches).
Please do let know any ideas you have.

But, as far as this patch is concerned, it tries only one thing that the node
distribution of memory remains the same across ballooning, acknowledging that
mappings can change underneath and making no other assumptions. It might help in
some cases and is a no-op in others.
Whether the initial distribution is consistent or pseudo-consistent is
a matter of more work.
Moreover, this is just best effort, since even if XENMEMF_node(n) is
set, the allocation
inside xen could still be from other nodes' heaps.

If you/Jeremy don't find this (incremental) patch useful, we can drop
it for now and that's fine with me ! :)


Xen-devel mailing list