[Xen-devel] NUMA guest: best-fit-nodes algorithm (was Re: [PATCH

To:	Dulloor <dulloor@xxxxxxxxx>, "Cui, Dexuan" <dexuan.cui@xxxxxxxxx>
Subject:	[Xen-devel] NUMA guest: best-fit-nodes algorithm (was Re: [PATCH 00/11] PV NUMA Guests)
From:	Andre Przywara <andre.przywara@xxxxxxx>
Date:	Fri, 23 Apr 2010 14:45:58 +0200
Cc:	xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>
Delivery-date:	Fri, 23 Apr 2010 05:48:47 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	Thunderbird 2.0.0.23 (X11/20090820)

Dulloor wrote:
> Cui, Dexuan <dexuan.cui@xxxxxxxxx> wrote:
>> xc_select_best_fit_nodes() decides the "min-set" of host nodes that
>> will be used for the guest. It only considers the current memory

>> usage of the system. Maybe we should also condider the cpu load? And>> the number of the nodes must be 2^^n? And how to handle the case

>> #vcpu is < #vnode?
>> And looks your patches only consider the guest's memory requirement
>> -- guest's vcpu requirement is neglected? e.g., a guest may not need
>> a very large amount of memory while it needs many vcpus.
>> xc_select_best_fit_nodes() should consider this when
>> determining the number of vnode.
> I agree with you. I was planning to consider vcpu load as the next
> step. Also, I am looking for a good heuristic. I looked at the
> nodeload heuristic (currently in xen), but found it too naive.
> But, if you/Andre think it is a good heuristic, I will add the
> support. Actually, I think in future we should do away with strict
> vcpu-affinities and rely more on a scheduler with necessary NUMA
> support to complement our placement strategies.
>
> As of now, we don't SPLIT, if #vcpu < #vnode. We use STRIPING in that
> case.

Determing the current load of a node is quite a hard thing to docurrently in Xen. If guests are pinned to nodes (which I'd considernecessary with the current credit scheduler), then using this affinityis a good heuristic to find good nodes, at least the best I can thinkof. So until we have a NUMA aware scheduler, we should go with thissolution. Of course it only measures the theoretical load of a node anddoesn't distinguish between idle and loaded guests. One would needsomething like a permanently running xm top to gather statistics aboutthe guest's load, but that is something for a future patch.

(Or is there a guest load metric already measured in Xen?)

Regards,
Andre.


--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448-3567-12


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] NUMA guest: best-fit-nodes algorithm (was Re: [PATCH 00/11]