WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: Power aware credit scheduler

To: "Emmanuel Ackaouy" <ackaouy@xxxxxxxxx>
Subject: [Xen-devel] RE: Power aware credit scheduler
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Thu, 19 Jun 2008 21:32:44 +0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, "Wei, Gang" <gang.wei@xxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>
Delivery-date: Thu, 19 Jun 2008 06:33:20 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <46C27FF3-24A7-48DA-9ABA-BCCB3E9DD30C@xxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <D470B4E54465E3469E2ABBC5AFAC390F024D9444@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <46C27FF3-24A7-48DA-9ABA-BCCB3E9DD30C@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcjSDbMZejSzEOCzSqG+TYtNQtVGuAAAENJg
Thread-topic: Power aware credit scheduler
>From: Emmanuel Ackaouy [mailto:ackaouy@xxxxxxxxx] 
>Sent: 2008年6月19日 21:09
>
>Hi Kevin.
>
>I'm glad you're looking at this. There are a bunch of interesting
>areas to look at to improve scheduling on large hierarchical
>systems. The idle loop is at the center of most of them.

Agree.

>
>On Jun 19, 2008, at 6:51 , Tian, Kevin wrote:
>> a) when there's more idle cpus than required
>>
>> a.1) csched_cpu_pick
>>      Existing policy is to pick one with more idle neighbours,
>> to avoid shared resource contention among cores or threads.
>> However from power P.O.V, package C-state saves much more
>> power than per-core C-state vehicle. From this angle, it might be
>> better to keep idle package continuously idle, while picking idle
>> cores/threads with busy neighbours already, if csched_private.
>> power is set. The performance/watt ratio is positively incremented
>> though absolute performance is kicked a bit.
>
>Regardless of any new knobs, a good default behavior might be
>to only take a package out of C-state when another non-idle
>package has had more than one VCPU active on it over some
>reasonable amount of time.
>
>By default, putting multiple VCPUs on the same physical package
>when other packages are idle is obviously not always going to
>be optimal. Maybe it's not a bad default for VCPUs that are
>related (same VM or qemu)? I think Ian P hinted at this. But it
>frightens me that you would always do this by default for any set
>of VCPUs. Power saving is good but so is memory bandwidth

To enable this feature depends on a control command from system
adminstrator, who knows the tradeoff. From absolute performance 
P.O.V, I believe it's not optimal. However if looking from the 
performance/watt, i.e. power efficiency angle, power saving due to
package level idle may overwhelm performance impact by keeping
activity in other package. Of course finally memory latency should
be also considered in NUMA system, as you mentioned.

Note that we'll never keep one package idle when other package
already has vcpu pending in runqueue. Even when such power
aware feature is configured, it only happens when cpu number is
larger than runnable vcpu number.

Just like what prevalent OS provides to choose user's own 
profiles... :-)

>
>
>> a.2) csched_vcpu_wake
>>      Similar as above, instead of blindly kick all idle cpus in
>> a rush, some selective knock can be pushed with power factor
>> concerned.
>
>Yeah, you will need to rewrite the idle kick code. This can be
>tricky because a CPU's idle state might change by the time it
>processes a "scheduling IPI" and you need to be careful that
>a runnable VCPU doesn't sit on a runqueue when there is at
>least one idle CPU in the system.
>

I understand above caveats but not sure I catch exactly how it's
related to possible change. Could you elaborate a bit? How does
above concerns get handled in current logic?

Thanks,
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel