WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [RFC][PATCH] scheduler: credit scheduler for client virt

To: "NISHIGUCHI Naoki" <nisiguti@xxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [RFC][PATCH] scheduler: credit scheduler for client virtualization
From: "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx>
Date: Fri, 5 Dec 2008 11:37:11 +0000
Cc: Ian.Pratt@xxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, disheng.su@xxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Fri, 05 Dec 2008 03:37:39 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references :x-google-sender-auth; bh=k/t9iPPySfwPF5RGX5mEvivp8Ci4OHTva6Jy0pATerQ=; b=h0W8cSBCs+1fNeS11sNwET+ws+f7l9n8nwVc9w5LBCHwL5BJOYyo0zpB0MqhaQNSN9 FScklBUkb9o8ThLBkBc1BMJ96qVXVAJIG7YX+qLjgvlbYT6qrOaZRctg1aogU4CUIlyg 7ZIa2aO+McxG/IFB+pgIMWuQ10HwtyZmiGq1Y=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=fb6iNqzkwidgQ1tLEZWIivuNPz/MosP93siFG6YcTWNwywt0ZL3CqI6OWcwPH9vKtf 9Zx5urUfQybjH9VkfTRdP7V2oEE7n1rIz5A89IA9iLwvHiFD5JE+nMkz5KG2yfd6xEY+ P7U051p+VZ2fno8F9vKAXVzGuT+FvzZhrtSfQ=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4938962B.1060007@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <49364960.2060101@xxxxxxxxxxxxxx> <C55BFEE2.1FCA7%keir.fraser@xxxxxxxxxxxxx> <de76405a0812030446m38290b2ex9d624a0f7d788cfc@xxxxxxxxxxxxxx> <49378C16.1040106@xxxxxxxxxxxxxx> <de76405a0812040421i15f9e87dy3bf80c6a590505e0@xxxxxxxxxxxxxx> <4938962B.1060007@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Fri, Dec 5, 2008 at 2:47 AM, NISHIGUCHI Naoki
<nisiguti@xxxxxxxxxxxxxx> wrote:
> Oh, I misread the word "battery". I understand what "a battery of tests"
> means.
> By the way, what tests do you concretely do? I have no idea on these tests.

For basic workload tests, a couple are pretty handy.  vConsolidate is
a good test, but pretty hard to set up; I should be able to manage it
with our infrastructure here, though.  Other tests include:
* kernel-build (i.e., time how long it takes to build the Linux
kernel) and or ddk-build (Windows equivalent)
* specjbb (a cpu-intensive workload)
* netperf (for networks)

For testing its effect on network, the paper I mentioned has three
workloads that it combines with different ways:
* cpu (just busy spinning)
* sustained network (netbench): throughput
* network ping: latency.

> OK.
> We must consider also a sleeping vcpu. The vcpu will be added to the queue
> by wakeup. So, we can set the timer to 2ms only if the next waiting vcpu on
> the queue or the sleeping vcpu is also BOOST.
>
> My thought about 2ms is: the period that the vcpu will be executed next is
> 2ms. Therefore, time slice of the vcpu is changed according to the number of
> existing vcpus. In a word, we may set the timer to 2ms or less. But I think
> that the number of vcpus will not be so much. Is this supposition wrong? And
> how about time slice of 2ms or less?

I think I understand you to mean: If we set the timer for 10ms, and in
the mean time another vcpu wakes up and is set at BOOST, then it won't
get a chance to run for another 10 ms.  And you're suggesting that we
run the scheduler at 2ms if there are any vcpus that *may* wake up and
be at BOOST, just in case; and you don't think this situation will
happen very often.  Is that correct?

Unfortunately, in consolidated server workloads you're pretty likely
to have more vcpus than physical cpus, so I think this case would come
up pretty often.  Furthermore, 2ms is really too short a scheduling
quantum for normal use, especially for HVM domains, which have to take
a vmexit/vmenter cycle to handle every interrupt.  (I did some tests
back when we were using the SEDF scheduler, and the scheduling alone
was a 4-5% overhead for HVM domains.)

But I don't think we actually have a problem here: if a vcpu wakes up
and is promoted to BOOST, won't it "tickle" the runqueues to find
somewhere for it to run?  At very least the current cpu should be able
to run it, or if it's already running one at BOOST, it can set its own
timer to 2ms.  In any case, I think handling this corner case with
some extra code is preferrable to running a 2ms timer any time it
*might* happen.

> OK.
> I'll separate individual changes from current patch and post each patch.

Thanks!  I'll take them for a spin today.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel