This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Question about the ability of credit scheduler to handle

To: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Question about the ability of credit scheduler to handle I/O and CPU intensive VMs
From: Yuehai Xu <yuehaixu@xxxxxxxxx>
Date: Mon, 4 Oct 2010 22:52:51 -0400
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, yhxu@xxxxxxxxx
Delivery-date: Mon, 04 Oct 2010 19:53:48 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=hC3r2KKqMmb+fd1O5mQsAybucW3SEgRvAA17J/z0qN8=; b=Yep2KVrlJtOeEvQDEcYY3q2b3aucPer5Q6BsEm1SxgMD3+HcO/JuKqPU51XOAYGCBU V/IYltqLOSviHC1ZtW+HD+UGYcTegJYVaasznjB80qeu+DciIgSylutpv0SC22DKcc2X t+U+K+K9ljRK1GG6RO3DfeQTA9TPF1klsSpig=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=i3rPBPdaTjUqufWMwsWj3Q75xD3wEYYy1uuyUGofbcnWMaaW5nuwWs1yt2GvLZeEe5 2L9goXrTG0ZPPGoZIea89Qgy5nmouAKTGXFgPgkQ8I+swvON8EnkTXyVvhk/mZYnMagL PbvzEfOxLpnQwEsLZtVzTqpQiyAGIy2I5aADw=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTi=Oa0_=vXrr63eALBU2sQa3aLV0NiQHt8hPPvcw@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTi=Ro24zg-yDPk1+=c0XsZSe2kNn8Gk07Bu4x0WN@xxxxxxxxxxxxxx> <AANLkTin9E1m_jFcj4Ak7nB9OxcQynrznpQ_nNPi_U7hN@xxxxxxxxxxxxxx> <AANLkTikBWZdpOviSEQSNi_pf66A+zYW8FyQVjiCX8ojm@xxxxxxxxxxxxxx> <AANLkTi=Oa0_=vXrr63eALBU2sQa3aLV0NiQHt8hPPvcw@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Thu, Sep 30, 2010 at 9:27 AM, George Dunlap
<George.Dunlap@xxxxxxxxxxxxx> wrote:
> On Thu, Sep 30, 2010 at 1:28 PM, Yuehai Xu <yuehaixu@xxxxxxxxx> wrote:
>>> I agree, letting a VM with an interrupt run for a short period of time
>>> makes sense.  The challenge is to make sure that it can't simply send
>>> itself interrupts every 50us and get to run 100% of the time. :-)
>> I am afraid I don't really understand the challenge is, or, in another
>> word, this method is good principally, but in practice, it is hard to
>> implement? As I know, the OS should always schedules I/O related
>> processes once they are in runnable queue, so, as long as we give even
>> a very short period of time to the waken up guest VM, the I/O process
>> in it should be scheduled at once. In that case, this problem should
>> be solved. Of course, I don't do experiments, saying is always much
>> easier than doing.
> What I mean is that you have to be careful when implementing it.  A
> very simple implementation would look like this:
> * Normally, let the VM with the highest credits run.  However, if a VM
> is sent an interrupt, give it priority to run for 50us.
> Now, suppose, however, that a rogue VM sets up a periodic timer to
> send itself an interrupt every 55us.  Then it will get an interrupt,
> get priority for 50us, be preempted for 5us, and then get another
> interrupt, allowing it to run for another 50us.    Thus it runs 90% of
> the time, even though it should only run (for example) 50% of the
> time.
> We need a way to balance interrupt latency (how long after an
> interrupt is raised before a VM can run) and cpu scheduling fairness.
> That means that if we let a VM run for 50us, and then preempt it, and
> it gets an interrupt 5us later, we need a way to know not to schedule
> it until it's been off the cpu for a reasonable amount of time.  It's
> possible, but it will take some experimentation to see what the best
> option is.
>  -George

I'd like try to implement this idea to XEN, even though I am not sure
whether I can do it since I am not an expert. :-D.

The first step for me is to write a very simple scheduler without
considering CPU fairness, I/O performance, etc. Its mechanism is very
the selection of next VCPU is based on the algorithm of round robin.
The current VCPU is always inserted into the tail of the list while
the next
VCPU of the head is selected to be scheduled. The current test code is
basing on credit scheduler of XEN 4.0.1-rc6-pre, except that I delete
the component of credit calculation related, the tick of every 10ms,
30ms is also deleted. The time for the next VCPU which is selected is
set to 30ms.

Here, my pre-assumption is that Dom0 pins to PCPU0, while other DomU
pins to PCPU1 for simplicity.

However, some problems puzzle me a lot. When I start two DomU which
shares PCPU1, and in both of which I run a CPU intensive program,
the trace log from xenalyze is below(I modify some code so that the
format is different from the original):
<  0.399300204 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 30000802)
<  0.424239058 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 30000582)
<  0.449177708 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 30000336)
<  0.474116762 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 30000827)
<  0.499055641 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 30000596)
<  0.523972987 |x d2v0> (dom: 2) --> (dom: 1) vruntime : 30001301)
<  0.548911095 -x d1v0> (dom: 1) --> (dom: 2) vruntime : 29999684)
I think these results make sense since every domU is using almost 30ms of PCPU1

However, I stop one of the CPU intensive program in a DomU while keep
the other running, the results are:
<  0.327815345 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 1542607)
<  0.327906620 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 109521)
<  0.344349033 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 19779544)
<  0.344377129 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 33528)
<  0.344570662 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 232540)
<  0.344643933 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 87857)
<  0.345009170 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 439081)
<  0.345034387 -x d2v0> (dom: 2) --> (dom: 1) vruntime : 30059)
<  0.369973183 -x d1v0> (dom: 1) --> (dom: 1) vruntime : 30000506)
<  0.392423279 |x d1v0> (dom: 1) --> (dom: 2) vruntime : 27006658)

Here I am gotten confusing, since my algorithm of scheduling is very
simple, every VM should have 30ms of PCPU, however, from the results,
the time for
each VCPU to have PCPU is quite unstable. I think somewhere, the
routine of schedule() should be invoked frequently, and from xentop,
the VM with CPU
intensive occupies PCPU almost at 97%.


Xen-devel mailing list