This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Questions regarding Xen Credit Scheduler

To: Gaurav Dhiman <dimanuec@xxxxxxxxx>
Subject: Re: [Xen-devel] Questions regarding Xen Credit Scheduler
From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date: Fri, 16 Jul 2010 10:13:11 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 16 Jul 2010 02:13:49 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; bh=oRVk7N0tGiEYBEJaKdmYg5LBO+DgbOXYBdwK8VF9IIM=; b=VkB3uWN9pyi2pBsjsmL5Azhck6Pz50Ns1YYpSiMkf4PyHxwk22HX6qcwc/t3+y1zVk gSMdl0j/vFc7ataZM90iU97I4PBtftWUqtXWRynR+wRY3XORiqj+2x7hD0WESTsuuRIV HkGC2frEnQT58nRSI5NnVRe2vdU9r9UNmYJfQ=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=NYs8sv/d2+7o18xPKACCGuUNHkT+qC3GQIpn0ayOZ7qIGWzWiL6TBkI9d/4wNgULln qa37EI0NsC4GspF8KMYB/xAy52LMIGajBbZJChp9lO7Qfg50GE7O4ww/8qQG23eZFMWP MwqkPiNWDWkUZNr6hjcHmChMAdZ5zowkKk/2I=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTimcko1hwMhIuZsix4M7Q-7lLh7Sdpqp890Vb4s8@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTiloe7jgMO49i72sF0MDFmuHJJeysEBb0oLVNono@xxxxxxxxxxxxxx> <AANLkTikh504vP27XP1SXtNANv2h1Z42RNDgEzRMjI-BK@xxxxxxxxxxxxxx> <AANLkTim2BYie1fZS00YO23XOZB3KRv8JFXmptbt9I-rp@xxxxxxxxxxxxxx> <AANLkTinFWVZwfPdVc_pbo6x77KYNqVYEa8xJCkbEAjKF@xxxxxxxxxxxxxx> <AANLkTimcko1hwMhIuZsix4M7Q-7lLh7Sdpqp890Vb4s8@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Fri, Jul 16, 2010 at 1:41 AM, Gaurav Dhiman <dimanuec@xxxxxxxxx> wrote:
> Sure, dividing by 2 could be a good middle ground. We can additionally
> not mark them inactive as well?

Think through the implications of your policy if we have the following
* 2 "burn" VMs, one with weight 100, one with weight 200
* 10 mostly idle VMs, using 1% of the cpu each, with a weight of 100.

Think about what the ideal scheduler would do in this situation.

You want the idle VMs to run whenever they want; that's 90% left for
the two "burn" VMs.  We want one "burn" VM to run 30% of the time, and
the other to run 60% of the time (because of the weights).

Now, consider what would happen if we use the algorithm you describe.
Credit1 divides all credits by weight among "active" VMs.  With your
modification, we're not marking any VMs "inactive", so we're dividing
it by all VMs.  That means each accounting period, the "idle" VMs are
each getting about 7.7% of the credit (1/13), the 100-weight 'burn" VM
is getting 7.7% of the credit, and the 200-weight "burn" vm is getting
15.4% of the credit (2/13).

Now what happens?  The "burn" VMs are guaranteed to burn more than
their credits, so they're continually negative. The 200-weight VM only
has 7.7% of cpu time more credit added per accounting period than the
100-weight VM, so even if we sort by credits, it's likely that the
split will be 10% idle VMs / 49% 200-weight / 41 % 100-weight (i.e.,
the 200-weight gets 7.7% of total cpu time more, rather than twice as
much).  If we don't set a "floor" for credits, then the credit of the
"burn" VMs will continue to go negative into oblivion; if we do set a
floor, the steady state will be for all VMs to be either at the
ceiling (if they're not using their "fair share"), or at the floor (if
they are).

(I encourage you to work out your algorithm by hand, or set up a
simulator and go over the results with a fine-tooth comb, to
understand why this is the case.  It's a real grind, but it will give
you a really solid foundation for understanding scheduling problems.
I've spent hours and hours doing just that.)

Credit1 solves this by using the "active / inactive" designation.  The
100-weight VM gets 33% of the credits, the 200-weight VM gets 66% of
the credits, and the idle VMs are usually in the "inactive" state,
running at BOOST priority; only occasionally flipping into "active"
for a short time, before flipping back to "inactive".

It's far from ideal, as you've seen, but it usually works not too
badly.  Changing the credits to divide by 2 (but still mark it
"active") is a patch-up.  But a more fundamental change in the
algorithm needs to be made to avoid this; and that's what credit2 is

BTW, what are you using to do your analysis of the live scheduler?
Xen has a tracing mechanism that I've found indispensable for
understanding what the algorithm was actually doing.  I've got the
basic tool I use to analyze the output here:


I don't have the patches used to analyze the scheduler stuff public
(since they normally go through a lot of churn, and are interesting
almost exclusively to developers), but I'll see if I can dig some of
them up for you.


Xen-devel mailing list