On Fri, Jul 16, 2010 at 1:41 AM, Gaurav Dhiman <dimanuec@xxxxxxxxx> wrote:
> Sure, dividing by 2 could be a good middle ground. We can additionally
> not mark them inactive as well?
Think through the implications of your policy if we have the following
* 2 "burn" VMs, one with weight 100, one with weight 200
* 10 mostly idle VMs, using 1% of the cpu each, with a weight of 100.
Think about what the ideal scheduler would do in this situation.
You want the idle VMs to run whenever they want; that's 90% left for
the two "burn" VMs. We want one "burn" VM to run 30% of the time, and
the other to run 60% of the time (because of the weights).
Now, consider what would happen if we use the algorithm you describe.
Credit1 divides all credits by weight among "active" VMs. With your
modification, we're not marking any VMs "inactive", so we're dividing
it by all VMs. That means each accounting period, the "idle" VMs are
each getting about 7.7% of the credit (1/13), the 100-weight 'burn" VM
is getting 7.7% of the credit, and the 200-weight "burn" vm is getting
15.4% of the credit (2/13).
Now what happens? The "burn" VMs are guaranteed to burn more than
their credits, so they're continually negative. The 200-weight VM only
has 7.7% of cpu time more credit added per accounting period than the
100-weight VM, so even if we sort by credits, it's likely that the
split will be 10% idle VMs / 49% 200-weight / 41 % 100-weight (i.e.,
the 200-weight gets 7.7% of total cpu time more, rather than twice as
much). If we don't set a "floor" for credits, then the credit of the
"burn" VMs will continue to go negative into oblivion; if we do set a
floor, the steady state will be for all VMs to be either at the
ceiling (if they're not using their "fair share"), or at the floor (if
(I encourage you to work out your algorithm by hand, or set up a
simulator and go over the results with a fine-tooth comb, to
understand why this is the case. It's a real grind, but it will give
you a really solid foundation for understanding scheduling problems.
I've spent hours and hours doing just that.)
Credit1 solves this by using the "active / inactive" designation. The
100-weight VM gets 33% of the credits, the 200-weight VM gets 66% of
the credits, and the idle VMs are usually in the "inactive" state,
running at BOOST priority; only occasionally flipping into "active"
for a short time, before flipping back to "inactive".
It's far from ideal, as you've seen, but it usually works not too
badly. Changing the credits to divide by 2 (but still mark it
"active") is a patch-up. But a more fundamental change in the
algorithm needs to be made to avoid this; and that's what credit2 is
BTW, what are you using to do your analysis of the live scheduler?
Xen has a tracing mechanism that I've found indispensable for
understanding what the algorithm was actually doing. I've got the
basic tool I use to analyze the output here:
I don't have the patches used to analyze the scheduler stuff public
(since they normally go through a lot of churn, and are interesting
almost exclusively to developers), but I'll see if I can dig some of
them up for you.
Xen-devel mailing list