Guarav,
I've identified a lot of the problems you mention here (you may want
to see my paper and talk from XenSummit Asia 2009 [1]) , but I haven't
done anything to address them in credit1 because I thought it really
just needed to be scrapped and started over.
However, that process is taking a lot longer than I'd hoped, so I
think it may make sense to do some work to patch up the current
scheduler to keep it running until the new one can replace it. Are
you willing to help out with some investigation / testing in this
process?
Regarding specific things:
One thing you didn't catch is that credits before 4.0 are debited
probabilistically, a full 10ms at a time, by the very timer tick that
moves a vcpu from "inactive" to "active"; so when you make the switch
from "active" to "inactive", you don't start out at 0, but at -10ms.
It turns out that's not only bad for latency-sensitive processes, but
it's also a security bug; so there's a patch in 4.0 (not sure whether
it's been backported to 3.4) to do accurate accounting based on RDTSC
reads instead of probabilistic-based accounting based on timer ticks.
#1: Setting the credits to 0 is part of the "reset condition" I
mention in my paper. The basic idea is that accumulated credit needs
to be discarded somehow. I have a patch that will intsead of setting
it to 0, will divide it by 2. This should balance between discarding
credits and not starting too far "behind".
#2: AFAICT, the reason for choosing to sort by priority was that it
allowed a simple O(n) sorting algorithm. However, the effect is that
within a given priority, scheduling is round-robin. Round-robin
scheduling is known to discriminate against processes that voluntarily
block in favor of those that use up their entire timeslice. Diego et
al[2] did some experiments with sorting by credit and found that it
helped latency sensitive workloads
So the answer to #3 is:
* The "accurate credit" patch is in 4.0, maybe 3.4. That should help somewhat.
* I have a patch that will change the "reset condition"; I'm
considering submitting it. I'd appreciate testing / feedback. (I'll
send this in a separate e-mail.)
* There is no patch yet that will fix the sort-by-priority, but it
should be simple and straightforward to implement. I'll support
putting it in once I'm reasonably convinced that it helps and doesn't
hurt too much. If you were to help out with the implementation and
testing, that will happen a lot faster. :--)
Peace,
-George
Refs:
[1] http://www.xen.org/xensummit/xensummit_fall_2009.html -- search
for my name under "Topics"
[2] Diego Ongaro , Alan L. Cox , Scott Rixner, Scheduling I/O in
virtual machine monitors, Proceedings of the fourth ACM SIGPLAN/SIGOPS
international conference on Virtual execution environments, March
05-07, 2008, Seattle, WA, USA
On Thu, Jul 8, 2010 at 10:14 PM, Gaurav Dhiman <dimanuec@xxxxxxxxx> wrote:
> 1) In the sched_acct function, the credit cap is set to 300, enough to
> survive one time slice. But if some VCPU crosses that cap, it is set
> to 0, and marked inactive. Why is there no concept of a ceiling (like
> that of a floor for the VCPUs going over the credit line), i.e. why is
> it not set to 300? Is there some fundamental reason for setting it to
> 0? I believe this is resulting in a lot of times when our latency
> sensitive VCPUs have to wait for maybe a time slice, when they can
> immediately run. This might happen if they run with BOOST priority and
> get interrupted by a timer tick, which takes that priority away.
>
> 2) Why is the runq sorted by just priority (which is very coarse
> grained: BOOST, UNDER and OVER), and not the credit? This can result
> in VCPUs with higher credit getting starved for CPU if we have batch
> and latency sensitive VCPUs in the system.
>
> 3) Is there some patch, which makes the current credit scheduler
> fairer to the latency sensitive VCPUs? I see that the sched_credit2
> scheduler addresses these issues, but right now it has just one global
> runq and no load balancing features.
>
> Any advice/inputs here will be extremely valuable!
>
> Thanks in advance,
> -Gaurav
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|