I suspect yield() was first devised as a simple synchronization
mechanism
for uni-processor round robin schedulers.
Then strict priorities were added to make certain tasks (like pagers)
run
more aggressively than "normal" ones. As long as these high priority
threads don't use the yield() mechanism, things are fine. I believe you
are
pointing out that from the perspective of the yield() mechanism, all
time-share priorities (UNDER and OVER) should be considered one and
the same because they are not strict priorities. This is a good
observation
and I agree with you (as long as reasonable uses of yield() don't cause
fairness to go out the window).
However, before you go and fix yield(), you might want to consider this:
1- It's been proposed before that things like dom0 VCPUs be scheduled
with a priority strictly greater than any domU VCPU. If strict
priorities are
introduced into the Xen scheduler at some point in the future, code that
assumes that a yield() from a VCPU will allow all other runnable VCPUs
in the system a chance to run ahead of it will break (again).
2- Priorities aside, on an SMP host (ie all computers) with distributed
run
queues, it is non trivial to guarantee that a VCPU will not be
rescheduled
until all other runnable VCPUs have had a chance to run first. If you
can
come up with a simple and scalable way to do it, great. I suspect you
will
need to approximate this definition of yield() though, perhaps by using
some form of directed yield, targeted at one or more VCPUs ,as you have
suggested.
3- Yield really isn't a great model to do synchronization in an SMP
world.
If you're going to para-virtualize your IPI and spinlock paths, as you
pointed
out in your last mail, you might as well do something that can be
directed
and block if necessary.
I guess my point is that instead of working real hard to try and
maintain
the old yield behavior ("don't run again until all other runnable VCPUs
have had a chance to run first") on an SMP scheduler which potentially
also has to deal with strict priorities, you'd be better off spending
your
energy on building and optimizing simpler and more targeted
synchronization mechanisms and using those instead. User level
threads libraries may be a good place to look for inspiration if you're
really worried about the costs of supervisor to hypervisor context
switches. I'm not a huge fan of share pages but it was popular to
write papers about them for user level thread synchronization back in
the 90s.
In the case of IPIs, you're already going into the hypervisor so you
should be able to do something straightforward with a sleeping
semaphore. Maybe you spin a little before you sleep though to give
running VCPUs a chance to respond before you give up the end of
your time slice.
For spinlocks, I suspect turning a spinlock into a sleeping lock after
a reasonable number of spins would work well too.
In the long run, it would probably be beneficial to remove most uses
of the generic yield mechanism.
Emmanuel.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|