This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Scheduling of I/O domains

[sorry for the delayed reply]

as keir pointed out the problem is in the wakeup function and in
particular with a BVT hack.

BVT has the notion of a context switch allowance, i.e., the minimum time
a task is allowed run before it gets preempted, to avoid context switch
thrashing (ctx_allow=5ms in sched_bvt.c). after this time a new run
through the scheduler is performed

in our BVT implementation we extend this slightly in that if there is
only one runable task we expand the context switch allowance to 10 times
the normal amount in order to avoid to many runs through the scheduler.

In my opinion the above is not entiraly true. The context switch allowance is introduced to stop two runnable tasks switching between each other just after overtaking the other in virtual time. This is quite different to immediate dispatch when a task (domain) becomes runnable. Then the current task should be preempted even if it run for less then ctx_allow. Last few days I modified the scheduler interface to push runqueue management and most of wakeup work to specific schedulers (BVT by default), so that to let them decide what to do exactly. Another bug was found in the meantime (but it only activated itself when a domain with large AVT migrated between processors). Now I will try to fix the early dispatch bug. (It seems to me that simple modification to wakeup function should do the trick).

The old (i.e., 1.2) BVT implementation would check on waking up another
domain if the current task already used up the ctx_allow and if it had
would force an immediate run through the scheduler (therefor ignoring
the the expanded context switch allowance).

In the -unstable implementation this is not the case anymore as the BVT
scheduling function reports back a time value for the next run through
the scheduler and no hook is provided into specific scheduler
implementations when a domains is unblocking. Therefor, if your CPU hog
is the only runable task when it is scheduled it will run 10*ctx_allow
(50ms) irrespective of other tasks becoming runnable during that time
(ie. your IO tasks). in the worst case the IO tasks then have to wait
for 50ms rather than 5ms before they get scheduled.

as a quick fix could you comment out line 366-371  (which extends the
context switch allowance if there is only one task running) in
sched_bvt.c and  try your experiment again.

The proper fix should be a call into the scheduler if a task unblocks,
which shouldn't be too hard to add.


This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
Xen-devel mailing list