So, just to be clear: you're proposing that this mechanism *might* be
useful for a VM with real-time scheduling requirements? Or are
actually working on / developing real-time operating systems, and are
suggesting this in order to support real-time VMs?
I'm not an expert in real-time scheduling, but it doesn't seem to me
like this will really be what a real-time system would want. (Feel
free to contradict me if you know better.) It might work OK if there
were only a single real-time PV guest, but in the face of competition,
you'd have trouble. It seems like an actual real-time Xen scheduler
would want the PV guests to submit deadlines to Xen, and then Xen
could try to make a decision as to which deadlines to drop if it needs
to (based on some mechanism).
The only test you've measured is networking; but networking isn't a
"real-time" workload, it's a latency-sensitive workload. And you
haven't measured:
* The effect on network traffic if you have several high-priority VMs competing
* The effect on network traffic of non-prioritized VMs if a
high-priority VM is receiving traffic, or is misbehaving
You also haven't compared how raising a VM's priority within the
current credit framework, such as giving it a very high weight,
affects the numbers. Can you get similar results if you were to give
the "latency-sensitive" VMs a weight of, say, 10000, and leave the
other ones at 256?
Overall, I don't think fixed priorities like this is a good solution:
I think it will create more problems than it solves, and I think it's
actually harder to predict how a complex system will actually behave
(and thus harder to configure properly).
I think the proper solution (and I'm working on a "credit2" scheduler
that has these properites) is:
* Fix the credit assignment, so that VMs don't spend very much time in "over"
* Give VMs that wake up and are under their credits a fixed "boost"
period (e.g., 1ms)
* Allow users to specify a cpu "reservation"; so that no matter how
much work there is on the system, a VM can be guaranteed to get a
minimum fixed amount of the cpu if it wants it; e.g., dom0 always gets
50% of one core if it wants it, no matter how many other VMs are on
the system.
#1 and #2 have resulted in significant improvements in TCP throughput
in the face of competition. I hope to publish a draft here on the
list sometime soon, but I'm still working out some of the details.
-George Dunlap
2009/3/20 Su, Disheng <disheng.su@xxxxxxxxx>:
> Hi all,
> Attached patches add static priority into credit scheduler.
> Currently, credit scheduler has 4 kinds of priority: BOOST, UNDER,
> OVER and IDLE. And the priority of VM is dynamically changed according to the
> credit of VM, or I/O events, the highest priority VM is chosed to be
> scheduled in for each scheduling period. Due to priority is not fixed, which
> VM will be scheduled in is properly unknown. The I/O latency caused by
> scheduler is well analyzed in [1] and [2]. They provides ways to reduce I/O
> latency and also retain CPU and I/O fairness between VMs to some extend.
> There are some cases that reducing latency is much preferable to CPU
> or I/O fairness, such as RTOS guest or VM with device(audio)-assigned. The
> straightforward way is to set static(fixed) highest priority for this VM, to
> make sure it is scheduled each time. Attached patches implemented this kind
> of mechanism, like SCHED_RR/SCHED_FIFO in Linux.
>
> How it works?
> --Users can set RT priority(between 1~100) for domains. The larger the
> number, the higher the priority. Users can also change a RT domain into a
> non-RT domain by setting its priority other than 1~100.
> --Scheduler always chooses the highest priority domain to run for RT
> domains, no changes for non-RT domains in there. If RT domains have the same
> priority, round robin between this domains for every 30ms. 30ms is the
> default scheduling period, it can be changed to 2ms or other value if needed.
> --There is still accounting for current running non-RT vcpu in every
> 10ms, accounting for all non-RT domains in every 30ms as credit scheduler did
> before.
>
> Implementation details:
> -- In order to minimize the modification in the credit scheduler, one
> additional rt runqueue per pcpu is added, and one rt active domain list added
> in csched_private. RT vcpus are added into the rt runqueue in the running
> pcpu, and rt domains are added into rt active domain.
> -- Scheduler always chooses the highest priority in the rt runqueue
> if it's not empty at first, then chooses from normal runqueue instead.
> --__runq_insert/__runq_remove are changed to based on the priority of
> vcpu.
> -- Vcpu accounting is only took effects on the non-RT vcpus as
> before. Non-RT vcpus propotionally share the rest of cpu based on their
> weight. The total weight is changed during adding/removing RT domains, e.g.
> promoting a non-RT domain to a RT domain, total weight is substracted by the
> weight of non-RT domain.
>
> How to use it:
> set priority(y) of a VM(x) by: "xm sched-credit -d x -p y"
>
> Test results:
> I did some tests with this patches according to following
> configuration:
> CPU: Intel Core 2 Duo E6850, Xen(1881), 7 VMs created on one
> physical machine A, each 2 VMs pair ping with each other, the other VM has RT
> priority. Another physical machine B connects with it through 1G network card
> directly. Conduct these tests from B to A, e.g ping A from B.
> some test results are uploaded to
> http://wiki.xensource.com/xenwiki/DishengSu, FYI.
>
> Summary:
> This patches minimize the scheduling latency, while losing CPU, or I/O
> fairness. It can be used as a scheduler for RT guest, for some cases(such as
> RT guest and non-RT guests co-exist). While there are lot of areas to improve
> real time response, such as interrupt latency, Xen I/O model[3].
> Any comments are appreciated. Thanks!
>
> ---------------------
> [1]Scheduling I/O in Virtual Machine Monitors
> [2]Evaluation and Consideration of the Credit Scheduler for Client
> Virtualization
> [3]A step to support real-time in virtual machine
>
> Best Regards,
> Disheng, Su
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|