xen-devel
Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and inter
George Dunlap wrote:
1. Design targets
We have three general use cases in mind: Server consolidation, virtual
desktop providers, and clients (e.g. XenClient).
For servers, our target "sweet spot" for which we will optimize is a
system with 2 sockets, 4 cores each socket, and SMT (16 logical cpus).
Ideal performance is expected to be reached at about 80% total system
cpu utilization; but the system should function reasonably well up to
a utilization of 800% (e.g., a load of 8).
Is that forward-looking enough? That hardware is currently available;
what's going to be commonplace in 2-3 years?
For virtual desktop systems, we will have a large number of
interactive VMs with a lot of shared memory. Most of these will be
single-vcpu, or at most 2 vcpus.
For client systems, we expect to have 3-4 VMs (including dom0).
Systems will probably ahve a single socket with 2 cores and SMT (4
logical cpus). Many VMs will be using PCI pass-through to access
network, video, and audio cards. They'll also be running video and
audio workloads, which are extremely latency-sensitive.
2. Design goals
For each of the target systems and workloads above, we have some
high-level goals for the scheduler:
* Fairness. In this context, we define "fairness" as the ability to
get cpu time proportional to weight.
We want to try to make this true even for latency-sensitive workloads
such as networking, where long scheduling latency can reduce the
throughput, and thus the total amount of time the VM can effectively
use.
* Good scheduling for latency-sensitive workloads.
To the degree we are able, we want this to be true even those which
use a significant amount of cpu power: That is, my audio shouldn't
break up if I start a cpu hog process in the VM playing the audio.
* HT-aware.
Running on a logical processor with an idle peer thread is not the
same as running on a logical processor with a busy peer thread. The
scheduler needs to take this into account when deciding "fairness".
Would it be worth just pair-scheduling HT threads so they're always
running in the same domain?
* Power-aware.
Using as many sockets / cores as possible can increase the total cache
size avalable to VMs, and thus (in the absence of inter-VM sharing)
increase total computing power; but by keeping multiple sockets and
cores powered up, also increases the electrical power used by the
system. We want a configurable way to balance between maximizing
processing power vs minimizing electrical power.
I don't remember if there's a proper term for this, but what about
having multiple domains sharing the same scheduling context, so that a
stub domain can be co-scheduled with its main domain, rather than having
them treated separately?
Also, a somewhat related point, some kind of directed schedule so that
when one vcpu is synchronously waiting on anohter vcpu, have it directly
hand over its pcpu to avoid any cross-cpu overhead (including the
ability to take advantage of directly using hot cache lines). That
would be useful for intra-domain IPIs, etc, but also inter-domain
context switches (domain<->stub, frontend<->backend, etc).
3. Target interface:
The target interface will be similar to credit1:
* The basic unit is the VM "weight". When competing for cpu
resources, VMs will get a share of the resources proportional to their
weight. (e.g., two cpu-hog workloads with weights of 256 and 512 will
get 33% and 67% of the cpu, respectively).
* Additionally, we will be introducing a "reservation" or "floor".
(I'm open to name changes on this one.) This will be a minimum
amount of cpu time that a VM can get if it wants it.
For example, one could give dom0 a "reservation" of 50%, but leave the
weight at 256. No matter how many other VMs run with a weight of 256,
dom0 will be guaranteed to get 50% of one cpu if it wants it.
How does the reservation interact with the credits? Is the reservtion
in addition to its credits, or does using the reservation consume them?
* The "cap" functionality of credit1 will be retained.
This is a maximum amount of cpu time that a VM can get: i.e., a VM
with a cap of 50% will only get half of one cpu, even if the rest of
the system is completely idle.
* We will also have an interface to the cpu-vs-electrical power.
This is yet to be defined. At the hypervisor level, it will probably
be a number representing the "badness" of powering up extra cpus /
cores. At the tools level, there will probably be the option of
either specifying the number, or of using one of 2/3 pre-defined
values {power, balance, green/battery}.
Is it worth taking into account the power cost of cache misses vs hits?
Do vcpus running on pcpus running at less than 100% speed consume fewer
credits?
Is there any explicit interface to cpu power state management, or would
that be decoupled?
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., George Dunlap
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface.,
Jeremy Fitzhardinge <=
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Tian, Kevin
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Jeremy Fitzhardinge
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Ian Pratt
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Jeremy Fitzhardinge
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Tian, Kevin
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., George Dunlap
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., George Dunlap
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Jeremy Fitzhardinge
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Jeremy Fitzhardinge
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Tian, Kevin
|
|
|