xen-devel
RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and inter
>From: George Dunlap
>Sent: 2009年4月9日 23:59
>
>For servers, our target "sweet spot" for which we will optimize is a
>system with 2 sockets, 4 cores each socket, and SMT (16 logical cpus).
>Ideal performance is expected to be reached at about 80% total system
>cpu utilization; but the system should function reasonably well up to
>a utilization of 800% (e.g., a load of 8).
How is 80%/800% chosen here?
>
>For virtual desktop systems, we will have a large number of
>interactive VMs with a lot of shared memory. Most of these will be
>single-vcpu, or at most 2 vcpus
How about VM number in total you'd like to support?
.
>
>* HT-aware.
>
>Running on a logical processor with an idle peer thread is not the
>same as running on a logical processor with a busy peer thread. The
>scheduler needs to take this into account when deciding "fairness".
Do you mean that same elapsed time in above two scenarios will be
translated into different credits?
>
>* Power-aware.
>
>Using as many sockets / cores as possible can increase the total cache
>size avalable to VMs, and thus (in the absence of inter-VM sharing)
>increase total computing power; but by keeping multiple sockets and
>cores powered up, also increases the electrical power used by the
>system. We want a configurable way to balance between maximizing
>processing power vs minimizing electrical power.
Xen3.4 now supports "sched_smt_power_savings" (both boot option
and touchable by xenpm) to change power/performance preference.
It's simple implementation to simply reverse the span order from
existing package->core->thread to thread->core->package. More
fine-grained flexibility could be given in future if hierarchical scheduling
concept could be more clearly constructed like domain scheduler
in Linux.
Another possible 'fairness' point affected by power management
could be to take freq scaling into consideration, since credit by far
is simply calculated by elapsed time while elapsed time with
different frequency actually indicates different consumed cycles.
>
>3. Target interface:
>
>The target interface will be similar to credit1:
>
>* The basic unit is the VM "weight". When competing for cpu
>resources, VMs will get a share of the resources proportional to their
>weight. (e.g., two cpu-hog workloads with weights of 256 and 512 will
>get 33% and 67% of the cpu, respectively).
imo, weight is not strictly translated into the care for latency. any
elaboration on that? I remembered that previously Nishiguchi-san
gave idea to boost credit, and Disheng proposed static priority.
Maybe you can make a summary to help people how latency would
be exactly ensured in your proposal
>
>* Additionally, we will be introducing a "reservation" or "floor".
> (I'm open to name changes on this one.) This will be a minimum
> amount of cpu time that a VM can get if it wants it.
this is good idea.
>
>For example, one could give dom0 a "reservation" of 50%, but leave the
>weight at 256. No matter how many other VMs run with a weight of 256,
>dom0 will be guaranteed to get 50% of one cpu if it wants it.
there should be some way to adjust or limit usage of 'reservation' when
multiple vcpus both claim a desire which however sum up to some
exceeding cpu's computing power or weaken your general
'weight-as-basic-unit' idea?
>
>* The "cap" functionality of credit1 will be retained.
>
>This is a maximum amount of cpu time that a VM can get: i.e., a VM
>with a cap of 50% will only get half of one cpu, even if the rest of
>the system is completely idle.
>
>* We will also have an interface to the cpu-vs-electrical power.
>
>This is yet to be defined. At the hypervisor level, it will probably
>be a number representing the "badness" of powering up extra cpus /
>cores. At the tools level, there will probably be the option of
>either specifying the number, or of using one of 2/3 pre-defined
>values {power, balance, green/battery}.
Not sure how that number will be defined. Maybe we can follow
current way to just add individual name-based options matching
its purpose (such as migration_cost and sched_smt_power_savings...)
Thanks,
Kevin _______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., (continued)
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Jeremy Fitzhardinge
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., George Dunlap
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Dan Magenheimer
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Jeremy Fitzhardinge
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Dan Magenheimer
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Jeremy Fitzhardinge
- Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., George Dunlap
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Dan Magenheimer
- RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Tian, Kevin
Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., George Dunlap
RE: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface.,
Tian, Kevin <=
Re: [Xen-devel] [RFC] Scheduler work, part 1: High-level goals and interface., Zhiyuan Shao
|
|
|