The following patchset is to add processor C state power management
support to for Xen x86.
Processor idle is a fundemantal feature of processor power management.
In x86, It is defined as processor C state in ACPI spec, and supported
by most x86 processors. Linux has cpu_idle driver to support this
feature ("cpuidle - Do nothing, efficiently. . .", Venkatesh P, OLS
2007). This patchset intends to add this feature to Xen.
Basically the following things needs to be done to support C state
- Get C state related info from ACPI DSDT table,
- Hypervisor decide when and which C state to enter base on the policy.
Considering dom0 kernel already has well written ACPI DSDT table
interpreter, the first task is done in dom0 kernel, and the related info
is passed to hypervisor via hypercall. The second parts is done in
hypervisor idle domain handler. The algorithm is ported from linux
kernel, the ladder governer is implemented in this version, and the new
menu governer implementation is in our plan.
This patchset also set up a code base for the incoming ACPI P/T state
support, due to the similar architecture they have.
This patchset is based on cset xen-linux-524/xen-staging-17501
[PATCH 1/9] [xen] Add basic acpi C-states based cpu idle power mgmt in
xen for x86.
[PATCH 2/9] [dom0] Add basic interface to allow ACPI processor events
revealed to external control logic like VMM.
[PATCH 3/9] [dom0] Notify ACPI processor events to external logic,
including C/P/T and hotplug, etc.
[PATCH 4/9] [dom0] Notify xen about Cx acpi info, such as table returned
by _CST/_CSD methods.
[PATCH 5/9] [xen] Add option "xen_processor_pm" in xen to enable dom0
external Cx control.
[PATCH 6/9] [xen] Port acpi bit register support from Linux.
[PATCH 7/9] [xen] Add acpi C3 support for x86.
[PATCH 8/9] [dom0] Handle dom0_max_vcpus < nr_pcpu cases, e.g. UP dom0.
[PATCH 9/9] [xen] [RFC] Add TSC stop support for Deep C state
Special notes on deep C state support
During deep C state (C3/C4/C5/C6), there are two tsc/lapic_timer related
issues need to address
- TSC may stop upon deep C state entry: to address this issue, one
appraoch is to calculate tsc diff by platform clock source and restore
TSC after C state exit. And another approach is to mark tsc unstable,
and notify other component not to use tsc, this is what linux kernel
does. This patchset choose the first approach.
- Local APIC timer may stop upon deep C state entry: this will cause
delayed ac timer and even system hang. One approach to fix this issue is
to use platform clock event (e.g. HPET/PIT) to trigger timer interrupt
and wakeup the CPU, for SMP case, the CPU0 will wakeup first, then send
IPI to target CPU. This patch is still in early stage.
Deep C state is currently disabled by max_cstate cmdline option.
max_cstate is set to 2 by default, which prevent processor entering C3.
Xen-devel mailing list