Hi, all,
When working on adding PM support to xen, we realized that
some enhancements are required to suspend/resume domU. Following
is some background and thoughts, and welcome on comments. :-)
Currently we use a simple approach (pause/unpause) for domU
when ready to pull whole platform into a power save state, saying a
S3. Because pause/unpause is out of domU's knowledge, domU
observes soft lockup when unpaused after resuming from S3. Also
this can not handle driver domain case. We tried "xm save/restore"
or equivalent "xm suspend/resume", it works however overhead is a
bit high since the whole domU memory is saved to disk and domain
itself is destroyed after suspend. For S3 support, it's better to have
quick suspend/resume, cause memory still keeps along the process.
Basically the change may lie with two aspects:
- Lightweight "xm suspend/resume"
- Extend suspend support to driver domU
[Lightweight "xm suspend/resume"]
It's reasonable for current implementation to save and release
whole memory of domU after suspended, since it allows more
memory available to other domains. However for platform level S3,
this is redundant when box is physically put into a suspend state.
What we need is just to send a suspend notification into domU, and
let domU fall into __xen_suspend path. Then domU exits scheduler
by issuing HYPERVISOR_suspend. Nothing else required after this.
After resume, domU just continues to run after suspend point.
Even __xen_suspend path is a bit heavy, and in this case
resources don't change for domU even after resume. Maybe we
can benefit from recent checkpoint patch which has appropriate
change on this path.
But I'm not familiar with control panel side and hope some guys
can suggest me. My gut-feeling is to add an option (like -L) to existing
"xm suspend/resume", instead of a new command. Actually the
possible changes may look like what checkpoint patch does except
no immediate resume and we need disable memory save logic.
[Driver domU suspend]
This has to be added if one device is assigned to a domU and
we want system still working correctly after resume. When driver
domU receives suspend request, it should invoke driver suspend
method of owned physical devices. Before that, one other necessary
step is to freeze all processes since some may still hold critical
resource. In this case, we need borrow some Linux PM stuff into
xen suspend path, something like:
Smp_suspend();
freeze_processes();
device_suspend();
device_power_down();
xenbus_suspend();
...
HYPERVISOR_suspend(virt_to_mfn(xen_start_info));
...
xenbus_resume();
device_power_up();
device_resume();
thaw_processes();
smp_resume();
It may be more difficult if we want to support wake-on-LAN when
that NIC is assigned to domU, which is more tightly related to ACPI.
So we will simply consider normal device suspend here.
One interesting question is, why doesn't current __xen_suspend
freeze running processes? My rough feeling is that virtual device state
is still kept (like in xenstore, BE, etc.) along with single domU
suspend/resume, and thus almost all front-end drivers (except TPM)
have no suspend method. If no driver suspend methods are invoked,
no need to freeze processes since no contention there. Correct me for
the real trick. :-)
Thanks,
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|