WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Ehancement to domU suspend/resume

To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Ehancement to domU suspend/resume
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Wed, 17 Jan 2007 17:09:42 +0800
Delivery-date: Wed, 17 Jan 2007 01:09:46 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acc6Fzl3cjV2EoL6SmOleX5Dkf0hog==
Thread-topic: Ehancement to domU suspend/resume
Hi, all,
        When working on adding PM support to xen, we realized that 
some enhancements are required to suspend/resume domU. Following 
is some background and thoughts, and welcome on comments. :-)

        Currently we use a simple approach (pause/unpause) for domU 
when ready to pull whole platform into a power save state, saying a 
S3. Because pause/unpause is out of domU's knowledge, domU 
observes soft lockup when unpaused after resuming from S3. Also 
this can not handle driver domain case. We tried "xm save/restore" 
or equivalent "xm suspend/resume", it works however overhead is a 
bit high since the whole domU memory is saved to disk and domain 
itself is destroyed after suspend. For S3 support, it's better to have 
quick suspend/resume, cause memory still keeps along the process.

        Basically the change may lie with two aspects:
                - Lightweight "xm suspend/resume"
                - Extend suspend support to driver domU

[Lightweight "xm suspend/resume"]
        It's reasonable for current implementation to save and release 
whole memory of domU after suspended, since it allows more 
memory available to other domains. However for platform level S3, 
this is redundant when box is physically put into a suspend state. 
What we need is just to send a suspend notification into domU, and 
let domU fall into __xen_suspend path. Then domU exits scheduler 
by issuing HYPERVISOR_suspend. Nothing else required after this.
 After resume, domU just continues to run after suspend point.

        Even __xen_suspend path is a bit heavy, and in this case 
resources don't change for domU even after resume. Maybe we 
can benefit from recent checkpoint patch which has appropriate 
change on this path.

        But I'm not familiar with control panel side and hope some guys 
can suggest me. My gut-feeling is to add an option (like -L) to existing

"xm suspend/resume", instead of a new command. Actually the 
possible changes may look like what checkpoint patch does except 
no immediate resume and we need disable memory save logic.

[Driver domU suspend]
        This has to be added if one device is assigned to a domU and 
we want system still working correctly after resume. When driver 
domU receives suspend request, it should invoke driver suspend 
method of owned physical devices. Before that, one other necessary 
step is to freeze all processes since some may still hold critical 
resource. In this case, we need borrow some Linux PM stuff into 
xen suspend path, something like:
        Smp_suspend();
        freeze_processes();
        device_suspend();
        device_power_down();
        xenbus_suspend();
        ...
        HYPERVISOR_suspend(virt_to_mfn(xen_start_info));
        ...
        xenbus_resume();
        device_power_up();
        device_resume();
        thaw_processes();
        smp_resume();

        It may be more difficult if we want to support wake-on-LAN when 
that NIC is assigned to domU, which is more tightly related to ACPI. 
So we will simply consider normal device suspend here.

        One interesting question is, why doesn't current __xen_suspend 
freeze running processes? My rough feeling is that virtual device state 
is still kept (like in xenstore, BE, etc.) along with single domU 
suspend/resume, and thus almost all front-end drivers (except TPM) 
have no suspend method. If no driver suspend methods are invoked, 
no need to freeze processes since no contention there. Correct me for 
the real trick. :-)

Thanks,
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel