This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: [PATCH] Make explicit message when guest failed to suspe

To: Frank Pan <frankpzh@xxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH] Make explicit message when guest failed to suspend
From: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>
Date: Thu, 3 Mar 2011 10:56:21 +0000
Cc: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jeremy, Fitzhardinge <Jeremy.Fitzhardinge@xxxxxxxxxx>, Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>
Delivery-date: Thu, 03 Mar 2011 02:57:27 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTi=Jg2eWzCMR6Zksdpf=5sDUeu8L1Q7WE354dR-h@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Citrix Systems, Inc.
References: <AANLkTi=Jg2eWzCMR6Zksdpf=5sDUeu8L1Q7WE354dR-h@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Thu, 2011-03-03 at 10:13 +0000, Frank Pan wrote:
> Recent xen uses xenbus to suspend PV-on-HVM guest domain. The related
> code in the xen-unstable tree will fall into infinite loop when guest
> domain failed on suspending one or more devices.

This is a bug in Xend, it should never enter an infinite loop regardless
of guest behaviour. The code certainly appears to expect to be managing
a timeout, perhaps it has bitrotted?

> The patch attached changes the logic, and raises an XendError after 1
> minute waiting. The patch also makes use of "control/shutdown" entry
> in xenstore, allows guest kernel report the failure of the suspending.

I don't think you can change the xenbus protocol in this way without
further rationale regarding it's correctness.

In particular you need to consider and explain how it remains compatible
with a new kernel running on older toolstacks and vice versa.

> Any suggestions?

Perhaps a separate control/shutdown-error key? The message written
should be more verbose than just "failed" if/when more specific
information is available to the kernel.

> @@ -165,6 +165,8 @@ out_destroy_sm:
>       stop_machine_destroy();
>  out:
> +     if (cancelled)
> +             xenbus_write(XBT_NIL, "control", "shutdown", "failed");

cancelled does not necessarily imply failure, it can mean the suspend
resulted in a checkpoint rather than a full suspend.


Xen-devel mailing list