This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] xc_save: ignore the first suspend event channel

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] xc_save: ignore the first suspend event channel notification
From: Brendan Cully <brendan@xxxxxxxxx>
Date: Mon, 8 Sep 2008 15:36:07 -0700
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 08 Sep 2008 15:36:30 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C4EABA98.26D60%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Mail-followup-to: keir.fraser@xxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx
References: <20080907022833.GA36704@xxxxxxxxxxxx> <C4EABA98.26D60%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-08-26)
On Monday, 08 September 2008 at 11:18, Keir Fraser wrote:
> On 7/9/08 03:28, "Brendan Cully" <brendan@xxxxxxxxx> wrote:
> > I could certainly code this up as well (it'd need a static flag in
> > evtchn_suspend as well to avoid resignalling the domain, I think). But
> > generally without clearing the event channel before signalling the
> > guest, the first suspend attempt will always return early. I'm not
> > really clear on the scenario that results in the domain not being
> > suspended after *suspend has succesfully returned. Could you clarify?
> I checked in your patch as is. One question: do we need the
> wait-one-second-for-shutdown loop in suspend_and_state() at all? My reading
> of (*suspend)() is that it should be sure the domain is suspended when it
> returns, and hence should suspend_and_state() not simply raise an error if
> it finds that domaininfo does not indicate the guest is shut down? The retry
> loop may simply be allowing bugs of the sort you've just fixed to linger.

I agree that that retry loop is a bit dubious. It appears to come from
changeset 2147:949f21fc9e77 (Fix migrate to cope with domains that are
paused.) This was long before device migration appeared in xend
(changeset 9657:1fe63743a147), when the only thing that mattered was
that the domain be suspended before the final round.

Removing the poll in suspend_and_state undoes 2147. If we want to keep
that logic, we could probably just hoist it up into *suspend.

There's still no protection that I know of between concurrent
invocation of xc_save on the same domain, but that's a whole other
kettle of fish :)

Xen-devel mailing list