This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] xc_save: ignore the first suspend event channel

To: Brendan Cully <brendan@xxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] xc_save: ignore the first suspend event channel notification
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Sat, 06 Sep 2008 09:19:25 +0100
Delivery-date: Sat, 06 Sep 2008 01:20:02 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <8b8456290da7eeb8b40d.1220642198@xxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AckP+UXahG1ywXvsEd29dwAWy6hiGQ==
Thread-topic: [Xen-devel] [PATCH] xc_save: ignore the first suspend event channel notification
User-agent: Microsoft-Entourage/
On 5/9/08 20:16, "Brendan Cully" <brendan@xxxxxxxxx> wrote:

> I've noticed that the suspend event channel becomes pending as soon as
> it is bound. I'm not sure why or whether this is intentional, but it
> means that the suspend function will return before the domain has
> completed suspending unless the first notification is cleared. Without
> this patch, xc_domain_save may find that the guest has not suspended
> and sleep in 10ms chunks until it does. Typically this is several
> milliseconds of wasted time.

This patch looks pretty dodgy to me.

First of all, you removed the non-error-case return statement from
suspend_evtchn_init(). Hence it appears you release the local port before
returning, and you return -1 (which you'd never notice since the caller
erroneously does not check the return code - please fix this).

Hence I think actually you are ending up using the old slow shutdown method
since suspend_evtchn will be -1 and hence you'll end up in compat_suspend().
That should result in a significant slowdown rather than speedup, so were
your measurements actually taken with this exact patch?

Even with this fixed, I don't think that ignoring an event wakeup is a great
idea. You can easily make the not-suspended-yet retry loop always wait on
the evtchn rather than sleeping 10ms. By which I mean -- there is really no
reason for that to be a pure poll loop when you have an evtchn to sleep on.

To do this, call (*suspend) from within the retry loop: the evtchn case can
do what it always does (basically sleep on the evtchn device until its
evtchn of interest appears); the compat case should change behaviour after
its first invocation so that it sleeps 10ms (stash a static variable in the
function or in suspendinfo for this purpose, to remember whether it was
already invoked).

 -- Keir

Xen-devel mailing list