This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] [PATCH] xen: explicitly create/destroy stop_machine workqueu

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] [PATCH] xen: explicitly create/destroy stop_machine workqueues outside suspend/resume region.
From: Ian Campbell <ian.campbell@xxxxxxxxxx>
Date: Tue, 1 Dec 2009 11:47:15 +0000
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>
Delivery-date: Tue, 01 Dec 2009 03:48:21 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1259668035-8552-1-git-send-email-ian.campbell@xxxxxxxxxx>
In-reply-to: <1259158328.7590.539.camel@xxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1259668035-8552-1-git-send-email-ian.campbell@xxxxxxxxxx>
References: <1259158328.7590.539.camel@xxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
I have observed cases where the implicit stop_machine_destroy() done by
stop_machine() hangs while destroying the workqueues, specifically in
kthread_stop(). This seems to be because timer ticks are not restarted
until after stop_machine() returns.

Fortunately stop_machine provides a facility to pre-create/post-destroy the
workqueues so use this to ensure that workqueues are only destroyed after
everything is really up and running again.

I only actually observed this failure with 2.6.30. It seems that newer kernels
are somehow more robust against doing kthread_stop() without timer interrupts
(I tried some backports of some likely looking candidates but did not track
down the commit which added this robustness). However this change seems like a
reasonable belt&braces thing to do.

Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
 drivers/xen/manage.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 2fb7d39..c499793 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -79,6 +79,12 @@ static void do_suspend(void)
        shutting_down = SHUTDOWN_SUSPEND;
+       err = stop_machine_create();
+       if (err) {
+               printk(KERN_ERR "xen suspend: failed to setup stop_machine 
%d\n", err);
+               goto out;
+       }
        /* If the kernel is preemptible, we need to freeze all the processes
           to prevent them from being in the middle of a pagetable update
@@ -86,7 +92,7 @@ static void do_suspend(void)
        err = freeze_processes();
        if (err) {
                printk(KERN_ERR "xen suspend: freeze failed %d\n", err);
-               goto out;
+               goto out_destroy_sm;
@@ -129,7 +135,11 @@ out_resume:
+       stop_machine_destroy();
        shutting_down = SHUTDOWN_INVALID;

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>