This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] BUG: soft lockup detected on CPU#0! on 3.0.2-2

To: Luke Crawford <lsc@xxxxxxxxx>, Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] BUG: soft lockup detected on CPU#0! on 3.0.2-2
From: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>
Date: Sat, 16 Sep 2006 17:23:14 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Sat, 16 Sep 2006 09:33:48 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <Pine.NEB.4.64.0609151824170.14422@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcbZrGimp38VQkWfEduRKQANk04WTA==
Thread-topic: [Xen-devel] BUG: soft lockup detected on CPU#0! on 3.0.2-2
User-agent: Microsoft-Entourage/
On 16/9/06 2:34 am, "Luke Crawford" <lsc@xxxxxxxxx> wrote:

> completely unpingable.  console was also dead, nobody tried the xen
> console.  (I just setup a better reboot procedure for my hosting company;
> I need to setup something similar here so that we don't loose the data we
> need to figure this out.)
> Where should I start looking to find out exactly what "bug: soft lockup on
> cpu0" means?  linux source/docs?  or Xen source/docs?

The watchdog code runs a kernel thread on every CPU. This is supposed to
wake up every second and update a per-CPU counter. A hook from the timer
interrupt checks the per-CPU counter and prints a softlockup warning if the
counter is not updated for 10 seconds.

3.0.2-2 is known to be susceptible to softlockups because the Xen scheduler
will starve domains to run domain0. It's not clear if that's what is
happening here, but you need to repro on tip of xen-3.0-testing to find out
one way or the other. Because of the number of bug fixes since 3.0.2-3 we
don't recommend running any old releases of 3.0.2.

 -- Keir

Xen-devel mailing list