WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] BUG() on soft lockup upon suspend/resume

To: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] BUG() on soft lockup upon suspend/resume
From: Glauber de Oliveira Costa <gcosta@xxxxxxxxxx>
Date: Mon, 9 Oct 2006 21:29:10 -0300
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 09 Oct 2006 17:30:14 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C1508A1A.2405%Keir.Fraser@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20061009212242.GB28540@xxxxxxxxxx> <C1508A1A.2405%Keir.Fraser@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.11
> 
> > In systems with vcpu > 1, a BUG due to a detected soft lockup seems to be
> > triggered after system resume/suspend. This is probably due to the lack of
> > seqlocking around the region that does the local time processing.
> 
> We do SMP save/restore tests regularly and do not see this issue. It ought
> to be avoided by the fact that, when we bring up a CPU, we
> touch_softlockup_watchdog() in cpu_bringup(), before enabling interrupts.
> For CPU0 on resume, the touch is done in time_resume() in
> arch/i386/kernel/time-xen.c.

This happens not only (once) when the system comes back. It do happen a
lot after it. So even if the first touch is right, I suspect this issue
is more related to a situation in which we are already resumed for a
long time, with all set up
> 
> I think we need to understand the issue you are hitting a bit more before
> deciding on the right fix.

Right, here it goes more info:

I'm on a 8-way x86_64 machine, and This is the sort of info I see
repeatedly:

BUG: soft lockup detected on CPU#1!

Call Trace:
 <IRQ>  [<ffffffff802ace9d>] softlockup_tick+0xf8/0x113
 [<ffffffff8026d591>] timer_interrupt+0x38a/0x3d8
 [<ffffffff80210e87>] handle_IRQ_event+0x2d/0x60
 [<ffffffff802ad1e6>] __do_IRQ+0xa5/0x107
 [<ffffffff8028be7a>] _local_bh_enable+0x61/0xc5
 [<ffffffff8026b4c9>] do_IRQ+0xe7/0xf5
 [<ffffffff8039386e>] evtchn_do_upcall+0x86/0xe0
 [<ffffffff8025e2a2>] do_hypervisor_callback+0x1e/0x2c
 <EOI>  [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff8026cb13>] raw_safe_halt+0x84/0xa8
 [<ffffffff8026a121>] xen_idle+0x38/0x4a
 [<ffffffff80248e66>] cpu_idle+0x97/0xba

It obviously never happen on CPU#0, but I see it on all others (vcpus=4)

If you have any other opinion on what else may be causing this, it's
very welcome. I'll keep investigating.


-- 
Glauber de Oliveira Costa
Red Hat Inc.
"Free as in Freedom"

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>