WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Dom0 Locked up for 4 hours "BUG: soft lockup - CPU#3 stu

To: Todd Deshane <todd.deshane@xxxxxxx>
Subject: Re: [Xen-users] Dom0 Locked up for 4 hours "BUG: soft lockup - CPU#3 stuck for 61s!"
From: Pasi Kärkkäinen <pasik@xxxxxx>
Date: Wed, 23 Mar 2011 18:00:19 +0200
Cc: Javier Frias <jfrias@xxxxxxxxx>, xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 23 Mar 2011 09:06:03 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTi=svmMSC7k4Zwt+0ycDN_p1iMG5MbZdy3vuZVKz@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTik-v5eQeyWC4npJaq2_=5N0owb4KBEtZgFfKJyp@xxxxxxxxxxxxxx> <AANLkTi=svmMSC7k4Zwt+0ycDN_p1iMG5MbZdy3vuZVKz@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
On Sat, Feb 26, 2011 at 07:50:37PM +0000, Todd Deshane wrote:
> On Sat, Feb 26, 2011 at 12:11 AM, Javier Frias <jfrias@xxxxxxxxx> wrote:
> > I posted a bug about this, but figured I'd ask the mailing list to see
> > if someone had seen this.
> > Bugzilla: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1746
> >
> > Basically, I had a dom0, after 57 days of non issues, lock up for 4
> > hours, completely unresponsive, and then recovered. The domU's were
> > unaffected except for the fact that I could not shut them down. (
> > since dom0 was unresponsive ). Although I was able to gain access via
> > xapi/xencenter, and I atleast had some access ( console, status, etc,
> > all worked via xapi).
> >
> 
> Could you clarify this explanation a bit. What access was not
> available for 4 hours?
> 
> You say you could access via xapi/xencenter was this after the 4 hours
> or during?
> 
> Did you happen to look at the guest performance during those times?
> Was one of the guest doing a lot of disk I/O? Could you give some more
> information as to how the guests access their virtual disks (local,
> NFS, iSCSI, etc.) and any other information about your setup that
> could give us hints as to what might have caused this.
> 

Hey,

Did you get more info about this issue? 

Did the system become responsive after some process was killed in dom0?
(due to oom killer, perhaps?)

-- Pasi

> Thanks,
> Todd
> 
> >  It seemed to have done a lot of paging and swapping, and there were
> > faults reported in cpu 3. The load was also extremely high on dom0.
> >
> > Has anyone seen this?  I could only find one other posting with a
> > similar issue 
> > http://www.gossamer-threads.com/lists/xen/api/197429?do=post_view_threaded
> >
> > I'm running XCP release 1.0.0-38754c (xcp), any help greatly appreciated.
> >
> >
> > BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
> > Modules linked in: nls_utf8 hfsplus bonding tun lockd sunrpc bridge
> > stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
> > nf_conntrack xt_tcpudp x_tables binfmt_misc dm_mirror video output sbs
> > sbshc fan container battery ac parport_pc lp parport nvram joydev
> > sr_mod cdrom evdev usb_storage usb_libusual usbhid sg thermal button
> > processor thermal_sys bnx2 serio_raw 8250_pnp rtc_cmos 8250
> > serial_core rtc_core tpm_tis rtc_lib tpm tpm_bios pcspkr
> > dm_region_hash dm_log dm_mod ide_gd_mod megaraid_sas sd_mod scsi_mod
> > ext3 jbd uhci_hcd ohci_hcd ehci_hcd usbcore fbcon font tileblit
> > bitblit softcursor [last unloaded: ip_tables]
> >
> > Pid: 0, comm: swapper Not tainted
> > (2.6.32.12-0.7.1.xs1.0.0.298.170582xen #1) PowerEdge R710
> > EIP: 0061:[<c01013a7>] EFLAGS: 00000246 CPU: 3
> > EIP is at 0xc01013a7
> > EAX: 00000000 EBX: 00000001 ECX: 00000000 EDX: ee853f78
> > ESI: 00117f58 EDI: 00000003 EBP: ee853f90 ESP: ee853f74
> >  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> > CR0: 8005003b CR2: b7736000 CR3: 0d9f4000 CR4: 00002660
> > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > DR6: ffff0ff0 DR7: 00000400
> > Call Trace:
> >  [<c0107035>] ? xen_safe_halt+0xb5/0x150
> >  [<c010ac7e>] xen_idle+0x1e/0x50
> >  [<c0102a7b>] cpu_idle+0x3b/0x60
> >  [<c037b00d>] cpu_bringup_and_idle+0xd/0x10
> >
> > _______________________________________________
> > Xen-users mailing list
> > Xen-users@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-users
> >
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>