WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qem

To: Giuseppe Sacco <giuseppe@xxxxxxxxxxxxxxxxxxxxxxxxx>, 638172@xxxxxxxxxxxxxxx
Subject: [Xen-devel] Re: Bug#638172: BUG: soft lockup - CPU#0 stuck for 61s! [qemu-dm:3205]
From: Ian Campbell <ijc@xxxxxxxxxxxxxx>
Date: Mon, 22 Aug 2011 10:00:11 +0100
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Ben Hutchings <ben@xxxxxxxxxxxxxxx>
Delivery-date: Mon, 22 Aug 2011 02:03:05 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1313577856.13030.17.camel@scarafaggio>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1313577856.13030.17.camel@scarafaggio>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
@xen-devel:

Does this look familiar to anyone, this is (I expect, hopefully Giuseppe
will confirm) from Debian Squeeze which has a Xen 4.0.x with a PVops
dom0 kernel based on xen.git from last summer (e73f4955a821) with more
recent upstream longterm kernels (up to and including 2.6.32.41) merged
in. While it does seem to have the switch from level to edge triggered
interrupt the Debian kernel doesn't appear to have the switch to fasteoi
for pirqs (0672fb44a111 plus a few followups) -- could that be related
to this? (I'm not sure if that was a cleanup or a fix)

Might the tsc unstable message be relevant?

@Giuseppe:

Can you confirm the versions of the xen and qemu-dm packages which you
have got installed please.

Also I think it would be useful to see the guest configuration file and
details of the storage (filesystems, SCSI controllers etc) backing the
guest storage which you have got configured.

Full history of this report can be found at
http://bugs.debian.org/638172

Ian.

Can you also provide configuration details 
On Wed, 2011-08-17 at 12:44 +0200, Giuseppe Sacco wrote:
> Package: linux-image-2.6.32-5-xen-686
> Version: 2.6.32-35
> Severity: important
> 
> Hi,
> I am experiencing a few outages on a XEN server. Often I have to
> poweroff the server, but last time I found some information in syslog.
> Here it is:
> 
> Aug 17 12:35:45 centrum kernel: [ 1424.037532] Clocksource tsc unstable 
> (delta = -103103328 ns)
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] BUG: soft lockup - CPU#0 stuck 
> for 61s! [qemu-dm:3205]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Modules linked in: xt_state 
> xt_physdev iptable_filter tun cpufreq_userspace cpufreq_powersave cpufreq_c
> onservative cpufreq_stats dummy bridge stp xen_evtchn xenfs xt_tcpudp 
> iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables 
> x_tab
> les xfs exportfs loop snd_hda_codec_atihdmi snd_hda_intel snd_hda_codec 
> radeon snd_hwdep ttm snd_pcm snd_timer drm_kms_helper drm snd soundcore snd_pa
> ge_alloc i2c_algo_bit shpchp i2c_piix4 pcspkr k8temp pci_hotplug i2c_core 
> evdev button ext3 jbd mbcache dm_mod aacraid 3w_9xxx 3w_xxxx raid10 raid456 
> async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 
> raid0 md_mod sata_nv sata_sil sata_via sd_mod crc_t10dif ata_generic ahc
> i pata_atiixp ohci_hcd libata processor ehci_hcd r8169 mii scsi_mod thermal 
> usbcore nls_base thermal_sys acpi_processor [last unloaded: dummy]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] 
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Pid: 3205, comm: qemu-dm 
> Tainted: G        W  (2.6.32-5-xen-686 #1) MS-7368
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EIP: 0061:[<c1002227>] EFLAGS: 
> 00200246 CPU: 0
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EIP is at 
> hypercall_page+0x227/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EAX: 00040000 EBX: 00000000 
> ECX: 00000000 EDX: ec8fa828
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] ESI: ec8fa800 EDI: c24d9600 
> EBP: c27d4800 ESP: e4207d64
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  DS: 007b ES: 007b FS: 00d8 
> GS: 00e0 SS: 0069
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] CR0: 8005003b CR2: b7712200 
> CR3: 241f0000 CR4: 00000660
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR0: 00000000 DR1: 00000000 
> DR2: 00000000 DR3: 00000000
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR6: ffff0ff0 DR7: 00000400
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Call Trace:
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006034>] ? 
> xen_force_evtchn_callback+0xc/0x10
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006764>] ? 
> check_events+0x8/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006723>] ? 
> xen_irq_enable_direct_end+0x0/0x1
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f457>] ? 
> scsi_request_fn+0x440/0x47a [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1132541>] ? 
> __blk_run_queue+0x2e/0x5a
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c11325f3>] ? 
> blk_run_queue+0x18/0x27
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93eaca>] ? 
> scsi_run_queue+0x281/0x308 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f639>] ? 
> scsi_next_command+0x25/0x2f [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93ffa1>] ? 
> scsi_io_completion+0x383/0x3a4 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93a723>] ? 
> scsi_finish_command+0xaa/0xc2 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1135bb3>] ? 
> blk_done_softirq+0x53/0x5f
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c8ea>] ? 
> __do_softirq+0xaa/0x156
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c9c7>] ? 
> do_softirq+0x31/0x3c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103caa1>] ? 
> irq_exit+0x26/0x58
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1199be6>] ? 
> xen_evtchn_do_upcall+0x22/0x2c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1009b3f>] ? 
> xen_do_upcall+0x7/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1002407>] ? 
> hypercall_page+0x407/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10015>] ? 
> HYPERVISOR_event_channel_op+0x15/0x4c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10937c6>] ? 
> __alloc_pages_nodemask+0xf3/0x4d9
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EIP is at 
> hypercall_page+0x227/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] EAX: 00040000 EBX: 00000000 
> ECX: 00000000 EDX: ec8fa828
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] ESI: ec8fa800 EDI: c24d9600 
> EBP: c27d4800 ESP: e4207d64
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  DS: 007b ES: 007b FS: 00d8 
> GS: 00e0 SS: 0069
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] CR0: 8005003b CR2: b7712200 
> CR3: 241f0000 CR4: 00000660
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR0: 00000000 DR1: 00000000 
> DR2: 00000000 DR3: 00000000
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] DR6: ffff0ff0 DR7: 00000400
> Aug 17 12:35:45 centrum kernel: [ 1456.620463] Call Trace:
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006034>] ? 
> xen_force_evtchn_callback+0xc/0x10
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006764>] ? 
> check_events+0x8/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1006723>] ? 
> xen_irq_enable_direct_end+0x0/0x1
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f457>] ? 
> scsi_request_fn+0x440/0x47a [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1132541>] ? 
> __blk_run_queue+0x2e/0x5a
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c11325f3>] ? 
> blk_run_queue+0x18/0x27
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93eaca>] ? 
> scsi_run_queue+0x281/0x308 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93f639>] ? 
> scsi_next_command+0x25/0x2f [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93ffa1>] ? 
> scsi_io_completion+0x383/0x3a4 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<ed93a723>] ? 
> scsi_finish_command+0xaa/0xc2 [scsi_mod]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1135bb3>] ? 
> blk_done_softirq+0x53/0x5f
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c8ea>] ? 
> __do_softirq+0xaa/0x156
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103c9c7>] ? 
> do_softirq+0x31/0x3c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c103caa1>] ? 
> irq_exit+0x26/0x58
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1199be6>] ? 
> xen_evtchn_do_upcall+0x22/0x2c
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1009b3f>] ? 
> xen_do_upcall+0x7/0xc
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1002407>] ? 
> hypercall_page+0x407/0x1001
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10015>] ? 
> HYPERVISOR_event_channel_op+0x15/0x4c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10937c6>] ? 
> __alloc_pages_nodemask+0xf3/0x4d9
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10484>] ? 
> evtchn_ioctl+0x22e/0x28c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda10256>] ? 
> evtchn_ioctl+0x0/0x28c [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10c6520>] ? 
> vfs_ioctl+0x1c/0x5f
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10c6ab4>] ? 
> do_vfs_ioctl+0x4aa/0x4e5
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10bb8e6>] ? 
> fsnotify_modify+0x5a/0x61
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<eda104e2>] ? 
> evtchn_write+0x0/0xda [xen_evtchn]
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10bc4c4>] ? 
> vfs_write+0x9e/0xd6
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c10c6b30>] ? 
> sys_ioctl+0x41/0x58
> Aug 17 12:35:45 centrum kernel: [ 1456.620463]  [<c1008f7c>] ? 
> syscall_call+0x7/0xb
> 
> Bye,
> Giuseppe
> 
> 
> 
> 

-- 
Ian Campbell
Current Noise: Converge - For You (Live)

(null cookie; hope that's ok)


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel