WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

RE: [Xen-users] Re: CPU soft lockup XEN 4.1rc (Solved)

To: "Matthias Bannach" <matthias@xxxxxxxxxxx>, <mbrown@xxxxxxxxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-users] Re: CPU soft lockup XEN 4.1rc (Solved)
From: "Ian Tobin" <itobin@xxxxxxxxxxxxx>
Date: Fri, 2 Sep 2011 10:57:57 +0100
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 02 Sep 2011 03:04:26 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <4E5E8089.40801@xxxxxxxxxxxxxxxxxxxxxxxxx> <4E602D73.4020407@xxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcxpUlwduuY/7Gd+Rc+GhbNHRT7ZcgABFVuQ
Thread-topic: [Xen-users] Re: CPU soft lockup XEN 4.1rc (Solved)
Hi,

Are you saying this one worked?

# in /etc/xen/*.conf
extra="clocksource=jiffies"

we have the same issue with one of our DomUs (CentOS)

thanks

Ian



-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Matthias
Bannach
Sent: 02 September 2011 02:12
To: mbrown@xxxxxxxxxxxxxxxxxxxxxxxxx
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Re: CPU soft lockup XEN 4.1rc (Solved)

All,

Ha - finally - solved. Guess google is not the answer, searching the
mailing list is. After much frustration I found the following:

http://wiki.debian.org/Xen#A.27clocksource.2BAC8-0.3ATimewentbackwards.2
7

based on a post by Marco Marongiu

http://my.opera.com/marcomarongiu/blog/2010/08/18/debugging-ntp-again-pa
rt-4-and-last

For me lockup solution #2 worked:

# DomU and Dom0
# in /etc/sysctl.conf
clocksource=jiffies
independent_wallclock=0
# then sysctl -p

# in /etc/xen/*.conf
extra="clocksource=jiffies"

And voila - no more lockups, nothing with the motherboards (which I
thought not to be the cause based on success with non-xen
configurations)

Not sure if this is a kernel or XEN problem though.

Hope this helps others

On 8/31/2011 2:42 PM, Mark Brown wrote:
> Hello,
> 
> Similar to others I have freezeups on the system, it is consistent 
> with high IO load. If the system runs (even with multiple) XenU it 
> does not happen. But I can consistently force the situation to occur.
> 
> Running 4 dd processes dumping 20GB each on a LVM/mdadm soft RAID5 
> volume it consistenly crashes in a DomU. Running without XEN I do not 
> see the problem at all - (e.g. after about 3TB of read/write) nothing 
> happened.
> 
> Any suggestion would be very welcome.
> 
> Marc
> 
> [ .. more .. ]
> It appears to be very unpredictable of when it actually occurs, here 
> are a few examples. Kind of odd that on Aug29th it always happened on 
> the same second ;-{.
> 
>> syslog.2:Aug 29 17:35:47 nwsc-xen-Q45 kernel: [ 2698.560009] BUG: 
>> soft lockup - CPU#0 stuck for 146s! [events/0:9] syslog.2:Aug 29 
>> 17:35:47 nwsc-xen-Q45 kernel: [ 2698.561016] BUG: soft lockup - CPU#1

>> stuck for 146s! [rsyslogd:2024] syslog.2:Aug 29 22:57:27 nwsc-xen-Q45

>> kernel: [ 4198.404353] BUG: soft lockup - CPU#0 stuck for 122s! 
>> [md1_raid5:1243] syslog.2:Aug 29 23:07:27 nwsc-xen-Q45 kernel: [ 
>> 4798.336110] BUG: soft lockup - CPU#0 stuck for 101s! [xend:2583] 
>> syslog.2:Aug 29 23:07:27 nwsc-xen-Q45 kernel: [ 4798.337007] BUG: 
>> soft lockup - CPU#1 stuck for 101s! [bdi-default:19] syslog.2:Aug 29 
>> 23:12:27 nwsc-xen-Q45 kernel: [ 5098.304013] BUG: soft lockup - CPU#0

>> stuck for 136s! [blkback.5.xvdd1:7226] syslog.2:Aug 29 23:12:27 
>> nwsc-xen-Q45 kernel: [ 5098.305010] BUG: soft lockup - CPU#1 stuck 
>> for 136s! [sh:7262] syslog.6:Aug 17 12:07:08 nwsc-xen-Q45 kernel: [ 
>> 2998.596016] BUG: soft lockup - CPU#0 stuck for 73s! [xend:2506] 
>> syslog.6:Aug 17 12:07:08 nwsc-xen-Q45 kernel: [ 2998.597555] BUG: 
>> soft lockup - CPU#1 stuck for 73s! [md0_raid5:598] syslog.6:Aug 17 
>> 12:17:08 nwsc-xen-Q45 kernel: [ 3598.534068] BUG: soft lockup - CPU#1

>> stuck for 150s! [xend:2506]
> 
> It does not appear to relate to a specific process. (Those above are 
> from Xen 4.0.1 with Debian 2.6.32-5-xen-amd64).
> 
> This one is with Xen 4.1.2-rc2-pre/Debian 2.6.32-5-xen-amd64. Both are

> on Intel DQ45CB board with 4GB ram.
> 
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348062] BUG: soft lockup
- CPU#0 stuck for 79s! [xend:2767]
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348073] Modules linked
in: xt_tcpudp xt_physdev iptable_filter ip_tables x_ta
bles ext4 jbd2 crc16 sata_sil24 hid_apple sky2 via_velocity crc_ccitt
usb_storage raid456 md_mod async_raid6_recov async_
pq raid6_pq async_xor xor async_memcpy async_tx dm_mod ext3 jbd mbcache
firewire_sbp2 loop sr_mod cdrom sg xenfs xen_evtc                    hn
bridge stp 3w_9xxx usbhid hid sd_mod crc_t10dif snd_hda_codec_analog
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss
snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event
snd_seq snd_timer snd_seq_device firewire_ohci psmouse
i2c_i801 video firewire_core uhci_hcd ata_piix snd crc_itu_t output
serio_raw evdev ahci pcspkr ehci_hcd i2c_core usbcor
e nls_base e1000e button ata_generic soundcore snd_page_alloc libata
thermal scsi_mod processor thermal_sys acpi_processo                   
 
> r
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348219] CPU 0:
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348222] Modules linked
in: xt_tcpudp xt_physdev iptable_filter ip_tables x_ta
bles ext4 jbd2 crc16 sata_sil24 hid_apple sky2 via_velocity crc_ccitt
usb_storage raid456 md_mod async_raid6_recov async_
pq raid6_pq async_xor xor async_memcpy async_tx dm_mod ext3 jbd mbcache
firewire_sbp2 loop sr_mod cdrom sg xenfs xen_evtc                    hn
bridge stp 3w_9xxx usbhid hid sd_mod crc_t10dif snd_hda_codec_analog
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss
snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event
snd_seq snd_timer snd_seq_device firewire_ohci psmouse
i2c_i801 video firewire_core uhci_hcd ata_piix snd crc_itu_t output
serio_raw evdev ahci pcspkr ehci_hcd i2c_core usbcor
e nls_base e1000e button ata_generic soundcore snd_page_alloc libata
thermal scsi_mod processor thermal_sys acpi_processo                   
 
> r
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348318] Pid: 2767, comm: 
>> xend Not tainted 2.6.32-5-xen-amd64 #1 Aug 31 13:05:41 nwsc-xen-Q45 
>> kernel: [ 4039.348322] RIP: e033:[<00007fa4064c0289>]  
>> [<00007fa4064c0289>] 0x7fa4064c0289 Aug 31 13:05:41 nwsc-xen-Q45 
>> kernel: [ 4039.348330] RSP: e02b:00007fa402ee54a0  EFLAGS: 00000206 
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348334] RAX: 
>> 0000000001c3a320 RBX: 0000000001f8ace0 RCX: 00007fa40650f844 Aug 31 
>> 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348338] RDX: ffffffffffffffe0
RSI: 0000000000000000 RDI: 00007fa4067a9e40 Aug 31 13:05:41 nwsc-xen-Q45
kernel: [ 4039.348341] RBP: 0000000000000000 R08: 0000000000000008 R09:
0000000000000001 Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348345]
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fa4067a9e40 Aug 31
13:05:41 nwsc-xen-Q45 kernel: [ 4039.348349] R13: 00007fa402ee555c R14:
00007fa402ee5548 R15: 00000000ffffffff
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348356] FS:
00007fa402ee6700(0000) GS:ffff880002995000(0000) knlGS:000000000
0000000
>> Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348360] CS:  e033 DS: 
>> 0000 ES: 0000 CR0: 000000008005003b Aug 31 13:05:41 nwsc-xen-Q45 
>> kernel: [ 4039.348363] CR2: 00007fb2ed832e28 CR3: 00000000bba8e000 
>> CR4: 0000000000002660 Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 
>> 4039.348367] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000 Aug 31 13:05:41 nwsc-xen-Q45 kernel: [ 4039.348371]
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Aug 31
13:05:41 nwsc-xen-Q45 kernel: [ 4039.348375] Call Trace:
>>
>> Aug 31 13:07:51 nwsc-xen-Q45 init: Id "T1" respawning too fast: 
>> disabled for 5 minutes
> 


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>