WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] BUG: soft lockup

To: Dana Rawding <dana@xxxxxxxxxxx>
Subject: Re: [Xen-users] BUG: soft lockup
From: alex <alex.faq8@xxxxxxxxx>
Date: Tue, 2 Feb 2010 23:58:07 +0300
Cc: Xen List <xen-users@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 02 Feb 2010 12:59:32 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=KFM+DLZZfzzp4Wg3kAtmfA2Hsh+tClDsrsXH2kuylhs=; b=iJogbJGVW0KL9NXx7ap3c5z3TcS36sDN2zDANvb7PdUXuLbWI6jiQ9WyV3R5L6Y5VS gzZQ7n297ATozwPC7DciuLb9UR0F9mudAg2z2fkgAJu/+6+eXZLykyrPUOP3WGgOd6Ki /jU7O7dJ909B6IWMmoWFw4K9pCZBVyvrFj/04=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=txBjvpSjmfajKMtClWMouDGVv6LkHaZ28vFgmc8httpCcemwPIF9NQLvIUoSioxuqI sEMyuM7t2C9sr6jyh/e9JhyFwQ+RM7m43cHgAKKxArz1buYcEIvffiVJ/1R6oYsRO29j qDc5Ty095zS7/HS1dFExQhXSaIiYcSR11ztSk=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <E161F6E2-E7CD-4500-B670-D5793FAB43ED@xxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <695200da1001280330r6720970ay8551dd709969d50e@xxxxxxxxxxxxxx> <90eb1dc71001280651y772899c7h6049ab7763c33aed@xxxxxxxxxxxxxx> <695200da1001281419v7c1f3ff7sa860ef2c05223ab2@xxxxxxxxxxxxxx> <p0624081cc788551f7038@xxxxxxxxxxxxxxxxxxxxxx> <695200da1001290157g5c481fc4ua3885f71ef7a1051@xxxxxxxxxxxxxx> <E161F6E2-E7CD-4500-B670-D5793FAB43ED@xxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
I have this problem too.
Xen 3.3.1 Debian Lenny.
LA on server up to 10-15, all domUs freeze and I can't do anything.
Please test I fix this problem by xm sched-credit -d 0 -w 512 .

[787717.425090] BUG: soft lockup - CPU#0 stuck for 61s! [watchdog/0:5]
[787717.425090] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables tun bridge ipv6 nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc loop joydev igb psmouse pcspkr i2c_i801 serio_raw button i2c_core evdev dca ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock sd_mod ide_pci_generic ide_core ehci_hcd uhci_hcd 3w_9xxx scsi_mod thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[787717.432148] CPU 0:
[787717.432148] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables tun bridge ipv6 nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc loop joydev igb psmouse pcspkr i2c_i801 serio_raw button i2c_core evdev dca ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock sd_mod ide_pci_generic ide_core ehci_hcd uhci_hcd 3w_9xxx scsi_mod thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
[787717.436173] Pid: 5, comm: watchdog/0 Not tainted 2.6.26-1-xen-amd64 #1
[787717.436173] RIP: e030:[<ffffffff8025ed13>]  [<ffffffff8025ed13>] watchdog+0xbe/0x1cf
[787717.436173] RSP: e02b:ffff880bce0d9ef0  EFLAGS: 00000207
[787717.436173] RAX: 0000000000000001 RBX: ffff880bcb4e5400 RCX: 0002cc64939f91fe
[787717.436173] RDX: ffff880081656000 RSI: ffffffff804fe460 RDI: ffffffff8053a000
[787717.436173] RBP: ffff880bcb4e5400 R08: ffff880001be3040 R09: ffff880bce0d9e30
[787717.436173] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000399
[787717.436173] R13: 00000000000b3192 R14: 0000000000000000 R15: 0000000000000000
[787717.436173] FS:  00007f0cfbb3e6e0(0000) GS:ffffffff80539000(0000) knlGS:0000000000000000
[787717.436173] CS:  e033 DS: 0000 ES: 0000
[787717.436173] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[787717.436173] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[787717.436173]
[787717.436173] Call Trace:
[787717.436173]  [<ffffffff8025ec55>] ? watchdog+0x0/0x1cf
[787717.436173]  [<ffffffff8023f56b>] ? kthread+0x47/0x74
[787717.436173]  [<ffffffff8022839f>] ? schedule_tail+0x27/0x5c
[787717.436173]  [<ffffffff8020be28>] ? child_rip+0xa/0x12
[787717.436173]  [<ffffffff8023f524>] ? kthread+0x0/0x74
[787717.436173]  [<ffffffff8020be1e>] ? child_rip+0x0/0x12
[787717.436173]



I fix this problem by xm sched-credit -d 0 -w 512 .


2010/1/31 Dana Rawding <dana@xxxxxxxxxxx>
Hi all,

I've been experiencing a rash of CPU lockups on a number of domU's recently.  It's been happening on two different servers.  About a year ago I had this problem every once in a while but it was not frequent.  I was running Ubuntu with Xen 3.1 and 2.6.24-18 back then.  I'm now running Xen 3.3 and 2.6.24-26.

What I have noticed is that just prior to the lockups the domU's had high cpu loads.  The domU that I have the most problems with is a Zimbra server.  My guess is that a rash of spam comes through and cpu loads get high, then the cpu's lock up.  Originally I had it running with 1 cpu but have since upped it 2 then 3 cpu's.

I have been collecting the lockup messages and have posed a few below.  Any ideas?  Recommendations?

Thanks,
Dana


[138077.172283]  =======================
[138075.147398] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:97]
[138075.147411]
[138075.147419] Pid: 97, comm: kswapd0 Tainted: G      D (2.6.24-26-xen #1)
[138075.147426] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 0
[138075.147441] EIP is at _spin_lock+0x7/0x10
[138075.147447] EAX: c1da48ec EBX: 00000000 ECX: 220c7000 EDX: 00000000
[138075.147453] ESI: 8b804067 EDI: c1da48ec EBP: 00000f28 ESP: ed707dec
[138075.147459]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[138075.147471] CR0: 8005003b CR2: 080f0010 CR3: 2213b000 CR4: 00000660
[138075.147482] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[138075.147488] DR6: ffff0ff0 DR7: 00000400
[138075.147495]  [<c01773cb>] page_check_address+0x1cb/0x3c0
[138075.147514]  [<c0119868>] xen_invlpg_mask+0x38/0x40
[138075.147529]  [<c017762e>] page_referenced_one+0x6e/0x190
[138075.147541]  [<c017875c>] page_referenced+0xec/0x130
[138075.147552]  [<c01671cf>] shrink_active_list+0x18f/0x5c0
[138075.147567]  [<c016826d>] shrink_zone+0xdd/0x100
[138075.147578]  [<c01688cc>] kswapd+0x44c/0x490
[138075.147589]  [<c013bb00>] autoremove_wake_function+0x0/0x40
[138075.147603]  [<c011e270>] complete+0x40/0x60
[138075.147614]  [<c0168480>] kswapd+0x0/0x490
[138075.147625]  [<c013b842>] kthread+0x42/0x70
[138075.147635]  [<c013b800>] kthread+0x0/0x70
[138075.147646]  [<c0105bb7>] kernel_thread_helper+0x7/0x10
[138075.147658]  =======================
[138088.987826] BUG: soft lockup - CPU#1 stuck for 11s! [java:23215]
[138088.987841]
[138088.987846] Pid: 23215, comm: java Tainted: G      D (2.6.24-26-xen #1)
[138088.987850] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1
[138088.987862] EIP is at _spin_lock+0x7/0x10
[138088.987866] EAX: c1da48ec EBX: 00000000 ECX: c1da48e0 EDX: 00000ca8
[138088.987870] ESI: 8b804067 EDI: 00000000 EBP: e20c7ca8 ESP: e226be04
[138088.987873]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[138088.987883] CR0: 80050033 CR2: 940ef020 CR3: 2211f000 CR4: 00000660
[138088.987891] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[138088.987896] DR6: ffff0ff0 DR7: 00000400
[138088.987901]  [<c016d88d>] unmap_vmas+0x43d/0xae0
[138088.987922]  [<c011959c>] kmap_atomic+0x1c/0x30
[138088.987941]  [<c01192fd>] kunmap_atomic+0x3d/0x60
[138088.987957]  [<c0173ee8>] vma_adjust+0x1c8/0x440
[138088.987967]  [<c0173765>] unmap_region+0x95/0x120
[138088.987975]  [<c0174387>] do_munmap+0x147/0x1f0
[138088.987983]  [<c0174c90>] mmap_region+0x70/0x450
[138088.987991]  [<c01db3b7>] security_file_mmap+0x27/0x30
[138088.988001]  [<c0175472>] do_mmap_pgoff+0x312/0x330
[138088.988008]  [<c010a02b>] sys_mmap2+0xbb/0xd0
[138088.988016]  [<c0105832>] syscall_call+0x7/0xb
[138088.988023]  [<c0320000>] svc_accept+0x150/0x410
[138088.988032]  =======================


[66916.451144] BUG: soft lockup - CPU#0 stuck for 11s! [java:2758]
[66928.193453] BUG: soft lockup - CPU#1 stuck for 11s! [java:3419]


[336990.703192] BUG: soft lockup - CPU#1 stuck for 11s! [ps:32586]
[336990.703206]
[336990.703214] Pid: 32586, comm: ps Tainted: G      D (2.6.24-26-xen #1)
[336990.703221] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1
[336990.703235] EIP is at _spin_lock+0x7/0x10
[336990.703241] EAX: c1dbc72c EBX: 00000000 ECX: c1dbc720 EDX: 00000007
[336990.703247] ESI: 57b51067 EDI: 00000001 EBP: e2cb93c8 ESP: e2033e4c
[336990.703253]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[336990.703266] CR0: 80050033 CR2: 08079004 CR3: 23651000 CR4: 00000660
[336990.703275] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[336990.703282] DR6: ffff0ff0 DR7: 00000400
[336990.703288]  [<c0171646>] handle_mm_fault+0xae6/0x1360
[336990.703307]  [<c020e057>] rb_insert_color+0x77/0xe0
[336990.703325]  [<c032a27e>] do_page_fault+0x35e/0xe70
[336990.703337]  [<c01745d4>] vma_merge+0x144/0x1d0
[336990.703349]  [<c0174b75>] do_brk+0x195/0x240
[336990.703362]  [<c0175126>] sys_brk+0xb6/0xf0
[336990.703374]  [<c0329f20>] do_page_fault+0x0/0xe70
[336990.703384]  [<c0328bc5>] error_code+0x35/0x40
[336990.703396]  =======================
[337005.938292] BUG: soft lockup - CPU#2 stuck for 11s! [zmlocalconfig:11371]
[337005.938306]
[337005.938312] Pid: 11371, comm: zmlocalconfig Tainted: G      D (2.6.24-26-xen #1)
[337005.938318] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 2
[337005.938330] EIP is at _spin_lock+0x7/0x10
[337005.938335] EAX: ec64a870 EBX: ec64a870 ECX: 00000002 EDX: ec64a871
[337005.938339] ESI: 00000000 EDI: c03fe800 EBP: c1261e38 ESP: c1261d7c
[337005.938343]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[337005.938357] CR0: 8005003b CR2: 08128000 CR3: 25d8e000 CR4: 00000660
[337005.938364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[337005.938370] DR6: ffff0ff0 DR7: 00000400
[337005.938376]  [<c01771f0>] page_lock_anon_vma+0x20/0x30
[337005.938391]  [<c01786fd>] page_referenced+0x8d/0x130
[337005.938401]  [<c01671cf>] shrink_active_list+0x18f/0x5c0
[337005.938411]  [<c0164286>] get_dirty_limits+0x16/0x200
[337005.938421]  [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache]
[337005.938435]  [<c016826d>] shrink_zone+0xdd/0x100
[337005.938444]  [<c0168d72>] try_to_free_pages+0x152/0x250
[337005.938453]  [<c0162fcb>] __alloc_pages+0x14b/0x390
[337005.938463]  [<c01855c5>] do_sync_read+0xd5/0x120
[337005.938475]  [<c0163247>] __get_free_pages+0x37/0x50
[337005.938483]  [<c0124496>] copy_process+0xa6/0x1210
[337005.938493]  [<c0197c34>] d_alloc+0x114/0x1a0
[337005.938503]  [<c0125830>] do_fork+0x40/0x260
[337005.938511]  [<c0210f00>] copy_to_user+0x30/0x60
[337005.938523]  [<c0103226>] sys_clone+0x36/0x40
[337005.938530]  [<c0105832>] syscall_call+0x7/0xb
[337005.938542]  =======================
[336990.803889] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:103]
[336990.803907]
[336990.803915] Pid: 103, comm: kswapd0 Tainted: G      D (2.6.24-26-xen #1)
[336990.803922] EIP: 0061:[<c03286ea>] EFLAGS: 00000286 CPU: 0
[336990.803940] EIP is at _spin_lock+0xa/0x10
[336990.803948] EAX: c1dbc86c EBX: 00000000 ECX: 22cc3000 EDX: 00000000
[336990.803955] ESI: 57b47067 EDI: c1dbc86c EBP: 00000ff0 ESP: ed725dec
[336990.803961]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[336990.803976] CR0: 8005003b CR2: b791e6d9 CR3: 23e3b000 CR4: 00000660
[336990.803986] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[336990.803992] DR6: ffff0ff0 DR7: 00000400
[336990.804001]  [<c01773cb>] page_check_address+0x1cb/0x3c0
[336990.804026]  [<c017762e>] page_referenced_one+0x6e/0x190
[336990.804039]  [<c017875c>] page_referenced+0xec/0x130
[336990.804049]  [<c01671cf>] shrink_active_list+0x18f/0x5c0
[336990.804064]  [<c0210556>] memmove+0x36/0x40
[336990.804079]  [<c0164286>] get_dirty_limits+0x16/0x200
[336990.804089]  [<c0139857>] call_rcu+0x97/0xa0
[336990.804102]  [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache]
[336990.804120]  [<c016826d>] shrink_zone+0xdd/0x100
[336990.804132]  [<c01688cc>] kswapd+0x44c/0x490
[336990.804145]  [<c013bb00>] autoremove_wake_function+0x0/0x40
[336990.804160]  [<c011e270>] complete+0x40/0x60
[336990.804172]  [<c0168480>] kswapd+0x0/0x490
[336990.804183]  [<c013b842>] kthread+0x42/0x70
[336990.804194]  [<c013b800>] kthread+0x0/0x70
[336990.804206]  [<c0105bb7>] kernel_thread_helper+0x7/0x10
[336990.804218]  =======================
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users



--
Best Regards,
alex.faq8@xxxxxxxxx


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
<Prev in Thread] Current Thread [Next in Thread>