|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] BUG: soft lockup
On Sun, Jan 31, 2010 at 11:23:36AM -0500, Dana Rawding wrote:
> Hi all,
>
> I've been experiencing a rash of CPU lockups on a number of domU's recently.
> It's been happening on two different servers. About a year ago I had this
> problem every once in a while but it was not frequent. I was running Ubuntu
> with Xen 3.1 and 2.6.24-18 back then. I'm now running Xen 3.3 and 2.6.24-26.
>
>
> What I have noticed is that just prior to the lockups the domU's had high cpu
> loads. The domU that I have the most problems with is a Zimbra server. My
> guess is that a rash of spam comes through and cpu loads get high, then the
> cpu's lock up. Originally I had it running with 1 cpu but have since upped
> it 2 then 3 cpu's.
>
> I have been collecting the lockup messages and have posed a few below. Any
> ideas? Recommendations?
>
Please check this wiki page:
http://wiki.xensource.com/xenwiki/XenBestPractices
Are all those OK on your setup?
After those I'd upgrade the dom0 kernel, since Ubuntu's 2.6.24 is known to be
buggy.
-- Pasi
> Thanks,
> Dana
>
>
> [138077.172283] =======================
> [138075.147398] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:97]
> [138075.147411]
> [138075.147419] Pid: 97, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1)
> [138075.147426] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 0
> [138075.147441] EIP is at _spin_lock+0x7/0x10
> [138075.147447] EAX: c1da48ec EBX: 00000000 ECX: 220c7000 EDX: 00000000
> [138075.147453] ESI: 8b804067 EDI: c1da48ec EBP: 00000f28 ESP: ed707dec
> [138075.147459] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> [138075.147471] CR0: 8005003b CR2: 080f0010 CR3: 2213b000 CR4: 00000660
> [138075.147482] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [138075.147488] DR6: ffff0ff0 DR7: 00000400
> [138075.147495] [<c01773cb>] page_check_address+0x1cb/0x3c0
> [138075.147514] [<c0119868>] xen_invlpg_mask+0x38/0x40
> [138075.147529] [<c017762e>] page_referenced_one+0x6e/0x190
> [138075.147541] [<c017875c>] page_referenced+0xec/0x130
> [138075.147552] [<c01671cf>] shrink_active_list+0x18f/0x5c0
> [138075.147567] [<c016826d>] shrink_zone+0xdd/0x100
> [138075.147578] [<c01688cc>] kswapd+0x44c/0x490
> [138075.147589] [<c013bb00>] autoremove_wake_function+0x0/0x40
> [138075.147603] [<c011e270>] complete+0x40/0x60
> [138075.147614] [<c0168480>] kswapd+0x0/0x490
> [138075.147625] [<c013b842>] kthread+0x42/0x70
> [138075.147635] [<c013b800>] kthread+0x0/0x70
> [138075.147646] [<c0105bb7>] kernel_thread_helper+0x7/0x10
> [138075.147658] =======================
> [138088.987826] BUG: soft lockup - CPU#1 stuck for 11s! [java:23215]
> [138088.987841]
> [138088.987846] Pid: 23215, comm: java Tainted: G D (2.6.24-26-xen #1)
> [138088.987850] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1
> [138088.987862] EIP is at _spin_lock+0x7/0x10
> [138088.987866] EAX: c1da48ec EBX: 00000000 ECX: c1da48e0 EDX: 00000ca8
> [138088.987870] ESI: 8b804067 EDI: 00000000 EBP: e20c7ca8 ESP: e226be04
> [138088.987873] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> [138088.987883] CR0: 80050033 CR2: 940ef020 CR3: 2211f000 CR4: 00000660
> [138088.987891] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [138088.987896] DR6: ffff0ff0 DR7: 00000400
> [138088.987901] [<c016d88d>] unmap_vmas+0x43d/0xae0
> [138088.987922] [<c011959c>] kmap_atomic+0x1c/0x30
> [138088.987941] [<c01192fd>] kunmap_atomic+0x3d/0x60
> [138088.987957] [<c0173ee8>] vma_adjust+0x1c8/0x440
> [138088.987967] [<c0173765>] unmap_region+0x95/0x120
> [138088.987975] [<c0174387>] do_munmap+0x147/0x1f0
> [138088.987983] [<c0174c90>] mmap_region+0x70/0x450
> [138088.987991] [<c01db3b7>] security_file_mmap+0x27/0x30
> [138088.988001] [<c0175472>] do_mmap_pgoff+0x312/0x330
> [138088.988008] [<c010a02b>] sys_mmap2+0xbb/0xd0
> [138088.988016] [<c0105832>] syscall_call+0x7/0xb
> [138088.988023] [<c0320000>] svc_accept+0x150/0x410
> [138088.988032] =======================
>
>
> [66916.451144] BUG: soft lockup - CPU#0 stuck for 11s! [java:2758]
> [66928.193453] BUG: soft lockup - CPU#1 stuck for 11s! [java:3419]
>
>
> [336990.703192] BUG: soft lockup - CPU#1 stuck for 11s! [ps:32586]
> [336990.703206]
> [336990.703214] Pid: 32586, comm: ps Tainted: G D (2.6.24-26-xen #1)
> [336990.703221] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1
> [336990.703235] EIP is at _spin_lock+0x7/0x10
> [336990.703241] EAX: c1dbc72c EBX: 00000000 ECX: c1dbc720 EDX: 00000007
> [336990.703247] ESI: 57b51067 EDI: 00000001 EBP: e2cb93c8 ESP: e2033e4c
> [336990.703253] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> [336990.703266] CR0: 80050033 CR2: 08079004 CR3: 23651000 CR4: 00000660
> [336990.703275] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [336990.703282] DR6: ffff0ff0 DR7: 00000400
> [336990.703288] [<c0171646>] handle_mm_fault+0xae6/0x1360
> [336990.703307] [<c020e057>] rb_insert_color+0x77/0xe0
> [336990.703325] [<c032a27e>] do_page_fault+0x35e/0xe70
> [336990.703337] [<c01745d4>] vma_merge+0x144/0x1d0
> [336990.703349] [<c0174b75>] do_brk+0x195/0x240
> [336990.703362] [<c0175126>] sys_brk+0xb6/0xf0
> [336990.703374] [<c0329f20>] do_page_fault+0x0/0xe70
> [336990.703384] [<c0328bc5>] error_code+0x35/0x40
> [336990.703396] =======================
> [337005.938292] BUG: soft lockup - CPU#2 stuck for 11s! [zmlocalconfig:11371]
> [337005.938306]
> [337005.938312] Pid: 11371, comm: zmlocalconfig Tainted: G D
> (2.6.24-26-xen #1)
> [337005.938318] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 2
> [337005.938330] EIP is at _spin_lock+0x7/0x10
> [337005.938335] EAX: ec64a870 EBX: ec64a870 ECX: 00000002 EDX: ec64a871
> [337005.938339] ESI: 00000000 EDI: c03fe800 EBP: c1261e38 ESP: c1261d7c
> [337005.938343] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> [337005.938357] CR0: 8005003b CR2: 08128000 CR3: 25d8e000 CR4: 00000660
> [337005.938364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [337005.938370] DR6: ffff0ff0 DR7: 00000400
> [337005.938376] [<c01771f0>] page_lock_anon_vma+0x20/0x30
> [337005.938391] [<c01786fd>] page_referenced+0x8d/0x130
> [337005.938401] [<c01671cf>] shrink_active_list+0x18f/0x5c0
> [337005.938411] [<c0164286>] get_dirty_limits+0x16/0x200
> [337005.938421] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache]
> [337005.938435] [<c016826d>] shrink_zone+0xdd/0x100
> [337005.938444] [<c0168d72>] try_to_free_pages+0x152/0x250
> [337005.938453] [<c0162fcb>] __alloc_pages+0x14b/0x390
> [337005.938463] [<c01855c5>] do_sync_read+0xd5/0x120
> [337005.938475] [<c0163247>] __get_free_pages+0x37/0x50
> [337005.938483] [<c0124496>] copy_process+0xa6/0x1210
> [337005.938493] [<c0197c34>] d_alloc+0x114/0x1a0
> [337005.938503] [<c0125830>] do_fork+0x40/0x260
> [337005.938511] [<c0210f00>] copy_to_user+0x30/0x60
> [337005.938523] [<c0103226>] sys_clone+0x36/0x40
> [337005.938530] [<c0105832>] syscall_call+0x7/0xb
> [337005.938542] =======================
> [336990.803889] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:103]
> [336990.803907]
> [336990.803915] Pid: 103, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1)
> [336990.803922] EIP: 0061:[<c03286ea>] EFLAGS: 00000286 CPU: 0
> [336990.803940] EIP is at _spin_lock+0xa/0x10
> [336990.803948] EAX: c1dbc86c EBX: 00000000 ECX: 22cc3000 EDX: 00000000
> [336990.803955] ESI: 57b47067 EDI: c1dbc86c EBP: 00000ff0 ESP: ed725dec
> [336990.803961] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> [336990.803976] CR0: 8005003b CR2: b791e6d9 CR3: 23e3b000 CR4: 00000660
> [336990.803986] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [336990.803992] DR6: ffff0ff0 DR7: 00000400
> [336990.804001] [<c01773cb>] page_check_address+0x1cb/0x3c0
> [336990.804026] [<c017762e>] page_referenced_one+0x6e/0x190
> [336990.804039] [<c017875c>] page_referenced+0xec/0x130
> [336990.804049] [<c01671cf>] shrink_active_list+0x18f/0x5c0
> [336990.804064] [<c0210556>] memmove+0x36/0x40
> [336990.804079] [<c0164286>] get_dirty_limits+0x16/0x200
> [336990.804089] [<c0139857>] call_rcu+0x97/0xa0
> [336990.804102] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache]
> [336990.804120] [<c016826d>] shrink_zone+0xdd/0x100
> [336990.804132] [<c01688cc>] kswapd+0x44c/0x490
> [336990.804145] [<c013bb00>] autoremove_wake_function+0x0/0x40
> [336990.804160] [<c011e270>] complete+0x40/0x60
> [336990.804172] [<c0168480>] kswapd+0x0/0x490
> [336990.804183] [<c013b842>] kthread+0x42/0x70
> [336990.804194] [<c013b800>] kthread+0x0/0x70
> [336990.804206] [<c0105bb7>] kernel_thread_helper+0x7/0x10
> [336990.804218] =======================
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
<Prev in Thread] |
Current Thread |
[Next in Thread> |
- Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts, (continued)
|
|
|
|
|