WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Soft lockup with kernel 2.6.24

To: xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>, "Christopher Isip" <cmisip@xxxxxxxxx>
Subject: Re: [Xen-users] Soft lockup with kernel 2.6.24
From: "Jean-Michel Bonnefond" <pompon2@xxxxxxxxx>
Date: Wed, 25 Jun 2008 09:54:44 +0200
Delivery-date: Wed, 25 Jun 2008 00:55:17 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=JhnpWTjb9xSXjDTZKA9JL+37PfcJQMhWZvw1686I8fw=; b=E26GUcHBRd+lshga5dGiN7ZMpJUSlZnBG7FRULxZPeIB7RU2x8PUMQb92tHYEkxAIO hvQ9l2E2RFa82U6nWCZggUM0v9pDGBkJjdLD/7DzCXKzy3Fvrfa6ZS3VNxZl3crRNu4t bKW/WQlyyV1FZsL+T5YPQ58sVIFuEI79b1lyk=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=Z5uaGjFWk2eAr9aURpytqkOeG3MIaimAZpyRUvhj4MkmwDpofQ2wSm3h8l8w0KUZU1 5t6m4q/D1UQ7JuBt5OXmiqtP3nTUzCRxqlu84UesJVu9H4u9WD800Pyma6ahun1804Jp tr0ywFPOyvJpeAabiPoWyShU01+lkwfS3AKZw=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <4bca5f6c0806241355g2f597441l5361f11c2a1afd4b@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <b30b3fc20806240203p2fd9b2cx902bcd1c4c43a8af@xxxxxxxxxxxxxx> <4bca5f6c0806241355g2f597441l5361f11c2a1afd4b@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx

Thanks Christopher.

I finally downgrade my dom0 to linux 2.6.18 wich is no more provided with ubuntu 8.04 :-( , and it seems to be more stable.
I've googled some few people having the same problem but find no clues or answer. It seems to be a kernel bug linked to some specific hardware...

However if someone is interested I could reproduce the bug and provide more informations.

Jean-Michel.
 

2008/6/24 Christopher Isip <cmisip@xxxxxxxxx>:


On Tue, Jun 24, 2008 at 5:03 AM, Jean-Michel Bonnefond <pompon2@xxxxxxxxx> wrote:
Hello,

I'm facing a soft lockup kernel bug on a dom0 that result in freezing the server.
The soft lockup systematically appears between 10 to 20 hours after the server reboot.

I'm using xen 3.2.0 with kernel 2.6.24-18-xen on an ubuntu server 8.04.
The underlying server is an HP Proliant DL385 G2, with a dual-core AMD Opteron 2214 HE

Here is some example of the console logs I have when it happen :

[51694.282459] BUG: soft lockup - CPU#0 stuck for 11s! [nrpe:8707]
[51694.282469]
[51694.282470] Pid: 8707, comm: nrpe Not tainted (2.6.24-18-xen #1)
[51694.282472] EIP: 0061:[ipv6:_spin_lock+0xa/0x10] EFLAGS: 00200282 CPU: 0
[51694.282476] EIP is at _spin_lock+0xa/0x10
[51694.282477] EAX: c1b6286c EBX: 00000000 ECX: c1b62860 EDX: 00000598
[51694.282501] ESI: 17743067 EDI: 00000000 EBP: c0477158 ESP: e7fb1ddc
[51694.282503]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[51694.282506] CR0: 80050033 CR2: b7cb3e70 CR3: 2c1ed000 CR4: 00000660
[51694.282508] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[51694.282511] DR6: ffff0ff0 DR7: 00000400
[51694.282512]  [__do_fault+0x3b8/0x6b0] __do_fault+0x3b8/0x6b0
[51694.282591]  [handle_mm_fault+0x249/0x1350] handle_mm_fault+0x249/0x1350
[51694.282629]  [sock_aio_read+0x120/0x130] sock_aio_read+0x120/0x130
[51694.282740]  [do_page_fault+0x366/0xe90] do_page_fault+0x366/0xe90
[51694.282755]  [__do_softirq+0x92/0x130] __do_softirq+0x92/0x130
[51694.282787]  [vfs_read+0x11c/0x170] vfs_read+0x11c/0x170
[51694.282802]  [sys_read+0x41/0x70] sys_read+0x41/0x70
[51694.282813]  [do_page_fault+0x0/0xe90] do_page_fault+0x0/0xe90
[51694.282819]  [error_code+0x35/0x40] error_code+0x35/0x40
[51694.282856]  =======================

[99821.430516] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:237]
[99821.430527]
[99821.430531] Pid: 237, comm: kswapd0 Tainted: G      D (2.6.24-18-xen #1)
[99821.430534] EIP: 0061:[<c032767a>] EFLAGS: 00000286 CPU: 0
[99821.430563] EIP is at _spin_lock+0xa/0x10
[99821.430565] EAX: c1af8ecc EBX: 00000000 ECX: 24a76000 EDX: 00000000
[99821.430569] ESI: 1aa76067 EDI: c1af8ecc EBP: 000002c0 ESP: ed617dec
[99821.430572]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[99821.430578] CR0: 8005003b CR2: b7fc5000 CR3: 2c44a000 CR4: 00000660
[99821.430582] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[99821.430585] DR6: ffff0ff0 DR7: 00000400
[99821.430588]  [<c017723b>] page_check_address+0x1cb/0x3c0
[99821.430644]  [<c0119858>] xen_invlpg_mask+0x38/0x40
[99821.430682]  [<c017749e>] page_referenced_one+0x6e/0x190
[99821.430732]  [<c01785cc>] page_referenced+0xec/0x130
[99821.430773]  [<c016713f>] shrink_active_list+0x18f/0x5c0
[99821.430941]  [<c01681dd>] shrink_zone+0xdd/0x100
[99821.430980]  [<c016883c>] kswapd+0x44c/0x490
[99821.431068]  [<c013bb90>] autoremove_wake_function+0x0/0x40
[99821.431105]  [<c011e260>] complete+0x40/0x60
[99821.431137]  [<c01683f0>] kswapd+0x0/0x490
[99821.431144]  [<c013b8d2>] kthread+0x42/0x70
[99821.431148]  [<c013b890>] kthread+0x0/0x70
[99821.431177]  [<c0105bb7>] kernel_thread_helper+0x7/0x10
[99821.431215]  =======================
[99926.949316] BUG: soft lockup - CPU#1 stuck for 11s! [ps:8318]
[99926.949322]
[99926.949324] Pid: 8318, comm: ps Tainted: G      D (2.6.24-18-xen #1)
[99926.949327] EIP: 0061:[<c0327677>] EFLAGS: 00000286 CPU: 1
[99926.949331] EIP is at _spin_lock+0x7/0x10
[99926.949334] EAX: c1af8ecc EBX: 00000000 ECX: c1af8ec0 EDX: 00000248
[99926.949337] ESI: 1aa76067 EDI: 00000000 EBP: c0477158 ESP: e4b15ddc
[99926.949356]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
[99926.949360] CR0: 80050033 CR2: 080492e0 CR3: 24941000 CR4: 00000660
[99926.949364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[99926.949368] DR6: ffff0ff0 DR7: 00000400
[99926.949370]  [<c016bd18>] __do_fault+0x3b8/0x6b0
[99926.949420]  [<c017885d>] anon_vma_prepare+0x1d/0xe0
[99926.949445]  [<c0170c69>] handle_mm_fault+0x249/0x1350
[99926.949510]  [<c0162456>] __pagevec_free+0x26/0x30
[99926.949561]  [<c0329216>] do_page_fault+0x366/0xe90
[99926.949583]  [<c01165fb>] check_pgt_cache+0x1b/0x20
[99926.949597]  [<c0173667>] unmap_region+0x107/0x120
[99926.949622]  [<c0174250>] do_munmap+0x180/0x1f0
[99926.949653]  [<c0328eb0>] do_page_fault+0x0/0xe90
[99926.949660]  [<c0327b55>] error_code+0x35/0x40
[99926.949702]  =======================


Do you have already heard of such a problem ?

Thanks,
Jean-Michel.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

I saw this error in one of my Ubuntu 8.04 domUs the other day when the free memory in dom0 approached a very low value.  I rebooted the system and it seems to be fine.  But then again, I might see the problem again in a week's time.  I also use nrpe.

Chris


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
<Prev in Thread] Current Thread [Next in Thread>