WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] xen 4.0.2rc3/kernel 2.6.32.36: BUG: unable to handle kernel

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] xen 4.0.2rc3/kernel 2.6.32.36: BUG: unable to handle kernel paging request
From: Gerd Jakobovitsch <gerd@xxxxxxxxxxx>
Date: Fri, 15 Apr 2011 11:21:40 -0300
Delivery-date: Fri, 15 Apr 2011 07:30:41 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110414131543.GE5548@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DA60F55.4000604@xxxxxxxxxxx> <20110414131543.GE5548@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.14) Gecko/20110223 Lightning/1.0b2 Thunderbird/3.1.8
Reporting a new bug that appeared during stress tests. The scenario is the same as reported below, with patches applied:

On 04/14/2011 10:15 AM, Konrad Rzeszutek Wilk wrote:
On Wed, Apr 13, 2011 at 06:02:13PM -0300, Gerd Jakobovitsch wrote:
I'm trying to run several VMs (linux hvm, with tapdisk:aio disks at
a storage over nfs) on a CentOS system, using the up-to-date version
of xen 4.0 / kernel pvops 2.6.32.x stable. With a configuration
without (most of) debug activated, I can start several instances -
I'm running 7 of them - but shortly afterwards the system stops
responding. I can't find any information on this.
First time I see it.
Activating several debug configuration items, among them
DEBUG_PAGEALLOC, I get an exception as soon as I try to start up a
VM. The system reboots.

With the debug information still set, I'm running 42 VMs - mixed Linux (several distros) and Windows, most of them running benchmarks for CPU and disk usage. After roughly 15 hours, a bug message appeared at dmesg. It affected xm commands - it seems to be related to a specific VM - but xl commands still work. VMs are running.

# xm list
Error: (5, 'Input/output error, while reading /local/domain/33/console/vnc-port')
Usage: xm list [options] [Domain, ...]

After killing the VM that reported error, xm commands are working again.

The BUG message at dmesg:

[66007.135552] BUG: unable to handle kernel paging request at ffff8800004ca458
[66007.135567] IP: [<ffffffff8100d4ae>] xen_set_pte+0x3e/0x4b
[66007.135580] PGD 1002067 PUD 1006067 PMD 2d78067 PTE 100000004ca025
[66007.135675] Oops: 0003 [#1] SMP DEBUG_PAGEALLOC
[66007.135686] last sysfs file: /sys/class/net/virtbr/bridge/topology_change_detected
[66007.135693] CPU 4
[66007.135698] Modules linked in: arptable_filter arp_tables bridge stp bonding bnx2i libiscsi scsi_transport_iscsi cnic uio bnx2 megaraid_sas [66007.135729] Pid: 683, comm: pageattr-test Not tainted 2.6.32.36 #7 PowerEdge M610 [66007.135735] RIP: e030:[<ffffffff8100d4ae>] [<ffffffff8100d4ae>] xen_set_pte+0x3e/0x4b
[66007.135746] RSP: e02b:ffff88007c8edbb0  EFLAGS: 00010202
[66007.135751] RAX: 0000000000e32cb6 RBX: 0000000000e32cb6 RCX: 0000000000000001 [66007.135757] RDX: 0000000000000000 RSI: 8010000800569267 RDI: ffff8800004ca458 [66007.135764] RBP: ffff88007c8edbd0 R08: 0000000000000001 R09: 0000000000000000 [66007.135770] R10: ffffffff818385f8 R11: ffffffff818385e0 R12: 8010000800569267 [66007.135776] R13: ffff8800004ca458 R14: 8010000416569067 R15: 8010000800569267 [66007.135786] FS: 00007f0eeede66e0(0000) GS:ffff88002813f000(0000) knlGS:0000000000000000
[66007.135792] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[66007.135797] CR2: ffff8800004ca458 CR3: 000000007b663000 CR4: 0000000000002660 [66007.135804] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [66007.135810] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [66007.135816] Process pageattr-test (pid: 683, threadinfo ffff88007c8ec000, task ffff88007e4ce480)
[66007.135822] Stack:
[66007.135825] 0000000000000000 8010000004569067 0000000000004569 ffff88007c8edd20 [66007.135835] <0> ffff88007c8edbe0 ffffffff81034740 ffff88007c8edbf0 ffffffff8103474d [66007.135848] <0> ffff88007c8edcf0 ffffffff81034e77 000000017c8edc40 ffffffff818385e0
[66007.135860] Call Trace:
[66007.135868]  [<ffffffff81034740>] set_pte+0x17/0x1b
[66007.135875]  [<ffffffff8103474d>] set_pte_atomic+0x9/0xb
[66007.135882]  [<ffffffff81034e77>] __change_page_attr_set_clr+0x186/0x82d
[66007.135936]  [<ffffffff8124f4a0>] ? _raw_spin_unlock+0xab/0xb1
[66007.135951]  [<ffffffff8157641f>] ? _spin_unlock+0x26/0x2a
[66007.135961]  [<ffffffff810e587d>] ? vm_unmap_aliases+0x151/0x160
[66007.135969]  [<ffffffff81035695>] change_page_attr_set_clr+0x177/0x360
[66007.135976]  [<ffffffff8103597a>] change_page_attr_set+0x27/0x29
[66007.135983]  [<ffffffff810348e2>] ? pte_flags+0x9/0x18
[66007.135990]  [<ffffffff81035c01>] do_pageattr_test+0x285/0x4b1
[66007.135998]  [<ffffffff8103597c>] ? do_pageattr_test+0x0/0x4b1
[66007.136097]  [<ffffffff8106a9c3>] kthread+0x69/0x71
[66007.136105]  [<ffffffff81013daa>] child_rip+0xa/0x20
[66007.136112]  [<ffffffff81012ee6>] ? int_ret_from_sys_call+0x7/0x1b
[66007.136119]  [<ffffffff81013726>] ? retint_restore_args+0x5/0x6
[66007.136127]  [<ffffffff81013da0>] ? child_rip+0x0/0x20
[66007.136131] Code: e8 3c ff ff ff ff 05 b6 5c 94 00 e8 31 ff ff ff 8b 1d b3 5c 94 00 e8 a2 23 02 00 ff c8 0f 94 c0 0f b6 c0 01 d8 89 05 9e 5c 94 00 <4d> 89 65 00 41 59 5b 41 5c 41 5d c9 c3 55 48 89 e5 53 89 fb 48
[66007.136273] RIP  [<ffffffff8100d4ae>] xen_set_pte+0x3e/0x4b
[66007.136281]  RSP <ffff88007c8edbb0>
[66007.136285] CR2: ffff8800004ca458
[66007.136574] ---[ end trace 4e200a271895cc90 ]---

Attached errors registered in xm dmesg and xend.log.

Attachment: bug_paging_xend-log.txt
Description: Text document

Attachment: bug_paging_xm_dmesg.txt
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel