WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Tapdisk failures / kernel general protection fault at xe

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [Xen-devel] Tapdisk failures / kernel general protection fault at xen 4.0.2rc3 / kernel pvops 2.6.32.36
From: Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Date: Thu, 14 Apr 2011 09:38:09 -0700
Cc: Gerd Jakobovitsch <gerd@xxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 14 Apr 2011 09:38:49 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110414131543.GE5548@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DA60F55.4000604@xxxxxxxxxxx> <20110414131543.GE5548@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Thu, 2011-04-14 at 09:15 -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Apr 13, 2011 at 06:02:13PM -0300, Gerd Jakobovitsch wrote:
> > I'm trying to run several VMs (linux hvm, with tapdisk:aio disks at
> > a storage over nfs) on a CentOS system, using the up-to-date version
> > of xen 4.0 / kernel pvops 2.6.32.x stable. With a configuration
> > without (most of) debug activated, I can start several instances -
> > I'm running 7 of them - but shortly afterwards the system stops
> > responding. I can't find any information on this.
> 
> First time I see it.
> > 
> > Activating several debug configuration items, among them
> > DEBUG_PAGEALLOC, I get an exception as soon as I try to start up a
> > VM. The system reboots.
> 
> Oooh, and is the log below from that situation?
> 
> Daniel, any thoughs?

---
          Unmap pages from the kernel linear mapping after free_pages().
          This results in a large slowdown, but helps to find certain types
          of memory corruption.

Stunning. Our I/O page allocator is a sort of twisted mempool. Unless
the allocation is explicitly modified in sysfs/, everything should stay
pinned. We might be just tripping over debug code alone, but I didn't
figure it out yet.

Daniel

> > 
> > Below the log from /var/log/messages:
> > 
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Created
> > /dev/xen/blktap-2/control device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Created
> > /dev/xen/blktap-2/blktap0 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Created
> > /dev/xen/blktap-2/tapdev0 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: new interface: ring:
> > 251, device: 253, minor: 0
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: I/O queue driver: lio
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: block-aio 
> > open('/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda')
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: 
> > open(/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda)
> > with O_DIRECT
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: Image size:       pre
> > sector_shift  [134217728]   post sector_shift [262144]
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: opened image
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda (1
> > users, state: 0x00000001, type: 0)
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]: VBD CHAIN:
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[4988]:
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/hda: 0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.158549] block tda:
> > sector-size: 512 capacity: 262144
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200514] general
> > protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200703] last sysfs
> > file: /sys/block/tda/removable
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200761] CPU 0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.200847] Modules linked
> > in: bridge stp bonding bnx2i libiscsi scsi_transport_iscsi cnic uio
> > bnx2 megaraid_sas
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201363] Pid: 4988,
> > comm: tapdisk2 Not tainted 2.6.32.36 #3 PowerEdge M610
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201424] RIP:
> > e030:[<ffffffff812b9c24>]  [<ffffffff812b9c24>]
> > blktap_device_end_request+0x49/0x5e
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201543] RSP:
> > e02b:ffff88006a7f7cd8  EFLAGS: 00010046
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201600] RAX:
> > 6b6b6b6b6b6b6b6b RBX: ffff88006a6fc000 RCX: ffff88006a7f7c38
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201662] RDX:
> > 0000000000000000 RSI: 0000000000000000 RDI: ffff88006a5c3500
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201723] RBP:
> > ffff88006a7f7cf8 R08: ffffffff818383c0 R09: ffff88006a7f7c38
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201784] R10:
> > 0000000000000000 R11: ffff88007b697b18 R12: ffff88007b697b18
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201845] R13:
> > ffff88006a5c3360 R14: 0000000000000000 R15: ffff88006a5c3370
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201910] FS:
> > 00007f50a9445730(0000) GS:ffff8800280c7000(0000)
> > knlGS:0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.201974] CS:  e033 DS:
> > 0000 ES: 0000 CR0: 000000008005003b
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202032] CR2:
> > 00007fb35d12e6e8 CR3: 000000006a4ce000 CR4: 0000000000002660
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202093] DR0:
> > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202154] DR3:
> > 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202436] Process
> > tapdisk2 (pid: 4988, threadinfo ffff88006a7f6000, task
> > ffff88006b5a0000)
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.202941] Stack:
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.203206]
> > ffff88006b5a0000 0000000000000000 0000000000000000 0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.203609] <0>
> > ffff88006a7f7e88 ffffffff812b9416 ffff88006a6c80f8 0000000100000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.204310] <0>
> > 00000000ffffffff ffff88006a5c3360 000000017edd7ab0 0000000000000000
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.205284] Call Trace:
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.205553]
> > [<ffffffff812b9416>] blktap_ring_ioctl+0x183/0x2d8
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.205838]
> > [<ffffffff81209a64>] ? inode_has_perm+0xa1/0xb3
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206120]
> > [<ffffffff8157641f>] ? _spin_unlock+0x26/0x2a
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206400]
> > [<ffffffff81126ff9>] ? aio_read_evt+0x56/0xe0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206678]
> > [<ffffffff81127071>] ? aio_read_evt+0xce/0xe0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.206957]
> > [<ffffffff8124f5c1>] ? _raw_spin_lock+0x77/0x12d
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.207236]
> > [<ffffffff81209bf8>] ? file_has_perm+0xb4/0xc6
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.207516]
> > [<ffffffff8110464e>] vfs_ioctl+0x5e/0x77
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.207793]
> > [<ffffffff81104b63>] do_vfs_ioctl+0x484/0x4d5
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.208069]
> > [<ffffffff81104c0b>] sys_ioctl+0x57/0x7a
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.208346]
> > [<ffffffff81012cc2>] system_call_fastpath+0x16/0x1b
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.208621] Code: 89 de 4c
> > 89 ef e8 60 f4 ff ff 49 8b 44 24 40 48 8b b8 90 04 00 00 e8 41 c9 2b
> > 00 44 89 f6 4c 89 e7 e8 39 fc ff ff 49 8b 44 24 40 <48> 8b b8 90 04
> > 00 00 e8 66 c7 2b 00 5b 41 5c 41 5d 41 5e c9 c3
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.211986] RIP
> > [<ffffffff812b9c24>] blktap_device_end_request+0x49/0x5e
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.212306]  RSP <ffff88006a7f7cd8>
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.212579] ---[ end trace
> > b97070122f44735d ]---
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: Created
> > /dev/xen/blktap-2/blktap1 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: Created
> > /dev/xen/blktap-2/tapdev1 device
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: new interface: ring:
> > 251, device: 253, minor: 1
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: I/O queue driver: lio
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: block-aio 
> > open('/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda')
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: 
> > open(/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda)
> > with O_DIRECT
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: Image size:       pre
> > sector_shift  [10737418240]         post sector_shift [20971520]
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: opened image
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda (1
> > users, state: 0x00000001, type: 0)
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]: VBD CHAIN:
> > Apr 13 17:47:23 r2b16ch2x28p2 tapdisk2[5009]:
> > /storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7253/xvda: 0
> > Apr 13 17:47:23 r2b16ch2x28p2 kernel: [  179.317931] block tdb:
> > sector-size: 512 capacity: 20971520
> > 
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel