WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Kernel Panic in xen-blkfront.c:blkif_queue_request under

To: Greg Harris <greg.harris@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Kernel Panic in xen-blkfront.c:blkif_queue_request under 2.6.28
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Sun, 01 Feb 2009 22:19:56 -0800
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jens Axboe <jens.axboe@xxxxxxxxxx>
Delivery-date: Sun, 01 Feb 2009 22:20:32 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <2411544.7918341233327005412.JavaMail.root@ouachita>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <2411544.7918341233327005412.JavaMail.root@ouachita>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.19 (X11/20090105)
Greg Harris wrote:
Hi,

I've run into several panics around the Xen block frontend driver in 2.6.28.  
It appears that when the kernel issues block queue requests in 
blkif_queue_request the number of segments exceeds 
BLKIF_MAX_SEGMENTS_PER_REQUEST triggering the panic despite the call to set the 
maximum number of segments during the queue initialization 
(xlvdb_init_blk_queue calls blk_queue_max_phys_segments and 
blk_queue_max_hw_segments with BLKIF_MAX_SEGMENTS_PER_REQUEST as parameters).

I've got a few reports if this, but I haven't managed to reproduce it myself. But from this description it sounds like the problem is in the upper layers presenting too many segments, rather than a bug in the block driver itself. Jens, does that sound likely?

Thanks,
   J

Attached are two panics:

kernel BUG at drivers/block/xen-blkfront.c:243!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/xvda/dev
CPU 0
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.28-metacarta-appliance-1 #2
RIP: e030:[<ffffffff804077c0>]  [<ffffffff804077c0>]
do_blkif_request+0x2f0/0x380
RSP: e02b:ffffffff80865dd8  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff880366ee33c0 RCX: ffff880366ee33c0
RDX: ffff880366f15d90 RSI: 000000000000000a RDI: 0000000000000303
RBP: ffff88039d78b190 R08: 0000000000001818 R09: ffff88038fb7a9e0
R10: 0000000000000004 R11: 000000000000001a R12: 0000000000000303
R13: 0000000000000001 R14: ffff880366f15da0 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff807a1980(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000f7f54444 CR3: 00000003977e5000 CR4: 0000000000002620
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff807a6000, task ffffffff806f0360)
Stack:
 000000000000004c ffff88038fb7a9e0 ffff88039a5f4000 ffff880366f123b8
 0000000680298eec 000000000000000f ffff88039a5f4000 0000000066edc808
 ffff880366ee33c0 ffff88038fb7aa00 ffffffff00000001 ffff88038fb7a9e0
Call Trace:
 <IRQ> <0> [<ffffffff8036fa45>] ? blk_invoke_request_fn+0xa5/0x110
 [<ffffffff80407868>] ? kick_pending_request_queues+0x18/0x30
 [<ffffffff80407a17>] ? blkif_interrupt+0x197/0x1e0
 [<ffffffff8026cc59>] ? handle_IRQ_event+0x39/0x80
 [<ffffffff8026f016>] ? handle_level_irq+0x96/0x120
 [<ffffffff802140d5>] ? do_IRQ+0x85/0x110
 [<ffffffff803c8315>] ? xen_evtchn_do_upcall+0xe5/0x130
 [<ffffffff802461f7>] ? __do_softirq+0xe7/0x180
 [<ffffffff8059f3ee>] ? xen_do_hypervisor_callback+0x1e/0x30
 <EOI> <0> [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
 [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
 [<ffffffff8020de8c>] ? xen_safe_halt+0xc/0x20
 [<ffffffff8020c1fa>] ? xen_idle+0x2a/0x50
 [<ffffffff80210041>] ? cpu_idle+0x41/0x70
Code: fa d0 00 00 00 48 8d bc 07 88 00 00 00 e8 b9 dd f7 ff 8b 7c 24 54 e8 90 fb
fb ff ff 44 24 24 e9 3b fd ff ff 0f 0b eb fe 66 66 90 <0f> 0b eb fe 48 8b 7c 24
30 48 8b 54 24 30 b9 0b 00 00 00 48 c7
RIP  [<ffffffff804077c0>] do_blkif_request+0x2f0/0x380
 RSP <ffffffff80865dd8>
Kernel panic - not syncing: Fatal exception in interrupt

kernel BUG at drivers/block/xen-blkfront.c:243!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/xvda/dev
CPU 0
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.28-metacarta-appliance-1 #2
RIP: e030:[<ffffffff804077c0>]  [<ffffffff804077c0>]
do_blkif_request+0x2f0/0x380
RSP: e02b:ffffffff80865dd8  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff880366f2a9c0 RCX: ffff880366f2a9c0
RDX: ffff880366f233b0 RSI: 000000000000000a RDI: 0000000000000168
RBP: ffff88039d895cf0 R08: 0000000000000b40 R09: ffff88038fb029e0
R10: 000000000000000f R11: 000000000000001a R12: 0000000000000168
R13: 0000000000000001 R14: ffff880366f233c0 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff807a1980(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000f7f9c444 CR3: 000000039e7ea000 CR4: 0000000000002620
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff807a6000, task ffffffff806f0360)
Stack:
 000000000000004c ffff88038fb029e0 ffff88039a5d8000 ffff880366f11938
 0000000980298eec 000000000000001d ffff88039a5d8000 000004008036f3da
 ffff880366f2a9c0 ffff88038fb02a00 ffffffff00000001 ffff88038fb029e0
Call Trace:
 <IRQ> <0> [<ffffffff8036fa45>] ? blk_invoke_request_fn+0xa5/0x110
 [<ffffffff80407868>] ? kick_pending_request_queues+0x18/0x30
 [<ffffffff80407a17>] ? blkif_interrupt+0x197/0x1e0
 [<ffffffff8026cc59>] ? handle_IRQ_event+0x39/0x80
 [<ffffffff8026f016>] ? handle_level_irq+0x96/0x120
 [<ffffffff802140d5>] ? do_IRQ+0x85/0x110
 [<ffffffff803c8315>] ? xen_evtchn_do_upcall+0xe5/0x130
 [<ffffffff802461f7>] ? __do_softirq+0xe7/0x180
 [<ffffffff8059f3ee>] ? xen_do_hypervisor_callback+0x1e/0x30
 <EOI> <0> [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
 [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
 [<ffffffff8020de8c>] ? xen_safe_halt+0xc/0x20
 [<ffffffff8020c1fa>] ? xen_idle+0x2a/0x50
 [<ffffffff80210041>] ? cpu_idle+0x41/0x70
Code: fa d0 00 00 00 48 8d bc 07 88 00 00 00 e8 b9 dd f7 ff 8b 7c 24 54 e8 90 fb
fb ff ff 44 24 24 e9 3b fd ff ff 0f 0b eb fe 66 66 90 <0f> 0b eb
fe 48 8b 7c 24 30 48 8b 54 24 30 b9 0b 00 00 00 48 c7
RIP  [<ffffffff804077c0>] do_blkif_request+0x2f0/0x380
 RSP <ffffffff80865dd8>
Kernel panic - not syncing: Fatal exception in interrupt

We've encountered the a similar panic using Xen 3.2.1 (debian-backports, 
2.6.18-6-xen-amd64 kernel) and Xen 3.2.0 (Ubuntu Hardy, 2.6.24-23-xen kernel) 
running in para-virtual mode.  The source around the line referenced in the 
panic is:

rq_for_each_segment(bvec, req, iter) {
                BUG_ON(ring_req->nr_segments == BLKIF_MAX_SEGMENTS_PER_REQUEST);
                ...
                handle the segment
                ...
                ring_req->nr_segments++;
}

I'm able to reliably reproduce this panic through a certain workload (usually 
through creating file-systems) if anyone would like me to do further debugging.

Thanks,
---

Greg Harris
System Administrator
MetaCarta, Inc.

(O) +1 (617) 301-5530
(M) +1 (781) 258-4474


---

Greg Harris
System Administrator
MetaCarta, Inc.

(O) +1 (617) 301-5530
(M) +1 (781) 258-4474

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel