http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1371
Summary: On heavy IO load in domU, dom0 shows a kernel bug and
domU hangs on 100% io wait
Product: Xen
Version: unstable
Platform: x86-64
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Unspecified
AssignedTo: xen-bugs@xxxxxxxxxxxxxxxxxxx
ReportedBy: michael.wyraz@xxxxxxxxxxx
System: Gentoo
Xen: 3.3.0
Kernel: 2.6.21
The virtual disks are LVM on Raid5. I tried both, tap:aio and phy: to access
the disks. The system worked well for a few weeks. but now i have the following
error every nicht. During our nightly backup of domU the dom0 shows errors in
the logfile. "top" in domU shows 100% io wait. I need to kill the domU and
reboot the host to fix it. It happend with different domUs.
>From the log:
Oct 17 00:02:20 office-host ------------[ cut here ]------------
Oct 17 00:02:20 office-host kernel BUG at fs/direct-io.c:1113!
Oct 17 00:02:20 office-host invalid opcode: 0000 [1] SMP
Oct 17 00:02:20 office-host CPU 0
Oct 17 00:02:20 office-host Modules linked in: nfs lockd nfs_acl sunrpc bridge
w83627ehf hwmon_vid hwmon i2c_isa autofs4 skge sk98lin i2c_i801 iTCO_wdt
i2c_core iTCO_vendor_support floppy ehci_hcd pcspkr uhci_hcd
Oct 17 00:02:20 office-host Pid: 7887, comm: tapdisk Not tainted 2.6.21-xen #5
Oct 17 00:02:20 office-host RIP: e030:[<ffffffff802b20e7>]
[<ffffffff802b20e7>] __blockdev_direct_IO+0xa57/0xb70
Oct 17 00:02:20 office-host RSP: e02b:ffff880035475cd8 EFLAGS: 00010202
Oct 17 00:02:20 office-host RAX: 0000000000000001 RBX: ffff8800e64b8000 RCX:
0000000000000000
Oct 17 00:02:20 office-host RDX: 0000000000000001 RSI: 0000000000000000 RDI:
0000000000000000
Oct 17 00:02:20 office-host RBP: ffff8800e64b8324 R08: ffff880000ddc838 R09:
ffff880035f014c0
Oct 17 00:02:20 office-host R10: ffff880001d4b910 R11: ffffffff80432840 R12:
0000000000000001
Oct 17 00:02:20 office-host R13: fffffffffffffdef R14: ffff8800e64b8088 R15:
0000000000000000
Oct 17 00:02:20 office-host FS: 00002b96dda23bb0(0000)
GS:ffffffff8072d000(0000) knlGS:0000000000000000
Oct 17 00:02:20 office-host CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 17 00:02:20 office-host CR2: 0000000000000000 CR3: 0000000035476000 CR4:
0000000000002620
Oct 17 00:02:20 office-host Process tapdisk (pid: 7887, threadinfo
ffff880035474000, task ffff8800cb2e8080)
Oct 17 00:02:20 office-host Stack: 0000000000001000 0000005d5a262000
ffff8801ed00be38 ffff8801ee7c8a48
Oct 17 00:02:20 office-host 0000000c00000000 0000000000000000 ffff8801ed00be38
ffff8800099f6708
Oct 17 00:02:20 office-host ffff880000000001 ffff8801edd4f3c0 0000000cffffffff
0000000000000001
Oct 17 00:02:20 office-host Call Trace:
Oct 17 00:02:20 office-host [<ffffffff802b0b95>] blkdev_direct_IO+0x45/0x50
Oct 17 00:02:20 office-host [<ffffffff802b0a90>] blkdev_get_blocks+0x0/0xc0
Oct 17 00:02:20 office-host [<ffffffff8025fb9f>]
generic_file_direct_IO+0xef/0x150
Oct 17 00:02:20 office-host [<ffffffff802612fb>]
generic_file_aio_read+0xcb/0x1b0
Oct 17 00:02:20 office-host [<ffffffff80261230>]
generic_file_aio_read+0x0/0x1b0
Oct 17 00:02:20 office-host [<ffffffff802a168f>] aio_rw_vect_retry+0x7f/0x170
Oct 17 00:02:20 office-host [<ffffffff802a1610>] aio_rw_vect_retry+0x0/0x170
Oct 17 00:02:20 office-host [<ffffffff802a21d7>] aio_run_iocb+0xc7/0x160
Oct 17 00:02:20 office-host [<ffffffff802a2ef4>] io_submit_one+0x344/0x3e0
Oct 17 00:02:20 office-host [<ffffffff802a3555>] sys_io_submit+0xb5/0x130
Oct 17 00:02:20 office-host [<ffffffff80209ae6>] system_call+0x86/0x8b
Oct 17 00:02:20 office-host [<ffffffff80209a60>] system_call+0x0/0x8b
Oct 17 00:02:20 office-host
Oct 17 00:02:20 office-host
Oct 17 00:02:20 office-host Code: 0f 0b eb fe 4d 85 e4 75 1d 48 8b 74 24 08 44
89 ea 48 89 df
Oct 17 00:02:20 office-host RIP [<ffffffff802b20e7>]
__blockdev_direct_IO+0xa57/0xb70
Oct 17 00:02:20 office-host RSP <ffff880035475cd8>
...
Oct 17 03:07:36 office-host eth0: port 4(tap5.0) entering disabled state
Oct 17 03:07:36 office-host device tap5.0 left promiscuous mode
Oct 17 03:07:36 office-host audit(1224205656.682:20): dev=tap5.0 prom=0
old_prom=256 auid=4294967295
Oct 17 03:07:36 office-host eth0: port 4(tap5.0) entering disabled state
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup:
XENBUS_PATH=backend/vfb/5/0
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup:
XENBUS_PATH=backend/console/5/0
Oct 17 03:07:36 office-host eth0: port 5(vif5.0) entering disabled state
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove
XENBUS_PATH=backend/tap/5/768
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/vif-bridge: offline
XENBUS_PATH=backend/vif/5/0
Oct 17 03:07:36 office-host device vif5.0 left promiscuous mode
Oct 17 03:07:36 office-host audit(1224205656.898:21): dev=vif5.0 prom=0
old_prom=256 auid=4294967295
Oct 17 03:07:36 office-host eth0: port 5(vif5.0) entering disabled state
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove
XENBUS_PATH=backend/tap/5/768
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove
XENBUS_PATH=backend/tap/5/832
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/blktap: remove
XENBUS_PATH=backend/tap/5/832
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup:
XENBUS_PATH=backend/tap/5/768
Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:691: Unmapping vbd
Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:312: Write_msg
called: CTLMSG_CLOSE
Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:240: DEL_DISKTYPE:
Freeing entry
Oct 17 03:07:36 office-host TAPDISK[5249]: tapdisk.c:389: Tapdisk: Received
msg, len 16, type 7, UID 2
Oct 17 03:07:36 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup:
XENBUS_PATH=backend/tap/5/832
Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:691: Unmapping vbd
Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:312: Write_msg
called: CTLMSG_CLOSE
Oct 17 03:07:36 office-host BLKTAPCTRL[4287]: blktapctrl.c:240: DEL_DISKTYPE:
Freeing entry
Oct 17 03:07:36 office-host TAPDISK[5292]: tapdisk.c:389: Tapdisk: Received
msg, len 16, type 7, UID 3
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:609: Received a poll
for a new vbd
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:56: Sent domid
12 and be_id 768
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:23:
/dev/xen/blktap3 device already exists
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:75: Received
device id 3 and major 251
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:167: Detected
handle: [aio]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:562: tapdisk process
does not exist:
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:580: Launching
process, CMDLINE [tapdisk /var/run/tap/tapctrlwrite3 /var/run/tap/tapctrlread3]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:327: Write_msg
called: CTLMSG_PID
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:426: Received
CTLMSG_PID_RSP
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:432: PID:
[15534]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:270: Write_msg
called: CTLMSG_PARAMS, sending [aio:/dev/VolData1/TermSrv2,
/dev/VolData1/TermSrv2]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:284: Generated
cookie, 22
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:400: Received
CTLMSG_IMG: 167772160, 512, 0
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:674: Received a poll
for a new devmap
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:293: Write_msg
called: CTLMSG_NEWDEV
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:411: Received
CTLMSG_NEWDEV_RSP
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:684: Exiting
map_new_blktapctrl
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:389: Tapdisk: Received
msg, len 16, type 9, UID 22
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:389: Tapdisk: Received
msg, len 39, type 1, UID 22
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:398: Received
CTLMSG_PARAMS: [/dev/VolData1/TermSrv2]
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:406: Loaded driver: name
[tapdisk_aio], type [0]
Oct 17 03:07:37 office-host TAPDISK[15534]: block-aio.c:127: block-aio
open('/dev/VolData1/TermSrv2')
Oct 17 03:07:37 office-host TAPDISK[15534]: tapaio.c:237: AIO state initialised
Oct 17 03:07:37 office-host TAPDISK[15534]: block-aio.c:147:
open(/dev/VolData1/TermSrv2) with O_DIRECT
Oct 17 03:07:37 office-host TAPDISK[15534]: block-aio.c:84: Image size:
pre sector_shift [85899345920] post sector_shift [167772160]
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:159: Adding fd_list_entry
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:421: Entered cookie 22
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:389: Tapdisk: Received
msg, len 20, type 4, UID 22
Oct 17 03:07:37 office-host TAPDISK[15534]: tapdisk.c:447: Retrieving state,
cookie 22.....[OK]
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: brctl delif
eth0 vif5.0 failed
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add
XENBUS_PATH=backend/tap/12/832
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:609: Received a poll
for a new vbd
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:56: Sent domid
12 and be_id 832
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:23:
/dev/xen/blktap4 device already exists
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl_linux.c:75: Received
device id 4 and major 251
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:167: Detected
handle: [aio]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:562: tapdisk process
does not exist:
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:580: Launching
process, CMDLINE [tapdisk /var/run/tap/tapctrlwrite4 /var/run/tap/tapctrlread4]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:327: Write_msg
called: CTLMSG_PID
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:389: Tapdisk: Received
msg, len 16, type 9, UID 23
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:426: Received
CTLMSG_PID_RSP
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:432: PID:
[15569]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:270: Write_msg
called: CTLMSG_PARAMS, sending [aio:/dev/VolStore1/TermSrv2Swap,
/dev/VolStore1/TermSrv2Swap]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:284: Generated
cookie, 23
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:389: Tapdisk: Received
msg, len 44, type 1, UID 23
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:398: Received
CTLMSG_PARAMS: [/dev/VolStore1/TermSrv2Swap]
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:406: Loaded driver: name
[tapdisk_aio], type [0]
Oct 17 03:07:37 office-host TAPDISK[15569]: block-aio.c:127: block-aio
open('/dev/VolStore1/TermSrv2Swap')
Oct 17 03:07:37 office-host TAPDISK[15569]: tapaio.c:237: AIO state initialised
Oct 17 03:07:37 office-host TAPDISK[15569]: block-aio.c:147:
open(/dev/VolStore1/TermSrv2Swap) with O_DIRECT
Oct 17 03:07:37 office-host TAPDISK[15569]: block-aio.c:84: Image size:
pre sector_shift [2147483648] post sector_shift [4194304]
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:159: Adding fd_list_entry
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:421: Entered cookie 23
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:400: Received
CTLMSG_IMG: 4194304, 512, 0
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:674: Received a poll
for a new devmap
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:293: Write_msg
called: CTLMSG_NEWDEV
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:389: Tapdisk: Received
msg, len 20, type 4, UID 23
Oct 17 03:07:37 office-host TAPDISK[15569]: tapdisk.c:447: Retrieving state,
cookie 23.....[OK]
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:411: Received
CTLMSG_NEWDEV_RSP
Oct 17 03:07:37 office-host BLKTAPCTRL[4287]: blktapctrl.c:684: Exiting
map_new_blktapctrl
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: ifconfig
vif5.0 down failed
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add
XENBUS_PATH=backend/tap/12/768
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: Successful
vif-bridge offline for vif5.0, bridge eth0.
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add
XENBUS_PATH=backend/tap/12/832
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: add
XENBUS_PATH=backend/tap/12/768
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/xen-hotplug-cleanup:
XENBUS_PATH=backend/vif/5/0
Oct 17 03:07:37 office-host device tap12.0 entered promiscuous mode
Oct 17 03:07:37 office-host audit(1224205657.330:22): dev=tap12.0 prom=256
old_prom=0 auid=4294967295
Oct 17 03:07:37 office-host eth0: port 4(tap12.0) entering learning state
Oct 17 03:07:37 office-host eth0: topology change detected, propagating
Oct 17 03:07:37 office-host eth0: port 4(tap12.0) entering forwarding state
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: online
XENBUS_PATH=backend/vif/12/0
Oct 17 03:07:37 office-host device vif12.0 entered promiscuous mode
Oct 17 03:07:37 office-host audit(1224205657.383:23): dev=vif12.0 prom=256
old_prom=0 auid=4294967295
Oct 17 03:07:37 office-host eth0: port 5(vif12.0) entering learning state
Oct 17 03:07:37 office-host eth0: topology change detected, propagating
Oct 17 03:07:37 office-host eth0: port 5(vif12.0) entering forwarding state
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: Successful
vif-bridge online for vif12.0, bridge eth0.
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/vif-bridge: Writing
backend/vif/12/0/hotplug-status connected to xenstore.
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: Writing
backend/tap/12/832/hotplug-status connected to xenstore.
Oct 17 03:07:37 office-host logger: /etc/xen/scripts/blktap: Writing
backend/tap/12/768/hotplug-status connected to xenstore.
Oct 17 03:08:14 office-host blktap: ring-ref 2047, event-channel 5, protocol 1
(unspecified, assuming native)
Oct 17 03:08:14 office-host blk_tap: unknown operation [255]
Oct 17 03:08:14 office-host blk_tap: Bad number of segments in request (0)
Oct 17 03:08:14 office-host blktap: ring-ref 1758, event-channel 6, protocol 1
(unspecified, assuming native)
Oct 17 03:08:14 office-host blk_tap: unknown operation [255]
Oct 17 03:08:14 office-host blk_tap: Bad number of segments in request (0)
...
--
Configure bugmail:
http://bugzilla.xensource.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
_______________________________________________
Xen-bugs mailing list
Xen-bugs@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-bugs
|