WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] Xen Dom0 crash doing some I/O with "Out of SW-IOMMU space"

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Xen Dom0 crash doing some I/O with "Out of SW-IOMMU space"
From: Ulrich Hochholdinger <uhochholdinger@xxxxxxxxxxxxx>
Date: Mon, 15 Nov 2010 18:52:05 +0100
Delivery-date: Mon, 15 Nov 2010 09:53:36 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
Hi,
My dom0 crashes while doing I/O on the local harddrive.
* System is a "Dell Poweredge R710" with an "Perc H200" controller "mpt2sas"/ 
96GB RAM / 2x XEON X5650
* Harddrives are configured as raid1. 
* OS is Debian Squeeze with
  * Xen version 4.0.1 (Debian 4.0.1-1) - amd64 (xen Option: dom0_mem=512M)
  * Dom0-Kernel (Distribution Kernel) : 2.6.32-5-xen-686 (no special Options)
* after doing some moderate I/O on the local raid1 with "dd if=/dev/zero 
of=bigfile bs=1024 count=100000" the system crashes.
* Strange: if the raid1 is degraded, the system doesn't crash, doing I/O over 
the complete Harddrive.  

Has someone an idea how to fix/workaorund this "bug"? In the meantime I tested
different setings without any success: 
- VT-d enabled / disabled / (BIOS and iommu=1)
- dom0_mem=512M (my default) different settings
- modified swiotlb (without any success)

The last lines the Kernel reports:
[ 5822.499666] mpt2sas 0000:03:00.0: DMA: Out of SW-IOMMU space for 65536 bytes.
[ 5822.499743] BUG: unable to handle kernel NULL pointer dereference at 00000008
[ 5822.499919] IP: [<e09a10a4>] _scsih_qcmd+0x412/0x4d0 [mpt2sas]
[ 5822.500024] *pdpt = 0000000001466007 *pde = 0000000000000000 
[ 5822.500147] Oops: 0000 [#1] SMP 
[ 5822.500269] last sysfs file: /sys/devices/virtual/block/md0/md/mismatch_cnt
[ 5822.500330] Modules linked in: netconsole configfs xen_evtchn xenfs fuse 
8021q garp bridge stp reiserfs loop snd_pcm snd_timer ioatdma snd soundcore 
snd_page_alloc psmouse dca dcdbas serio_raw evdev processor button power_meter 
pcspkr joydev acpi_processor ext3 jbd mbcache dm_mod raid1 md_mod sg sr_mod 
sd_mod cdrom crc_t10dif usbhid hid usb_storage uhci_hcd mpt2sas ehci_hcd 
scsi_transport_sas usbcore nls_base scsi_mod bnx2 thermal thermal_sys [last 
unloaded: netconsole]
[ 5822.502221] 
[ 5822.502272] Pid: 442, comm: md0_raid1 Not tainted (2.6.32-5-xen-686 #1) 
PowerEdge R710
[ 5822.502348] EIP: 0061:[<e09a10a4>] EFLAGS: 00010002 CPU: 1
[ 5822.502406] EIP is at _scsih_qcmd+0x412/0x4d0 [mpt2sas]
[ 5822.502462] EAX: dd9ba344 EBX: 00000009 ECX: e099b05d EDX: 14000000
[ 5822.502520] ESI: 00000000 EDI: dd145b30 EBP: 0000000f ESP: dd5efd64
[ 5822.502615]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
[ 5822.502679] Process md0_raid1 (pid: 442, ti=dd5ee000 task=c1f4f2c0 
task.ti=dd5ee000)
[ 5822.502754] Stack:
[ 5822.502804]  000000b6 dd9ba344 c1dde400 d5000000 94000000 fffffff1 bf145b00 
00000000
[ 5822.503086] <0> dd105b00 14000000 dd145b00 c1dde000 dada6240 dd9ba000 
dd9b0228 e096597b
[ 5822.503442] <0> dd0a0f90 c1dde000 de50f560 dd9ba000 e096a33c dd0a0f90 
c1dde0b0 dada6240
[ 5822.503844] Call Trace:
[ 5822.503907]  [<e096597b>] ? scsi_dispatch_cmd+0x179/0x1e5 [scsi_mod]
[ 5822.503971]  [<e096a33c>] ? scsi_request_fn+0x343/0x47a [scsi_mod]
[ 5822.504032]  [<c1131da3>] ? __generic_unplug_device+0x23/0x25
[ 5822.504091]  [<c11323a4>] ? __make_request+0x364/0x3d9
[ 5822.505487]  [<c107655b>] ? rcu_process_callbacks+0x33/0x39
[ 5822.505546]  [<c103c4f6>] ? __do_softirq+0x128/0x151
[ 5822.505605]  [<c1005fb4>] ? xen_force_evtchn_callback+0xc/0x10
[ 5822.505663]  [<c1130f81>] ? generic_make_request+0x266/0x2b4
[ 5822.505723]  [<e08f3d12>] ? flush_pending_writes+0x58/0x74 [raid1]
[ 5822.505783]  [<e08f3df3>] ? raid1d+0x61/0xccc [raid1]
[ 5822.505842]  [<c1007c85>] ? __switch_to+0x124/0x141
[ 5822.505900]  [<c1032342>] ? finish_task_switch+0x3c/0x95
[ 5822.505958]  [<c128d196>] ? schedule+0x78f/0x7dc
[ 5822.506015]  [<c1005fb4>] ? xen_force_evtchn_callback+0xc/0x10
[ 5822.506074]  [<c10066d3>] ? xen_restore_fl_direct_end+0x0/0x1
[ 5822.506133]  [<c128e2f9>] ? _spin_unlock_irqrestore+0xd/0xf
[ 5822.506192]  [<c104241a>] ? try_to_del_timer_sync+0x4f/0x56
[ 5822.506251]  [<c104242b>] ? del_timer_sync+0xa/0x14
[ 5822.506308]  [<c128d512>] ? schedule_timeout+0x89/0xb0
[ 5822.506365]  [<c10424d3>] ? process_timeout+0x0/0x5
[ 5822.506424]  [<c1005fb4>] ? xen_force_evtchn_callback+0xc/0x10
[ 5822.506483]  [<c10066dc>] ? check_events+0x8/0xc
[ 5822.506542]  [<e0acd050>] ? md_thread+0xe1/0xf8 [md_mod]
[ 5822.506601]  [<c104b0ea>] ? autoremove_wake_function+0x0/0x2d
[ 5822.506661]  [<e0accf6f>] ? md_thread+0x0/0xf8 [md_mod]
[ 5822.506718]  [<c104aeb8>] ? kthread+0x61/0x66
[ 5822.506774]  [<c104ae57>] ? kthread+0x0/0x66
[ 5822.506830]  [<c1009a67>] ? kernel_thread_helper+0x7/0x10
[ 5822.506886] Code: 08 89 eb 8b 7c 24 28 eb 48 8b 7c 24 28 e9 a9 00 00 00 8b 
44 24 04 83 fb 01 8b 88 14 02 00 00 75 06 8b 54 24 10 eb 04 8b 54 24 24 <0b> 56 
08 89 f8 ff 76 10 4b ff 76 0c ff d1 58 89 f0 5a e8 7d 4c 
[ 5822.509030] EIP: [<e09a10a4>] _scsih_qcmd+0x412/0x4d0 [mpt2sas] SS:ESP 
0069:dd5efd64
[ 5822.509183] CR2: 0000000000000008
[ 5822.509238] ---[ end trace 3c25d9a65cc7a879 ]---


Regards,
        Ulli

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>