WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] mptscsih gets SCSI I/O errors in HVM with VT-d

To: "Xen-devel@xxxxxxxxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] mptscsih gets SCSI I/O errors in HVM with VT-d
From: "Nadolski, Ed" <Ed.Nadolski@xxxxxxx>
Date: Fri, 19 Mar 2010 12:41:28 -0600
Accept-language: en-US
Acceptlanguage: en-US
Cc:
Delivery-date: Fri, 19 Mar 2010 11:42:31 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcrHk8mAafXPxuPDQyuMfFvar28zIg==
Thread-topic: mptscsih gets SCSI I/O errors in HVM with VT-d
Hi,

I am running Xen 4.0.0-rc6 on a Dell T7500 quad-core Xeon with Fedora 12 as 
dom0.  I have an LSI FC949E quad-port Fibre Channel HBA that works fine when I 
run it from either dom0 or baremetal, but when I try to assign this HBA to an 
HVM using VT-d, I see a bunch of SCSI abort/reset errors from the mptscsih 
driver in the HVM whenever I run disk I/Os thru the HBA.  The HVM OS is 
off-the-shelf Fedora 12.

Here are the mpt driver error messages from the HVM:

> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa900)
> sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 20 00 00 c0 00
> mptscsih: ioc3: WARNING - Issuing Reset from mptscsih_IssueTaskMgmt!!
> mptbase: ioc3: Initiating recovery
> mptscsih: ioc3: task abort: SUCCESS (sc=ffff88001d4fa900)
> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa500)
> sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 e0 00 01 00 00
> mptscsih: ioc3: task abort: FAILED (sc=ffff88001d4fa500)
> mptscsih: ioc3: attempting target reset! (sc=ffff88001d4fa900)
> sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 20 00 00 c0 00
> mptscsih: ioc3: target reset: SUCCESS (sc=ffff88001d4fa900)
> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fb400)
> sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd 20 00 00 c0 00
> mptscsih: ioc3: WARNING - Issuing Reset from mptscsih_IssueTaskMgmt!!
> mptbase: ioc3: Initiating recovery
> mptscsih: ioc3: task abort: SUCCESS (sc=ffff88001d4fb400)
> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa600)
> sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd e0 00 01 00 00
> mptscsih: ioc3: task abort: FAILED (sc=ffff88001d4fa600)
> mptscsih: ioc3: attempting target reset! (sc=ffff88001d4fb400)
> sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd 20 00 00 c0 00
> mptscsih: ioc3: target reset: SUCCESS (sc=ffff88001d4fb400)


Any thoughts on what could cause something like this under HVM but not 
baremetal? I will try to instrument the mptscsih driver in the HVM to get a 
better idea of what kind of I/O errors are occurring.

Interestingly, this Dell T7500 also has an onboard LSI 1068 SAS controller, 
which works fine when assigned to the HVM. So I wonder if this could have 
something to do with PCI bridging?

FWIW I've also enclosed below the lspci -vvvxxx for the HBA, both baremetal and 
in the HVM, tho I don't see anything obvious there.

Thanks,
Ed


### /etc/grub.conf entry for passthru

title Fedora-12 Xen 4.0.0-rc6 (2.6.31.12) iommu=1 
xen-pciback.hide=(24:00.0)(24:00.1)(25:00.0)(25:00.1)
        root (hd0,0)
        kernel /xen-4.0.0-rc6.gz iommu=1 acpi_skip_timer_override loglvl=all 
guest_loglvl=all sync_console console_to_ring com1=115200,8n1 console=com1
        module /vmlinuz-2.6.31.12 ro 
root=UUID=edbcbc29-f3e4-4985-80c1-3c3b0ce24d17  LANG=en_US.UTF-8 
SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us console=hvc0 
earlyprintk=xen xen-pciback.hide=(24:00.0)(24:00.1)(25:00.0)(25:00.1)
        module /initramfs-2.6.31.12.img





### lspci for HBA device on baremetal Fedora 12:
# lspci
...
24:00.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
24:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
25:00.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
25:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)

# lspci -vvvxxx -s 25:00.1
25:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
        Subsystem: LSI Logic / Symbios Logic Device 1070
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin B routed to IRQ 61
        Region 0: I/O ports at dc00 [size=256]
        Region 1: Memory at dfadc000 (64-bit, non-prefetchable) [size=16K]
        Region 3: Memory at dfaf0000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at dc100000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, 
L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ 
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Latency 
L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
        Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003000
        Capabilities: [100] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 14, GenCap+ CGenEn- ChkCap+ ChkEn-
        Kernel driver in use: mptfc
        Kernel modules: mptfc
00: 00 10 46 06 07 00 10 00 02 00 04 0c 10 00 80 00
10: 01 dc 00 00 04 c0 ad df 00 00 00 00 04 00 af df
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 10 70 10
30: 00 00 b0 df 50 00 00 00 00 00 00 00 0a 02 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 01 68 02 06 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 01 25 00 00 10 98 01 00 25 00 00 00
70: 36 28 0a 00 81 0c 00 00 40 00 81 10 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 05 b0 80 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 11 00 00 00 01 20 00 00 01 30 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



#### lspci for HBA device on HVM Fedora 12:
# lspci
...
00:04.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
00:05.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
00:06.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
00:07.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)

# lspci -vvvxxx -s 00:07.0
00:07.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter 
(rev 02)
        Subsystem: LSI Logic / Symbios Logic Device 1070
        Physical Slot: 7
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 128
        Interrupt: pin B routed to IRQ 45
        Region 0: I/O ports at c400 [size=256]
        Region 1: Memory at f344c000 (64-bit, non-prefetchable) [size=16K]
        Region 3: Memory at f3430000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at f3300000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, 
L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Latency 
L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
        Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003000
        Kernel driver in use: mptfc
        Kernel modules: mptfc
00: 00 10 46 06 07 00 10 00 02 00 04 0c 00 80 80 00
10: 01 c4 00 00 04 c0 44 f3 00 00 00 00 04 00 43 f3
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 10 70 10
30: 00 00 30 f3 50 00 00 00 00 00 00 00 05 02 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 01 68 02 06 08 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 01 25 00 00 10 98 01 00 25 00 00 00
70: 10 28 0a 00 81 0c 00 00 00 00 81 10 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 05 b0 80 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 11 00 00 00 01 20 00 00 01 30 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-devel] mptscsih gets SCSI I/O errors in HVM with VT-d, Nadolski, Ed <=