WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] iommuu/vt-d issues with LSI MegaSAS (PERC5i)

To: "M. Nunberg" <mnunberg@xxxxxxxxxxxx>, xiantao.zhang@xxxxxxxxx, allen.m.kay@xxxxxxxxx, weidong.han@xxxxxxxxx
Subject: Re: [Xen-devel] iommuu/vt-d issues with LSI MegaSAS (PERC5i)
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Tue, 1 Jun 2010 11:25:33 -0400
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 01 Jun 2010 08:27:21 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1275244385.6240.1.camel@debmed>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1275143477.15573.19.camel@debmed> <20100529152010.GW17817@xxxxxxxxxxx> <1275244385.6240.1.camel@debmed>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.19 (2009-01-05)
.. snip ..
> (XEN) [VT-D]iommu.c:1332: d0:PCI: map bdf = 0:1f.2
> (XEN) [VT-D]iommu.c:1332: d0:PCI: map bdf = 0:1f.3
> (XEN) [VT-D]iommu.c:1325: d0:PCIe: map bdf = 1:0.0
> (XEN) [VT-D]iommu.c:1325: d0:PCIe: map bdf = 3:0.0
> (XEN) [VT-D]iommu.c:1332: d0:PCI: map bdf = 5:e.0

Good, the entry is there ..
> (XEN) PCI add device 05:0e.0

.. and the kernel notifies Xen that it is going to use it. 


But then when it tries to do DMA operations:
> Already setup the GSI :18
> megaraid_sas 0000:05:0e.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
> megasas: FW now in Ready state
> calling  e1000_init_module+0x0/0(XEN) [VT-D]iommu.c:821:
> iommu_fault_status: Primary Pending Fault
> x70 [e1000e] @ 6(XEN) [VT-D]iommu.c:796: DMAR:[DMA Write] Request device
> [05:08.0] fault addr 2d0842000, iommu reg = ffff82c3fff57000
> (XEN) DMAR:[fault reason 02h] Present bit in context entry is clear
> 76
> (XEN) print_vtd_entries: iommu = ffff83013ff7b130 bdf = 5:8.0 gmfn =
> 2d0842
> scsi6 : LSI SAS (XEN)     root_entry = ffff83013ff37000
> based MegaRAID d(XEN)     root_entry[5] = 13cf13001
> river
> (XEN)     context = ffff83013cf13000
> initcall megasas(XEN)     context[40] = 0_0
> _init+0x0/0x16f (XEN)     ctxt_entry[40] not present
> [megaraid_sas] r(XEN) [VT-D]iommu.c:821: iommu_fault_status: Primary
> Pending Fault
> eturned 0 after (XEN) [VT-D]iommu.c:796: DMAR:[DMA Write] Request device
> [05:08.0] fault addr 2d0842000, iommu reg = ffff82c3fff57000
> (XEN) DMAR:[fault reason 02h] Present bit in context entry is clear
> 73201 usecs
> (XEN) print_vtd_entries: iommu = ffff83013ff7b130 bdf = 5:8.0 gmfn =
> 2d0842
> scsi scan: INQUI(XEN)     root_entry = ffff83013ff37000
> RY result too sh(XEN)     root_entry[5] = 13cf13001
> ort (5), using 3(XEN)     context = ffff83013cf13000
> 6
> (XEN)     context[40] = 0_0
> scsi 6:0:0:0: Di(XEN)     ctxt_entry[40] not present
> rect-Access                                    PQ: 0 ANSI: 0
> scsi scan: INQUIRY result too short (5), using 36
> scsi 6:0:1:0: Direct-Access                                    PQ: 0
> ANSI: 0
> scsi scan: INQUIRY result too short (5), using 36
> scsi 6:0:2:0: Direct-Access                                    PQ: 0
> ANSI: 0
> scsi scan: INQUIRY result too short (5), using 36
> scsi 6:0:3:0: Direct-Access                                    PQ: 0
> ANSI: 0
> scsi 6:2:0:0: Direct-Access     DELL     PERC 5/i         1.03 PQ: 0
> ANSI: 5
> e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
> e1000e: Copyright (c) 1999-2008 Intel Corporation.
> e1000e 0000:07:00.0: Disabling L1 ASPM
> xen: registering gsi 16 triggering 0 polarity 1
> xen_allocate_pirq: returning irq 16 for gsi 16
> xen: --> irq=16
> Already setup the GSI :16
> e1000e 0000:07:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> e1000e 0000:07:00.0: setting latency timer to 64
>   alloc irq_desc for 4260 on node 0
>   alloc kstat_irqs on node 0
>   alloc irq_desc for 4259 on node 0
>   alloc kstat_irqs on node 0
>   alloc irq_desc for 4258 on node 0
> sd 6:2:0:0: Attached scsi generic sg2 type 0
> scsi_scan_6 used(XEN) [VT-D]iommu.c:821: iommu_fault_status: Primary
> Pending Fault
>  greatest stack (XEN) [VT-D]iommu.c:796: DMAR:[DMA Write] Request device
> [05:08.0] fault addr 2cfa6e000, iommu reg = ffff82c3fff57000
> (XEN) DMAR:[fault reason 02h] Present bit in context entry is clear
> depth: 4088 byte(XEN) print_vtd_entries: iommu = ffff83013ff7b130 bdf =
> 5:8.0 gmfn = 2cfa6e
> s left
> (XEN)     root_entry = ffff83013ff37000
> sd 6:2:0:0: [sdb(XEN)     root_entry[5] = 13cf13001
> ] 211550208 512-(XEN)     context = ffff83013cf13000
> byte logical blo(XEN)     context[40] = 0_0
> cks: (108 GB/100(XEN)     ctxt_entry[40] not present
>  GiB)
> sd 6:2:0:0: [sdb] Write Protect is off
> sd 6:2:0:0: [sdb] Mode Sense: 1f 00 00 08
> sd 6:2:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't
> support DPO or FUA
>  sdb: unknown partition table
>   alloc kstat_irqs on node 0
> sd 6:2:0:0: [sdb] Attached SCSI disk

And here the driver sends the SCSI INQUIRY command and they are
truncated. Which probably means that the page where the data was
expected never got any data (which is exactly what the VT-d chipset did
since it does not have a entry for that page and canceled the DMA
operation).

So, Xen prints it out and confirms this: we get "ctxt_entry[40] not present"
which means that there hasn't been an entry for it.

And root_entry[5] is set to 13cf13001, which is definitely above the 4GB
mark. The Xen hypervisor looks to be setting 1:1 mapping for every page
up to 'max_page', which is the max page (would be ~2937039) so it would
include the 4GB. (iommu_set_dom0_mapping code in
rivers/passthrough/vtd/x86/vtd.c).  and the 13cf13001 is within the E820
entry for RAM, so it _ought_ to have an entry.

I can think of four things:
 1). The driver is buggy. It is using the wrong DMA address. This can be
verified if you do two things: compile your pv_ops kernel with
CONFIG_DMAR and boot it bare metal.

2). The mapping in Xen VT-D IOMMU code for  4GB is somehow busted.
You can check that by including two options on your Linux kernel line:
"iommu=soft swiotlb=force". 
Or you can make Dom0 have a limited amount of memory by setting this on
your Xen line: 'dom0_mem=max:4GB' which will limit Dom0 to only 4GB.

3). Maybe the hardware is busted?

4). CC-ing the Intel folks. They might have some ideas here too. Also my
analysis could be flawed - the hardware I have is busted so I only know
of non-working conditions :-(


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel