Hi list,
I'm having some problems trying to pass through a Mellaxnox ConnectX HCA
to a domU.
This is on Xen 4.0.1, with the latest Debian Testing packages:
ii xen-hypervisor-4.0-amd64 4.0.1-2
ii linux-image-2.6.32-5-xen-amd64 2.6.32-30
The hardware is Supermicro H8DGT-HIBQF, BIOS revision 1.0c (date 10/29/10).
It has two AMD Opteron 6128 CPUs, for a total of 16 cores. The machine has
32GiB of ram. The Mellannox adapter looks like this in the dom0:
02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
5GT/s - IB QDR / 10GigE] (rev b0)
Subsystem: Super Micro Computer Inc Device 0048
Flags: fast devsel, IRQ 19
Memory at fea00000 (64-bit, non-prefetchable) [size=1M]
Memory at fc800000 (64-bit, prefetchable) [size=8M]
Capabilities: [40] Power Management version 3
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable- Count=256 Masked-
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [100] Alternative Routing-ID Interpretation (ARI)
Kernel driver in use: pciback
I've attached the output of xm dmesg (xm.dmesg.txt).
I have the following in the domU config files:
pci = ['0000:02:00.0']
I've attached the boot log from trying to boot the same kernel as a HVM guest
(testsqueezehvm.bootlog.txt). Doing so generates these four lines of output
in xm dmesg:
(XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault
address:0x255c000
(XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault
address:0x255c080
(XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault
address:0x255c040
(XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault
address:0x255c0c0
The mlx4_core driver in the domU is not happy:
[ 0.411867] mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007)
[ 0.411879] mlx4_core: Initializing 0000:00:00.0
[ 0.412027] mlx4_core 0000:00:00.0: enabling device (0000 -> 0002)
[ 0.412027] mlx4_core 0000:00:00.0: Xen PCI enabling IRQ: 19
[ 1.417477] mlx4_core 0000:00:00.0: Installed FW has unsupported command
interface revision 0.
[ 1.417509] mlx4_core 0000:00:00.0: (Installed FW version is 0.0.000)
[ 1.417527] mlx4_core 0000:00:00.0: This driver version supports only
revisions 2 to 3.
[ 1.417549] mlx4_core 0000:00:00.0: QUERY_FW command failed, aborting.
When trying to boot a PV domU with kernel options iommu=soft and
swiotlb=force, the output is slightly different. The full bootlog is attached
(testsqueeze.bootlog.txt). Here's the relevant excerpt:
[ 0.441684] mlx4_core: Mellanox ConnectX core driver v1.0-ofed1.5.2
(August 4, 2010)
[ 0.441696] mlx4_core: Initializing 0000:00:00.0
[ 0.442044] mlx4_core 0000:00:00.0: enabling device (0000 -> 0002)
[ 0.442741] mlx4_core 0000:00:00.0: Xen PCI enabling IRQ: 19
[ 2.752125] mlx4_core 0000:00:00.0: NOP command failed to generate MSI-X
interrupt IRQ 54).
[ 2.752158] mlx4_core 0000:00:00.0: Trying again without MSI-X.
[ 2.884105] mlx4_core 0000:00:00.0: NOP command failed to generate
interrupt (IRQ 54), aborting.
[ 2.884138] mlx4_core 0000:00:00.0: BIOS or ACPI interrupt routing
problem?
[ 2.916920] mlx4_core: probe of 0000:00:00.0 failed with error -16
And xm dmesg quickly fills up with many, many lines like this:
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43000
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43020
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43040
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43060
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43080
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a430a0
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a430c0
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a430e0
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43100
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43120
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43140
(XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault
address:0x70a4309170a43160
...
Booting a PV domU with only the swiotlb=force option makes the output much
more like the HVM output.
Any thoughts on what could be going on here?
Thanks,
Ward.
xm.dmesg.txt
Description: Text document
testsqueeze.bootlog.txt
Description: Text document
testsqueezehvm.bootlog.txt
Description: Text document
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|