This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Re: Comments on Xen bug 1732

On Mon, 2011-01-31 at 13:18 +0000, Jan Beulich wrote:
> >>> On 31.01.11 at 05:54, Haitao Shan <maillists.shan@xxxxxxxxx> wrote:
> After taking a closer look:
> > As you may already notice the bug 1732, (
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1732), the culprit is
> > c/s 22182.
> The warnings are a result of the c/s, but if there are functionality
> problems, they shouldn't be caused by this: The MSI-X table's base
> address was always determined from the value passed from Dom0
> (the raw address found in the BAR) plus the table offset as found
> in the MSI-X capability structure.

Actually I have some functionality problems which coincide with these

> > I see the following attached code in your patch. It is pointless to check
> > msi->table_base against the value read from physical device if this function
> > is a virtual function of SR-IOV device. VFs are required to have BARs zeroed
> > by specifications. And for VFs, unless you can read these values from
> > corresponding PF, you will have to trust the "table_base" passed from dom0
> > via hypercall. Actually, this parameter is specifically introduced for
> > enabling SR-IOV.
> One important question then is whether there's a way for Xen to
> determine the PF for the VF and the correct BAR to use without
> additional help from Dom0. If that's not possible, passing down the
> BAR contents needed for the PBA base address calculation on a
> VF would be necessary, which would require a new sub-hypercall.

In my case (HVM) it looks like qemu has figured out the correct base
address for the PBA.

> The only exception to this would be if both use the same BAR (and
> really if that's a common case, a simple initial fix could be to use
> the passed down table_base value also for pba_paddr if the two
> BIRs match).
> In any case I am of the opinion that all of the warnings make
> sense currently, with the sole exception of the VF case of the
> msi->table_base != read_pci_mem_bar() one (avoiding this
> would require Xen to at least have a way to recognize a given
> <bus>:<dev>.<func> is a VF).

I see

> > BTW: I vaguely recall that MSI-X table base might not be the first page of
> > the corresponding BAR register.
> Indeed - that's what is being accounted for using table_offset (read
> from MSI-X capability structure + msix_table_offset_reg()).

In my case the device is ixgbe and yes, it seems to follow the 8KB
aligning recommendation.

The actual symptom I am having is a lot of stuff like this in the guest
with VF passed-through:

ixgbevf: eth: ixgbevf_reset: PF still resetting
ixgbevf: eth: ixgbevf_open: Unable to start - perhaps the PFDriver isn't up yet
ixgbevf: eth: ixgbevf_check_tx_hang: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH, TDT             <0>, <1>
  next_to_use          <1>
  next_to_clean        <0>
  time_stamp           <fffd2f6b>
  jiffies              <fffd3db4>
ixgbevf: eth: ixgbevf_clean_tx_irq: tx hang 3 detected, resetting adapter
ixgbevf: eth: ixgbevf_watchdog_task: NIC Link is Up 10 Gbps

And correspondingly no Tx or Rx traffic at all. It all seems very much
like a lack of interrupts, but /proc/interrupts shows good numbers:

201:        146       PCI-MSI-X  eth-rx-0
209:        140       PCI-MSI-X  eth-tx-0
217:          8       PCI-MSI-X  eth:mbx

Furthermore this used to work on xen 3.4 but fails on 4.1 so it seems to
be a regression. One other notable change is the assignments of the
MSI-X vectors that I see when hitting the Q debug key:

On 3.4:
(XEN) 04:10.0 - dom 1   - MSIs < 66 74 82 >

On 4.1:
(XEN) 04:10.1 - dom 0   - MSIs < 117 118 119 >

However qemu seems happy with it all in either case:

Mar 15 18:00:30 localhost qemu.1[10344]: pt_register_regions: IO region 
registered (size=0x00004000 base_addr=0xdd700004) 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_register_regions: IO region 
registered (size=0x00004000 base_addr=0xdd800004) 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: get MSI-X table bar base 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: table_off = 0, 
total_entries = 3 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: errno = 2 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: mapping physical MSI-X 
table to b5d91000 
Mar 15 18:00:30 localhost qemu.1[10344]: register_real_device: Real physical 
device 04:10.1 registered successfuly! IRQ type = INTx 
Mar 15 18:01:13 localhost qemu.1[10344]: pt_msix_update_one: Update msix entry 
0 with pirq 56 gvec b9 
Mar 15 18:01:13 localhost qemu.1[10344]: pt_msix_update_one: Update msix entry 
1 with pirq 55 gvec c1 
Mar 15 18:01:13 localhost qemu.1[10344]: pt_msix_update_one: Update msix entry 
2 with pirq 54 gvec c9 

Any ideas?


Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>