This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [RFC PATCH linux-2.6.18-xen] pciback: clean up (MSI-X v

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] [RFC PATCH linux-2.6.18-xen] pciback: clean up (MSI-X vec, entrynr) list when resetting PCI device
From: Laszlo Ersek <lersek@xxxxxxxxxx>
Date: Wed, 01 Jun 2011 16:12:19 +0200
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>, Drew Jones <drjones@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 01 Jun 2011 07:12:03 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DE653290200007800044DE4@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DE60EF8.5060902@xxxxxxxxxx> <4DE653290200007800044DE4@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.10
On 06/01/11 14:56, Jan Beulich wrote:
On 01.06.11 at 12:05, Laszlo Ersek<lersek@xxxxxxxxxx>  wrote:

domU: igbvf_shutdown()
                     dev, PCI_COMMAND,
                     orig&  ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)
dom0:                   pciback_do_op()
                               command_write() [via PCI_COMMAND funcptr]
                                     dev->msix_enabled = 0;

The final assignment above precludes c/s 1070 from doing the job.

That seems to be a problem in the PCI subsystem, in that
disable_msi_mode() is being used from pci_disable_device() (instead
of pci_disable_msi{,x}()), and does not seem to be done in a similarly
bad way in newer kernels.

But does linux-2.6.18-xen avoid the problem? I think its pci_disable_device() still calls disable_msi_mode(), and the latter also only read/writes config words.

So I wonder whether you shouldn't fix
pci_disable_device() instead, or alternatively move the vector
de-allocation (and then for MSI and MSI-X) into disable_msi_mode().

Yes, I did think of that, however IMO that would introduce exactly the problem that you describe below. If either pci_disable_device() or disable_msi_mode() frees the vector, then re-enabling the device can face yet another stumbling block.

I liken this a bit to the UNIX(R) signals -- the allocation/mapping of the MSI-X vectors is the "signal disposition" (= signal action, signal delivery), while the PCI dev's configuration -- whether it's allowed to generate such interrupts -- is almost like the "signal mask". You can have the vectors allocated / mapped and the device can still be told not to generate those interrupts.

In that sense dev->msix_enabled has a split personality (mixed responsibilities).

While the approach you take covers the guest shutdown/restart
case, what if the guest called its pci_disable_device() at runtime,
expecting to be able to call pci_enable_{device,msi,msix}() on it
again later on? Your newly added function would also be called here,

May I ask how? The function is only called from pciback_reset_device(), which in turn is called from drivers/xen/pciback/pci_stub.c,

  pcistub_put_pci_dev         EXTERN
  pcistub_device_get_pci_dev  EXTERN
  pcistub_remove              DRIVEROP
  permissive_add              EXTERN DRIVER_ATTR(permissive)
  pciback_init                MODULE_INIT
  pcistub_probe               DRIVEROP
  pcistub_get_pci_dev_by_slot EXTERN
  pcistub_get_pci_dev         EXTERN

I did a cursory search for '\<pcistub_', and it seems to me all these functions are only called from functions like

  pciback_release_pci_dev (passthrough.c, slot.c and vpci.c)
  pciback_release_devices (passthrough.c, slot.c and vpci.c)

In my understanding we have no problems reported about the insertion/removal of the pciback module, or device (de)assignment to / from it.

but in this situation you would have to call into the hypervisor (as
the domain is still alive), i.e. you could as well call

I considered taking c/s 1070 and simply removing the ifs around the pci_disable_msi[x]() calls. However, msi_get_dev_owner(), called by msi_unmap_pirq(), could find no owner domain for the device and return DOMID_SELF. (This is different in upstream I think, see c/s 680.) Then msi_unmap_pirq() would try to unmap a PIRQ for dom0.

Additionally, while covering the MSI-X case, I don't see how the
similar already-mapped case would be cleanly handled for MSI.

The target is "cleanup after shutdown in such a way that it doesn't hurt in other cases either", so:

- it is assumed the hypervisor takes care of the mappings when the domain goes away,

- dom0 has no filtering list for MSI. msi_capability_init() "simply" asks the hypervisor for a vector, while msix_capability_init() "gets in the way".

The sole purpose of the patch is to trim the MSI-X "filtering" list (msi_dev_head) after the domain is gone. I'm not bold enough to yank out the filtering altogether, even though I think it only does damage wrt. reboots -- it must have been originally meant to catch the case when the *same* guest tries to ask for a set of MSI-X entries for a device, in such a way that the requested set is not distinct from entries requested previously for the same device.

The fact that the domid of the device-owning domain is not saved anywhere in the data structure (... at least in RHEL-5) seems to confirm this -- the current implementation has no way to detect that the owning domain has changed; I think it's even oblivious to this possiblity.

I'm looking for the best (... least wrong) location to place the cleanup at; c/s 1070 suggested pciback_reset_device().


Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>