[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dealing with SIOV/IMS


  • To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Demi Marie Obenour <demiobenour@xxxxxxxxx>
  • Date: Thu, 1 Jan 2026 20:01:51 -0500
  • Autocrypt: addr=demiobenour@xxxxxxxxx; keydata= xsFNBFp+A0oBEADffj6anl9/BHhUSxGTICeVl2tob7hPDdhHNgPR4C8xlYt5q49yB+l2nipd aq+4Gk6FZfqC825TKl7eRpUjMriwle4r3R0ydSIGcy4M6eb0IcxmuPYfbWpr/si88QKgyGSV Z7GeNW1UnzTdhYHuFlk8dBSmB1fzhEYEk0RcJqg4AKoq6/3/UorR+FaSuVwT7rqzGrTlscnT DlPWgRzrQ3jssesI7sZLm82E3pJSgaUoCdCOlL7MMPCJwI8JpPlBedRpe9tfVyfu3euTPLPx wcV3L/cfWPGSL4PofBtB8NUU6QwYiQ9Hzx4xOyn67zW73/G0Q2vPPRst8LBDqlxLjbtx/WLR 6h3nBc3eyuZ+q62HS1pJ5EvUT1vjyJ1ySrqtUXWQ4XlZyoEFUfpJxJoN0A9HCxmHGVckzTRl 5FMWo8TCniHynNXsBtDQbabt7aNEOaAJdE7to0AH3T/Bvwzcp0ZJtBk0EM6YeMLtotUut7h2 Bkg1b//r6bTBswMBXVJ5H44Qf0+eKeUg7whSC9qpYOzzrm7+0r9F5u3qF8ZTx55TJc2g656C 9a1P1MYVysLvkLvS4H+crmxA/i08Tc1h+x9RRvqba4lSzZ6/Tmt60DPM5Sc4R0nSm9BBff0N m0bSNRS8InXdO1Aq3362QKX2NOwcL5YaStwODNyZUqF7izjK4QARAQABzTxEZW1pIE1hcmll IE9iZW5vdXIgKGxvdmVyIG9mIGNvZGluZykgPGRlbWlvYmVub3VyQGdtYWlsLmNvbT7CwXgE EwECACIFAlp+A0oCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJELKItV//nCLBhr8Q AK/xrb4wyi71xII2hkFBpT59ObLN+32FQT7R3lbZRjVFjc6yMUjOb1H/hJVxx+yo5gsSj5LS 9AwggioUSrcUKldfA/PKKai2mzTlUDxTcF3vKx6iMXKA6AqwAw4B57ZEJoMM6egm57TV19kz PMc879NV2nc6+elaKl+/kbVeD3qvBuEwsTe2Do3HAAdrfUG/j9erwIk6gha/Hp9yZlCnPTX+ VK+xifQqt8RtMqS5R/S8z0msJMI/ajNU03kFjOpqrYziv6OZLJ5cuKb3bZU5aoaRQRDzkFIR 6aqtFLTohTo20QywXwRa39uFaOT/0YMpNyel0kdOszFOykTEGI2u+kja35g9TkH90kkBTG+a EWttIht0Hy6YFmwjcAxisSakBuHnHuMSOiyRQLu43ej2+mDWgItLZ48Mu0C3IG1seeQDjEYP tqvyZ6bGkf2Vj+L6wLoLLIhRZxQOedqArIk/Sb2SzQYuxN44IDRt+3ZcDqsPppoKcxSyd1Ny 2tpvjYJXlfKmOYLhTWs8nwlAlSHX/c/jz/ywwf7eSvGknToo1Y0VpRtoxMaKW1nvH0OeCSVJ itfRP7YbiRVc2aNqWPCSgtqHAuVraBRbAFLKh9d2rKFB3BmynTUpc1BQLJP8+D5oNyb8Ts4x Xd3iV/uD8JLGJfYZIR7oGWFLP4uZ3tkneDfYzsFNBFp+A0oBEAC9ynZI9LU+uJkMeEJeJyQ/ 8VFkCJQPQZEsIGzOTlPnwvVna0AS86n2Z+rK7R/usYs5iJCZ55/JISWd8xD57ue0eB47bcJv VqGlObI2DEG8TwaW0O0duRhDgzMEL4t1KdRAepIESBEA/iPpI4gfUbVEIEQuqdqQyO4GAe+M kD0Hy5JH/0qgFmbaSegNTdQg5iqYjRZ3ttiswalql1/iSyv1WYeC1OAs+2BLOAT2NEggSiVO txEfgewsQtCWi8H1SoirakIfo45Hz0tk/Ad9ZWh2PvOGt97Ka85o4TLJxgJJqGEnqcFUZnJJ riwoaRIS8N2C8/nEM53jb1sH0gYddMU3QxY7dYNLIUrRKQeNkF30dK7V6JRH7pleRlf+wQcN fRAIUrNlatj9TxwivQrKnC9aIFFHEy/0mAgtrQShcMRmMgVlRoOA5B8RTulRLCmkafvwuhs6 dCxN0GNAORIVVFxjx9Vn7OqYPgwiofZ6SbEl0hgPyWBQvE85klFLZLoj7p+joDY1XNQztmfA rnJ9x+YV4igjWImINAZSlmEcYtd+xy3Li/8oeYDAqrsnrOjb+WvGhCykJk4urBog2LNtcyCj kTs7F+WeXGUo0NDhbd3Z6AyFfqeF7uJ3D5hlpX2nI9no/ugPrrTVoVZAgrrnNz0iZG2DVx46 x913pVKHl5mlYQARAQABwsFfBBgBAgAJBQJafgNKAhsMAAoJELKItV//nCLBwNIP/AiIHE8b oIqReFQyaMzxq6lE4YZCZNj65B/nkDOvodSiwfwjjVVE2V3iEzxMHbgyTCGA67+Bo/d5aQGj gn0TPtsGzelyQHipaUzEyrsceUGWYoKXYyVWKEfyh0cDfnd9diAm3VeNqchtcMpoehETH8fr RHnJdBcjf112PzQSdKC6kqU0Q196c4Vp5HDOQfNiDnTf7gZSj0BraHOByy9LEDCLhQiCmr+2 E0rW4tBtDAn2HkT9uf32ZGqJCn1O+2uVfFhGu6vPE5qkqrbSE8TG+03H8ecU2q50zgHWPdHM OBvy3EhzfAh2VmOSTcRK+tSUe/u3wdLRDPwv/DTzGI36Kgky9MsDC5gpIwNbOJP2G/q1wT1o Gkw4IXfWv2ufWiXqJ+k7HEi2N1sree7Dy9KBCqb+ca1vFhYPDJfhP75I/VnzHVssZ/rYZ9+5 1yDoUABoNdJNSGUYl+Yh9Pw9pE3Kt4EFzUlFZWbE4xKL/NPno+z4J9aWemLLszcYz/u3XnbO vUSQHSrmfOzX3cV4yfmjM5lewgSstoxGyTx2M8enslgdXhPthZlDnTnOT+C+OTsh8+m5tos8 HQjaPM01MKBiAqdPgksm1wu2DrrwUi6ChRVTUBcj6+/9IJ81H2P2gJk3Ls3AVIxIffLoY34E +MYSfkEjBz0E8CLOcAw7JIwAaeBT
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 02 Jan 2026 01:02:21 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 1/1/26 17:56, Marek Marczykowski-Górecki wrote:
> Hi,
> 
> I've got yet another report[1] of device failing because (I assume) the
> drivers reads MSI/MSI-X values (thinking it sees values actually set in
> the HW) and then pass them to the device via some alternative means.
> IIUC this is what IMS does.
> 
> I'm interested in two things:
> 1. Some plan for a long term solution - it was briefly discussed on
> XenDevel matrix room in September, Roger said:
> 
>> urg, that's the spec that also defines IMS IIRC?  I think the only way
>> to support anything like that is using vfio/mdev and re-using the
>> drivers from Linux.  There's too much device-specific magic to
>> implement any of this in Xen, or do our own Xen-specific drivers.
> 
> 2. A short term workaround for few specific devices. If you look at the
> linked threads, users resort to patching the domU driver and then
> copying MSI values from dom0's lspci output manually... I think we can
> do better than this short term, via some quirks in QEMU. Either let the
> domU see the real HW values, or translate IMS writes at QEMU level
> (assuming they can be identified). Disclaimer - I haven't looked yet at
> this specific driver, nor the SIOV/IMS spec, so I'm not sure if that's
> viable approach...
> 
> [1] 
> https://forum.qubes-os.org/t/solved-qualcomm-qcnfa765-ath11k-wcn6855-wifi-working-on-thinkpad-p14s-gen4-amd/38192

Disclaimer: I have not read the Intel or AMD IOMMU specs, do not have
access to the PCI spec, and know very little about ath11k.  All of
this is based on various mailing list threads and Matrix messages.
It might be wrong.  Please correct me if it is.

First, a background on IMS.  All of this comes from [2] and its thread.

IMS is a result of wanting to store interrupts in host memory.  This
avoids needing to have them in expensive on-die SRAM or including DRAM
in the card.  However, on-die SRAM is used to cached the interrupts.
This means that interrupts must be managed via command queues.

This causes problems for Linux and for any other OS that expects
to be able to change IRQs without a command/response operation.
The only workarounds I saw in that thread are:

1. Redesign the OS so it never needs to change interrupts from a
   context where device commands are impossible.
2. Modify the command queue code so it can run in interrupt context.
3. Rely on the IOMMU to remap interrupts.

ath10k and friends use an even worse hack, which is to pin everything
to a fixed CPU so that the problems mentioned above (which relate to
moving interrupts between CPUs) don't arise.

Now, the part that is relevant to Xen:

IMS *also* causes problems for hypervisors.  Hypervisors present guests
with a virtualized MSI range rather than exposing the actual one.
My understanding is that virtualization serves two purposes:

4. It turns non-remappable MSIs into remappable ones, so that they
   are translated by the IOMMU instead of being rejected.
5. It fixes some information in the MSI (target CPU?) so that the
   interrupt correctly reaches the guest.

With IMS, virtualizing the guest's interrupts is no longer possible.
That would require virtualizing the command queues, which are
device-specific and in any case (probably) too complex for the
hypervsior to handle.

The only way I know of to make IMS work under Xen is option (3) above:
give the guest access to the real MSI configuration space, and rely on
the IOMMU to translate the guest's interrupts to whatever it needs.
This is possible on AMD but not on Intel.  See [4] and the related
Matrix messages.

For Intel, the only solution I know of is to patch ath11k and friends
to get the real interrupt from Xen and/or QEMU so they can program the
hardware accordingly.  This will require a driver patch.  I *think*
ath11k and friends are the only IMS devices consumers are likely
to run into.  I suspect the others are likely enterprise devices
with VFIO/MDEV support.  Supporting all devices could be done via a
paravirtualized interface, perhaps as part of paravirtualized IOMMU
support.  On Intel, the IOMMU must play a role in MSI assignment
anyway, so a PV IOMMU could coordinate with a hypervisor to avoid
this kind of problem.

A lot of this information is taken from the thread in [3].

[2]: https://lore.kernel.org/lkml/20200821201705.GA2811871@xxxxxxxxxx/
[3]: 
https://lore.kernel.org/xen-devel/20250226211125.43625-1-jason.andryuk@xxxxxxx/t/#m590f8a0de6fecde893345a6836828dc84eaccd5d
[4]: 
https://matrix.to/#/!XcEgmbCouiNWHlGdHk:matrix.org/$laXuwPmDLINXAYnwoDsVCUvByPS6-5IjB_1OCAl9zgQ
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

Attachment: OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.