|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Dealing with SIOV/IMS
On 1/1/26 17:56, Marek Marczykowski-Górecki wrote: > Hi, > > I've got yet another report[1] of device failing because (I assume) the > drivers reads MSI/MSI-X values (thinking it sees values actually set in > the HW) and then pass them to the device via some alternative means. > IIUC this is what IMS does. > > I'm interested in two things: > 1. Some plan for a long term solution - it was briefly discussed on > XenDevel matrix room in September, Roger said: > >> urg, that's the spec that also defines IMS IIRC? I think the only way >> to support anything like that is using vfio/mdev and re-using the >> drivers from Linux. There's too much device-specific magic to >> implement any of this in Xen, or do our own Xen-specific drivers. > > 2. A short term workaround for few specific devices. If you look at the > linked threads, users resort to patching the domU driver and then > copying MSI values from dom0's lspci output manually... I think we can > do better than this short term, via some quirks in QEMU. Either let the > domU see the real HW values, or translate IMS writes at QEMU level > (assuming they can be identified). Disclaimer - I haven't looked yet at > this specific driver, nor the SIOV/IMS spec, so I'm not sure if that's > viable approach... > > [1] > https://forum.qubes-os.org/t/solved-qualcomm-qcnfa765-ath11k-wcn6855-wifi-working-on-thinkpad-p14s-gen4-amd/38192 Disclaimer: I have not read the Intel or AMD IOMMU specs, do not have access to the PCI spec, and know very little about ath11k. All of this is based on various mailing list threads and Matrix messages. It might be wrong. Please correct me if it is. First, a background on IMS. All of this comes from [2] and its thread. IMS is a result of wanting to store interrupts in host memory. This avoids needing to have them in expensive on-die SRAM or including DRAM in the card. However, on-die SRAM is used to cached the interrupts. This means that interrupts must be managed via command queues. This causes problems for Linux and for any other OS that expects to be able to change IRQs without a command/response operation. The only workarounds I saw in that thread are: 1. Redesign the OS so it never needs to change interrupts from a context where device commands are impossible. 2. Modify the command queue code so it can run in interrupt context. 3. Rely on the IOMMU to remap interrupts. ath10k and friends use an even worse hack, which is to pin everything to a fixed CPU so that the problems mentioned above (which relate to moving interrupts between CPUs) don't arise. Now, the part that is relevant to Xen: IMS *also* causes problems for hypervisors. Hypervisors present guests with a virtualized MSI range rather than exposing the actual one. My understanding is that virtualization serves two purposes: 4. It turns non-remappable MSIs into remappable ones, so that they are translated by the IOMMU instead of being rejected. 5. It fixes some information in the MSI (target CPU?) so that the interrupt correctly reaches the guest. With IMS, virtualizing the guest's interrupts is no longer possible. That would require virtualizing the command queues, which are device-specific and in any case (probably) too complex for the hypervsior to handle. The only way I know of to make IMS work under Xen is option (3) above: give the guest access to the real MSI configuration space, and rely on the IOMMU to translate the guest's interrupts to whatever it needs. This is possible on AMD but not on Intel. See [4] and the related Matrix messages. For Intel, the only solution I know of is to patch ath11k and friends to get the real interrupt from Xen and/or QEMU so they can program the hardware accordingly. This will require a driver patch. I *think* ath11k and friends are the only IMS devices consumers are likely to run into. I suspect the others are likely enterprise devices with VFIO/MDEV support. Supporting all devices could be done via a paravirtualized interface, perhaps as part of paravirtualized IOMMU support. On Intel, the IOMMU must play a role in MSI assignment anyway, so a PV IOMMU could coordinate with a hypervisor to avoid this kind of problem. A lot of this information is taken from the thread in [3]. [2]: https://lore.kernel.org/lkml/20200821201705.GA2811871@xxxxxxxxxx/ [3]: https://lore.kernel.org/xen-devel/20250226211125.43625-1-jason.andryuk@xxxxxxx/t/#m590f8a0de6fecde893345a6836828dc84eaccd5d [4]: https://matrix.to/#/!XcEgmbCouiNWHlGdHk:matrix.org/$laXuwPmDLINXAYnwoDsVCUvByPS6-5IjB_1OCAl9zgQ -- Sincerely, Demi Marie Obenour (she/her/hers) Attachment:
OpenPGP_0xB288B55FFF9C22C1.asc Attachment:
OpenPGP_signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |