[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: PCI passthrough of XHCI on Framework AMD crashes the host
On 23/07/2025 14:35, Marek Marczykowski-Górecki wrote: > Hi, > > There is yet another issue affecting Framework AMD... When I start a > domU with XHCI controller attached (PCI passthrough), the whole host > resets if there was an USB device plugged into it. I don't get any panic > message (neither on XHCI console - which is connected to a different > XHCI controller, nor on VGA), and the reboot reason register shows > 0x08000800 ("an uncorrected error caused a data fabric sync flood > event") according to [1]. > > This is Framework AMD with AMD Ryzen 5 7640U. > > The crash itself happens quite early on domU startup - specifically when > SeaBIOS tries to initialize XHCI. I tracked it down to the second > readl() in xhci_controller_setup() [2]. Interestingly, it's specifically > the second readl(), regardless of which of those comes first. I tried > swapping their order, or even repeating read from the same register - > always the second call triggers the crash. The first one succeeds and > returns some value (for example 0x1200020 for HCCPARAMS). > > If I start the domU when no USB devices are connected, it doesn't crash. > > If I manually unbind the device from the dom0 driver (echo 0000:c3:00.4 > > /sys/bus/pci/drivers/xhci_hcd/unbind), it doesn't crash. Note I have > seize=1 in domU config, so the `xl pci-assignable-add` calls is implicit. > > If the system doesn't crash (either by not having any USB devices > connected initially, or by the manual unbind), the USB controller in > domU works fine. I can later connect devices and they appear inside > domU. > > This system has a couple of XHCI controllers, and the same behavior is > observed on at least two of them. > > The controller works just fine when used in dom0. > > If I passthrough another PCI device instead (tried wifi card and audio > card), it doesn't crash. > > The value read from from HCCPARAMS (BAR + 0x10) differs between good and bad > case: > - 0x01200020 when it crashes > - 0x0110ffc5 when it works > > It's weird to have this much differences here, given most bits in this > register is about device capabilities[3], not its runtime state... > > In this system my main debugging tool is the XHCI console. But I tried > also without enabling XHCI console, and it still crashes, so it looks > like it isn't caused by the XHCI console. > > I tried also disabling XHCI initialization in SeaBIOS, and then it > proceeds to booting domU's kernel. But as soon as Linux gets into > initializing that USB controller, it crashes the same way. So, it isn't > just SeaBIOS doing something weird (or at least not just that). > > With PVH dom0, the behavior is a bit different: > 1. Initially, the controller works fine in dom0. > 2. When starting domU, instead of clean unbind this happens: > > [ 11.248760] xhci_hcd 0000:c3:00.4: Controller not ready at resume -19 > [ 11.248765] xhci_hcd 0000:c3:00.4: PCI post-resume error -19! > [ 11.248767] xhci_hcd 0000:c3:00.4: HC died; cleaning up > [ 11.249010] xhci_hcd 0000:c3:00.4: remove, state 4 > [ 11.249013] usb usb8: USB disconnect, device number 1 > [ 11.249437] xhci_hcd 0000:c3:00.4: USB bus 8 deregistered > [ 11.249832] xhci_hcd 0000:c3:00.4: remove, state 4 > [ 11.249835] usb usb7: USB disconnect, device number 1 > [ 11.250074] xhci_hcd 0000:c3:00.4: Host halt failed, -19 > [ 11.250076] xhci_hcd 0000:c3:00.4: Host not accessible, reset failed. > [ 11.250389] xhci_hcd 0000:c3:00.4: USB bus 7 deregistered > [ 11.251011] pciback 0000:c3:00.4: xen_pciback: seizing device > [ 11.335120] pciback 0000:c3:00.4: xen_pciback: vpci: assign to > virtual slot 0 > [ 11.335544] pciback 0000:c3:00.4: registering for 1 > > 3. Reading from BAR in domU (in SeaBIOS, and later Linux) returns > 0xffffffff. > 4. Does not crash the host. > > Any ideas? > > I don't have any other system with Zen4 to try on. The hw11 gitlab > runner is Ryzen 7 7735HS, and it doesn't have this issue. It's also > possible this is something related to Framework's firmware, but give all > the observations above, I find it less likely. > > [1] https://docs.kernel.org/arch/x86/amd-debugging.html#random-reboot-issues > [2] https://github.com/coreboot/seabios/blob/master/src/hw/usb-xhci.c#L553 > [3] > https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/extensible-host-controler-interface-usb-xhci.pdf > (page 385) I had a similar problem with a Beelink mini PC with the Ryzen 5800U after a recent Qubes upgrade. If the USB controller is passed through to sys-usb then the system simply resets without warning. Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |