> - The Intel IOMMU fault handler prints quite a lot of info in interrupt
> context, making it easier to livelock. Still I think the general
> problem applies on AMD too.
Someone at Intel looked into implementing measured rate printing in vt-d fault
handler. He encountered some complications. I remember it had to do with
measured rate printing not enabled by default (?). For now, I think having it
print out only for debug case sounds simple enough. I will submit a patch for
> - Domain destruction re-assigns passed though cards to dom0, but the
> cards don't seem to get reset. So there's nothing to stop a card
> battering away at DMA in the meantime. That seems like a problem
> independent of livelock, actually.
>From reading the code in libxl, it seems libxl__device_pci_reset() is called
>by both libxl__device_pci_add() and do_pci_remove(). Isn't do_pci_remove()
>called when the pass through device is reassigned to dom0 during a domain
From: Tim Deegan [mailto:Tim.Deegan@xxxxxxxxxx]
Sent: Thursday, June 16, 2011 2:25 AM
To: Kay, Allen M; Wei Wang
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Jean Guyader
Subject: IOMMU faults
Hi, IOMMU maintainers,
What should Xen do when an IOMMU fault happens? As far as I can
see both the AMD and Intel code clears the error in the IOMMU and
carries on, but I suspect some more vigorous action is appropriate.
I've seen traces from an Intel machine that seemed to be livelocked on
IOMMU faults from a passed-through VGA card, until it was killed by the
watchdog. I think I can see two things that contribute to that:
- The Intel IOMMU fault handler prints quite a lot of info in interrupt
context, making it easier to livelock. Still I think the general
problem applies on AMD too.
- Domain destruction re-assigns passed though cards to dom0, but the
cards don't seem to get reset. So there's nothing to stop a card
battering away at DMA in the meantime. That seems like a problem
independent of livelock, actually.
In any case, it seems like it would be a good idea to stop a
broken/malicious/deassigned card from flooding Xen with IOMMU faults.
I was considering just writing 0 to the faulting card's PCI command
register, but I'm told that's not always enough to properly deactivate
a card, and it might be a little over-zealous to do it on the first
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
Xen-devel mailing list