[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: IOMMU faults



> - The Intel IOMMU fault handler prints quite a lot of info in interrupt
>   context, making it easier to livelock.  Still I think the general
>   problem applies on AMD too.

Someone at Intel looked into implementing measured rate printing in vt-d fault 
handler.  He encountered some complications.  I remember it had to do with 
measured rate printing not enabled by default (?).  For now, I think having it 
print out only for debug case sounds simple enough.  I will submit a patch for 
it.

> - Domain destruction re-assigns passed though cards to dom0, but the
>   cards don't seem to get reset.  So there's nothing to stop a card
>   battering away at DMA in the meantime.  That seems like a problem
>   independent of livelock, actually.

>From reading the code in libxl, it seems libxl__device_pci_reset() is called 
>by both libxl__device_pci_add() and do_pci_remove().  Isn't do_pci_remove() 
>called when the pass through device is reassigned to dom0 during a domain 
>teardown?

Allen

-----Original Message-----
From: Tim Deegan [mailto:Tim.Deegan@xxxxxxxxxx] 
Sent: Thursday, June 16, 2011 2:25 AM
To: Kay, Allen M; Wei Wang
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Jean Guyader
Subject: IOMMU faults

Hi, IOMMU maintainers,

What should Xen do when an IOMMU fault happens?  As far as I can
see both the AMD and Intel code clears the error in the IOMMU and
carries on, but I suspect some more vigorous action is appropriate.
I've seen traces from an Intel machine that seemed to be livelocked on
IOMMU faults from a passed-through VGA card, until it was killed by the
watchdog.  I think I can see two things that contribute to that:

 - The Intel IOMMU fault handler prints quite a lot of info in interrupt
   context, making it easier to livelock.  Still I think the general
   problem applies on AMD too.
 - Domain destruction re-assigns passed though cards to dom0, but the
   cards don't seem to get reset.  So there's nothing to stop a card
   battering away at DMA in the meantime.  That seems like a problem
   independent of livelock, actually.

In any case, it seems like it would be a good idea to stop a
broken/malicious/deassigned card from flooding Xen with IOMMU faults.

I was considering just writing 0 to the faulting card's PCI command
register, but I'm told that's not always enough to properly deactivate
a card, and it might be a little over-zealous to do it on the first
offence. 

Ideas?

Tim.

-- 
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.