This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] RE: Crashdump and IOMMU problems

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] RE: Crashdump and IOMMU problems
From: "Kay, Allen M" <allen.m.kay@xxxxxxxxx>
Date: Wed, 11 May 2011 15:11:33 -0700
Accept-language: en-US
Acceptlanguage: en-US
Cc: Wei Wang <wei.wang2@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>
Delivery-date: Wed, 11 May 2011 15:12:45 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DCA8532.3070508@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DCA8532.3070508@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcwP2Yb5YEx63iGpS/uA79hJxS/bugATRM2w
Thread-topic: Crashdump and IOMMU problems
I believe Jan was involved with adding kexec support in iommu code.

As for systematic way to disable active DMA, isn't this similar to OS shutdown 
case when all the drivers are unloaded?  Does kexec unload all of the device 
drivers?  Once all the drivers are unloaded, there shouldn't be any DMA 
transactions going on.

I don't know much about kexec flow but it sound like the high level flow is: 1) 
dom0 kernel shutdown all of the device dirver and then 2) call 
iommu_ops->suspend() or crash_shutdown() to disable all of the iommu hardware.


-----Original Message-----
From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx] 
Sent: Wednesday, May 11, 2011 5:47 AM
To: xen-devel@xxxxxxxxxxxxxxxxxxx
Cc: Kay, Allen M; Wei Wang
Subject: Crashdump and IOMMU problems


I have been debugging kexec interaction problems with XenServer and 
found that the problem lies in how Xen tares down the computer in a crash.

The Xen kexec path does not touch IOMMU at all, which leaves the kexec 
native kernel with interrupt remapping enabled without realizing it.  
This leads to the kexec kernel failing to understand why its interrupts 
aren't working.

As a debugging measure, I have put iommu_ops->suspend() and 
iommu_disable_IR() on the kexec path and this 'fixes' the problem, 
although it is far from safe.

 From a correctness point of view, Xen really does need to shutdown all 
IOMMU remapping before it jumps to the crash kernel.  I know that kdump 
is a "seat of the pants best effort" in the best case, but there is more 
which Xen needs to do to help it along.  I was considering adding a 
crash_shutdown function to iommu_ops which goes and twiddles the 
relevant disable bits, without saving state.

However, disabling DMA remapping while transfers are still ongoing is 
likely asking for trouble.  Seeing as people on here are likely to know 
far more than me on this subject:

1) Is there a systematic way to find and disable active DMA transfers, 
or indeed a systematic way to shut down PCI (etc) devices which is safe 
for the kexec path.
2) Are there any other PC subsystems which could do with being shut down 
in a sensible manor to make life easier for the kdump kernel?

Thanks in advance,


Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>