On Wed, 2011-07-06 at 19:42 +0100, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 06, 2011 at 01:39:12PM +0100, Andrew Cooper wrote:
> > In the case of a crash, IOMMU DMA remapping gets turned off so that
> > the kdump kernel may boot. However, this is warned as being dangerous
> > in the VTD specification if a DMA transaction is in progress.
> >
> > Also, in the case of a crash, DMA transactions and interrupts from
> > peripheral devices such as network cards are likely to keep coming in.
> > Without DMA remapping enabled, the transactions will be writing over
> > low memory, corrupting the crash state, and perhaps even the kdump
> > reserved memory.
> >
> > Therefore, on the crash path, we can disconnect all PCI devices from
> > their respective buses so that they are no longer able to be DMA
> > busmasters. This reduces the risk of DMA transactions corrupting
> > state (and will also reduce spurious interrupts arriving to the kdump
> > kernel) until the kdump kernel and properly reset the PCI devices.
> >
> > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >
> > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/arch/x86/crash.c
> > --- a/xen/arch/x86/crash.c Mon Jun 27 17:37:12 2011 +0100
> > +++ b/xen/arch/x86/crash.c Wed Jul 06 13:37:44 2011 +0100
> > @@ -28,6 +28,7 @@
> > #include <asm/apic.h>
> > #include <asm/io_apic.h>
> > #include <xen/iommu.h>
> > +#include <xen/pci.h>
> >
> > static atomic_t waiting_for_crash_ipi;
> > static unsigned int crashing_cpu;
> > @@ -78,6 +79,8 @@ static void nmi_shootdown_cpus(void)
> > msecs--;
> > }
> >
> > + disconnect_pci_devices();
> > +
> > /* Crash shutdown any IOMMU functionality as the crashdump kernel is
> > not
> > * happy when booting if interrupt/dma remapping is still enabled */
> > iommu_crash_shutdown();
> > diff -r 2f63562df1c4 -r 7ea606c5ce8c xen/drivers/passthrough/pci.c
> > --- a/xen/drivers/passthrough/pci.c Mon Jun 27 17:37:12 2011 +0100
> > +++ b/xen/drivers/passthrough/pci.c Wed Jul 06 13:37:44 2011 +0100
> > @@ -462,6 +462,32 @@ int __init scan_pci_devices(void)
> > return 0;
> > }
> >
> > +/* Disconnect a PCI device from the PCI bus. From the PCI spec:
> > + * "When a 0 is written to [the COMMAND] register, the device is
> > + * logically disconnected from the PCI bus for all accesses except
> > + * configuration accesses. All devices are required to support
> > + * this base level of functionality."
> > + */
> > +void disconnect_pci_device(struct pci_dev *pdev)
> > +{
> > + pci_conf_write16(pdev->bus, PCI_SLOT(pdev->devfn),
> > + PCI_FUNC(pdev->devfn), PCI_COMMAND, 0);
>
> So if you have a PCI serial card (or Intel AMT) and you are using that for
> serial output on the hypervisor line, this will turn it off. There should
> be some whitelist capability to not do it for PCI serial devices that are
> owned (used) by the hypervisor.
That would be useful for debugging the kexec process itself but in the
general case there won't be any further output from the hypervisor and
if the kexec'd kernel wants to use the device it is going to have to set
it up again anyways.
On the other hand if the hypervisor is driving a device it presumably
knows (or could know) how to turn of interrupts and switch to polled
mode so we might as well do that?
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|