[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping




On 14/06/11 10:02, Jan Beulich wrote:
>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> Experimental evidence shows that Extended Interrupt Mode remains in
>> effect even after Interrupt Remapping is disabled in each DMAR Global
>> Command Register.  A consiquence of this is that when we switch from
>> x2apic mode back to xapic mode, and disable interrupt remapping for
>> the kdump kernel, interrupts passing through the IO APICs are in
>> x2apic format as opposed xapic.  This causes a triple fault in the
>> kexec kernel.
>>
>> As EIM is explicitly set up each time Interrup Remapping is enabled,
>> it is safe for us to clobber this when taring down.
>>
>> Also, change the header definition of IRTA_REG_EIME_SHIFT.  It caused
>> verbose and error-prone code, and was only used in 1 place before.  We
>> now have IRTA_EIME which is the specific bit in the register.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>
>> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c
>> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100
>> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu,
>>  
>>  #ifdef CONFIG_X86
>>      /* set extended interrupt mode bit */
>> -    ir_ctrl->iremap_maddr |=
>> -            eim ? (1 << IRTA_REG_EIME_SHIFT) : 0;
>> +    ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0;
>>  #endif
>>      spin_lock_irqsave(&iommu->register_lock, flags);
>>  
>> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm
>>      if ( !ecap_intr_remap(iommu->ecap) )
>>          return;
>>  
>> +    /* If we are disabling Interrupt Remapping, make sure we dont stay in
>> +     * Extended Interrupt Mode, as this is unaffected by the Interrupt 
>> +     * Remapping flag in each DMAR Global Control Register.
>> +     * Specifically, local apics in xapic mode do not like interrupts 
>> delivered
>> +     * in x2apic mode.  Any code turning interrupt remapping back on will 
>> set
>> +     * EIME back correctly.
>> +     */
>> +    if ( iommu_supports_eim() )
>> +    {
>> +        u64 irta;
>> +        irta = dmar_readl(iommu->reg, DMAR_IRTA_REG);
>> +        dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME);
>> +        IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl,
>> +                      !(irta & IRTA_EIME), irta);
> Hmm, this of course can be problematic in the context of a crash:
> I realize that you want it cleared, but does system state guarantee
> that this WAIT_OP will always be able to complete? You'd get a
> hung system instead if it won't. Admittedly it's not clear which one
> is better, but a best effort approach would probably skip the wait
> loop. Let's see what our friends at Intel think...
>
> Jan
>From what I can see, IOMMU_WAIT_OP just polls memory mapped registers
until they change.  As we are doing multiple dmar_read/writes and the
spec warns that we must wait.

I cant see anything inherently dangerous doing this in the crash
context, because if the wait times out, then the chances are that the
hardware is broken, at which point you have bigger problems that a kexec
not happening.

~Andrew
>> +    }
>> +
>>      spin_lock_irqsave(&iommu->register_lock, flags);
>>      sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
>>      if ( !(sts & DMA_GSTS_IRES) )
>> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h
>> --- a/xen/drivers/passthrough/vtd/iommu.h    Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/drivers/passthrough/vtd/iommu.h    Mon Jun 13 17:45:43 2011 +0100
>> @@ -471,7 +471,7 @@ struct qinval_entry {
>>  
>>  #define IEC_GLOBAL_INVL         0
>>  #define IEC_INDEX_INVL          1
>> -#define IRTA_REG_EIME_SHIFT     11
>> +#define IRTA_EIME               (1 << 11)
>>  
>>  /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */
>>  #define IRTA_REG_TABLE_SIZE     ( IREMAP_PAGE_ORDER + 7 )
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx 
>> http://lists.xensource.com/xen-devel 
>
>

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.