This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue

To: "Wei Wang2" <wei.wang2@xxxxxxx>
Subject: [Xen-devel] Re: [PATCH] AMD IOMMU: Fix an interrupt remapping issue
From: "Jan Beulich" <JBeulich@xxxxxxxxxx>
Date: Fri, 08 Apr 2011 15:39:13 +0100
Cc: Boris Ostrovsky <Boris.Ostrovsky@xxxxxxx>, Wei Huang2 <Wei.Huang2@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 08 Apr 2011 07:39:54 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201104081626.44096.wei.wang2@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <201104081335.36718.wei.wang2@xxxxxxx> <4D9F2D3D020000780003A9AD@xxxxxxxxxxxxxxxxxx> <201104081626.44096.wei.wang2@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> On 08.04.11 at 16:26, Wei Wang2 <wei.wang2@xxxxxxx> wrote:

> On Friday 08 April 2011 15:43:57 Jan Beulich wrote:
>> >>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@xxxxxxx> wrote:
>> >
>> > Some device could generate bogus interrupts if an IO-APIC RTE and an
>> > iommu interrupt remapping entry are not consistent during 2 adjacent
>> > 64bits IO-APIC RTE updates. For example, if the 2nd operation updates
>> > destination bits in RTE for SATA device and unmask it, in some case, SATA
>> > device will assert ioapic pin to generate interrupt immediately using new
>> > destination but iommu could still translate it into the old destination,
>> > then dom0 would be confused. To fix that, we sync up interrupt remapping
>> > entry with IO-APIC IRE on every 32 bits operation and foward IOAPIC RTE
>> > updates after interrupt remapping table has been changed.
>> I don't think this is correct: Without the patch, the filling of ioapic_rte
>> takes into account the value already written. Now that you only write
>> the value at the end of the function, you should overwrite the
>> affected half with "value" immediately before calling
>> update_intremap_entry_from_ioapic(). 
> Sorry, not quite understand your point. My thought is, no matter dom0 tried 
> to 
> updates lower half or upper half of RTE, we always updates interrupt table 
> from the lower half. This will keep iommu table strictly identically to RTE. 
> The old code has an assumption that both lower half and upper of RTE should 
> be updated together. But this might not be always true. If by incident, dom0 
> only updates the upper half and we don't sync iommu with it, then the 
> destination in RTE and iommu table will be different.

No, that's not my point. The problem I'm seeing is that you pass the
old value (as read from the IO-APIC) to
update_intremap_entry_from_ioapic(), but the function certainly
should use the to-be-written one. Previously this was implicit because
the IO-APIC register write happened first.

>> Eliminating the double write if reg == rte_lo would also seem desirable
>> (and in no case should you write back the old value after having called
>> update_intremap_entry_from_ioapic()).
> It not a write back, It just finishes IO-APIC RTE writes. After updating 
> interrupt remapping table we still have to update RTE. It is just a copy of 
> __io_apic_write (maybe I should just call it). Old code updates ioapic 
> earlier than interrupt remapping table and sata device might generate 
> interrupt right after this, which is not expected.

No. If reg == ret_lo, you write that IO-APIC register twice, which is
pointless. With the other problem unaddressed, you actually first write
back the old value (with the mask bit restored), which gets IO-APIC
and remapping tables out of sync for a brief period of time (which is
a problem by itself), then write the new value. With the other problem
addressed, you would simply write the new value twice, which is
wasteful given that these writes are uncached.


Xen-devel mailing list