Jan, How dose this one look like to you?
Signed-off-by Wei Wang <wei.wang2@xxxxxxx>
Advanced Micro Devices GmbH
Sitz: Dornach, Gemeinde Aschheim,
Landkreis München Registergericht München,
HRB Nr. 43632
WEEE-Reg-Nr: DE 12919551
Alberto Bozzo, Andrew Bowd
On Friday 08 April 2011 17:06:16 Wei Wang2 wrote:
> On Friday 08 April 2011 16:39:13 Jan Beulich wrote:
> > >>> On 08.04.11 at 16:26, Wei Wang2 <wei.wang2@xxxxxxx> wrote:
> > >
> > > On Friday 08 April 2011 15:43:57 Jan Beulich wrote:
> > >> >>> On 08.04.11 at 13:35, Wei Wang2 <wei.wang2@xxxxxxx> wrote:
> > >> >
> > >> > Some device could generate bogus interrupts if an IO-APIC RTE and an
> > >> > iommu interrupt remapping entry are not consistent during 2 adjacent
> > >> > 64bits IO-APIC RTE updates. For example, if the 2nd operation
> > >> > updates destination bits in RTE for SATA device and unmask it, in
> > >> > some case, SATA device will assert ioapic pin to generate interrupt
> > >> > immediately using new destination but iommu could still translate it
> > >> > into the old destination, then dom0 would be confused. To fix that,
> > >> > we sync up interrupt remapping entry with IO-APIC IRE on every 32
> > >> > bits operation and foward IOAPIC RTE updates after interrupt
> > >> > remapping table has been changed.
> > >>
> > >> I don't think this is correct: Without the patch, the filling of
> > >> ioapic_rte takes into account the value already written. Now that you
> > >> only write the value at the end of the function, you should overwrite
> > >> the
> > >> affected half with "value" immediately before calling
> > >> update_intremap_entry_from_ioapic().
> > >
> > > Sorry, not quite understand your point. My thought is, no matter dom0
> > > tried to
> > > updates lower half or upper half of RTE, we always updates interrupt
> > > table from the lower half. This will keep iommu table strictly
> > > identically to RTE. The old code has an assumption that both lower half
> > > and upper of RTE should be updated together. But this might not be
> > > always true. If by incident, dom0 only updates the upper half and we
> > > don't sync iommu with it, then the destination in RTE and iommu table
> > > will be different.
> > No, that's not my point. The problem I'm seeing is that you pass the
> > old value (as read from the IO-APIC) to
> > update_intremap_entry_from_ioapic(), but the function certainly
> > should use the to-be-written one. Previously this was implicit because
> > the IO-APIC register write happened first.
> OK, got it. That is definitely problematic. will fix it.
> > >> Eliminating the double write if reg == rte_lo would also seem
> > >> desirable (and in no case should you write back the old value after
> > >> having called update_intremap_entry_from_ioapic()).
> > >
> > > It not a write back, It just finishes IO-APIC RTE writes. After
> > > updating interrupt remapping table we still have to update RTE. It is
> > > just a copy of __io_apic_write (maybe I should just call it). Old code
> > > updates ioapic earlier than interrupt remapping table and sata device
> > > might generate interrupt right after this, which is not expected.
> > No. If reg == ret_lo, you write that IO-APIC register twice, which is
> > pointless. With the other problem unaddressed, you actually first write
> > back the old value (with the mask bit restored), which gets IO-APIC
> > and remapping tables out of sync for a brief period of time (which is
> > a problem by itself), then write the new value. With the other problem
> > addressed, you would simply write the new value twice, which is
> > wasteful given that these writes are uncached.
> True. I will rework the patch try to eliminate this.
> > Jan
> Xen-devel mailing list
Xen-devel mailing list