>>> On 25.08.10 at 19:54, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
> Note that this patch is specifically for upstream Xen, which doesn't
> have any pirq support in it at present.
I understand that, but saw that you had paralleling changes to the
pirq handling in your Dom0 tree.
> However, I did consider using fasteoi, but I couldn't see how to make
> it work. The problem is that it only does a single call into the
> irq_chip for EOI after calling the interrupt handler, but there is no
> call beforehand to ack the interrupt (which means clear the event flag
> in our case). This leads to a race where an event can be lost after the
> interrupt handler has returned, but before the event flag has been
> cleared (because Xen won't set pending or call the upcall function if
> the event is already set). I guess I could pre-clear the event in the
> upcall function, but I'm not sure that's any better.
That's precisely what we're doing.
> In the dom0 kernels, I followed the example of the IOAPIC irq_chip
> implementation, which does the hardware EOI in the ack call at the start
> of handle_edge_irq, can did the EOI hypercall (when necessary) there. I
> welcome a reviewer's eye on this though.
This I didn't actually notice so far.
That doesn't look right, at least in combination with ->mask() being
wired to disable_pirq(), which is empty (and btw., if the latter was
right, you should also wire ->mask_ack() to disable_pirq() to avoid
a pointless indirect call).
But even with ->mask() actually masking the IRQ I'm not certain
this is right. At the very least it'll make Xen see a potential
second instance of the same IRQ much earlier than you will
really be able to handle it. This is particularly bad for level
triggered ones, as Xen will see them right again after you
passed it the EOI notification - as a result there'll be twice as
many interrupts seen by Xen on the respective lines.
The native I/O APIC can validly do this as ->ack() only gets
called for edge triggered interrupts (which is why ->eoi() is
wired to ack_apic_level()).
> I was thinking specifically of the timer, debug and console virqs. The
> last is the only one which could conceivably be non-percpu, but in
> practice I think it would be a bad idea to put it on anything other than
> cpu0. What other global virqs are there? Nothing that's high-frequency
> enough to be worth migrating?
Once supported in your tree, oprofile could have high interrupt
rates, and the trace buffer ones might too. Admittedly both are
debugging aids, but I don't think it'd be nice for them to influence
performance more than necessary.
Xen-devel mailing list