On 08/25/2010 11:46 PM, Jan Beulich wrote:
> >>> On 25.08.10 at 19:54, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
>> Note that this patch is specifically for upstream Xen, which doesn't
>> have any pirq support in it at present.
> I understand that, but saw that you had paralleling changes to the
> pirq handling in your Dom0 tree.
>> However, I did consider using fasteoi, but I couldn't see how to make
>> it work. The problem is that it only does a single call into the
>> irq_chip for EOI after calling the interrupt handler, but there is no
>> call beforehand to ack the interrupt (which means clear the event flag
>> in our case). This leads to a race where an event can be lost after the
>> interrupt handler has returned, but before the event flag has been
>> cleared (because Xen won't set pending or call the upcall function if
>> the event is already set). I guess I could pre-clear the event in the
>> upcall function, but I'm not sure that's any better.
> That's precisely what we're doing.
You mean pre-clearing the event? OK.
But aren't you still subject to the bug the switch to handle_edge_irq fixed?
cpu A cpu B
<migrate event channel to B>
: get event
: INPROGRESS set? -> EOI, return
The event arriving on B is lost, and there's no record made of it ever
having arrived. How do you avoid this?
With handle_edge_irq, the second event will set PENDING in the irq_desc,
and a loop will keep running on cpu A until PENDING is clear, so nothing
is ever lost.
>> In the dom0 kernels, I followed the example of the IOAPIC irq_chip
>> implementation, which does the hardware EOI in the ack call at the start
>> of handle_edge_irq, can did the EOI hypercall (when necessary) there. I
>> welcome a reviewer's eye on this though.
> This I didn't actually notice so far.
> That doesn't look right, at least in combination with ->mask() being
> wired to disable_pirq(), which is empty (and btw., if the latter was
> right, you should also wire ->mask_ack() to disable_pirq() to avoid
> a pointless indirect call).
> But even with ->mask() actually masking the IRQ I'm not certain
> this is right. At the very least it'll make Xen see a potential
> second instance of the same IRQ much earlier than you will
> really be able to handle it. This is particularly bad for level
> triggered ones, as Xen will see them right again after you
> passed it the EOI notification - as a result there'll be twice as
> many interrupts seen by Xen on the respective lines.
> The native I/O APIC can validly do this as ->ack() only gets
> called for edge triggered interrupts (which is why ->eoi() is
> wired to ack_apic_level()).
Yes. The code as-is works OK, but I haven't checked to see if Xen it
taking many spurious interrupts. It probably helps that my test machine
has MSI for almost everything.
But does that mean the pirq code needs to have different ack/eoi
behaviour depending on the triggering of the ioapic interrupt?
>> I was thinking specifically of the timer, debug and console virqs. The
>> last is the only one which could conceivably be non-percpu, but in
>> practice I think it would be a bad idea to put it on anything other than
>> cpu0. What other global virqs are there? Nothing that's high-frequency
>> enough to be worth migrating?
> Once supported in your tree, oprofile could have high interrupt
> rates, and the trace buffer ones might too. Admittedly both are
> debugging aids, but I don't think it'd be nice for them to influence
> performance more than necessary.
VIRQ_XENOPROF is percpu as well. VIRQ_TBUF is global, but it has to
happen on *some* vcpu - but yes, I guess it would be useful to put it on
another vcpu so that it doesn't affect all the stuff tied to vcpu 0.
(At the moment it would be tied to whatever vcpu set up the virq, so you
could control initial placement that way...)
Xen-devel mailing list