On 08/27/2010 01:43 PM, Daniel Stodden wrote:
> On Fri, 2010-08-27 at 04:56 -0400, Jan Beulich wrote:
>>>>> On 26.08.10 at 18:32, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
>>> On 08/25/2010 11:46 PM, Jan Beulich wrote:
>>>> >>> On 25.08.10 at 19:54, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
>>>>> Note that this patch is specifically for upstream Xen, which doesn't
>>>>> have any pirq support in it at present.
>>>> I understand that, but saw that you had paralleling changes to the
>>>> pirq handling in your Dom0 tree.
>>>>> However, I did consider using fasteoi, but I couldn't see how to make
>>>>> it work. The problem is that it only does a single call into the
>>>>> irq_chip for EOI after calling the interrupt handler, but there is no
>>>>> call beforehand to ack the interrupt (which means clear the event flag
>>>>> in our case). This leads to a race where an event can be lost after the
>>>>> interrupt handler has returned, but before the event flag has been
>>>>> cleared (because Xen won't set pending or call the upcall function if
>>>>> the event is already set). I guess I could pre-clear the event in the
>>>>> upcall function, but I'm not sure that's any better.
>>>> That's precisely what we're doing.
>>> You mean pre-clearing the event? OK.
>>> But aren't you still subject to the bug the switch to handle_edge_irq fixed?
>>> With handle_fasteoi_irq:
>>> cpu A cpu B
>>> get event
>> mask and clear event
> Argh. Right, I guess that's my fault, I was the one who came up with the
> PENDING theory, but indeed I failed to see the event masking bits.
> However, please read on.
>>> set INPROGRESS
>>> call action
>>> <migrate event channel to B>
>>> : get event
>> Cannot happen, event is masked (i.e. all that would happen is
>> that the event occurrence would be logged evtchn_pending).
>>> : INPROGRESS set? -> EOI, return
>>> action returns
>>> clear INPROGRESS
>> unmask event, checking for whether the event got re-bound (and
>> doing the unmask through a hypercall if necessary), thus re-raising
>> the event in any case
> Yes. I agree. So let's come up with a new theory. Right now I'm still
> looking at xen/next. Correct me if I'm mistaken:
> mask_ack_pirq will:
> 1. chip->mask
> 2. chip->ack
> Where chip->ack will:
> 1. move_native_irq
> 2. clear_evtchn.
> Now if you look into move_native_irq, it will:
> 1. chip->mask (gratuitous)
> 2. move
> 3. chip->unmask (aiiiiiie).
> That explains why edge_irq still fixed the problem.
Good point. I guess the simplest fix in that case would have been to
The current fix is not wrong, so we can leave it as-is upstream for now.
But I think I will try Jan's idea about masking/clearing in the event
upcall then using fasteoi as the handler.
Xen-devel mailing list