xen-devel

[Top] [All Lists]

Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channel

from [Daniel Stodden]

[Permanent Link][Original]

To:	Jan Beulich <JBeulich@xxxxxxxxxx>
Subject:	Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels
From:	Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Date:	Fri, 27 Aug 2010 14:49:29 -0700
Cc:	Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "Xen-devel@xxxxxxxxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxxxxxxxx>, Tom Kopec <tek@xxxxxxx>
Delivery-date:	Fri, 27 Aug 2010 14:50:06 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<1282941781.26797.386.camel@xxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<4C743B2C.8070208@xxxxxxxx> <4C74E7C802000078000120C0@xxxxxxxxxxxxxxxxxx> <4C7558E0.1060806@xxxxxxxx> <4C7629D10200007800012387@xxxxxxxxxxxxxxxxxx> <4C769736.4050409@xxxxxxxx> <4C7799EB020000780001276F@xxxxxxxxxxxxxxxxxx> <1282941781.26797.386.camel@xxxxxxxxxxxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

On Fri, 2010-08-27 at 13:43 -0700, Daniel Stodden wrote:
> On Fri, 2010-08-27 at 04:56 -0400, Jan Beulich wrote:
> > >>> On 26.08.10 at 18:32, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
> > > On 08/25/2010 11:46 PM, Jan Beulich wrote:
> > >>  >>> On 25.08.10 at 19:54, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
> > >>> Note that this patch is specifically for upstream Xen, which doesn't
> > >>> have any pirq support in it at present.
> > >> I understand that, but saw that you had paralleling changes to the
> > >> pirq handling in your Dom0 tree.
> > >>
> > >>> However,  I did consider using fasteoi, but I couldn't see how to make
> > >>> it work.  The problem is that it only does a single call into the
> > >>> irq_chip for EOI after calling the interrupt handler, but there is no
> > >>> call beforehand to ack the interrupt (which means clear the event flag
> > >>> in our case).  This leads to a race where an event can be lost after the
> > >>> interrupt handler has returned, but before the event flag has been
> > >>> cleared (because Xen won't set pending or call the upcall function if
> > >>> the event is already set).  I guess I could pre-clear the event in the
> > >>> upcall function, but I'm not sure that's any better.
> > >> That's precisely what we're doing.
> > > 
> > > You mean pre-clearing the event?  OK.
> > > 
> > > But aren't you still subject to the bug the switch to handle_edge_irq 
> > > fixed?
> > > 
> > > With handle_fasteoi_irq:
> > > 
> > > cpu A                     cpu B
> > > get event
> > 
> > mask and clear event
> 
> Argh. Right, I guess that's my fault, I was the one who came up with the
> PENDING theory, but indeed I failed to see the event masking bits.
> 
> However, please read on.
> 
> > > set INPROGRESS
> > > call action
> > >    :
> > >    :
> > > <migrate event channel to B>
> > >    :                      get event
> > 
> > Cannot happen, event is masked (i.e. all that would happen is
> > that the event occurrence would be logged evtchn_pending).
> > 
> > >    :                      INPROGRESS set? -> EOI, return
> > >    :
> > > action returns
> > > clear INPROGRESS
> > > EOI
> > 
> > unmask event, checking for whether the event got re-bound (and
> > doing the unmask through a hypercall if necessary), thus re-raising
> > the event in any case
> 
> Yes. I agree. So let's come up with a new theory. Right now I'm still
> looking at xen/next. Correct me if I'm mistaken:
> 
> mask_ack_pirq will:
>  1. chip->mask
>  2. chip->ack
> 
> Where chip->ack will:
>  1. move_native_irq
>  2. clear_evtchn.
> 
> Now if you look into move_native_irq, it will:
>  1. chip->mask (gratuitous)
>  2. move
>  3. chip->unmask (aiiiiiie).
> 
> That explains why edge_irq still fixed the problem.
> 
> Price question is if that's the kind of fix we wanted then.

XCP has, presumably older, mask_ack() and ack() handlers in
core/evtchn.c. Those
1. move
2. mask
3. ack

and therefore don't have that problem. So maybe this was caused by some
pvops specific patch a while ago?

Cheers,
Daniel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jeremy Fitzhardinge Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Daniel Stodden Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jeremy Fitzhardinge Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jeremy Fitzhardinge Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Daniel Stodden Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Daniel Stodden <= Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jeremy Fitzhardinge Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Keir Fraser Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Keir Fraser Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jeremy Fitzhardinge Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jan Beulich

Previous by Date:	Re: [Xen-devel] [PATCH 6 of 8] xl: add a global configuration file, Zhigang Wang
Next by Date:	Re: [Xen-devel] [GIT] Sync pvhvm changes in 2.6.36-rc1 into xen/2.6.32.x, Jeremy Fitzhardinge
Previous by Thread:	Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Daniel Stodden
Next by Thread:	Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels, Jeremy Fitzhardinge
Indexes:	[Date] [Thread] [Top] [All Lists]