This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channe

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Mon, 30 Aug 2010 10:15:42 +0100
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "Xen-devel@xxxxxxxxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxxxxxxxx>, Tom Kopec <tek@xxxxxxx>, Daniel Stodden <Daniel.Stodden@xxxxxxxxxx>
Delivery-date: Mon, 30 Aug 2010 02:16:31 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4C7B909F0200007800012C52@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: ActIIpUS6uiiBM0rQmGDrft/UXWltwAAVZId
Thread-topic: [Xen-devel] [GIT PULL] Fix lost interrupt race in Xen event channels
User-agent: Microsoft-Entourage/
On 30/08/2010 10:06, "Jan Beulich" <JBeulich@xxxxxxxxxx> wrote:

> However, do you also think that pirq_unmask_and_notify() is safe
> to be called twice? I would think the double EOI potentially sent to
> Xen could lead to an interrupt getting ack-ed that didn't even get
> started to be serviced yet.

Erm, well if this is a race that happens only occasionally, does it matter?
Worst case you get another interrupt straight away. Only a problem if it
happens often enough to cause a performance issue or even livelock interrupt

The obvious fix would be for the kernel to privately keep track of which
event channels or pirqs are masked and/or disabled (e.g., with two bitflags
per port). Then have the evtchn_mask flag be the OR of the two. If this is
actually a real problem at all. I doubt move_native_irq() should be doing
work very often when it is called.

 -- Keir

> And this, afaict, can happen in 2.6.18
> as well (ack_pirq() -> move_native_irq() -> disable_pirq()/
> enable_pirq() -> pirq_unmask_and_notify() followed by end_pirq()
> -> pirq_unmask_and_notify()). Here, however, you couldn't even
> use the mask bit to detect the situation, since the masking only
> happens after already having called move_native_irq() (i.e. the
> event channel will be masked when you get into
> pirq_unmask_and_notify() the second time).

Xen-devel mailing list