This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] fix evtchn cpu affinity ?

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] fix evtchn cpu affinity ?
From: Pascal Bouchareine <pascal@xxxxxxxxx>
Date: Thu, 3 Apr 2008 02:10:55 +0200
Delivery-date: Wed, 02 Apr 2008 17:11:18 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.11

When rebinding user event channels to cpu in dom0, evtchn/evtchn.c
calls rebind_evtchn_to_cpu that changes the cpu_evtchn[] struct
to mask an event to a specific cpu and notifies xen about this
using an EVTCHNOP_bind_vcpu.

In core/evtchn.c:evtchn_do_upcall, the evtchn_pending array from
the shared info page is checked against this cpu_evtchn mask, 
effectively preventing domU from handling events when "upcalled" 
on the wrong cpu.

On an EVTCHNOP_close, xen rebinds the event channel to vcpu 0.

This seems to mean that anyone calling EVTCHNOP_close on an event channel
that was bound to a vcpu touching the vcpu_evtchn[] array, must 
rebind the event channel to cpu 0 too, or we are losing the events
(they stay pending, unmasked, and never do it to do_IRQ).

I'm currently seeing such "lost" events, and some debugging shows they're
bound in cpu_evtchn[] to a vcpu xen does not agree with (seen with
an EVTCHNOP_status call and status->vcpu). I think I tracked it down
to EVTCHNOP_closes() in evtchn/evtchn.c, thought I might miss some other
cases here ?

If this the right cause, maybe a close() wrapper in core/evtchn.c should
be used to avoid such deadlocks, or some other mechanism should ensure we
don't get out of synch with xen ?

Or maybe let xen publish this mask in the shared info page and rely
only on hypercalls for masking/unmasking events ?

The attached patch is an attempt to fix the above -


\o/   Pascal Bouchareine - Gandi 
 g    0170393757           15, place de la Nation - 75011 Paris      

Attachment: evtchn_rebind_on_close.patch
Description: Text document

Xen-devel mailing list
<Prev in Thread] Current Thread [Next in Thread>