Keir,
Well yes, the code is broken, but I thought the problem may be more general.
Looking at the code more carefully, I think you're right in that the code is
even more broken than I thought and that there is no general problem. This is
the tasklet in question from xen/arch/x86/mm/mem_event.c:
-static void mem_event_notify(struct domain *d)
-{
- prepare_wait_on_xen_event_channel(d->mem_event.xen_port);
- notify_via_xen_event_channel(d->mem_event.xen_port);
-}
The prepare_wait_on_xen_event_channel() looks to be totally bogus. The
notify_via_xen_event_channel() is the source of my problem. The first thing it
does is grab the lock on the current domain. If the tasklet is running on the
idle domain, it hangs because the event_lock of the current domain is not
initialized. Looking at the code for notify_via_xen_event_channel() more
carefully, it seems it should be calling evtchn_send() instead.
I could provide a patch that does this. As I said in the my first e-mail, I
don't see any advantage in doing the notification from a tasklet and my
inclination would be to delete it as well and just make the call directly. I'm
a little worried that I am only interested in xenpaging at the moment and it
looks like it might impact the sharing path as well.
John Byrne
> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: Thursday, April 22, 2010 2:18 PM
> To: Byrne, John (HP Labs); xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] event_lock not initialized in the idle domain
> (permitted actions in a tasklet?)
>
> It ought to be the destination domain's event-channel info that gets
> accessed, not that of the domain that happens to be running currently.
> Idle
> domains have no event-channel info because it makes no sense for them
> to
> have it. I don't know what xenpaging code you are specifically
> referring to,
> but sounds like it is simply broken.
>
> K.
>
> On 22/04/2010 21:57, "Byrne, John (HP Labs)" <john.l.byrne@xxxxxx>
> wrote:
>
> > I am playing with the xenpaging code that was checked into xen 4.0
> and I hit a
> > bug because it sends an event channel notification from a tasklet
> which was
> > getting run from the idle domain on my box. Since domain_create()
> does not
> > perform the evtchn_init for the idle domain, the event_lock was not
> > initialized and the tasklet would hang the cpu when it tried to
> acquire the
> > lock.
> >
> > While my immediate problem seems easy enough to work around --- I
> really can't
> > see the reason for the tasklet in the first place, so I just got rid
> of it ---
> > the underlying issue needs a look. Should domain_create() simply
> initialize
> > all of the idle_domain structure? As far as I can tell, the only
> reason it
> > doesn't is to save a little memory. While the event_lock issue can be
> dealt
> > with simply enough by breaking out the initialization separately, it
> seems
> > that having the special-case code for the idle domain opens up the
> possibility
> > for bugs with respect to operations from tasklets.
> >
> > Thanks,
> >
> > John Byrne
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|