On Thu, Mar 23, 2006 at 02:32:34PM +0000, Harry Butterworth wrote:
> On Thu, 2006-03-23 at 14:06 +0000, Ewan Mellor wrote:
> > On Thu, Mar 23, 2006 at 11:45:25AM +0000, Harry Butterworth wrote:
> >
> > > I updated bug #515. If you want to suggest an approach for a solution,
> > > I'll have a go at putting a patch together.
> >
> > Good catch, Harry!
> >
> > If you look at the block and xen-hotplug-cleanup scripts, you'll see
> > that they claim a lock to make sure that the teardown doesn't interfere
> > with the check when a device comes up. I think that a similar thing for
> > vif teardown would suffice.
>
> I saw the block locking.
>
> The problem isn't mutual exclusion of the vif-route and cleanup scripts
> it's that the cleanup script must be serialised _after_ the vif-route
> script.
>
> I think the kernel can reorder the event injection into user space which
> is why udevsend uses sequence numbers to put events back in-order.
>
> So, in the case where the cleanup event is injected first and gets the
> lock first, the problem will still occur.
>
> With no ordering guarantee on the event transport between kernel and
> user space, the options I could think of were:
>
> 1) Use sequence numbers and reserialise events as is done by
> udevsend---the easiest way to do this is to drop support for hotplug and
> require udev.
>
> 2) Change the protocol such that the kernel code waits for the offline
> script to complete before issuing the cleanup event. This would require
> a state change in the backend when the offline script completes which
> would trigger the backend to unregister the xenbus device. I'm not sure
> that I understand the full implications of this.
>
> 3) Somehow combine the offline and cleanup into one event. I don't know
> exactly how.
>
> 4) Use the lock even though we think the design is flawed and hope it
> will work most of the time.
Yes, well I was kind of hoping we'd get away with 4) ! Especially since
the move to 2.6.16 and onwards might move people towards udev in any
case.
I'm not sure number 3 would work, so I'm inclined to avoid that.
Number 2 sounds like the best option if you really want to do this
properly -- register a watch on "vif/<domid>/<devid>/hotplug-teardown-status",
issue the offline event, and then only call device_unregister once
hotplug-teardown-status goes to "done".
Cheers,
Ewan.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|