2011/7/22 Ian Campbell <Ian.Campbell@xxxxxxxxxx>:
> On Fri, 2011-07-22 at 18:36 +0100, Roger Pau Monne wrote:
>> @@ -455,6 +460,10 @@ int libxl__devices_destroy(libxl__gc *gc
>> fe_path = libxl__sprintf(gc, "/local/domain/%d/device/%s/%s",
>> domid, l1[i], l2[j]);
>> be_path = libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc,
>> "%s/backend", fe_path));
>> if (be_path != NULL) {
>> +#ifdef DONT_REMOVE_VBD_FROM_STORE
>> + if (!strcmp(l1[i], "vbd"))
>> + continue;
>> +#endif
>> if (libxl__device_destroy(gc, be_path, force) > 0)
>> n_watches++;
>> } else {
>
> Most of this patch looks good but I'm still not convinced by this bit.
>
> It misses the libxl_device_disk_del case which goes via
> libxl__device_destroy path not libxl__devices_destroy, as well as
> (maybe?) breaking the forced-destroy case (where the toolstack is
> responsible for actually nuking everything without exceptions) but
> really it's just that this special casing doesn't really pass the sniff
> test and makes me suspect it is just papering over a more fundamental
> issue somewhere else.
>
> The Linux hotplug scripts also consults and then removes the backend dir
> and it doesn't seem to cause visible issues, so what is it about
> xenbackendd and/or the NetBSD scripts which doesn't like libxl's
> behaviour I wonder? If there's a race there then perhaps Linux also has
> the issue but in a benign form -- in which case it should be worth
> putting a generic fix in instead of special casing NetBSD. If this
> really is correct NetBSD specific behaviour then I think it needs a lot
> more rationale in the changelog etc.
I don't like doing it this way either. How are Linux hotplug scripts
called? I've been looking arround, and there seems to be a set of udev
rules that call the scripts based on changes regarding devfs, the
problem is that NetBSD doesn't have devfs, and we have to watch the
xenstore for changes in the status of the devices to trigger the
correct scripts. I also see that Linux hotplug scripts also remove
entries from xenstore after detaching (xen-hotplug-cleanup), isn't it
wrong if the entries are deleted in libxl?
The basic problem with this (if I got it right) is synchronization,
entries should be deleted after the corresponding hotplug scripts are
done. This happens sometimes in NetBSD, but it is not guaranted, since
there's no rendezvous point to tell libxl that all is fine and
xenstore can be cleaned. I think it would be good to set some kind of
indication, that hotplug scripts have finished execution before libxl
deletes the entries in xenstore, and possibly set a timeout, so libxl
is not waiting forever if something goes wrong.
Also with xend hotplug scripts where in charge of removing entries
from xenstore, but in libxl Linux has no support for block-hotplug
using the loop device (only 'phy' which doesn't involve any read from
xenstore to unplug), that's why I think this problem has not appeared
before.
>
> The libxl device teardown stuff is pretty convoluted and I'm reasonably
> sure it is wrong in several respects, I've been meaning to untangle it,
> perhaps I should get to that sooner rather than later and maybe fix this
> issue as a side effect.
>
That would be great, I would like to give a hand if I can.
> Ian.
>
>
Regards, Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|