This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] xl create should refuse to share block devices RW betwee

On Tue, 2010-07-27 at 06:55 -0400, Ian Campbell wrote:
> On Tue, 2010-07-27 at 11:40 +0100, Stefano Stabellini wrote:
> > On Tue, 27 Jul 2010, Ian Campbell wrote:
> > > On Tue, 2010-07-27 at 01:36 +0100, Jeremy Fitzhardinge wrote:
> > > > When creating a domain, "xl create" should fail if a block device is 
> > > > shared RW between domains, like xm create does.
> > > > 
> > > > I'm not sure how this would be implemented.  Search xenstore for 
> > > > references to the device when setting up a domain?
> > > 
> > > The hotplug scripts have locking and calls to a function called 
> > > "check_device_sharing" in them, I've been wondering why that wasn't
> > > kicking in for xl created domains for a little while but never got to
> > > investigating.
> > > 
> >  
> > those scripts are called by udev and theoretically should work
> > exactly the same way with xend or libxl.
> > I didn't test this but I believe that since libxl always uses blktap2,
> > the script called is block and the codepath taken is the following:
> > 
> >     phys=$(xenstore_read_default "$XENBUS_PATH/physical-device" 'MISSING')
> >     if [ "$phys" != 'MISSING' ]
> >     then
> >       # Depending upon the hotplug configuration, it is possible for this
> >       # script to be called twice, so just bail.
> >       exit 0
> >     fi
> > 
> > so we never do any checks.
> ...and in any case physical-device would be a unique /dev/xen/tapdisk-N
> path, regardless of any sharing of the files tapdisk is backing onto so
> it is rather hard to check for sharing at this level anyway.

Nope. The xl/tapctl code does a reverse map from the leaf type:/path to
the minor number, so the result would be a shared device node. 

Provided the same  physical node isn't accessed through some alias. But
think at some point in the not so far future we'll start resolving
(fs;node) pairs anyway, to better identify storage types.

Also provided that xl create won't race. The tap-ctl calls don't
serialize themselves.

All that is different from blktap1 altogether, where sharing e.g. a vhd
is causing disaster in any case.

I guess xl/xend doesn't have a config bit for sharing disks? On a single
host it's altogether possible, although not exactly popular.

On shared storage the locking doesn't buy you anything, that's part of
the reason why nobody ever cared to implement it. XCP used to work with
killing metadata headers in VHD for a little while. Exactly until people
got aware of the crash resilience issues involved :o)

> > I also think that tap_ctl_create is the right place to do these checks,
> > not a script called by udev after the device has been created.
> Agreed. tapdisk should be taking out a flock() or something similar on
> any vhd files it is going to write to and should fail if it can't lock
> the file.

Yes. If  so, it would have to be tapdisk doing extra checks during
tap-ctl open, not tap-ctl.

For the above reasons, that only helps not breaking the metadata by
running two tapdisk on non-sharable image containers. The guest fs is
still at risk.

So it's either hotplug. I optimistically believe a lot in userspace
problem solving. Or, if paranoia prevails, it could be relatively safely
done at a lower level in blkback.


Xen-devel mailing list