WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Bug in Xen 4.1.0: Xen leaks tapdisk2 processes

On Mon, 2011-05-09 at 04:23 -0400, Ian Campbell wrote:
> On Sat, 2011-05-07 at 00:25 +0100, Nathan March wrote:
> > 
> > On 5/6/2011 11:27 AM, Jim Fehlig wrote:
> > >>
> > >>> I don't have a spare server to test the patch with at the moment, but I
> > >>> can try this out later this week.
> > >>>     
> > >> If you are running xm/xend rather than xl then it won't help.
> > >>
> > >> But I'm not sure how one tells with libvrit which you are running, I'd
> > >> expect that if xend were running it would be used by default. Jim?
> > >>   
> > > If xend is running, libvirt will use it.  If not, it will attempt to use
> > > libxenlight.  'virsh version' will tell which xen backend you are using.
> > >
> > > E.g. if xend is running:
> > > xen33: # virsh version
> > > Compiled against library: libvir 0.9.0
> > > Using library: libvir 0.9.0
> > > Using API: Xen 3.0.1
> > >
> > > If xend is not running:
> > > xen33: # virsh version
> > > Compiled against library: libvir 0.9.0
> > > Using library: libvir 0.9.0
> > > Using API: xenlight 0.9.0
> > >
> > > Looks like I need to put libxenlight's version in there instead of
> > > libvirt's version, but 'Xen' vs. 'xenlight' will tell which libvirt
> > > backend is being used.
> > >
> > In that case, I can confirm that I'm using xend:
> 
> Hrm, then my earlier patch is irrelevant and I've got no idea what is
> supposed to cause the tapdisk process to exit in either the xend or xl
> case but it seems like the issue is common to both -- Daniel, any
> ideas?. 

This stuff was originally written with toolstacks in mind which already
manage storage in more detail than just plug/unplug, so tap-ctl only
provides the minimum tool, not the framework. XCP will refcount the
node's usage, and shut down once dropping back to zero.

Does XL promote storage shared across VMs? Does XL have a big lock? If
no + yes then shutting down after the VBD should have worked.

Otherwise it gets more complicated. Try/error, i.e. calling destroy and
bailing out if the device node is found busy is not fully reliable, see
my other mail regarding bdev access noise. And plug/unplugs by
concurrent XLs interleaving will obviously race.

I can offer a patch which adds a timeout to destroy (possibly as 0), but
in theory the same issue obviously remains.

Usually it boils down to a refcount. Could go into xenstore, sth
like /local/domain/<me>/blktap/<minor>/refs, plus transactions. I think
that way even XCP might go use it at some point.

Daniel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel