On Fri, 2011-07-22 at 06:01 -0400, Sébastien RICCIO wrote:
> > The processes, really? Where do they hang? (check out the wait state --
> > ps -eopid,wchan:25,cmd or so).
> >
> > Or do you mean they're stuck waiting for I/Os?
> >
> > Daniel
> >
> >
>
> They seems to work and to do their job, but they are in a strange state.
> For example a ps -aux on dom0 hangs when processing
> the line about the tapdisk process, also it cannot be detached from the
> vm, and issuing a reboot of the host hangs too (can't kill the process
> so it doesn't reboot).
>
> I fighted quite a lot with this on a debian6 + xen 4.1.x box and found
> out that disabling the multipath-tools and multipath-tools-boot
> corrected the problem (but I need them). I thought that maybe it was
> beacause multipathd try to "multipath" the block device
> handled by blktap2 and somehow locks it. But it's speculations :)
The multipathing is in a dm node to which tapdisk issues I/O. There's no
special handling involved in there whatsoever. It's completely
transparent, to blktap and tapdisk, as it should be.
I could imagine tapdisk wedging in dm code, during some I/O operations.
These should be fully asynchronous, but for some storage types under
special conditions that's sometimes wishful thinking. That applies if
you find a tap-ctl call (even just a list command) blocking.
The blktap module does not do anything unusual to the tapdisk task.
Anyway, it'd initially be a matter of figuring out where exactly it
blocks. If ps is borked, try to get another shell and
cat /proc/<pid>/wchan. Makes sense with both the ps and tapdisk2 tasks.
You say from the guest I/O perspective it still makes progress? If not,
that would explain why you're unable to detach: Blkback won't be able to
release the device before all pending I/O is flushed.
To check tapdev I/O state from the host side, do a
cat /sys/class/blktap2/tapdisk<n>/debug
That will dump some task stuff and a list of outstanding requests, if
there are any.
> I do not have the the hands on the box at the moment to give you more
> informations and do not want to hijack this thread. It's just that it
> looked like the problem I encountered, but I will send you more
> informations when I am on the box.
Thanks!
Daniel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|