On Sun, 2010-05-09 at 15:03 -0400, Heiko Wundram wrote:
> As I'm seeing a similar behavior of tapdisk2 (see my recent posts to
> xen-users, especially "blktap2, also broken in current pv_ops
> stable-2.6.32.x?"), I can confirm that at least in my testing (I've done some
> more over the weekend, re. that message), this is indeed an "SMP-related"
> problem, but only for HVM-64bit domains.
> What I can basically say is that:
> 1) Uni/Multi/32-bit/64-bit PV domains run properly.
> 1) Uni-VCPU, 32-bit HVM domains run properly.
> 2) Multi-VCPU, 32-bit HVM domains run properly.
> 3) Uni-VCPU, 64-bit HVM domains run properly.
> 4) Multi-VCPU, 64-bit HVM domains cause tapdisk2 to segfault, sometimes,
> under heavy I/O, and if that happens, causes the Dom0-kernel to freeze/lock
> up, Bug, and/or all other kinds of undefined behavior, where I really haven't
> made out a pattern yet.
> Interestingly, these errors do not happen when using the "normal"
> blkback-driver, and I'm very positive (at least that's what happened during
> my testing) that it's specific to Multi-VCPU, 64-bit HVM domains that the
> crash occurs, independent of the number of VCPUs bound to Dom0.
Okay, one thing which was going to happen soon is a patch to make
tapdisk run the device queue synchronously. From your description I'm
just not very convinced that this resolves such issues as well. Looks
like it needs some 64 bit testing beforehand.
Thanks for the hints. Does HVM up there mean it's rather triggered by
qemu alone? Were you running pv drivers?
Xen-devel mailing list