The 64-bit HVM domain I've tested was a Windows (2008 R2 Web Server) domain,
without PV drivers, so yeah, it was triggered by qemu alone in my specific test
Looking back at my testing, I'm not 100% certain that the bug doesn't occur on
64-bit Multi-VCPU PV-domains (I thought I tested those "properly", but I just
did one test case with Ubuntu 10.04 running "fully" PV, and after the system
booted successfully, I didn't test whether bonnie++-I/O or any such would
trigger the behavior; installation was done in a Dom0 chroot anyway for that,
so I can't actually say whether there'd also be a bug here), but I'm positive
that Multi-VCPU 32-bit HVM and PV domains do not exhibit the errant behavior,
just as Uni-VCPU 64-bit HVM and PV domains don't.
If you have a patch to test (regarding the synchronous queuing of tapdisk2),
I'd be happy to give it a shot with my specific setup.
PS: again, sorry for topposting; I currently don't have access to a
"functioning" email client... Which begs the question: anybody know of a way to
make Outlook behave sanely concerning quoting of mails? ;-)
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] Im Auftrag von Daniel Stodden
Gesendet: Montag, 10. Mai 2010 09:49
An: Heiko Wundram
Betreff: Re: AW: [Xen-devel] VHD BUG in xen4.0 when install windows2008
On Sun, 2010-05-09 at 15:03 -0400, Heiko Wundram wrote:
> As I'm seeing a similar behavior of tapdisk2 (see my recent posts to
> xen-users, especially "blktap2, also broken in current pv_ops
> stable-2.6.32.x?"), I can confirm that at least in my testing (I've done some
> more over the weekend, re. that message), this is indeed an "SMP-related"
> problem, but only for HVM-64bit domains.
> What I can basically say is that:
> 1) Uni/Multi/32-bit/64-bit PV domains run properly.
> 1) Uni-VCPU, 32-bit HVM domains run properly.
> 2) Multi-VCPU, 32-bit HVM domains run properly.
> 3) Uni-VCPU, 64-bit HVM domains run properly.
> 4) Multi-VCPU, 64-bit HVM domains cause tapdisk2 to segfault, sometimes,
> under heavy I/O, and if that happens, causes the Dom0-kernel to freeze/lock
> up, Bug, and/or all other kinds of undefined behavior, where I really haven't
> made out a pattern yet.
> Interestingly, these errors do not happen when using the "normal"
> blkback-driver, and I'm very positive (at least that's what happened during
> my testing) that it's specific to Multi-VCPU, 64-bit HVM domains that the
> crash occurs, independent of the number of VCPUs bound to Dom0.
Okay, one thing which was going to happen soon is a patch to make
tapdisk run the device queue synchronously. From your description I'm
just not very convinced that this resolves such issues as well. Looks
like it needs some 64 bit testing beforehand.
Thanks for the hints. Does HVM up there mean it's rather triggered by
qemu alone? Were you running pv drivers?
Xen-devel mailing list
Xen-devel mailing list