On Thu, 2010-07-08 at 11:03 +0100, George Dunlap wrote:
> If both cpus are idling with EFLAGS.IF=1, this would imply that the
> kernel thinks it's waiting on a device, yes? One thing you could do
> is to track the interaction between the guest and the devices, and see
> if you can figure out what it's waiting for and why the thing it's
> waiting for isn't happening. You can use xentrace + xenalyze
> (http://xenbits.xensource.com/ext/xenalyze.hg) to see all the PIO,
> MMIO, and interrupts delivered to the guest.
>
> Unfortunately this would mean understanding at some level the
> interface the device presents, which may involve a lot of going
> through driver code / going through QEMU, which doesn't sound fun. :-/
> Maybe someone else will have some suggestions...
Hmm, yeah, usually that's a headache to do for one device never mind the
whole system...
> I ended up with a similar-looking problem during boot with a stock
> 2.6.18.8 kernel, after hacking up a work-around to allow it to get
> past the timer synchronization stage. It might be easier to track
> down if you have a failure mode that's quicker to reproduce and a
> guest kernel that's easier to modify. (But of course there's always
> the possibility that it's a different bug with similar symptoms...)
Well this reproduces relatively quick but because it's a vendor kernel +
custom initrd it's a bit harder to modify components. Just re-building
the original turns out to be a pain.
I think for now my time is probably best spent trying to minimise the
code required to reproduce the thing and hopefully, in turn, minimise
the amount of PIO + MMIO + IRQ traces to go through.
Argh :)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|