WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF =

To: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] HVM SMP linux guest hangs in cpu_idle() with EFLAGS.IF = 1
From: Gianni Tedesco <gianni.tedesco@xxxxxxxxxx>
Date: Thu, 8 Jul 2010 12:55:38 +0100
Cc: Xen Devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 08 Jul 2010 04:56:11 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTim8CBDMOw5NF4yZvNzb9uqdbvXOvozH1xdHvFzc@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1278528155.1723.139.camel@xxxxxxxxxxxxxxxxxxxxxx> <AANLkTim8CBDMOw5NF4yZvNzb9uqdbvXOvozH1xdHvFzc@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Thu, 2010-07-08 at 11:03 +0100, George Dunlap wrote:
> If both cpus are idling with EFLAGS.IF=1, this would imply that the
> kernel thinks it's waiting on a device, yes?  One thing you could do
> is to track the interaction between the guest and the devices, and see
> if you can figure out what it's waiting for and why the thing it's
> waiting for isn't happening.  You can use xentrace + xenalyze
> (http://xenbits.xensource.com/ext/xenalyze.hg) to see all the PIO,
> MMIO, and interrupts delivered to the guest.
> 
> Unfortunately this would mean understanding at some level the
> interface the device presents, which may involve a lot of going
> through driver code / going through QEMU, which doesn't sound fun. :-/
>  Maybe someone else will have some suggestions...

Hmm, yeah, usually that's a headache to do for one device never mind the
whole system...

> I ended up with a similar-looking problem during boot with a stock
> 2.6.18.8 kernel, after hacking up a work-around to allow it to get
> past the timer synchronization stage.  It might be easier to track
> down if you have a failure mode that's quicker to reproduce and a
> guest kernel that's easier to modify.  (But of course there's always
> the possibility that it's a different bug with similar symptoms...)

Well this reproduces relatively quick but because it's a vendor kernel +
custom initrd it's a bit harder to modify components. Just re-building
the original turns out to be a pain.

I think for now my time is probably best spent trying to minimise the
code required to reproduce the thing and hopefully, in turn, minimise
the amount of PIO + MMIO + IRQ traces to go through.

Argh :)


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel