This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-bugs] [Bug 1727] Hypevisor hangs on boot.

To: xen-bugs@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-bugs] [Bug 1727] Hypevisor hangs on boot.
From: bugzilla-daemon@xxxxxxxxxxxxxxxxxxx
Date: Tue, 25 Jan 2011 13:20:14 -0800
Delivery-date: Tue, 25 Jan 2011 13:20:24 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <bug-1727-3@xxxxxxxxxxxxxxxxxxxxxxxxxxx/bugzilla/>
List-help: <mailto:xen-bugs-request@lists.xensource.com?subject=help>
List-id: Xen Bugzilla <xen-bugs.lists.xensource.com>
List-post: <mailto:xen-bugs@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-bugs>, <mailto:xen-bugs-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-bugs>, <mailto:xen-bugs-request@lists.xensource.com?subject=unsubscribe>
Reply-to: bugs@xxxxxxxxxxxxxxxxxx
Sender: xen-bugs-bounces@xxxxxxxxxxxxxxxxxxx

------- Comment #3 from dmitry.trikoz@xxxxxxxxxxx  2011-01-25 13:20 -------
I tried booting with 'x2apic=off' or 'iommu=off' with the same result.
also reproduced the issue on xen-unstable.
The problem is easier to reproduce with 'cpuinfo=1' boot option.

After many tries I figured out exact line of code after which the system starts
hanging. So now I am able to expose the bug and reproduce the problem every
time I boot. To do that - just all these lines in xen/arch/x86/smpboot.c

--- xen/arch/x86/smpboot.c.original     2011-01-25 13:35:32.000000000 -0500
+++ xen/arch/x86/smpboot.c      2011-01-25 16:02:20.000000000 -0500
@@ -426,7 +426,9 @@
+    for(i=0; i<200;i++) {
+       printk(" %d ====================================================\n",
+    }
     Dprintk("Waiting for send to finish...\n");
     timeout = 0;
     do {

I also tried to force NMI on boot cpu while it's hanging. It appeared that boot
cpu doesn't react to NMIs while it's hanging. eventually it wakes up, and
handles pending NMI. This behaviour make me believe that hang happens in SMM

Also in the log I can see messages like this:

(XEN) MCE: The hardware reports a non fatal, correctable incident occurred on
CPU 1.
(XEN) Bank 2: d40000c000040150 at         8fbe5ae9

Address 8fbe5ae9 belongs to reserved BIOS area.
Decoding of code d40000c000040150 : Instruction fetch exception at Level 0

I have two machines of this type and I reproduced the problem on both of them.

As I said before, it looks very much like BIOS problem but linux kernel from
Centos 5.5 boots on my machines just fine, even with added printk loop (above).

Configure bugmail: 
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Xen-bugs mailing list