WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] dom0 hang in xen-4.0.0-rc5 - possible acpi issue? [WAS: Usin

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Pasi Kärkkäinen <pasik@xxxxxx>, Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "Xen-devel@xxxxxxxxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] dom0 hang in xen-4.0.0-rc5 - possible acpi issue? [WAS: Using xen-unstable, dom0 hangs during boot]
From: "Nadolski, Ed" <Ed.Nadolski@xxxxxxx>
Date: Tue, 2 Mar 2010 12:23:23 -0700
Accept-language: en-US
Acceptlanguage: en-US
Cc:
Delivery-date: Tue, 02 Mar 2010 11:25:04 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100301151005.GA7881@xxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <8115AF16522A3D4383C1FF753036713F9B1B522B@xxxxxxxxxxxxxxxxx> <4B8703B9.9000207@xxxxxxxx> <8115AF16522A3D4383C1FF753036713F9B1B52D8@xxxxxxxxxxxxxxxxx> <20100226144622.GP2761@xxxxxxxxxxx> <8115AF16522A3D4383C1FF753036713F9B1B54B0@xxxxxxxxxxxxxxxxx> <8115AF16522A3D4383C1FF753036713F9B21E600@xxxxxxxxxxxxxxxxx> <20100301151005.GA7881@xxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acq5VTMGRhpYjDhsR9mD4E0QlVnWzAAbL6Ew
Thread-topic: dom0 hang in xen-4.0.0-rc5 - possible acpi issue? [WAS: Using xen-unstable, dom0 hangs during boot]
> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> Sent: Monday, March 01, 2010 8:10 AM
> To: Nadolski, Ed
> Cc: Pasi Kärkkäinen; Jeremy Fitzhardinge; Xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] Using xen-unstable, dom0 hangs during boot
> 
> On Sun, Feb 28, 2010 at 04:47:21PM -0700, Nadolski, Ed wrote:
> > > -----Original Message-----
> > On 02/25/2010 02:18 PM, Nadolski, Ed wrote:
> > > I'm running Fedora 12 (kernel 2.6.31.5-127.fc12.x86_64) on a Dell
> T7500 Xeon with VT-x and VT-d. After building xen-unstable and
> rebooting, the dom0 Linux hangs a few seconds after it gets control
> from Xen, and I have to power-cycle to recover.   Here are the last
> messages before it hangs:
> > >
> > > [    2.766882] loop: module loaded
> > > [    2.767736] input: Macintosh mouse button emulation as
> /devices/virtual/input/input2
> > > [    2.769396] xen_set_ioapic_routing: irq 20 gsi 20 vector 20
> ioapic 0 pin 20 triggering 1 polarity 1
> > > [    2.770342] achi 0000:00:1f.2: PCI INT C ->  GSI 20 (level, low)
> ->  IRQ 20
> > > [    2.771158] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3
> Gbps 0x27 impl SATA mode
> > > [    2.772078] ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio
> ems
> > > <<hangs at this point>>
> >
> >
> >
> > I've added a bunch of trace prints. With serial ports enabled for
> trace capture, the hang actually occurs earlier than the ahci code
> above. It now occurs during the serial8250_config_port() function in
> the 8250/16650 serial driver initialization. There is a call to
> probe_irq_on(), which calls msleep(20), but the msleep() never returns.
> (see below)
> >
> > If I hit the power button on the front panel, it generates an
> interrupt that forces the msleep() to return.  Also, if I replace the
> msleep(20) with mdelay(20), the code does not hang at that point.  (In
> either case, the code does hang again a short while later.)
> >
> > I'm not too familiar with kernel internals - what could cause the
> msleep() not to return?  Possibly an interrupt gets missed, or is not
> getting unmasked?
> 
> I think you are hot on the trail. Try hitting 'i' (or maybe it is 'I')
> and see what Xen prints out for the IRQ mapping.  Earlier on you
> mentioned that you saw: "Xen: Cannot share IRQ0 with guest." which is a
> bit strange, considering you are booting Dom0. IRQ0 is usually the
> timer, but it looks as if the serial port is on interrupt 0? It
> shouldn't be - try adding some more printk's and find out what IRQ it
> thinks it is.
> 
> Also try to boot the kernel without Xen and see what IRQ the serial
> port driver uses then.


I've found out a bit more.  First, I've upgraded to Xen 4.0.0-rc5, but the 
problem persists. 

I've pasted some more trace below, including a WARN_ON() before the call to 
msleep().  The jumps in the timestamps show where msleep() hung and I hit the 
power button to force it to resume.

Looks like the serial8250 driver gets IRQ 3 for ttyS1.  I'm not clear what the 
"will not share" message for IRQ 0 means -- maybe it means Xen won't allow the 
IRQ to be shared with a guest?  It seems to happen in a loop that is 
initializing all the IRQs, not just the IRQ for the serial port.

Interestingly, I can make the hang go away by specifying 
"acpi_skip_timer_override" to xen in grub.conf.  AFAICT this is meant for some 
BIOS issues, but I don't think this system has a problem BIOS, since it cleanly 
boots Xen 3.4.1 & CentOS 5.3 dom0 without acpi_skip_timer_override.  Does that 
sound like maybe some kind of issue in the recent ACPI code?  Would that be in 
Xen or in the dom0 Linux?  

Thanks again,
Ed


Here is the partial trace, full trace is attached:

(XEN) Xen version 4.0.0-rc5 (root@) (gcc version 4.4.2 20091027 (Red Hat 
4.4.2-7) (GCC) ) Mon Mar  1 12:55:52 MST 2010
(XEN) Latest ChangeSet: Mon Mar 01 16:50:30 2010 +0000 20990:46bfb4a318e9
(XEN) Console output is synchronous.
(XEN) Command line: loglvl=all guest_loglvl=all sync_console console_to_ring 
com1=115200,8n1 console=com1
....
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 2.6.31.6 (root@truckee) (gcc version 4.4.2 
20091027 (Red Hat 4.4.2-7) (GCC) ) #3 SMP Mon Mar 1 12:54:12 MST 2010
[    0.000000] Command line: ro root=UUID=d9c5bf5d-23d1-445e-9210-e6ad0798a0ba 
nomodeset LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc 
KEYTABLE=us console=hvc0 earlyprintk=xen
[    0.000000] KERNEL supported cpus:
....
[    5.936124] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    5.942676] probe_irq_on: ENTRY!
(XEN) irq.c:1182:d0 Cannot bind IRQ 0 to guest. Will not share with others.
[    5.952512] ------------[ cut here ]------------
[    5.957180] WARNING: at 
/root/xen/xen-unstable.hg/linux-2.6-pvops.git/kernel/irq/autoprobe.c:69 
probe_irq_on+0xb3/0x213()
[    5.968172] Hardware name: Precision WorkStation T7500
[    5.973537] Modules linked in:
[    5.976648] Pid: 1, comm: swapper Not tainted 2.6.31.6 #3
[    5.982105] Call Trace:
[    5.984618]  [<ffffffff8106938f>] warn_slowpath_common+0x77/0x8f
[    5.990670]  [<ffffffff810693b6>] warn_slowpath_null+0xf/0x11
[    5.996467]  [<ffffffff810ae040>] probe_irq_on+0xb3/0x213
[    6.001926]  [<ffffffff812c79a9>] serial8250_config_port+0x781/0x98d
[    6.008324]  [<ffffffff812c3ed6>] uart_add_one_port+0x11d/0x301
[    6.014298]  [<ffffffff811f3b3d>] ? kobject_init+0x43/0x83
[    6.019842]  [<ffffffff81961a24>] serial8250_init+0xfe/0x143
[    6.025546]  [<ffffffff81961926>] ? serial8250_init+0x0/0x143
[    6.031347]  [<ffffffff8100a087>] do_one_initcall+0x59/0x179
[    6.037058]  [<ffffffff81938f7a>] kernel_init+0x16f/0x1c5
[    6.042515]  [<ffffffff81033d6a>] child_rip+0xa/0x20
[    6.047528]  [<ffffffff81032f27>] ? int_ret_from_sys_call+0x7/0x1b
[    6.053760]  [<ffffffff810336dd>] ? retint_restore_args+0x5/0x6
[    6.059731]  [<ffffffff81033d60>] ? child_rip+0x0/0x20
[    6.064930] ---[ end trace 11878b47d03d9332 ]---
[    6.069595] probe_irq_on: calling msleep(20)
<<<HANGS, PRESS POWER BUTTON>>>>
[   60.833667] probe_irq_on: Returned from msleep(20)
(XEN) irq.c:1182:d0 Cannot bind IRQ 0 to guest. Will not share with others.
[   60.845033] probe_irq_on: calling msleep(100) 
<<<HANGS, PRESS POWER BUTTON>>>>
[   76.386382] probe_irq_on: Returned from msleep(100)
[   76.391279] probe_irq_on: EXIT!
[   76.394535] probe_irq_on: ENTRY!
(XEN) irq.c:1182:d0 Cannot bind IRQ 0 to guest. Will not share with others.
[   76.404698] ------------[ cut here ]------------
[   76.409360] WARNING: at 
/root/xen/xen-unstable.hg/linux-2.6-pvops.git/kernel/irq/autoprobe.c:69 
probe_irq_on+0xb3/0x213()
[   76.420351] Hardware name: Precision WorkStation T7500
[   76.425718] Modules linked in:
[   76.428830] Pid: 1, comm: swapper Tainted: G        W  2.6.31.6 #3
[   76.435059] Call Trace:
[   76.437577]  [<ffffffff8106938f>] warn_slowpath_common+0x77/0x8f
[   76.443630]  [<ffffffff810693b6>] warn_slowpath_null+0xf/0x11
[   76.449428]  [<ffffffff810ae040>] probe_irq_on+0xb3/0x213
[   76.454885]  [<ffffffff812c79e2>] serial8250_config_port+0x7ba/0x98d
[   76.461285]  [<ffffffff812c3ed6>] uart_add_one_port+0x11d/0x301
[   76.467257]  [<ffffffff811f3b3d>] ? kobject_init+0x43/0x83
[   76.472800]  [<ffffffff81961a24>] serial8250_init+0xfe/0x143
[   76.478506]  [<ffffffff81961926>] ? serial8250_init+0x0/0x143
[   76.484305]  [<ffffffff8100a087>] do_one_initcall+0x59/0x179
[   76.490016]  [<ffffffff81938f7a>] kernel_init+0x16f/0x1c5
[   76.495473]  [<ffffffff81033d6a>] child_rip+0xa/0x20
[   76.500488]  [<ffffffff81032f27>] ? int_ret_from_sys_call+0x7/0x1b
[   76.506720]  [<ffffffff810336dd>] ? retint_restore_args+0x5/0x6
[   76.512691]  [<ffffffff81033d60>] ? child_rip+0x0/0x20
[   76.517886] ---[ end trace 11878b47d03d9333 ]---
[   76.522554] probe_irq_on: calling msleep(20) 
<<<HANGS, PRESS POWER BUTTON>>>>
[  109.284906] probe_irq_on: Returned from msleep(20)
(XEN) irq.c:1182:d0 Cannot bind IRQ 0 to guest. Will not share with others.
[  109.296271] probe_irq_on: calling msleep(100) 
<<<HANGS, PRESS POWER BUTTON>>>>
[  111.941064] probe_irq_on: Returned from msleep(100)
[  111.945863] probe_irq_on: EXIT!
[  111.949166] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[  111.956146] 00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
[  111.964407] brd: module loaded
[  111.968666] loop: module loaded
[  111.971867] input: Macintosh mouse button emulation as 
/devices/virtual/input/input2
[  111.980123] xen_set_ioapic_routing: irq 20 gsi 20 vector 20 ioapic 0 pin 20 
triggering 1 polarity 1
[  111.989062] ahci 0000:00:1f.2: PCI INT C -> GSI 20 (level, low) -> IRQ 20
[  111.996048] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x27 
impl SATA mode
[  112.004121] ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pio ems
<<<HANGS>>>








Attachment: trace_with_hang_ xen-4.0.0-rc5.txt
Description: trace_with_hang_ xen-4.0.0-rc5.txt

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel