WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: 2.6.39 crashes BUG: unable to handle kernel NULL pointer

To: John Stultz <john.stultz@xxxxxxxxxx>
Subject: [Xen-devel] Re: 2.6.39 crashes BUG: unable to handle kernel NULL pointer dereference at 000000000000042 .. cmos_checkintr+0x4d/0x55 under Xen as PV guest.
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Fri, 18 Mar 2011 22:51:35 -0400
Cc: tglx@xxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
Delivery-date: Fri, 18 Mar 2011 19:52:39 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1300485566.2731.46.camel@work-vm>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110318203830.GA9262@xxxxxxxxxxxx> <1300485566.2731.46.camel@work-vm>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
On Fri, Mar 18, 2011 at 02:59:26PM -0700, John Stultz wrote:
> On Fri, 2011-03-18 at 16:38 -0400, Konrad Rzeszutek Wilk wrote:
> > Haven't done any bisection, but looking at the latest set of
> > patches John's name is on them. (John, congrats on a new job)
> 
> No new job, I'm still at IBM. Just a new email address, as I'm working
> as part of the Linaro effort. My old address still work too, and I'll
> continue to use them for non-Linaro work.
> 

Aah, OK.
> 
> > With the latest linus/master I get this when starting a Xen Linux PV
> > guest:
> > 
> > [    0.404760] initcall psmouse_init+0x0/0x79 returned 0 after 59 usecs
> > [    0.404767] calling  cmos_init+0x0/0x6a @ 1
> > [    0.464855] BUG: unable to handle kernel NULL pointer dereference at 
> > 0000000000000428
> > [    0.464867] IP: [<ffffffff8105d347>] queue_work_on+0x4/0x1d
> [snip]
> > [    0.465018] Call Trace:
> > [    0.465023]  [<ffffffff8105d38f>] queue_work+0x1a/0x1c
> > [    0.465029]  [<ffffffff8105d3a4>] schedule_work+0x13/0x15
> > [    0.465035]  [<ffffffff81331b2e>] rtc_update_irq+0x10/0x12
> > [    0.465041]  [<ffffffff81333939>] cmos_checkintr+0x4d/0x55
> > [    0.465047]  [<ffffffff81333987>] cmos_irq_disable+0x46/0x4e
> > [    0.465051]  [<ffffffff8133481d>] cmos_set_alarm+0xd9/0x16e
> > [    0.465051]  [<ffffffff813320a4>] __rtc_set_alarm+0x7d/0x88
> > [    0.465051]  [<ffffffff813321fa>] rtc_timer_enqueue+0x71/0xb8
> > [    0.465051]  [<ffffffff81331707>] ? rtc_tm_to_time+0x2f/0x38
> > 
> > ... full log at the end.
> > 
> > From a brief look it looks as if rtc_device_register was never
> > called, so
> > 
> > INIT_WORK(&rtc->irqwork, rtc_timer_do_work);
> > 
> > was never called.. and hence schedule_work tries to derefence an
> > unitialized rtc->irqwork.
> > 
> > Which actually sounds right - the rtc_device_register should not
> > be called since there are no RTC clocks exposed.
> 
> 
> Huh. Did you see this with 2.6.38 vanilla? Just want to clarify if this

No. 2.6.38 vaniall works great.
> is 2.6.39 only or not.

It is something new.
> 
> 
> > There are probably two ways of fixing this - making rtc_update_irq
> > check the rtc->irqwork (not attempted) or inhibit cmos_pnp_probe from
> > setting this up. Looking at the cmos_pnp_probe and its friend they all
> > call cmos_wake_setup, but never checks whether that function works properly.
> > 
> > The cmos_wake_setup checks for ACPI (which is disabled for PV guests)
> > and just returns.
> > 
> > This little patch seems to work, but not sure if that is the correct
> > way to do it?
> 
> So I'm still trying to get my head around this (sorry, just back from
> vacation).

No problem. I just noticed it today.
> 
> So the issue is that somehow the cmos code is calling rtc_update_irq
> even though there is no cmos rtc device registered. That clearly seems
> problematic.
> 
> However, its unclear from both the code and your patch if
> cmos_wake_setup or cmos_do_probe is causing the rtc_update_irq to be
> called.

> cmos_do_probe() has lots of checks for the hardware and even registers
> the rtc device (which should init the irqwork), so I don't see how the
> null irqwork would trip after that point.

It isn't a pointer actually. I thought it would but then I realized it
was a 'struct workqueue' which just hadn't been initialized.

> 
> Any insight there?

I hoped you might have :-)
> 
> thanks
> -john
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>