This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: BUG: unable to handle kernel NULL pointer dereference at

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: [Xen-devel] Re: BUG: unable to handle kernel NULL pointer dereference at IP: [<ffffffff8105ae4c>] process_one_work+
From: Scott Garron <xen-devel@xxxxxxxxxxxxxxxxxx>
Date: Tue, 14 Jun 2011 17:55:47 -0400
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Delivery-date: Tue, 14 Jun 2011 14:56:37 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110614135543.GA27849@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110606191725.GZ32595@xxxxxxxxxxx> <4DED478E.5070607@xxxxxxxxxxxxxxxxxx> <20110607191949.GB2075@xxxxxxxxxxxx> <4DEFBE7F.5060909@xxxxxxxxxxxxxxxxxx> <20110608192916.GA4909@xxxxxxxxxxxx> <4DF12747.2090900@xxxxxxxxxxxxxxxxxx> <20110610125906.GA10831@xxxxxxxxxxxx> <4DF24B79.8050909@xxxxxxxxxxxxxxxxxx> <20110613220352.GA23755@xxxxxxxxxxxx> <4DF69B42.4080908@xxxxxxxxxxxxxxxxxx> <20110614135543.GA27849@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20101226 Icedove/3.0.11
On 06/14/2011 09:55 AM, Konrad Rzeszutek Wilk wrote:
But the curious thing is that you have two CPUs assigned to Dom0 and
while CPU0 looks to be bouncing back and forth, CPU1 is doing
something. The RIP is 0xffffffff8108820c. Can you try to run this
through System.map? Or the whole bunch of these:

ffffffff8108820c ffffffff81088100 ffffffff810881a7 ffffffff8108811a
ffffffff816101a8 ffffffff81006c32 ffffffff816114a4 ffffffff8108803a
ffffffff8105f5bd ffffffff81618564 ffffffff81617973 ffffffff816117a1

     I grabbed code snippets for each of these locations and put them here:


The other idea is to limit Dom0 to only run on one CPU. You can do
this by having 'dom0_max_vcpus=1 dom0_vcpus_pin' and see if it fails
somewhere else? It probably will die in the 0xffffffff810013aa :-(

     After setting dom0_max_vcpus=1 and dom0_vcpus_pin, the boot got to
"Trying to unpack rootfs image as initramfs..." and hung there.  The
serial console as well as the CTRL_A(x3) * outputs are here:


But irregardless of what I mentioned above we need to find out why
process_one_worker got a toxic parameter. Can you disassemble
0xffffffff8105ae4c and see what it does and how it corresponds to
'process_one_work' in kernel/workqueue.c?

     I put the disassembly of it in the hailstorm-debugnotes.txt file
that I mentioned above.  Let me know if you need more than that.

You can also instrument the code to find out what:

1804         work_func_t f = work->func;


     I think this request is starting to go a little beyond what I know
how to do.

Scott Garron

Xen-devel mailing list