Re: [Xen-devel] Hunting down an oops in Xen 3.1.0's 2.6.18 kerne

To:	"Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>
Subject:	Re: [Xen-devel] Hunting down an oops in Xen 3.1.0's 2.6.18 kernel
From:	"Michael Marineau" <mike@xxxxxxxxxxxx>
Date:	Wed, 3 Oct 2007 13:39:24 -0700
Cc:	xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date:	Wed, 03 Oct 2007 13:40:07 -0700
Dkim-signature:	v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; bh=bpv13VmmpRtZtETE4iSEJPfuUGaJuirLlHMlnhOoAik=; b=Vnu4I9fDZnG37aBCWHFwPb32QpzJwqrxPomZprvQV+/9cV7fIfNMehswZPwUUVEtFRLLma2avNlmm79AkKLcPq9qDrjdogWfy11JBIQzjBARkrBZ2xCLf7q4AAC6CpZTN+wbIDYJMMVzY2biLdv+XoZv9TZ4VDbkHOJXzBs98r8=
Domainkey-signature:	a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=q3wIxrvNK/GzFQ6gwandQHWotE/GHKhi6yjiKScm9I/ILBbjakKJzYlJOC2qO4VLYXZIyNlAV0YU0GFolkXiFLd5uA74frX5jJ96D5WEhI+f8V+7t1QEOe9hCb6OVnQfzFhd5r3hPcmW+QYtehwBWIxBqZF8WBHJOCav3YSVLt8=
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxx
In-reply-to:	<c0526ddf0709171656n784e95fbp7cb4874106c06d57@xxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<c0526ddf0709141551x29edd2b6j85dd740ebd8cf929@xxxxxxxxxxxxxx> <C3114346.D95B%Keir.Fraser@xxxxxxxxxxxx> <c0526ddf0709171656n784e95fbp7cb4874106c06d57@xxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

On 9/17/07, Michael Marineau <mike@xxxxxxxxxxxx> wrote:
> On 9/15/07, Keir Fraser <Keir.Fraser@xxxxxxxxxxxx> wrote:
> > On 14/9/07 23:51, "Michael Marineau" <mike@xxxxxxxxxxxx> wrote:
> >
> > > I have been unable to reproduce this with 3.0.4's 2.6.16 kernel but
> > > 2.6.18 will oops on both 3.0.4 and 3.1.0. Also, x86_64 appears to be
> > > ok.
> > >
> > > I'm guessing this issue is the same as the oops reported here:
> > > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=975
> > >
> > > Below is an example of the oops on my 2.6.18 pae kernel with a couple
> > > extra debuging lines added:
> >
> > Looks like xen_l1_entry_update() is passed a virtual address which has no
> > corresponding machine address. So the pte page or its mapping is corrupted
> > somehow. deadbeef in the register dumps is also not a good sign. I'll have a
> > go at repro'ing.
> >
> >  -- Keir
> >
> >
> >
>
> As for the deadbeef, I'm kind of doubt it is important. Those values
> show up after the hypercall to xen. Using the attached patch which
> checks for the bogus value prior to the call I get the following oops:
>
> virtptr: f57b40c0 machineptr: 7fffffff0c0
> ------------[ cut here ]------------
> kernel BUG at arch/i386/mm/hypervisor.c:64!
> invalid opcode: 0000 [#1]
> SMP
> Modules linked in:
> CPU:    0
> EIP:    0061:[<c0117893>]    Not tainted VLI
> EFLAGS: 00010286   (2.6.18-xen-r5-try2 #10)
> EIP is at xen_l1_entry_update+0xd7/0x100
> eax: 0000002d   ebx: 00000000   ecx: 00000000   edx: 00000001
> esi: fffff0c0   edi: 000007ff   ebp: ed45cd10   esp: ed45ccd8
> ds: 007b   es: 007b   ss: 0069
> Process bash (pid: 5044, ti=ed45c000 task=ec835a70 task.ti=ed45c000)
> Stack: c037b964 f57b40c0 fffff0c0 000007ff 00000000 00000000 f57b40c0 fffff0c0
>        000007ff 00000000 00000000 00000000 00000000 00000000 ed45cd84 c01586b7
>        35371025 00000000 ecd95ec0 ecd95f08 c04bce70 00000000 00000004 00000000
> Call Trace:
>  [<c01586b7>] zap_pte_range+0x265/0x658
>  [<c0158c16>] unmap_page_range+0x16c/0x2b4
>  [<c0158e2c>] unmap_vmas+0xce/0x1cb
>  [<c015f0b8>] exit_mmap+0x7d/0xf4
>  [<c011e0f3>] mmput+0x36/0x8c
>  [<c01782d3>] exec_mmap+0x156/0x229
>  [<c0178a78>] flush_old_exec+0x59/0x25a
>  [<c0198a18>] load_elf_binary+0x33c/0xc52
>  [<c0178f2a>] search_binary_handler+0x89/0x23c
>  [<c017922f>] do_execve+0x152/0x1be
>  [<c010391c>] sys_execve+0x32/0x84
>  [<c0104dfb>] syscall_call+0x7/0xb
>  [<b7efd899>] 0xb7efd899
> Code: b4 97 fe ff 85 c0 78 42 83 c4 2c 5b 5e 5f 5d c3 8b 45 e0 89 74
> 24 08 89 7c 24 0
> EIP: [<c0117893>] xen_l1_entry_update+0xd7/0x100 SS:ESP 0069:ed45ccd8

I can still reproduce this problem on the 3.1.1-rc2 xen kernel. Has
anyone had a chance to take a look at this or try to reproduce it? I
can reproduce this far to easily :-(

Is there any further debugging information I can provide?

-- 
Michael Marineau
Oregon State University
mike@xxxxxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] Hunting down an oops in Xen 3.1.0's 2.6.18 kernel