WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel pag

To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject: Re: [Xen-devel] Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
From: Jed Smith <jsmith@xxxxxxxxxx>
Date: Thu, 20 Aug 2009 15:11:03 -0400
Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Thu, 20 Aug 2009 12:11:43 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4A8B0BD2.2060304@xxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Linode, LLC
References: <4A89F539.2070801@xxxxxxxxxxxx> <4A8A2E74.9010202@xxxxxxxx> <4A8AB7B8.4070905@xxxxxxxxxxxx> <4A8B0BD2.2060304@xxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.22 (Macintosh/20090605)
Jeremy Fitzhardinge wrote:
>>> Is it new with 2.6.30.5?

Perhaps earlier, and we're just now running into it.  I am able to
reproduce on the v2.6.30 release.  My initial bisect leads me here (from
bad=v2.6.30 and good=v2.6.29 in linux-2.6.git):

commit 9049a11de73d3ecc623f1903100d099f82ede56c
Merge: c47c1b1 e4d0407
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>
Date:   Wed Feb 11 11:52:22 2009 -0800

    Merge commit 'remotes/tip/x86/paravirt' into x86/untangle2

I note astutely, however, that's a pretty large merge commit.

> Have you tried any other distros?  I'll try to repro with a current Xen
> and my Fedora system.

I used an Arch domU to test, as this happens a few steps into init's run
there.  The process that bugs varies widely, but it's always a few
scripts in.  We can reproduce this on two versions of our software
stack, which both run Xen 3.2.1-rc5 (xm info from one):

release                : 2.6.18.8-524-1
version                : #1 SMP Tue Apr 22 16:31:28 EDT 2008
machine                : i686

xen_major              : 3
xen_minor              : 2
xen_extra              : .1-rc5
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096

cc_compiler            : gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
cc_compile_date        : Fri Apr 11 11:24:13 EDT 2008

Newer hypervisors starting with v3.3.0 do not exhibit this behavior.

Now then, the bisection --

I ended up at 9049a11 in linux-2.6.git as told above, and I tried to
identify those patches in xen.git.  I'm not entirely sure my bisection
from that point was accurate (I could not reproduce a stack trace), and
I'll let you bisect it given your familiarity with xen.git.

I have a feeling version of hypervisor is important here as, again,
v3.3.0 and up do not BUG.

What's interesting is that they all stack trace, but the location
changes.  Here is an example from my bisection at f402a65:

-------

kernel BUG at kernel/sched.c:1184!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/ram11/removable
Modules linked in:

Pid: 1196, comm: sed Not tainted (2.6.29-rc4-bisect-00246-gf402a65 #12)
EIP: 0061:[<c011f1d3>] EFLAGS: 00010046 CPU: 2
EIP is at resched_task+0x63/0x70
EAX: 00000000 EBX: c05b3a80 ECX: 00000000 EDX: 00000000
ESI: d60d37f0 EDI: c12db200 EBP: 00000001 ESP: d4de9e20
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process sed (pid: 1196, ti=d4de8000 task=d5ae7030 task.ti=d4de8000)
Stack:
 c05b3a80 d6252030 c01260c7 00000001 00000003 00000000 d4f09e44 c12d1068
 00000001 00000001 c013f78b d4de9ea0 d4f09e44 c12d1068 c0120513 d4de9ea0
 00000003 c12d1074 c12d1070 d4de9ea0 00000001 00000200 c0120c4e 00000000
Call Trace:
 [<c01260c7>] try_to_wake_up+0xb7/0x1f0
 [<c013f78b>] autoremove_wake_function+0x1b/0x50
 [<c0120513>] __wake_up_common+0x43/0x70
 [<c0120c4e>] __wake_up+0x3e/0x60
 [<c013f6de>] __wake_up_bit+0x2e/0x40
 [<c0177719>] __do_fault+0x239/0x450
 [<c0165500>] filemap_fault+0x0/0x400
 [<c017962a>] handle_mm_fault+0x16a/0x900
 [<c01051ee>] __raw_callee_save_xen_restore_fl+0x6/0x8
 [<c018872c>] kfree+0x6c/0x80
 [<c0118e24>] do_page_fault+0x114/0x240
 [<c0118d10>] do_page_fault+0x0/0x240
 [<c05af91a>] error_code+0x72/0x78
Code: a1 04 61 79 c0 39 c2 74 0e 0f ae f0 89 f6 8b 46 04 f6 40 0c 04 74
09 5b 5e c3 8d b6 00 00 00 00 89 d0 ff 15 50 df 6d c0 5b 5e c3 <0f> 0b
eb fe 89 f6 8d bc 27 00 00 00 00 53 89 c3 8b 0c 85 80 f4
EIP: [<c011f1d3>] resched_task+0x63/0x70 SS:ESP 0069:d4de9e20

-------

I have saved everything from every bisect run, and uploaded it here:

   http://lateralus.jedsmith.org/

Let me know if I can help further.


Yours,

Jed Smith
Systems Developer
Linode, LLC
+1 (609) 593-7103 x1209
jsmith@xxxxxxxxxx
PGP: 0xA6611ED6


> 
>     J
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>