WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [SPAM] Re: kernel BUG at arch/x86/xen/mmu.c:1860! - idea

To: Andreas Olsowski <andreas.olsowski@xxxxxxxxxxx>
Subject: Re: [Xen-devel] [SPAM] Re: kernel BUG at arch/x86/xen/mmu.c:1860! - ideas.
From: Teck Choon Giam <giamteckchoon@xxxxxxxxx>
Date: Mon, 28 Mar 2011 20:29:22 +0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Delivery-date: Mon, 28 Mar 2011 05:29:57 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=RN+SooBf2u/hzX9k+bfOkADWKdtlz497KMuku9uKGAI=; b=FE0zsHFboJsvg/a1E8XOsq0vB8pxaP49lJfdkpQ4Sk0gZzb1IJ5PLZeemF7f6dEUCo RcqL8OTa9IA70Rr6qgONFF4g296y+Q0QyU2qmM0jUoEOWcEHnVXPHt6GRRMyoBCitU0j hhU0xv+wi++XDGnMsuZFiuON8fJ5jYmsrhaTY=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=HELws14l/KCTCJ4BMfA6vIcP1YVONQwHCiAUPew2BccUI2M9UbKsVK/6KkVE1vvq/l 0tOCE+Ej8MhyFlVt4M0nGwoNu0gh/Fi3zSKYTL3vkb2IHL952gWNE4tl5Vr8q42LfXH2 FPIS9/I88v+7uvnVikQAqA9KoNwl/r+yuC/Po=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4D907314.1040603@xxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTimin0OZZUmvUr2KcXugc_2GGuEhRHLdug1ufha6@xxxxxxxxxxxxxx> <20110308192950.GA4562@xxxxxxxxxxxx> <20110308201002.GA5721@xxxxxxxxxxxx> <AANLkTikdA0vnxYzU7MFZNk3m6SH6=ns-WGVry2zCfws+@xxxxxxxxxxxxxx> <1299617407852-3414620.post@xxxxxxxxxxxxx> <20110309004318.GB10007@xxxxxxxxxxxx> <4D77251F.8070709@xxxxxxxxxxx> <20110309150023.GB6247@xxxxxxxxxxxx> <4D77DC0A.9090705@xxxxxxxxxxx> <4D78D5DE.4000609@xxxxxxxxxxx> <20110316155220.GA15150@xxxxxxxxxxxx> <4D907314.1040603@xxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Mon, Mar 28, 2011 at 7:37 PM, Andreas Olsowski
<andreas.olsowski@xxxxxxxxxxx> wrote:
>
>>  - turn on CONFIG_DEBUG_PAGEALLOC
>>  - turn on CONFIG_DEBUG_LIST
>>  - turn on CONFIG_DEBUG_KMEMLEAK
>>  - turn on CONFIG_JBD_DEBUG, CONFIG_JBD2_DEBUG
>>  - turn on CONFIG_SLUB_DEBUG_ON
>
> After i enabled those options (i dont use SLUB, i use SLAB) i do no longer
> encounter any errors.
>
> I completed 1000 loops of snapshot/mount/umoun/removesnapshot.

Did you try with just CONFIG_DEBUG_PAGEALLOC=y and leave the rest
unchange of your config?  My testing all narrow down to
CONFIG_DEBUG_PAGEALLOC=y to prevent this BUG.

>
>
> Without those options in 2.6.32.35 i hit a different bug earlier today:
>
> But you really have to be patient to see some output, because lvremove will
> hang quite a while:
> (a "while" beeing the a a roughly the time it takes for: wait 5 min for
> error, leave office, get coffee, smoke cigarette, goto restroom, return to
> office, finally see error)
>
> kernel: BUG: unable to handle kernel paging request
> ...
> kernel: RIP  [<ffffffff8100f2bf>] xen_set_pmd+0x2f/0xb0
> syslog/dmesg output is attached as crash.2.6.32.35-xen_01 or available at:
> http://pastebin.com/Ad8MhUzD

I hit this before:

# grep 'xen_set_pmd' /var/log/messages*
/var/log/messages:Mar 27 09:31:14 xen05 kernel: IP:
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 09:31:14 xen05 kernel: RIP:
e030:[<ffffffff8100e2d4>]  [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 09:31:14 xen05 kernel: RIP
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 09:06:10 xen05 kernel: IP:
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 09:06:10 xen05 kernel: RIP:
e030:[<ffffffff8100e2d4>]  [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 09:06:10 xen05 kernel: RIP
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 15:18:57 xen05 kernel: IP:
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 15:18:57 xen05 kernel: RIP:
e030:[<ffffffff8100e2d4>]  [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages:Mar 27 15:18:57 xen05 kernel: RIP
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages.1:Mar 23 11:00:16 xen05 kernel: IP:
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages.1:Mar 23 11:00:16 xen05 kernel: RIP:
e030:[<ffffffff8100e2d4>]  [<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b
/var/log/messages.1:Mar 23 11:00:17 xen05 kernel: RIP
[<ffffffff8100e2d4>] xen_set_pmd+0x16/0x2b

But unable to reproduce when CONFIG_DEBUG_PAGEALLOC=y.

>
> After that happened i did a kernel recompile without rebooting the machine
> first and encoundeterd system_call_fastpath as last call once more as shown
> in crash.2.6.32.35-xen_02 or http://pastebin.com/kB38W5mp

I hit this at least once but unable to when CONFIG_DEBUG_PAGEALLOC=y:

/var/log/messages-Mar 27 17:04:39 xen05 kernel: ------------[ cut here
]------------
/var/log/messages-Mar 27 17:04:39 xen05 kernel: kernel BUG at
arch/x86/xen/mmu.c:1872!
/var/log/messages-Mar 27 17:04:39 xen05 kernel: invalid opcode: 0000 [#1] SMP
/var/log/messages-Mar 27 17:04:39 xen05 kernel: last sysfs file:
/sys/block/sdd/dev
/var/log/messages-Mar 27 17:04:39 xen05 kernel: CPU 2
/var/log/messages-Mar 27 17:04:39 xen05 kernel: Modules linked in:
ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack ipt_REJECT xt_tcpudp xt_physdev iptable_filter
ip_tables x_tables bridge stp be2iscsi iscsi_tcp bnx2i cnic uio ipv6
cxgb3i cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi
dm_multipath scsi_dh video backlight output sbs sbshc power_meter
hwmon battery acpi_memhotplug xen_acpi_memhotplug ac parport_pc lp
parport tg3 libphy sg ide_cd_mod cdrom serio_raw button tpm_tis tpm
tpm_bios i2c_i801 i2c_core shpchp iTCO_wdt pcspkr dm_snapshot dm_zero
dm_mirror dm_region_hash dm_log dm_mod ata_piix libata sd_mod scsi_mod
raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
/var/log/messages-Mar 27 17:04:39 xen05 kernel: Pid: 5874, comm:
lvcreate Not tainted 2.6.32.35-4.xen.pvops.choon.centos5 #1 PowerEdge
860
/var/log/messages-Mar 27 17:04:39 xen05 kernel: RIP:
e030:[<ffffffff8100cb5b>]  [<ffffffff8100cb5b>]
pin_pagetable_pfn+0x53/0x59
/var/log/messages-Mar 27 17:04:39 xen05 kernel: RSP:
e02b:ffff8800303d1c28  EFLAGS: 00010282
/var/log/messages-Mar 27 17:04:39 xen05 kernel: RAX: 00000000ffffffea
RBX: 000000000003032d RCX: 0000000000000181
/var/log/messages-Mar 27 17:04:39 xen05 kernel: RDX: 00000000deadbeef
RSI: 00000000deadbeef RDI: 00000000deadbeef
/var/log/messages-Mar 27 17:04:39 xen05 kernel: RBP: ffff8800303d1c48
R08: 0000000000000968 R09: ffff880000000000
/var/log/messages-Mar 27 17:04:39 xen05 kernel: R10: 00000000deadbeef
R11: ffff8800303d1d08 R12: 0000000000000003
/var/log/messages-Mar 27 17:04:39 xen05 kernel: R13: 000000000003032d
R14: ffff880030360000 R15: 00007fd324a00000
/var/log/messages-Mar 27 17:04:39 xen05 kernel: FS:
00007fd327d2e710(0000) GS:ffff880028089000(0000)
knlGS:0000000000000000
/var/log/messages-Mar 27 17:04:39 xen05 kernel: CS:  e033 DS: 0000 ES:
0000 CR0: 000000008005003b
/var/log/messages-Mar 27 17:04:39 xen05 kernel: CR2: 00000000004612f0
CR3: 000000003a025000 CR4: 0000000000002660
/var/log/messages-Mar 27 17:04:39 xen05 kernel: DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
/var/log/messages-Mar 27 17:04:39 xen05 kernel: DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
/var/log/messages-Mar 27 17:04:39 xen05 kernel: Process lvcreate (pid:
5874, threadinfo ffff8800303d0000, task ffff880030360000)
/var/log/messages-Mar 27 17:04:39 xen05 kernel: Stack:
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  0000000000000000
00000000002027a9 000000013eb43318 000000000003032d
/var/log/messages-Mar 27 17:04:39 xen05 kernel: <0> ffff8800303d1c68
ffffffff8100e07c ffff880032be05c0 ffff880032aa9928
/var/log/messages-Mar 27 17:04:39 xen05 kernel: <0> ffff8800303d1c78
ffffffff8100e0af ffff8800303d1cb8 ffffffff810a4433
/var/log/messages-Mar 27 17:04:39 xen05 kernel: Call Trace:
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff8100e07c>]
xen_alloc_ptpage+0x64/0x69
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff8100e0af>]
xen_alloc_pte+0xe/0x10
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff810a4433>]
__pte_alloc+0x70/0xce
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff810a45d1>]
handle_mm_fault+0x140/0x8b9
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff810a50c9>]
__get_user_pages+0x37f/0x479
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff810a76ca>]
__mlock_vma_pages_range+0xc0/0x16f
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff8131c03f>]
? _spin_unlock_irqrestore+0x11/0x13
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff810a78db>]
mlock_fixup+0x162/0x199
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff810a7989>]
do_mlockall+0x77/0x8d
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff81139016>]
? security_capable+0x27/0x29
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  [<ffffffff810a7ce2>]
sys_mlockall+0x8f/0xb9
/var/log/messages:Mar 27 17:04:39 xen05 kernel:  [<ffffffff81012ac2>]
system_call_fastpath+0x16/0x1b
/var/log/messages-Mar 27 17:04:39 xen05 kernel: Code: 48 b8 ff ff ff
ff ff ff ff 7f 48 21 c2 48 89 55 e8 48 8d 7d e0 be 01 00 00 00 31 d2
41 ba f0 7f 00 00 e8 e9 c7 ff ff 85 c0 74 04 <0f> 0b eb fe c9 c3 55 40
f6 c7 01 48 89 e5 53 48 89 fb 74 5b 48
/var/log/messages-Mar 27 17:04:39 xen05 kernel: RIP
[<ffffffff8100cb5b>] pin_pagetable_pfn+0x53/0x59
/var/log/messages-Mar 27 17:04:39 xen05 kernel:  RSP <ffff8800303d1c28>
/var/log/messages-Mar 27 17:04:39 xen05 kernel: ---[ end trace
bf36c55d2ecd52e5 ]---

>
>
> Maybe this helps, but i think, if anything, this makes it worse as the debug
> options actually supressed the problem that needs to be debugged.

True.  At least now we know/narrow down to just related to
CONFIG_DEBUG_PAGEALLOC.  Maybe Konrad or Jeremy can have a closer look
in the related codes... ...

Thanks.

Kindest regards,
Giam Teck Choon

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>