WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61

To: <giamteckchoon@xxxxxxxxx>
Subject: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61
From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Date: Thu, 14 Apr 2011 15:56:49 +0800
Cc: jeremy@xxxxxxxx, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, konrad.wilk@xxxxxxxxxx
Delivery-date: Thu, 14 Apr 2011 00:58:04 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
Importance: Normal
In-reply-to: <BANLkTinNxLnJxtZD68ODLSJqafq0tDRPfw@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <COL0-MC1-F14hmBzxHs00230882@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>, <BLU157-w488E5FEBD5E2DBC0666EF1DAA70@xxxxxxx>, <BLU157-w5025BFBB4B1CDFA7AA0966DAA90@xxxxxxx>, <BLU157-w540B39FBA137B4D96278D2DAA90@xxxxxxx>, <BANLkTimgh_iip27zkDPNV9r7miwbxHmdVg@xxxxxxxxxxxxxx>, <BANLkTimkMgYNyANcKiZu5tJTL4==zdP3xg@xxxxxxxxxxxxxx>, <BLU157-w116F1BB57ABFDE535C7851DAA80@xxxxxxx>, <4DA3438A.6070503@xxxxxxxx>, <BLU157-w2C6CD57CEA345B8D115E8DAAB0@xxxxxxx>, <BLU157-w36F4E0A7503A357C9DE6A3DAAB0@xxxxxxx>, <20110412100000.GA15647@xxxxxxxxxxxx>, <BLU157-w14B84A51C80B41AB72B6CBDAAD0@xxxxxxx>, <BANLkTinNxLnJxtZD68ODLSJqafq0tDRPfw@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

 
> Date: Thu, 14 Apr 2011 15:26:14 +0800
> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61
> From: giamteckchoon@xxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx
> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; jeremy@xxxxxxxx; konrad.wilk@xxxxxxxxxx
>
> 2011/4/14 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> > Hi:
> >
> >       I've done test with "cpuidle=0 cpufreq=none", two machine crashed.
> >
> > blktap_sysfs_destroy
> > blktap_sysfs_destroy
> > blktap_sysfs_create: adding attributes for dev ffff8800ad581000
> > blktap_sysfs_create: adding attributes for dev ffff8800a48e3e00
> > ------------[ cut here ]------------
> > kernel BUG at arch/x86/mm/tlb.c:61!
> > invalid opcode: 0000 [#1] SMP
> > last sysfs&nbs p;file: /sys/block/tapdeve/dev
> > CPU 0
> > Modules linked in: 8021q garp blktap xen_netback xen_blkback blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_ms
> > ghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy bnx2
> > serio_raw snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 snd_timer i2c_core snd iT
> > CO_wdt pata_acpi soundcore iTCO_vendor_
> > support ata_generic snd_page_alloc pcspkr ata_piix shpchp mptsas mptscsih mptbase [last unloa
> > ded: freq_table]
> > Pid: 8022, comm: khelper Not tainted 2.6.32.36xen #1 Tecal RH2285
> > RIP: e030:[<ffffffff8103a3cb>]  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> > RSP: e02b:ffff88002803ee48  EFLAGS: 00010046
> > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81675980
> > RDX: ffff88002803ee78 RSI: 0000000000000000 RDI: 0000000000000000
> > RBP: ffff88002803ee48 R08: ffff8800a4929000 R09: dead000000200200
> > R10: dead000000100100 R11: ffffffff81447292 R12: ffff88012ba07b80
> > R13: ffff880028046020 R14: 00000000000004fb R15: 0000000000000000
> > FS:  00007f410af416e0(0000) GS:ffff88002803b000(0000) knlGS:0000000000000000
> > CS:  e033 DS: 00 00 ES: 0000 CR0: 000000008005003b
> > CR2: 0000000000469000 CR3: 00000000ad639000 CR4: 0000000000002660
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process khelper (pid: 8022, threadinfo ffff8800a4846000, task ffff8800a9ed0000)
> > Stack:
> >  ffff88002803ee68 ffffffff8100e4a4 0000000000000001 ffff880097de3b88
> > <0> ffff88002803ee98 ffffffff81087224 ffff88002803ee78 ffff88002803ee78
> > <0> ffff88015f808180 00000000000004fb ffff88002803eea8 ffffffff810100e8
> > Call Trace:
> >  <IRQ>
> >  [<ffffffff8100e4a4>] drop_other_mm_ref+0x2a/0x53
> >  [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
> >  [<ffffffff810100e8>] xen_call_function_single_interrupt+0x13/0x28
> >  [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120
> >  [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
> >  [<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1ab/0x27d
> >  [<ffffffff8128dcf9>] xen_evtchn_do_upcall+0x33/0x46
> >  [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30
> >  <EOI>
> >  [<ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17
> >  [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1
> >  [<ffffffff81113f75>] ? flush_old_exec+0x3ac/0x500
> >  [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
> >  [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
> >  [<ffffffff81151161>] ? load_elf_binary+0x398/0x17ef
> >  [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
> >
> > [<ffffffff811f463c>] ? process_measurement+0xc0/0xd7
> >  [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
> >  [<ffffffff81113098>] ? search_binary_handler+0xc8/0x255
> >  [<ffffffff81114366>] ? do_execve+0x1c3/0x29e
> >  [<ffffffff8101155d>] ? sys_execve+0x43/0x5d
> >  [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
> >  [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff8100f8af>]  ;? xen_restore_fl_direct_end+0x0/0x1
> >  [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e
> >  [<ffffffff81013daa>] ? child_rip+0xa/0x20
> >  [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
> >  [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
> >  [<ffffffff81013da0>] ? c
> > hild_rip+0x0/0x20
> > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48&nbs p;8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
> > RIP  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> >  RSP <ffff88002803ee48>
> > ---[ end trace 1522f17fdfc9162d ]---
> > Kernel panic - not syncing: Fatal exception in interrupt
> > Pid: 8022, comm: khelper Tainted: G      D    2.6.32.36xen #1
> > Call Trace:
> >  <IRQ>  [<ffffffff8105682e>] panic+0xe0/0x19a
> >  [<ffffffff8144006a>] ? init_amd+0x296/0x37a
>
> Hmmm... both machines are using AMD CPU? Did you hit the same bug on Intel CPU?
>
>
 
It is Intel CPU, not AMD.
 
model name      : Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
 

> >  [<ffffffff8100f169>] ? xen_force_evtchn_callback+0xd/0xf
> >  [<ffffffff8100f8c2>] ? check_events+0x12/0x20
> >  [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1
> >  [<ffffffff81056487>] ? print_oops_end_marker+0x23/0x25
> >  [<ffffffff81448165>] oops_end+0xb6/0xc6
> >  [<ffffffff810166e5>] die+0x5a/0x63
> >  [<ffffffff81447a3c>] do_trap+0x115/0x124
> >  [<ffffffff810148e6>] do_invalid_op+0x9c/0xa5
> >  [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46
> >  [<ffffffff8100f6e6>] ? xen_clocksource_read+0x21/0x23
> >  [<ffffffff8100f258>] ? HYPERVISOR_vcpu_op+0xf/0x11
> >  [<ffffffff8100f753>] ? xen_vcpuop_set_next_event+0x52/0x67
> >  [<ffffffff81013b3b>] invalid_op+0x1b/0x20
> >  [<ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17
> >  [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46
> >  [<ffffffff8100e4a4>] drop_other_mm_ref+0x2a/0x53
> >  [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
> >  [<ffffffff810100e8>] xen_call_function_single_interrupt+0x13/0x28
> >  [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120
> >  [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
> >  [<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1ab/0x27d
> >  [<ffffffff8128dcf9>] xen_evtchn_do_upcall+0x33/0x46
> >  [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30
> >  <EOI>  [<ffffffff81447292 >] ? _spin_unlock_irqrestore+0x15/0x17
> >  [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1
> >  [<ffffffff81113f75>] ? flush_old_exec+0x3ac/0x500
> >  [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
> >  [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
> >  [<ffffffff81151161>] ? load_elf_binary+0x398/0x17ef
> >  [<ffffffff81042fcf>] ? need_resched+0x23/0x
> > 2d
> >  [<ffffffff811f463c>] ? process_measurement+0xc0/0xd7
> >  [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
> >  [<ffffffff81113098>] ? search_binary_handler+0xc8/0x255
> >  [<ffffffff81114366>] ? do_execve+0x1c3/0x29e
> >  [<ffffffff8101155d>] ? sys_exe cve+0x43/0x5d
> >  [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
> >  [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1
> >  [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e
> >  [<ffffffff81013daa>] ? child_rip+0xa/0x20
> >  [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
> >  [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
> >  [<ffffffff81013da0>] ? child_rip+0x0/0x20
> > (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
> >
> >> Date: Tue, 12 Apr 2011 06:00:00 -0400
> >> From: konrad.wilk@xxxxxxxxxx
> >> To: tinnycloud@xxxxxxxxxxx
> >> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; giamteckchoon@xxxxxxxxx;
> >> jeremy@xxxxxxxx
> >> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61
> >>
> >> On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote:
> >> >
> >> > Hi :
> >> >
> >> > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kernel
> >> > panic bug.
> >> >
> >> > 2.6.32.36 Kernel:
> >> > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=bb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4
> >> > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba183
> >> >
> >> > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loopes
> >> > in restart every 15minutes.
> >>
> >> What is the storage that you are using for your guests? AoE? Local disks?
> >>
> >> > About 17 machines are invovled in the test, after 10 hours run, one
> >> > confrontted a crash at arch/x86/mm/tlb.c:61
> >> >
> >> > Currently I am trying "cpuidle=0 cpufreq=none" tests based on Teck's
> >> > suggestion.
> >> >
> >> > Any comments, thanks.
> >> >


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>