WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [Help] Bad page state in some process and #GPF exception

To: fanliang <fanliang@xxxxxxxxxx>
Subject: [Xen-devel] Re: [Help] Bad page state in some process and #GPF exception made the Dom0 crashed !
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Tue, 07 Dec 2010 10:25:40 -0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, keir@xxxxxxx
Delivery-date: Tue, 07 Dec 2010 10:28:38 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <001401cb95fe$d87af640$8750a60a@xxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <001401cb95fe$d87af640$8750a60a@xxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101103 Fedora/1.0-0.33.b2pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.6
On 12/07/2010 03:06 AM, fanliang wrote:
> Hi all, I need an 'urgent ' help now because my Dom0 crashed. My dom0
> kernel is 2.6.32.12-xen. The server run for about one week ,then print
> lots of "Bad page ..." log in /var/log/message . The "Bad page " log
> is printed about 60 times and at last , the "general protection fault"
> exception happened. Of cource ,the server crashed .

So it ran fine for a week, and then died with a sudden burst of these
bad page messages?

> I am trying to analyse the mm of the dom0 but have not any progress.
> If you have any thoughts on how to approach these scenarios, I would
> appreciate if you could shed some light.

Could you send the complete output? It's hard to analyse these messages
in isolation, since there might be a clue earlier which indicates how it
got into this state. Also are you using a debug build of Xen? Are there
any Xen console messages?

And just to be sure: this hardware is definitely known to be stable when
running the same kernel native? Just to make sure we're chasing a
Xen-specific bug here.

> The whole log message is here:
> [946038.527830] BUG: Bad page state in process sh pfn:dab21
> [946038.528001] page:ffff8800077d8f38 flags:40000000004000d0 count:1
> mapcount:1 mapping:ffff8800f6e37979 index:7f25191d3
> [946038.528239] Pid: 19520, comm: sh Tainted: G N 2.6.32.12-0.7-xen #1
> [946038.528394] Call Trace:
> [946038.528549] [<ffffffff80009a75>] dump_trace+0x65/0x180
> [946038.528723] [<ffffffff8036d496>] dump_stack+0x69/0x73
> [946038.528876] [<ffffffff8009ccff>] bad_page+0xdf/0x160
> [946038.529033] [<ffffffff8009d9c8>] get_page_from_freelist+0x328/0x750
> [946038.529187] [<ffffffff8009e089>] __alloc_pages_nodemask+0x109/0x630
> [946038.529347] [<ffffffff800b696d>] do_wp_page+0x3bd/0xb80
> [946038.529508] [<ffffffff800b78b5>] handle_mm_fault+0x785/0xd90
> [946038.529657] [<ffffffff80373acb>] do_page_fault+0x21b/0x400
> [946038.529825] [<ffffffff80371738>] page_fault+0x28/0x30
> [946038.529992] [<00000000004258f7>] 0x4258f7
> [946038.530135] Disabling lock debugging due to kernel taint
> ...
> [946043.505745] BUG: Bad page state in process sh pfn:6cc5d
> [946043.509984] page:ffff880005fcd458 flags:40000000004000d0 count:2
> mapcount:2 mapping:ffff8800fd7558d1 index:7ffc76c65
> [946043.515684] Pid: 19520, comm: sh Tainted: G B D N 2.6.32.12-0.7-xen #1
> [946043.526914] Call Trace:
> [946043.533962] [<ffffffff80009a75>] dump_trace+0x65/0x180
> [946043.536785] [<ffffffff8036d496>] dump_stack+0x69/0x73
> [946043.541021] [<ffffffff8009ccff>] bad_page+0xdf/0x160
> [946043.546660] [<ffffffff8009d9c8>] get_page_from_freelist+0x328/0x750
> [946043.552307] [<ffffffff8009e089>] __alloc_pages_nodemask+0x109/0x630
> [946043.557962] [<ffffffff800b696d>] do_wp_page+0x3bd/0xb80
> [946043.565000] [<ffffffff800b78b5>] handle_mm_fault+0x785/0xd90
> [946043.570644] [<ffffffff80373acb>] do_page_fault+0x21b/0x400
> [946043.576286] [<ffffffff80371738>] page_fault+0x28/0x30
> [946043.581936] [<00000000004258f7>] 0x4258f7
> [946043.603080] general protection fault: 0000 [#3] SMP
> [946043.605030] last sysfs file:
> /sys/devices/pci0000:00/0000:00:1e.0/0000:08:00.0/irq
> [946043.605776] CPU 1
> [946043.607084] Modules linked in: tun(N) fuse(N) iptable_mangle(N)
> xt_physdev(N) xt_pkttype(N) ipt_MASQUERADE(N) iptable_nat(N) nf_nat(N)
> xt_tcpudp(N) bridge(N) domctl(N) ipmi_devintf(N) ipmi_si(N)
> ipmi_msghandler(N) cryptomgr(N) aead(N) pcompress(N)
> crypto_blkcipher(N) crc32c(N) crypto_hash(N) crypto_algapi(N)
> iscsi_tcp(N) libiscsi_tcp(N) libiscsi(N) scsi_transport_iscsi(N)
> 8021q(N) garp(N) stp(N) llc(N) bonding(N) microcode(N) binfmt_misc(N)
> ip6t_REJECT(N) nf_conntrack_ipv6(N) ip6table_raw(N) xt_NOTRACK(N)
> ipt_REJECT(N) xt_state(N) iptable_raw(N) iptable_filter(N)
> ip6table_mangle(N) nf_conntrack_netbios_ns(N) nf_conntrack_ipv4(N)
> nf_conntrack(N) nf_defrag_ipv4(N) ip_tables(N) ip6table_filter(N)
> ip6_tables(N) x_tables(N) ipv6(N) usbhid(N) hid(N) loop(N) dm_mod(N)
> i2c_i801(N) tpm_tis(N) tpm(N) 8250_pnp(N) tpm_bios(N) pcspkr(N)
> serio_raw(N) iTCO_wdt(N) i2c_core(N) iTCO_vendor_support(N) tg3(N)
> 8250(N) mptctl(N) serial_core(N) shpchp(N) pci_hotplug(N) button(N)
> uhci_hcd(N) ehci_hcd(N) usbcore(N) cdrom(N) edd(N) fan(N) thermal(N)
> processor(N) thermal_sys(N) ata_piix(N) libata(N) mptsas(N)
> mptscsih(N) mptbase(N) scsi_transport_sas(N) sg(N) sd_mod(N)
> crc_t10dif(N) scsi_mod(N)
> [946043.725795] Supported: Yes
> [946043.726476] Pid: 19520, comm: sh Tainted: G B D N
> 2.6.32.12-0.7-xen #1 Tecal BH620
> [946043.728684] RIP: e030:[<ffffffff8009d8c5>] [<ffffffff8009d8c5>]
> get_page_from_freelist+0x225/0x750
> [946043.742788] RSP: e02b:ffff88004ad9dc28 EFLAGS: 00010006
> [946043.748382] RAX: ffffffff805f4530 RBX: 00000000000200da RCX:
> dead000000200200
> [946043.752667] RDX: dead000000100100 RSI: dead000000100100 RDI:
> 0000000000000000
> [946043.761128] RBP: ffffffff805f4400 R08: 0000000000000100 R09:
> ffffffffa0a9fa60
> [946043.768182] R10: 0000000000000000 R11: 0000000000000001 R12:
> 00000000000200da
> [946043.775231] R13: ffff8800077d8f38 R14: 0000000000000001 R15:
> 0000000000000000
> [946043.782296] FS: 00007f5ef6aee700(0000) GS:ffff88000401a000(0000)
> knlGS:0000000000000000
> [946043.789343] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [946043.797751] CR2: 00000000006997a0 CR3: 00000000b3c6e000 CR4:
> 0000000000002660
> [946043.803444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [946043.812923] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [946043.818965] Process sh (pid: 19520, threadinfo ffff88004ad9c000,
> task ffff8800dee7c6c0)
> [946043.826025] Stack:
> [946043.834424] 0000000000000000 ffff88004ad9dfd8 0000003800000038
> 00000000000067c0
> [946043.841475] <0> 00000000000067c0 0000000000000001 00000000000088a0
> 0000000000000002
> [946043.849937] <0> 0000000000000000 00000000000067c0 0000000000000041
> ffff88004ad9dfd8
> [946043.859809] Call Trace:
> [946043.860490] [<ffffffff8009e089>] __alloc_pages_nodemask+0x109/0x630
> [946043.862650] [<ffffffff800b696d>] do_wp_page+0x3bd/0xb80
> [946043.868296] [<ffffffff800b78b5>] handle_mm_fault+0x785/0xd90
> [946043.873933] [<ffffffff80373acb>] do_page_fault+0x21b/0x400
> [946043.879579] [<ffffffff80371738>] page_fault+0x28/0x30
> [946043.885220] [<00000000004258f7>] 0x4258f7
> [946043.890856] Code: e0 04 48 8b 44 05 08 4c 8d 68 d8 49 8b 55 28 49
> 8b 45 30 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de
> 49 c1 e0 07 <48> 89 42 08 48 89 10 49 89 75 28 49 89 4d 30 42 83 6c 05
> 00 01
> [946043.936802] RIP [<ffffffff8009d8c5>]
> get_page_from_freelist+0x225/0x750
> [946043.938107] RSP <ffff88004ad9dc28>
> [946043.938781] ---[ end trace 39e1fc4956333a45 ]---
>


Thanks,
J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-devel] Re: [Help] Bad page state in some process and #GPF exception made the Dom0 crashed !, Jeremy Fitzhardinge <=