|
|
|
|
|
|
|
|
|
|
xen-devel
[Xen-devel] Re: [Help] Bad page state in some process and #GPF exception
On 12/07/2010 03:06 AM, fanliang wrote:
> Hi all, I need an 'urgent ' help now because my Dom0 crashed. My dom0
> kernel is 2.6.32.12-xen. The server run for about one week ,then print
> lots of "Bad page ..." log in /var/log/message . The "Bad page " log
> is printed about 60 times and at last , the "general protection fault"
> exception happened. Of cource ,the server crashed .
So it ran fine for a week, and then died with a sudden burst of these
bad page messages?
> I am trying to analyse the mm of the dom0 but have not any progress.
> If you have any thoughts on how to approach these scenarios, I would
> appreciate if you could shed some light.
Could you send the complete output? It's hard to analyse these messages
in isolation, since there might be a clue earlier which indicates how it
got into this state. Also are you using a debug build of Xen? Are there
any Xen console messages?
And just to be sure: this hardware is definitely known to be stable when
running the same kernel native? Just to make sure we're chasing a
Xen-specific bug here.
> The whole log message is here:
> [946038.527830] BUG: Bad page state in process sh pfn:dab21
> [946038.528001] page:ffff8800077d8f38 flags:40000000004000d0 count:1
> mapcount:1 mapping:ffff8800f6e37979 index:7f25191d3
> [946038.528239] Pid: 19520, comm: sh Tainted: G N 2.6.32.12-0.7-xen #1
> [946038.528394] Call Trace:
> [946038.528549] [<ffffffff80009a75>] dump_trace+0x65/0x180
> [946038.528723] [<ffffffff8036d496>] dump_stack+0x69/0x73
> [946038.528876] [<ffffffff8009ccff>] bad_page+0xdf/0x160
> [946038.529033] [<ffffffff8009d9c8>] get_page_from_freelist+0x328/0x750
> [946038.529187] [<ffffffff8009e089>] __alloc_pages_nodemask+0x109/0x630
> [946038.529347] [<ffffffff800b696d>] do_wp_page+0x3bd/0xb80
> [946038.529508] [<ffffffff800b78b5>] handle_mm_fault+0x785/0xd90
> [946038.529657] [<ffffffff80373acb>] do_page_fault+0x21b/0x400
> [946038.529825] [<ffffffff80371738>] page_fault+0x28/0x30
> [946038.529992] [<00000000004258f7>] 0x4258f7
> [946038.530135] Disabling lock debugging due to kernel taint
> ...
> [946043.505745] BUG: Bad page state in process sh pfn:6cc5d
> [946043.509984] page:ffff880005fcd458 flags:40000000004000d0 count:2
> mapcount:2 mapping:ffff8800fd7558d1 index:7ffc76c65
> [946043.515684] Pid: 19520, comm: sh Tainted: G B D N 2.6.32.12-0.7-xen #1
> [946043.526914] Call Trace:
> [946043.533962] [<ffffffff80009a75>] dump_trace+0x65/0x180
> [946043.536785] [<ffffffff8036d496>] dump_stack+0x69/0x73
> [946043.541021] [<ffffffff8009ccff>] bad_page+0xdf/0x160
> [946043.546660] [<ffffffff8009d9c8>] get_page_from_freelist+0x328/0x750
> [946043.552307] [<ffffffff8009e089>] __alloc_pages_nodemask+0x109/0x630
> [946043.557962] [<ffffffff800b696d>] do_wp_page+0x3bd/0xb80
> [946043.565000] [<ffffffff800b78b5>] handle_mm_fault+0x785/0xd90
> [946043.570644] [<ffffffff80373acb>] do_page_fault+0x21b/0x400
> [946043.576286] [<ffffffff80371738>] page_fault+0x28/0x30
> [946043.581936] [<00000000004258f7>] 0x4258f7
> [946043.603080] general protection fault: 0000 [#3] SMP
> [946043.605030] last sysfs file:
> /sys/devices/pci0000:00/0000:00:1e.0/0000:08:00.0/irq
> [946043.605776] CPU 1
> [946043.607084] Modules linked in: tun(N) fuse(N) iptable_mangle(N)
> xt_physdev(N) xt_pkttype(N) ipt_MASQUERADE(N) iptable_nat(N) nf_nat(N)
> xt_tcpudp(N) bridge(N) domctl(N) ipmi_devintf(N) ipmi_si(N)
> ipmi_msghandler(N) cryptomgr(N) aead(N) pcompress(N)
> crypto_blkcipher(N) crc32c(N) crypto_hash(N) crypto_algapi(N)
> iscsi_tcp(N) libiscsi_tcp(N) libiscsi(N) scsi_transport_iscsi(N)
> 8021q(N) garp(N) stp(N) llc(N) bonding(N) microcode(N) binfmt_misc(N)
> ip6t_REJECT(N) nf_conntrack_ipv6(N) ip6table_raw(N) xt_NOTRACK(N)
> ipt_REJECT(N) xt_state(N) iptable_raw(N) iptable_filter(N)
> ip6table_mangle(N) nf_conntrack_netbios_ns(N) nf_conntrack_ipv4(N)
> nf_conntrack(N) nf_defrag_ipv4(N) ip_tables(N) ip6table_filter(N)
> ip6_tables(N) x_tables(N) ipv6(N) usbhid(N) hid(N) loop(N) dm_mod(N)
> i2c_i801(N) tpm_tis(N) tpm(N) 8250_pnp(N) tpm_bios(N) pcspkr(N)
> serio_raw(N) iTCO_wdt(N) i2c_core(N) iTCO_vendor_support(N) tg3(N)
> 8250(N) mptctl(N) serial_core(N) shpchp(N) pci_hotplug(N) button(N)
> uhci_hcd(N) ehci_hcd(N) usbcore(N) cdrom(N) edd(N) fan(N) thermal(N)
> processor(N) thermal_sys(N) ata_piix(N) libata(N) mptsas(N)
> mptscsih(N) mptbase(N) scsi_transport_sas(N) sg(N) sd_mod(N)
> crc_t10dif(N) scsi_mod(N)
> [946043.725795] Supported: Yes
> [946043.726476] Pid: 19520, comm: sh Tainted: G B D N
> 2.6.32.12-0.7-xen #1 Tecal BH620
> [946043.728684] RIP: e030:[<ffffffff8009d8c5>] [<ffffffff8009d8c5>]
> get_page_from_freelist+0x225/0x750
> [946043.742788] RSP: e02b:ffff88004ad9dc28 EFLAGS: 00010006
> [946043.748382] RAX: ffffffff805f4530 RBX: 00000000000200da RCX:
> dead000000200200
> [946043.752667] RDX: dead000000100100 RSI: dead000000100100 RDI:
> 0000000000000000
> [946043.761128] RBP: ffffffff805f4400 R08: 0000000000000100 R09:
> ffffffffa0a9fa60
> [946043.768182] R10: 0000000000000000 R11: 0000000000000001 R12:
> 00000000000200da
> [946043.775231] R13: ffff8800077d8f38 R14: 0000000000000001 R15:
> 0000000000000000
> [946043.782296] FS: 00007f5ef6aee700(0000) GS:ffff88000401a000(0000)
> knlGS:0000000000000000
> [946043.789343] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> [946043.797751] CR2: 00000000006997a0 CR3: 00000000b3c6e000 CR4:
> 0000000000002660
> [946043.803444] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [946043.812923] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [946043.818965] Process sh (pid: 19520, threadinfo ffff88004ad9c000,
> task ffff8800dee7c6c0)
> [946043.826025] Stack:
> [946043.834424] 0000000000000000 ffff88004ad9dfd8 0000003800000038
> 00000000000067c0
> [946043.841475] <0> 00000000000067c0 0000000000000001 00000000000088a0
> 0000000000000002
> [946043.849937] <0> 0000000000000000 00000000000067c0 0000000000000041
> ffff88004ad9dfd8
> [946043.859809] Call Trace:
> [946043.860490] [<ffffffff8009e089>] __alloc_pages_nodemask+0x109/0x630
> [946043.862650] [<ffffffff800b696d>] do_wp_page+0x3bd/0xb80
> [946043.868296] [<ffffffff800b78b5>] handle_mm_fault+0x785/0xd90
> [946043.873933] [<ffffffff80373acb>] do_page_fault+0x21b/0x400
> [946043.879579] [<ffffffff80371738>] page_fault+0x28/0x30
> [946043.885220] [<00000000004258f7>] 0x4258f7
> [946043.890856] Code: e0 04 48 8b 44 05 08 4c 8d 68 d8 49 8b 55 28 49
> 8b 45 30 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de
> 49 c1 e0 07 <48> 89 42 08 48 89 10 49 89 75 28 49 89 4d 30 42 83 6c 05
> 00 01
> [946043.936802] RIP [<ffffffff8009d8c5>]
> get_page_from_freelist+0x225/0x750
> [946043.938107] RSP <ffff88004ad9dc28>
> [946043.938781] ---[ end trace 39e1fc4956333a45 ]---
>
Thanks,
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread> |
- [Xen-devel] Re: [Help] Bad page state in some process and #GPF exception made the Dom0 crashed !,
Jeremy Fitzhardinge <=
|
|
|
|
|