xen-devel
Re: [Xen-devel] PVops domain 0 crash on NUMA system only Node==1 present
To: |
Jeremy Fitzhardinge <jeremy@xxxxxxxx> |
Subject: |
Re: [Xen-devel] PVops domain 0 crash on NUMA system only Node==1 present (Was: Re: Bug#603632: linux-image-2.6.32-5-xen-amd64: Linux kernel 2.6.32/xen/amd64 booting fine on bare metal, but not as dom0 with Xen 4.0.1 (Dell R410)) |
From: |
Vincent Caron <vcaron@xxxxxxxxxxxxx> |
Date: |
Thu, 25 Nov 2010 13:51:57 +0100 |
Cc: |
xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Fraser <keir@xxxxxxx>, Vincent CARON <zerodeux@xxxxxxxxxxxx>, Cris Daniluk <cris.daniluk@xxxxxxxxx>, 603632@xxxxxxxxxxxxxxx, Ian Campbell <ijc@xxxxxxxxxxxxxx>, Keir |
Delivery-date: |
Fri, 26 Nov 2010 03:22:07 -0800 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<4CEC06C1.5010500@xxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
Organization: |
Bearstech |
References: |
<20101115233253.11935.35707.reportbug@zerohal> <1290513067.31507.7699.camel@xxxxxxxxxxxxxxxxxxxxxx> <4CEC06C1.5010500@xxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
On Tue, 2010-11-23 at 10:24 -0800, Jeremy Fitzhardinge wrote:
> On 11/23/2010 03:51 AM, Ian Campbell wrote:
> > I'm not sure but looking at the complete bootlog it looks as if the
> > system may only have node==1 i.e. no 0 node which could plausibly lead
> > to this sort of issue:
> > [ 0.000000] Bootmem setup node 1
> > 0000000000000000-0000000040000000
> > [ 0.000000] NODE_DATA [0000000000008000 - 000000000000ffff]
> > [ 0.000000] bootmap [0000000000010000 - 0000000000017fff]
> > pages 8
> > [ 0.000000] (8 early reservations) ==> bootmem [0000000000 -
> > 0040000000]
> > [ 0.000000] #0 [0000000000 - 0000001000] BIOS data page ==>
> > [0000000000 - 0000001000]
> > [ 0.000000] #1 [0003446000 - 0003465000] XEN PAGETABLES ==>
> > [0003446000 - 0003465000]
> > [ 0.000000] #2 [0000006000 - 0000008000] TRAMPOLINE ==>
> > [0000006000 - 0000008000]
> > [ 0.000000] #3 [0001000000 - 0001694994] TEXT DATA BSS ==>
> > [0001000000 - 0001694994]
> > [ 0.000000] #4 [00016b5000 - 0003244e00] RAMDISK ==>
> > [00016b5000 - 0003244e00]
> > [ 0.000000] #5 [0003245000 - 0003446000] XEN START INFO ==>
> > [0003245000 - 0003446000]
> > [ 0.000000] #6 [0001695000 - 000169532d] BRK ==>
> > [0001695000 - 000169532d]
> > [ 0.000000] #7 [0000100000 - 00002e0000] PGTABLE ==>
> > [0000100000 - 00002e0000]
> > [ 0.000000] found SMP MP-table at [ffff8800000fe710] fe710
> > [ 0.000000] Zone PFN ranges:
> > [ 0.000000] DMA 0x00000000 -> 0x00001000
> > [ 0.000000] DMA32 0x00001000 -> 0x00100000
> > [ 0.000000] Normal 0x00100000 -> 0x00100000
> > [ 0.000000] Movable zone start PFN for each node
> > [ 0.000000] early_node_map[2] active PFN ranges
> > [ 0.000000] 1: 0x00000000 -> 0x000000a0
> > [ 0.000000] 1: 0x00000100 -> 0x00040000
> > [ 0.000000] On node 1 totalpages: 262048
> > [ 0.000000] DMA zone: 56 pages used for memmap
> > [ 0.000000] DMA zone: 483 pages reserved
> > [ 0.000000] DMA zone: 3461 pages, LIFO batch:0
> > [ 0.000000] DMA32 zone: 3528 pages used for memmap
> > [ 0.000000] DMA32 zone: 254520 pages, LIFO batch:31
> >
> > Perhaps we should be passing numa_node_id() (e.g. current node) instead
> > of node 0? There doesn't seem to be another obvious alternative to
> > passing in an explicit node number to this callchain (some places cope
> > with -1 but not this path AFAICT).
>
> Does booting native get the same configuration?
Booting native with the same Xen-enabled kernel gives:
[ 0.000000] Bootmem setup node 0 0000000130000000-0000000230000000
[ 0.000000] NODE_DATA [0000000130000000 - 0000000130007fff]
[ 0.000000] bootmap [0000000130008000 - 0000000130027fff] pages 20
[ 0.000000] (8 early reservations) ==> bootmem [0130000000 -
0230000000]
[ 0.000000] #0 [0000000000 - 0000001000] BIOS data page
[ 0.000000] #1 [0000006000 - 0000008000] TRAMPOLINE
[ 0.000000] #2 [0001000000 - 0001694994] TEXT DATA BSS
[ 0.000000] #3 [0037656000 - 0037fefb18] RAMDISK
[ 0.000000] #4 [000009ec00 - 0000100000] BIOS reserved
[ 0.000000] #5 [0001695000 - 000169532d] BRK
[ 0.000000] #6 [0000008000 - 000000c000] PGTABLE
[ 0.000000] #7 [000000c000 - 0000011000] PGTABLE
[ 0.000000] Bootmem setup node 1 0000000000000000-0000000130000000
[ 0.000000] NODE_DATA [0000000000011000 - 0000000000018fff]
[ 0.000000] bootmap [0000000000019000 - 000000000003efff] pages 26
[ 0.000000] (8 early reservations) ==> bootmem [0000000000 -
0130000000]
[ 0.000000] #0 [0000000000 - 0000001000] BIOS data page ==>
[0000000000 - 0000001000]
[ 0.000000] #1 [0000006000 - 0000008000] TRAMPOLINE ==>
[0000006000 - 0000008000]
[ 0.000000] #2 [0001000000 - 0001694994] TEXT DATA BSS ==>
[0001000000 - 0001694994]
[ 0.000000] #3 [0037656000 - 0037fefb18] RAMDISK ==>
[0037656000 - 0037fefb18]
[ 0.000000] #4 [000009ec00 - 0000100000] BIOS reserved ==>
[000009ec00 - 0000100000]
[ 0.000000] #5 [0001695000 - 000169532d] BRK ==>
[0001695000 - 000169532d]
[ 0.000000] #6 [0000008000 - 000000c000] PGTABLE ==>
[0000008000 - 000000c000]
[ 0.000000] #7 [000000c000 - 0000011000] PGTABLE ==>
[000000c000 - 0000011000]
[ 0.000000] found SMP MP-table at [ffff8800000fe710] fe710
[ 0.000000] [ffffea0004280000-ffffea00043fffff] potential offnode
page_structs
[ 0.000000] [ffffea0000000000-ffffea00043fffff] PMD ->
[ffff880001800000-ffff8800051fffff] on node 1
[ 0.000000] [ffffea0004400000-ffffea0007bfffff] PMD ->
[ffff880130200000-ffff8801339fffff] on node 0
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0x00000000 -> 0x00001000
[ 0.000000] DMA32 0x00001000 -> 0x00100000
[ 0.000000] Normal 0x00100000 -> 0x00230000
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[4] active PFN ranges
[ 0.000000] 1: 0x00000000 -> 0x000000a0
[ 0.000000] 1: 0x00000100 -> 0x000cf679
[ 0.000000] 1: 0x00100000 -> 0x00130000
[ 0.000000] 0: 0x00130000 -> 0x00230000
[ 0.000000] On node 0 totalpages: 1048576
[ 0.000000] Normal zone: 14336 pages used for memmap
[ 0.000000] Normal zone: 1034240 pages, LIFO batch:31
[ 0.000000] On node 1 totalpages: 1046041
[ 0.000000] DMA zone: 56 pages used for memmap
[ 0.000000] DMA zone: 109 pages reserved
[ 0.000000] DMA zone: 3835 pages, LIFO batch:0
[ 0.000000] DMA32 zone: 14280 pages used for memmap
[ 0.000000] DMA32 zone: 831153 pages, LIFO batch:31
[ 0.000000] Normal zone: 2688 pages used for memmap
[ 0.000000] Normal zone: 193920 pages, LIFO batch:31
> > It's also not obvious if dom0 should be seeing the tables which describe
> > the hosts nodes anyway or if we should be clobbering something. Given
> > that dom0 sees a pseudo-physical address map I'm not convinced seeing
> > the real SRAT is in any way beneficial. Perhaps we should simply be
> > clobbering NUMAness until actual PV understanding of NUMA is ready?
>
> Yes, the host SRAT is meaningless in the domain and we really should
> ignore it. I'm not sure what happens if you boot on a really NUMA system.
>
> > One thing I notice when googling R410 issues is that they apparently
> > have a "Cores per CPU" BIOS option which might be worth playing with,
> > since configuring a reduced number of cores might remove node 0 but not
> > node 1 (odd but not invalid?). Presumably it is also worth making sure
> > you have the latest BIOS etc.
>
> Also, what's the DIMM configuration? Are the slots fully populated?
8 slots, 4 populated; slots #0, #1, #4 and #5 populated with 2GiB
dimms (according to lshw, setup by Dell).
I switched off hyperthreading in the BIOS settings (default is 'on'),
I had issues with Xen 3.2 on this topic (related to floating vcpus,
which I had to pin to fix random crashes). Also I don't think HT is
significant for my usage. I'm used to see strange bugs as soon as I
tweak Dell BIOSes, so I thought I'd mention that.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|