WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split

To: Andre Przywara <andre.przywara@xxxxxxx>
Subject: Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split
From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
Date: Mon, 31 Jan 2011 08:04:45 +0100
Cc: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>
Delivery-date: Sun, 30 Jan 2011 23:05:18 -0800
Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=ts.fujitsu.com; i=juergen.gross@xxxxxxxxxxxxxx; q=dns/txt; s=s1536b; t=1296457488; x=1327993488; h=message-id:date:from:mime-version:to:cc:subject: references:in-reply-to:content-transfer-encoding; z=Message-ID:=20<4D465F0D.4010408@xxxxxxxxxxxxxx>|Date:=20 Mon,=2031=20Jan=202011=2008:04:45=20+0100|From:=20Juergen =20Gross=20<juergen.gross@xxxxxxxxxxxxxx>|MIME-Version: =201.0|To:=20Andre=20Przywara=20<andre.przywara@xxxxxxx> |CC:=20Keir=20Fraser=20<keir@xxxxxxx>,=20=0D=0A=20"xen-de vel@xxxxxxxxxxxxxxxxxxx"=20<xen-devel@xxxxxxxxxxxxxxxxxxx >,=0D=0A=20Ian=20Jackson=20<Ian.Jackson@xxxxxxxxxxxxx> |Subject:=20Re:=20[Xen-devel]=20Hypervisor=20crash(!)=20o n=20xl=20cpupool-numa-split|References:=20<4D41FD3A.50905 06@xxxxxxx>=20<4D426673.7020200@xxxxxxxxxxxxxx>=09<4D42A3 5D.3050507@xxxxxxx>=20<4D42AC00.8050109@xxxxxxxxxxxxxx> =20<4D42C153.5050104@xxxxxxx>|In-Reply-To:=20<4D42C153.50 50104@xxxxxxx>|Content-Transfer-Encoding:=207bit; bh=YqzkbuIMryQz36U/oEusv8ip0EjsrwelMKv623lRScs=; b=bUUhPVYRtPvlcasCNF7CeA8L48451YExtx4YwCqH5xR6p96tMYPzhRl6 rhABdUeQ4LnHSAH8+GWj6e5rLwWOuhjlXBqr4zNCa8N6q6R/D/onnFQGX 97w0Nf/P1a0h7+8MxcdnQKGXPb14dqDQ4kT+hM7id5dNshzGNqoA/yqDq aUKx9PMTdht8wiMOvNO2cNRpmuP9e+CAk+o++EmzQUTTNRNw5zAkn+Og6 Wj1b3ey2UkJyetJWsulBZGT5SCIwN;
Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=bssJGN1b8WIm6SrR1C4//vjciL4/8bh7fbLgj8I7yadY2bMzn45UgX+4 GcQlSxipzHHetTf9HIPkoNOOV2KdZjYWS62WZ+Iu9o2YkOBLq+CyHyToL Iz53XoAt/tfSh5Eb75rfN7MCLWO7Spu57Rti83tvZDj0dHSM0tLkeTF6R uNgtQ+C0HK7mxhQJstAklPOQKRCWXfBEYUwOz7NuDjZh7aQmELdxwBe+v S+Gogoq+MUo0AkzSLr6an2OF03/3s;
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4D42C153.5050104@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Fujitsu Technology Solutions
References: <4D41FD3A.5090506@xxxxxxx> <4D426673.7020200@xxxxxxxxxxxxxx> <4D42A35D.3050507@xxxxxxx> <4D42AC00.8050109@xxxxxxxxxxxxxx> <4D42C153.5050104@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20101226 Iceowl/1.0b1 Icedove/3.0.11
On 01/28/11 14:14, Andre Przywara wrote:

Do I understand correctly?
No crash with only dom0_max_vcpus= and no crash with only dom0_mem= ?
Yes, see my previous mail to George.


Could you try this patch?
Ok, the crash dump is as follows:

Hmm, is the new crash reproducable as well?
Seems not to be directly related to my diagnosis patch...

Currently I have no NUMA machine available. I tried to use numa=fake=...
boot parameter, but this seems to fake only NUMA memory nodes, all cpus are
still in node 0:

(XEN) 'u' pressed -> dumping numa info (now-0x120:5D5E0203)
(XEN) idx0 -> NODE0 start->0 size->524288
(XEN) phys_to_nid(0000000000001000) -> 0 should be 0
(XEN) idx1 -> NODE1 start->524288 size->524288
(XEN) phys_to_nid(0000000080001000) -> 1 should be 1
(XEN) idx2 -> NODE2 start->1048576 size->524288
(XEN) phys_to_nid(0000000100001000) -> 2 should be 2
(XEN) idx3 -> NODE3 start->1572864 size->1835008
(XEN) phys_to_nid(0000000180001000) -> 3 should be 3
(XEN) CPU0 -> NODE0
(XEN) CPU1 -> NODE0
(XEN) CPU2 -> NODE0
(XEN) CPU3 -> NODE0
(XEN) Memory location of each domain:
(XEN) Domain 0 (total: 3003121):
(XEN)     Node 0: 433864
(XEN)     Node 1: 258522
(XEN)     Node 2: 514315
(XEN)     Node 3: 1796420

I suspect a problem with the __cpuinit stuff overwriting some node info.
Andre, could you check this? I hope to reproduce your problem on my machine.

(XEN) Xen BUG at sched_credit.c:384
(XEN) ----[ Xen-4.1.0-rc2-pre x86_64 debug=y Not tainted ]----
(XEN) CPU: 2
(XEN) RIP: e008:[<ffff82c480117fa0>] csched_alloc_pdata+0x146/0x17f
(XEN) RFLAGS: 0000000000010093 CONTEXT: hypervisor
(XEN) rax: ffff830434322000 rbx: ffff830434418748 rcx: 0000000000000024
(XEN) rdx: ffff82c4802d3ec0 rsi: 0000000000000003 rdi: ffff8304343c9100
(XEN) rbp: ffff83043457fce8 rsp: ffff83043457fca8 r8: 0000000000000001
(XEN) r9: ffff830434418748 r10: ffff82c48021a0a0 r11: 0000000000000286
(XEN) r12: 0000000000000024 r13: ffff83123a3b2b60 r14: ffff830434418730
(XEN) r15: 0000000000000024 cr0: 000000008005003b cr4: 00000000000006f0
(XEN) cr3: 00000008061df000 cr2: ffff8817a21f87a0
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen stack trace from rsp=ffff83043457fca8:
(XEN) ffff83043457fcb8 ffff83123a3b2b60 0000000000000286 0000000000000024
(XEN) ffff830434418820 ffff83123a3b2a70 0000000000000024 ffff82c4802b0880
(XEN) ffff83043457fd58 ffff82c48011fa63 ffff82f60102aa80 0000000000081554
(XEN) ffff8300c7cfa000 0000000000000000 0000400000000000 ffff82c480248e00
(XEN) 0000000000000002 0000000000000024 ffff830434418820 0000000000305000
(XEN) ffff82c4802550e4 ffff82c4802b0880 ffff83043457fd78 ffff82c48010188c
(XEN) ffff83043457fe40 0000000000000024 ffff83043457fdb8 ffff82c480101b94
(XEN) ffff83043457fdb8 ffff82c4801836f2 fffffffe00000286 ffff83043457ff18
(XEN) 0000000002170004 0000000000305000 ffff83043457fef8 ffff82c480125281
(XEN) ffff83043457fdd8 0000000180153c9d 0000000000000000 ffff82c4801068f8
(XEN) 0000000000000296 ffff8300c7e0a1c8 aaaaaaaaaaaaaaaa 0000000000000000
(XEN) ffff88007d1ac170 ffff88007d1ac170 ffff83043457fef8 ffff82c480113d8a
(XEN) ffff83043457fe78 ffff83043457fe88 0000000800000012 0000000600000004
(XEN) 0000000000000000 ffffffff00000024 0000000000000000 00007fac2e0e5a00
(XEN) 0000000002170000 0000000000000000 0000000000000000 ffffffffffffffff
(XEN) 0000000000000000 0000000000000080 000000000000002f 0000000002170004
(XEN) 0000000002172004 0000000002174004 00007fff878f1c80 0000000000000033
(XEN) ffff83043457fed8 ffff8300c7e0a000 00007fff878f1b30 0000000000305000
(XEN) 0000000000000003 0000000000000003 00007cfbcba800c7 ffff82c480207dd8
(XEN) ffffffff8100946a 0000000000000023 0000000000000003 0000000000000003
(XEN) Xen call trace:
(XEN) [<ffff82c480117fa0>] csched_alloc_pdata+0x146/0x17f
(XEN) [<ffff82c48011fa63>] schedule_cpu_switch+0x75/0x1eb
(XEN) [<ffff82c48010188c>] cpupool_assign_cpu_locked+0x44/0x8b
(XEN) [<ffff82c480101b94>] cpupool_do_sysctl+0x1fb/0x461
(XEN) [<ffff82c480125281>] do_sysctl+0x921/0xa30
(XEN) [<ffff82c480207dd8>] syscall_enter+0xc8/0x122
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) Xen BUG at sched_credit.c:384
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...


Juergen

--
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel