|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] xen dom0 2.6.32.15 kernel BUG at drivers/xen/grant-table
Am Montag, 14. Juni 2010, 14:44:36 schrieb Arnd Hannemann:
>
> FYI: I got lucky and reproduced the error within only 15 minutes and
> hypervisor version:
>
problem still exists. maybe something of the following information rings a
bell. if even more
infos are needed just ask.
vmhost2 runs small (128 mb ram) domUs which simulate wireless mesh network nodes
(network bridged, no blk - they boot over nfs). it runs ubuntu 10.4 over nfs.
after some
hours of starting and shutting down domUs it hangs with the following (or
similiar) traceback.
The issue might have gotten worse over the last kernel updates, some weeks ago
we
managed to start and shutdown over 500 dom0s, but now i hangs quicker, often
after 50-60 iterations.
Our "old" setup runs ubuntu 8.4 with xen 3.4 and runs stable for weeks doing
the same thing.
Jun 17 13:13:51 vmhost2 [16259.938609] ------------[ cut here ]------------
Jun 17 13:13:51 vmhost2 [16259.938658] kernel BUG at
drivers/xen/grant-table.c:583!
Jun 17 13:13:51 vmhost2 [16259.938698] invalid opcode: 0000 [#1]
Jun 17 13:13:51 vmhost2 SMP
Jun 17 13:13:51 vmhost2
Jun 17 13:13:51 vmhost2 [16259.938764] last sysfs file:
/sys/devices/virtual/net/br0/bridge/topology_change_detected
Jun 17 13:13:51 vmhost2 [16259.938824] Modules linked in:
Jun 17 13:13:51 vmhost2 nf_conntrack_ipv4
Jun 17 13:13:51 vmhost2 nf_defrag_ipv4
Jun 17 13:13:51 vmhost2 xt_state
Jun 17 13:13:51 vmhost2 nf_conntrack
Jun 17 13:13:51 vmhost2 xt_physdev
Jun 17 13:13:51 vmhost2 iptable_filter
Jun 17 13:13:51 vmhost2 ip_tables
Jun 17 13:13:51 vmhost2 x_tables
Jun 17 13:13:51 vmhost2 netconsole
Jun 17 13:13:51 vmhost2 raid0
Jun 17 13:13:51 vmhost2 md_mod
Jun 17 13:13:51 vmhost2 rtc_cmos
Jun 17 13:13:51 vmhost2 rtc_core
Jun 17 13:13:51 vmhost2 rtc_lib
Jun 17 13:13:51 vmhost2 pl2303
Jun 17 13:13:51 vmhost2 thermal
Jun 17 13:13:51 vmhost2 usbserial
Jun 17 13:13:51 vmhost2 processor
Jun 17 13:13:51 vmhost2 thermal_sys
Jun 17 13:13:51 vmhost2 button
Jun 17 13:13:51 vmhost2 hwmon
Jun 17 13:13:51 vmhost2 acpi_processor
Jun 17 13:13:51 vmhost2 sr_mod
Jun 17 13:13:51 vmhost2 cdrom
Jun 17 13:13:51 vmhost2 evdev
Jun 17 13:13:51 vmhost2 ipv6
Jun 17 13:13:51 vmhost2
Jun 17 13:13:51 vmhost2 [16259.939240]
Jun 17 13:13:51 vmhost2 [16259.939273] Pid: 0, comm: swapper Not tainted
(2.6.32.15-xen4.0.0-dom0 #2) System Product Name
Jun 17 13:13:51 vmhost2 [16259.939335] EIP: 0061:[<c120f170>] EFLAGS: 00010282
CPU: 0
Jun 17 13:13:51 vmhost2 [16259.939385] EIP is at
gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 [16259.939428] EAX: ffffffea EBX: c153be74 ECX:
00000001 EDX: 00000000
Jun 17 13:13:51 vmhost2 [16259.939469] ESI: 00007ff0 EDI: 0000001c EBP:
c28f9ae0 ESP: c153be40
Jun 17 13:13:51 vmhost2 [16259.939510] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS:
0069
Jun 17 13:13:51 vmhost2 [16259.939550] Process swapper (pid: 0, ti=c153a000
task=c1543760 task.ti=c153a000)
Jun 17 13:13:51 vmhost2 [16259.939608] Stack:
Jun 17 13:13:51 vmhost2 [16259.939640] 00000000
Jun 17 13:13:51 vmhost2 00214815
Jun 17 13:13:51 vmhost2 c28cefe0
Jun 17 13:13:51 vmhost2 0002bf57
Jun 17 13:13:51 vmhost2 ebf57000
Jun 17 13:13:51 vmhost2 ebc42078
Jun 17 13:13:51 vmhost2 0000001c
Jun 17 13:13:51 vmhost2 ebf57000
Jun 17 13:13:51 vmhost2
Jun 17 13:13:51 vmhost2 [16259.939762] <0>
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 ea9ff000
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 0000001c
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 14815001
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 0002bf57
Jun 17 13:13:51 vmhost2
Jun 17 13:13:51 vmhost2 [16259.939914] <0>
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 edab8a24
Jun 17 13:13:51 vmhost2 edab78a4
Jun 17 13:13:51 vmhost2 edab8a24
Jun 17 13:13:51 vmhost2 edab7818
Jun 17 13:13:51 vmhost2 c1220dc2
Jun 17 13:13:51 vmhost2 00000100
Jun 17 13:13:51 vmhost2 00000001
Jun 17 13:13:51 vmhost2
Jun 17 13:13:51 vmhost2 [16259.940094] Call Trace:
Jun 17 13:13:51 vmhost2 [16259.940135] [<c1220dc2>] ? net_tx_action+0x1d2/0x9f0
Jun 17 13:13:51 vmhost2 [16259.940179] [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 17 13:13:51 vmhost2 [16259.940221] [<c103c378>] ? __do_softirq+0x88/0x110
Jun 17 13:13:51 vmhost2 [16259.940263] [<c1210057>] ?
__xen_evtchn_do_upcall+0xd7/0x160
Jun 17 13:13:51 vmhost2 [16259.940307] [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 17 13:13:51 vmhost2 [16259.940348] [<c121063a>] ?
xen_evtchn_do_upcall+0x2a/0x40
Jun 17 13:13:51 vmhost2 [16259.940400] [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 17 13:13:51 vmhost2 [16259.940443] [<c10013a7>] ?
hypercall_page+0x3a7/0x1010
Jun 17 13:13:51 vmhost2 [16259.940486] [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 17 13:13:51 vmhost2 [16259.940528] [<c100382c>] ? xen_idle+0x1c/0x30
Jun 17 13:13:51 vmhost2 [16259.940569] [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 17 13:13:51 vmhost2 [16259.940611] [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 17 13:13:51 vmhost2 [16259.940653] [<c1578367>] ?
unknown_bootoption+0x0/0x190
Jun 17 13:13:51 vmhost2 [16259.940696] [<c157b0e6>] ?
xen_start_kernel+0x624/0x62c
Jun 17 13:13:51 vmhost2 [16259.940735] Code:
Jun 17 13:13:51 vmhost2 8d
Jun 17 13:13:51 vmhost2 5c
Jun 17 13:13:51 vmhost2 24
Jun 17 13:13:51 vmhost2 34
Jun 17 13:13:51 vmhost2 c1
Jun 17 13:13:51 vmhost2 e0
Jun 17 13:13:51 vmhost2 0c
Jun 17 13:13:51 vmhost2 83
Jun 17 13:13:51 vmhost2 c8
Jun 17 13:13:51 vmhost2 01
Jun 17 13:13:51 vmhost2 89
Jun 17 13:13:51 vmhost2 44
Jun 17 13:13:51 vmhost2 24
Jun 17 13:13:51 vmhost2 34
Jun 17 13:13:51 vmhost2 8b
Jun 17 13:13:51 vmhost2 44
Jun 17 13:13:51 vmhost2 24
Jun 17 13:13:51 vmhost2 0c
Jun 17 13:13:51 vmhost2 c7
Jun 17 13:13:51 vmhost2 44
Jun 17 13:13:51 vmhost2 24
Jun 17 13:13:51 vmhost2 40
Jun 17 13:13:51 vmhost2 00
Jun 17 13:13:51 vmhost2 00
Jun 17 13:13:51 vmhost2 00
Jun 17 13:13:51 vmhost2 00
Jun 17 13:13:51 vmhost2 89
Jun 17 13:13:51 vmhost2 44
Jun 17 13:13:51 vmhost2 24
Jun 17 13:13:51 vmhost2 3c
Jun 17 13:13:51 vmhost2 e8
Jun 17 13:13:51 vmhost2 b8
Jun 17 13:13:51 vmhost2 1e
Jun 17 13:13:51 vmhost2 df
Jun 17 13:13:51 vmhost2 ff
Jun 17 13:13:51 vmhost2 85
Jun 17 13:13:51 vmhost2 c0
Jun 17 13:13:51 vmhost2 0f
Jun 17 13:13:51 vmhost2 84
Jun 17 13:13:51 vmhost2 2c
Jun 17 13:13:51 vmhost2 ff
Jun 17 13:13:51 vmhost2 ff
Jun 17 13:13:51 vmhost2 ff
Jun 17 11:13:50 vmhost2 unparseable log message: "<0f> "
Jun 17 13:13:51 vmhost2 0b
Jun 17 13:13:51 vmhost2 eb
Jun 17 13:13:51 vmhost2 fe
Jun 17 13:13:51 vmhost2 0f
Jun 17 13:13:51 vmhost2 0b
Jun 17 13:13:51 vmhost2 eb
Jun 17 13:13:51 vmhost2 fe
Jun 17 13:13:51 vmhost2 0f
Jun 17 13:13:51 vmhost2 0b
Jun 17 13:13:51 vmhost2 eb
Jun 17 13:13:51 vmhost2 fe
Jun 17 13:13:51 vmhost2 8b
Jun 17 13:13:51 vmhost2 54
Jun 17 13:13:51 vmhost2 24
Jun 17 13:13:51 vmhost2 04
Jun 17 13:13:51 vmhost2 8b
Jun 17 13:13:51 vmhost2 44
Jun 17 13:13:51 vmhost2 24
Jun 17 13:13:51 vmhost2 0c
Jun 17 13:13:51 vmhost2 e8
Jun 17 13:13:51 vmhost2
Jun 17 13:13:51 vmhost2 [16259.941540] EIP: [<c120f170>]
Jun 17 13:13:51 vmhost2 gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 SS:ESP 0069:c153be40
Jun 17 13:13:51 vmhost2 [16259.941864] ---[ end trace 0514df71b6948a8c ]---
Jun 17 13:13:51 vmhost2 [16259.941934] Kernel panic - not syncing: Fatal
exception in interrupt
Jun 17 13:13:51 vmhost2 [16259.942009] Pid: 0, comm: swapper Tainted: G D
2.6.32.15-xen4.0.0-dom0 #2
Jun 17 13:13:51 vmhost2 [16259.942100] Call Trace:
Jun 17 13:13:51 vmhost2 [16259.942170] [<c141d9d5>] ? panic+0x42/0xe1
Jun 17 13:13:51 vmhost2 [16259.942243] [<c100cc56>] ? oops_end+0x96/0xa0
Jun 17 13:13:51 vmhost2 [16259.942316] [<c100a73f>] ? do_invalid_op+0x7f/0x90
Jun 17 13:13:51 vmhost2 [16259.942390] [<c120f170>] ?
gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 [16259.942468] [<c10741e4>] ?
__alloc_pages_nodemask+0xe4/0x5b0
Jun 17 13:13:51 vmhost2 [16259.942543] [<c1210057>] ?
__xen_evtchn_do_upcall+0xd7/0x160
Jun 17 13:13:51 vmhost2 [16259.942621] [<c14200c6>] ? error_code+0x66/0x6c
Jun 17 13:13:51 vmhost2 [16259.942694] [<c100a6c0>] ? do_invalid_op+0x0/0x90
Jun 17 13:13:51 vmhost2 [16259.942767] [<c120f170>] ?
gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 [16259.942844] [<c1220dc2>] ? net_tx_action+0x1d2/0x9f0
Jun 17 13:13:51 vmhost2 [16259.942919] [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 17 13:13:51 vmhost2 [16259.942992] [<c103c378>] ? __do_softirq+0x88/0x110
Jun 17 13:13:51 vmhost2 [16259.943066] [<c1210057>] ?
__xen_evtchn_do_upcall+0xd7/0x160
Jun 17 13:13:51 vmhost2 [16259.943142] [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 17 13:13:51 vmhost2 [16259.943215] [<c121063a>] ?
xen_evtchn_do_upcall+0x2a/0x40
Jun 17 13:13:51 vmhost2 [16259.943289] [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 17 13:13:51 vmhost2 [16259.943369] [<c10013a7>] ?
hypercall_page+0x3a7/0x1010
Jun 17 13:13:51 vmhost2 [16259.943451] [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 17 13:13:51 vmhost2 [16259.943525] [<c100382c>] ? xen_idle+0x1c/0x30
Jun 17 13:13:51 vmhost2 [16259.943598] [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 17 13:13:51 vmhost2 [16259.943671] [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 17 13:13:51 vmhost2 [16259.943744] [<c1578367>] ?
unknown_bootoption+0x0/0x190
Jun 17 13:13:51 vmhost2 [16259.943819] [<c157b0e6>] ?
xen_start_kernel+0x624/0x62c
unfortunately our test machine has no native serial port, so we have no access
to the
hypervisor output atm. although we suspect the hypervisor to be the problem.
Kernel latest xen/stable-2.6.32.x (config attached):
samsel@vmhost2:~$ uname -a
Linux vmhost2 2.6.32.15-xen4.0-dom0 #2 SMP Wed Jun 16 14:02:14 CEST 2010 i686
GNU/Linux
samsel@vmhost2:~/build/linux-2.6.32-xen$ git log | head
commit 01d9fbca207ec232c758d991d66466fc6e38349e
Merge: cfce2d4 0a904db
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>
Date: Tue Jun 15 14:09:48 2010 +0100
Merge branch 'xen/next' into xen/stable-2.6.32.x
* xen/next:
netback: minor code formatting fixup
Netback: Set allocated memory to zero from vmalloc.
$ sudo xm info
host : vmhost2
release : 2.6.32.15-xen4.0.0-dom0
version : #2 SMP Wed Jun 16 14:02:14 CEST 2010
machine : i686
nr_cpus : 8
nr_nodes : 1
cores_per_socket : 4
threads_per_core : 2
cpu_mhz : 2808
hw_caps :
bfebfbff:28100000:00000000:00001b40:0098e3fd:00000000:00000001:00000000
virt_caps : hvm hvm_directio
total_memory : 8183
free_memory : 5032
node_to_cpu : node0:0-7
node_to_memory : node0:5032
node_to_dma32_mem : node0:2925
max_node_id : 0
xen_major : 4
xen_minor : 0
xen_extra : .1-rc3-pre
xen_caps : xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xf5800000
xen_changeset : Fri Jun 11 14:04:36 2010 +0100 21203:3903d95733f7
xen_commandline : dom0_mem=4G
cc_compiler : gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
cc_compile_by : samsel
cc_compile_domain : umic-mesh.net
cc_compile_date : Mon Jun 14 12:43:49 CEST 2010
xend_config_format : 4
pxe boot config:
KERNEL /linux/default/vmeshhost-mboot.c32
APPEND /initrd/vmeshhost-xen-testing dom0_mem=4G ---
/linux/default/vmeshhost-vmlinuz-testing id=default image=vmeshhost/testing
nodetype=vmeshhost rw root=/dev/ram0
initrd=/initrd/vmeshhost-dom0-initrd-testing init=/linuxrc
--- /initrd/vmeshhost-dom0-initrd-testing
samsel@vmhost2:~$ lspci
00:00.0 Host bridge: Intel Corporation Device d131 (rev 11)
00:03.0 PCI bridge: Intel Corporation Device d138 (rev 11)
00:08.0 System peripheral: Intel Corporation Device d155 (rev 11)
00:08.1 System peripheral: Intel Corporation Device d156 (rev 11)
00:08.2 System peripheral: Intel Corporation Device d157 (rev 11)
00:08.3 System peripheral: Intel Corporation Device d158 (rev 11)
00:10.0 System peripheral: Intel Corporation Device d150 (rev 11)
00:10.1 System peripheral: Intel Corporation Device d151 (rev 11)
00:1a.0 USB Controller: Intel Corporation Device 3b3c (rev 05)
00:1b.0 Audio device: Intel Corporation Device 3b56 (rev 05)
00:1c.0 PCI bridge: Intel Corporation Device 3b42 (rev 05)
00:1c.4 PCI bridge: Intel Corporation Device 3b4a (rev 05)
00:1c.5 PCI bridge: Intel Corporation Device 3b4c (rev 05)
00:1c.6 PCI bridge: Intel Corporation Device 3b4e (rev 05)
00:1c.7 PCI bridge: Intel Corporation Device 3b50 (rev 05)
00:1d.0 USB Controller: Intel Corporation Device 3b34 (rev 05)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
00:1f.0 ISA bridge: Intel Corporation Device 3b02 (rev 05)
00:1f.2 SATA controller: Intel Corporation Device 3b22 (rev 05)
00:1f.3 SMBus: Intel Corporation Device 3b30 (rev 05)
01:00.0 VGA compatible controller: ATI Technologies Inc Device 954f
01:00.1 Audio device: ATI Technologies Inc Device aa38
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI
Express Gigabit Ethernet controller (rev 03) <- not used
03:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI
Controller (rev 03)
03:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI
Controller (rev 03)
07:01.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet
Controller <- used
07:04.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller
(rev c0)
samsel@vmhost2:/proc$ cat cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 30
model name : Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
stepping : 5
cpu MHz : 2808.822
cache size : 8192 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu de tsc msr pae mce cx8 apic mtrr mca cmov pat clflush
acpi mmx fxsr sse sse2 ss ht nx constant_tsc nonstop_tsc aperfmperf pni est
ssse3 sse4_1 sse4_2 popcnt hypervisor ida
bogomips : 5617.64
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
DomUs (example, we run various kernels):
samsel@vmrouter313:~$ uname -a
Linux vmrouter313 2.6.24.7-pae-um #7 SMP Thu Apr 9 15:35:55 CEST 2009 i686
GNU/Linux
vmhost2-config
Description: Text Data
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|