WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] xen dom0 2.6.32.15 kernel BUG at drivers/xen/grant-table

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] xen dom0 2.6.32.15 kernel BUG at drivers/xen/grant-table.c:583
From: Christian Samsel <csamsel@xxxxxxxxx>
Date: Mon, 21 Jun 2010 10:37:44 +0200
Delivery-date: Mon, 21 Jun 2010 01:38:58 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Am Montag, 14. Juni 2010, 14:44:36 schrieb Arnd Hannemann:
>
> FYI: I got lucky and reproduced the error within only 15 minutes and
> hypervisor version:
> 
problem still exists. maybe something of the following information rings a 
bell. if even more
infos are needed just ask.

vmhost2 runs small (128 mb ram) domUs which simulate wireless mesh network nodes
(network bridged, no blk - they boot over nfs). it runs ubuntu 10.4 over nfs. 
after some
hours of starting and shutting down domUs it hangs with the following (or 
similiar) traceback. 
The issue might have gotten worse over the last kernel updates, some weeks ago 
we 
managed to start and shutdown over 500 dom0s, but now i hangs quicker, often 
after 50-60 iterations.

Our "old" setup runs ubuntu 8.4 with xen 3.4 and runs stable for weeks doing 
the same thing.

Jun 17 13:13:51 vmhost2 [16259.938609] ------------[ cut here ]------------
Jun 17 13:13:51 vmhost2 [16259.938658] kernel BUG at 
drivers/xen/grant-table.c:583!
Jun 17 13:13:51 vmhost2 [16259.938698] invalid opcode: 0000 [#1] 
Jun 17 13:13:51 vmhost2 SMP 
Jun 17 13:13:51 vmhost2 
Jun 17 13:13:51 vmhost2 [16259.938764] last sysfs file: 
/sys/devices/virtual/net/br0/bridge/topology_change_detected
Jun 17 13:13:51 vmhost2 [16259.938824] Modules linked in:
Jun 17 13:13:51 vmhost2 nf_conntrack_ipv4
Jun 17 13:13:51 vmhost2 nf_defrag_ipv4
Jun 17 13:13:51 vmhost2 xt_state
Jun 17 13:13:51 vmhost2 nf_conntrack
Jun 17 13:13:51 vmhost2 xt_physdev
Jun 17 13:13:51 vmhost2 iptable_filter
Jun 17 13:13:51 vmhost2 ip_tables
Jun 17 13:13:51 vmhost2 x_tables
Jun 17 13:13:51 vmhost2 netconsole
Jun 17 13:13:51 vmhost2 raid0
Jun 17 13:13:51 vmhost2 md_mod
Jun 17 13:13:51 vmhost2 rtc_cmos
Jun 17 13:13:51 vmhost2 rtc_core
Jun 17 13:13:51 vmhost2 rtc_lib
Jun 17 13:13:51 vmhost2 pl2303
Jun 17 13:13:51 vmhost2 thermal
Jun 17 13:13:51 vmhost2 usbserial
Jun 17 13:13:51 vmhost2 processor
Jun 17 13:13:51 vmhost2 thermal_sys
Jun 17 13:13:51 vmhost2 button
Jun 17 13:13:51 vmhost2 hwmon
Jun 17 13:13:51 vmhost2 acpi_processor
Jun 17 13:13:51 vmhost2 sr_mod
Jun 17 13:13:51 vmhost2 cdrom
Jun 17 13:13:51 vmhost2 evdev
Jun 17 13:13:51 vmhost2 ipv6
Jun 17 13:13:51 vmhost2 
Jun 17 13:13:51 vmhost2 [16259.939240] 
Jun 17 13:13:51 vmhost2 [16259.939273] Pid: 0, comm: swapper Not tainted 
(2.6.32.15-xen4.0.0-dom0 #2) System Product Name
Jun 17 13:13:51 vmhost2 [16259.939335] EIP: 0061:[<c120f170>] EFLAGS: 00010282 
CPU: 0
Jun 17 13:13:51 vmhost2 [16259.939385] EIP is at 
gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 [16259.939428] EAX: ffffffea EBX: c153be74 ECX: 
00000001 EDX: 00000000
Jun 17 13:13:51 vmhost2 [16259.939469] ESI: 00007ff0 EDI: 0000001c EBP: 
c28f9ae0 ESP: c153be40
Jun 17 13:13:51 vmhost2 [16259.939510]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 
0069
Jun 17 13:13:51 vmhost2 [16259.939550] Process swapper (pid: 0, ti=c153a000 
task=c1543760 task.ti=c153a000)
Jun 17 13:13:51 vmhost2 [16259.939608] Stack:
Jun 17 13:13:51 vmhost2 [16259.939640]  00000000
Jun 17 13:13:51 vmhost2 00214815
Jun 17 13:13:51 vmhost2 c28cefe0
Jun 17 13:13:51 vmhost2 0002bf57
Jun 17 13:13:51 vmhost2 ebf57000
Jun 17 13:13:51 vmhost2 ebc42078
Jun 17 13:13:51 vmhost2 0000001c
Jun 17 13:13:51 vmhost2 ebf57000
Jun 17 13:13:51 vmhost2 
Jun 17 13:13:51 vmhost2 [16259.939762] <0>
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 ea9ff000
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 0000001c
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 14815001
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 0002bf57
Jun 17 13:13:51 vmhost2 
Jun 17 13:13:51 vmhost2 [16259.939914] <0>
Jun 17 13:13:51 vmhost2 00000000
Jun 17 13:13:51 vmhost2 edab8a24
Jun 17 13:13:51 vmhost2 edab78a4
Jun 17 13:13:51 vmhost2 edab8a24
Jun 17 13:13:51 vmhost2 edab7818
Jun 17 13:13:51 vmhost2 c1220dc2
Jun 17 13:13:51 vmhost2 00000100
Jun 17 13:13:51 vmhost2 00000001
Jun 17 13:13:51 vmhost2 
Jun 17 13:13:51 vmhost2 [16259.940094] Call Trace:
Jun 17 13:13:51 vmhost2 [16259.940135]  [<c1220dc2>] ? net_tx_action+0x1d2/0x9f0
Jun 17 13:13:51 vmhost2 [16259.940179]  [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 17 13:13:51 vmhost2 [16259.940221]  [<c103c378>] ? __do_softirq+0x88/0x110
Jun 17 13:13:51 vmhost2 [16259.940263]  [<c1210057>] ? 
__xen_evtchn_do_upcall+0xd7/0x160
Jun 17 13:13:51 vmhost2 [16259.940307]  [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 17 13:13:51 vmhost2 [16259.940348]  [<c121063a>] ? 
xen_evtchn_do_upcall+0x2a/0x40
Jun 17 13:13:51 vmhost2 [16259.940400]  [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 17 13:13:51 vmhost2 [16259.940443]  [<c10013a7>] ? 
hypercall_page+0x3a7/0x1010
Jun 17 13:13:51 vmhost2 [16259.940486]  [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 17 13:13:51 vmhost2 [16259.940528]  [<c100382c>] ? xen_idle+0x1c/0x30
Jun 17 13:13:51 vmhost2 [16259.940569]  [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 17 13:13:51 vmhost2 [16259.940611]  [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 17 13:13:51 vmhost2 [16259.940653]  [<c1578367>] ? 
unknown_bootoption+0x0/0x190
Jun 17 13:13:51 vmhost2 [16259.940696]  [<c157b0e6>] ? 
xen_start_kernel+0x624/0x62c
Jun 17 13:13:51 vmhost2 [16259.940735] Code: 
Jun 17 13:13:51 vmhost2 8d 
Jun 17 13:13:51 vmhost2 5c 
Jun 17 13:13:51 vmhost2 24 
Jun 17 13:13:51 vmhost2 34 
Jun 17 13:13:51 vmhost2 c1 
Jun 17 13:13:51 vmhost2 e0 
Jun 17 13:13:51 vmhost2 0c 
Jun 17 13:13:51 vmhost2 83 
Jun 17 13:13:51 vmhost2 c8 
Jun 17 13:13:51 vmhost2 01 
Jun 17 13:13:51 vmhost2 89 
Jun 17 13:13:51 vmhost2 44 
Jun 17 13:13:51 vmhost2 24 
Jun 17 13:13:51 vmhost2 34 
Jun 17 13:13:51 vmhost2 8b 
Jun 17 13:13:51 vmhost2 44 
Jun 17 13:13:51 vmhost2 24 
Jun 17 13:13:51 vmhost2 0c 
Jun 17 13:13:51 vmhost2 c7 
Jun 17 13:13:51 vmhost2 44 
Jun 17 13:13:51 vmhost2 24 
Jun 17 13:13:51 vmhost2 40 
Jun 17 13:13:51 vmhost2 00 
Jun 17 13:13:51 vmhost2 00 
Jun 17 13:13:51 vmhost2 00 
Jun 17 13:13:51 vmhost2 00 
Jun 17 13:13:51 vmhost2 89 
Jun 17 13:13:51 vmhost2 44 
Jun 17 13:13:51 vmhost2 24 
Jun 17 13:13:51 vmhost2 3c 
Jun 17 13:13:51 vmhost2 e8 
Jun 17 13:13:51 vmhost2 b8 
Jun 17 13:13:51 vmhost2 1e 
Jun 17 13:13:51 vmhost2 df 
Jun 17 13:13:51 vmhost2 ff 
Jun 17 13:13:51 vmhost2 85 
Jun 17 13:13:51 vmhost2 c0 
Jun 17 13:13:51 vmhost2 0f 
Jun 17 13:13:51 vmhost2 84 
Jun 17 13:13:51 vmhost2 2c 
Jun 17 13:13:51 vmhost2 ff 
Jun 17 13:13:51 vmhost2 ff 
Jun 17 13:13:51 vmhost2 ff 
Jun 17 11:13:50 vmhost2 unparseable log message: "<0f> "
Jun 17 13:13:51 vmhost2 0b 
Jun 17 13:13:51 vmhost2 eb 
Jun 17 13:13:51 vmhost2 fe 
Jun 17 13:13:51 vmhost2 0f 
Jun 17 13:13:51 vmhost2 0b 
Jun 17 13:13:51 vmhost2 eb 
Jun 17 13:13:51 vmhost2 fe 
Jun 17 13:13:51 vmhost2 0f 
Jun 17 13:13:51 vmhost2 0b 
Jun 17 13:13:51 vmhost2 eb 
Jun 17 13:13:51 vmhost2 fe 
Jun 17 13:13:51 vmhost2 8b 
Jun 17 13:13:51 vmhost2 54 
Jun 17 13:13:51 vmhost2 24 
Jun 17 13:13:51 vmhost2 04 
Jun 17 13:13:51 vmhost2 8b 
Jun 17 13:13:51 vmhost2 44 
Jun 17 13:13:51 vmhost2 24 
Jun 17 13:13:51 vmhost2 0c 
Jun 17 13:13:51 vmhost2 e8 
Jun 17 13:13:51 vmhost2 
Jun 17 13:13:51 vmhost2 [16259.941540] EIP: [<c120f170>] 
Jun 17 13:13:51 vmhost2 gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 SS:ESP 0069:c153be40
Jun 17 13:13:51 vmhost2 [16259.941864] ---[ end trace 0514df71b6948a8c ]---
Jun 17 13:13:51 vmhost2 [16259.941934] Kernel panic - not syncing: Fatal 
exception in interrupt
Jun 17 13:13:51 vmhost2 [16259.942009] Pid: 0, comm: swapper Tainted: G      D  
  2.6.32.15-xen4.0.0-dom0 #2
Jun 17 13:13:51 vmhost2 [16259.942100] Call Trace:
Jun 17 13:13:51 vmhost2 [16259.942170]  [<c141d9d5>] ? panic+0x42/0xe1
Jun 17 13:13:51 vmhost2 [16259.942243]  [<c100cc56>] ? oops_end+0x96/0xa0
Jun 17 13:13:51 vmhost2 [16259.942316]  [<c100a73f>] ? do_invalid_op+0x7f/0x90
Jun 17 13:13:51 vmhost2 [16259.942390]  [<c120f170>] ? 
gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 [16259.942468]  [<c10741e4>] ? 
__alloc_pages_nodemask+0xe4/0x5b0
Jun 17 13:13:51 vmhost2 [16259.942543]  [<c1210057>] ? 
__xen_evtchn_do_upcall+0xd7/0x160
Jun 17 13:13:51 vmhost2 [16259.942621]  [<c14200c6>] ? error_code+0x66/0x6c
Jun 17 13:13:51 vmhost2 [16259.942694]  [<c100a6c0>] ? do_invalid_op+0x0/0x90
Jun 17 13:13:51 vmhost2 [16259.942767]  [<c120f170>] ? 
gnttab_copy_grant_page+0x1f0/0x260
Jun 17 13:13:51 vmhost2 [16259.942844]  [<c1220dc2>] ? net_tx_action+0x1d2/0x9f0
Jun 17 13:13:51 vmhost2 [16259.942919]  [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 17 13:13:51 vmhost2 [16259.942992]  [<c103c378>] ? __do_softirq+0x88/0x110
Jun 17 13:13:51 vmhost2 [16259.943066]  [<c1210057>] ? 
__xen_evtchn_do_upcall+0xd7/0x160
Jun 17 13:13:51 vmhost2 [16259.943142]  [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 17 13:13:51 vmhost2 [16259.943215]  [<c121063a>] ? 
xen_evtchn_do_upcall+0x2a/0x40
Jun 17 13:13:51 vmhost2 [16259.943289]  [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 17 13:13:51 vmhost2 [16259.943369]  [<c10013a7>] ? 
hypercall_page+0x3a7/0x1010
Jun 17 13:13:51 vmhost2 [16259.943451]  [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 17 13:13:51 vmhost2 [16259.943525]  [<c100382c>] ? xen_idle+0x1c/0x30
Jun 17 13:13:51 vmhost2 [16259.943598]  [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 17 13:13:51 vmhost2 [16259.943671]  [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 17 13:13:51 vmhost2 [16259.943744]  [<c1578367>] ? 
unknown_bootoption+0x0/0x190
Jun 17 13:13:51 vmhost2 [16259.943819]  [<c157b0e6>] ? 
xen_start_kernel+0x624/0x62c

unfortunately our test machine has no native serial port, so we have no access 
to the 
hypervisor output atm. although we suspect the hypervisor to be the problem.

Kernel latest xen/stable-2.6.32.x (config attached):

samsel@vmhost2:~$ uname -a
Linux vmhost2 2.6.32.15-xen4.0-dom0 #2 SMP Wed Jun 16 14:02:14 CEST 2010 i686 
GNU/Linux

samsel@vmhost2:~/build/linux-2.6.32-xen$ git log | head
commit 01d9fbca207ec232c758d991d66466fc6e38349e
Merge: cfce2d4 0a904db
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>
Date:   Tue Jun 15 14:09:48 2010 +0100

    Merge branch 'xen/next' into xen/stable-2.6.32.x
    
    * xen/next:
      netback: minor code formatting fixup
      Netback: Set allocated memory to zero from vmalloc.

$ sudo xm info
host                   : vmhost2
release                : 2.6.32.15-xen4.0.0-dom0
version                : #2 SMP Wed Jun 16 14:02:14 CEST 2010
machine                : i686
nr_cpus                : 8
nr_nodes               : 1
cores_per_socket       : 4
threads_per_core       : 2
cpu_mhz                : 2808
hw_caps                : 
bfebfbff:28100000:00000000:00001b40:0098e3fd:00000000:00000001:00000000
virt_caps              : hvm hvm_directio
total_memory           : 8183
free_memory            : 5032
node_to_cpu            : node0:0-7
node_to_memory         : node0:5032
node_to_dma32_mem      : node0:2925
max_node_id            : 0
xen_major              : 4
xen_minor              : 0
xen_extra              : .1-rc3-pre
xen_caps               : xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p 
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xf5800000
xen_changeset          : Fri Jun 11 14:04:36 2010 +0100 21203:3903d95733f7
xen_commandline        : dom0_mem=4G 
cc_compiler            : gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) 
cc_compile_by          : samsel
cc_compile_domain      : umic-mesh.net
cc_compile_date        : Mon Jun 14 12:43:49 CEST 2010
xend_config_format     : 4

pxe boot config:
KERNEL /linux/default/vmeshhost-mboot.c32
APPEND /initrd/vmeshhost-xen-testing dom0_mem=4G --- 
/linux/default/vmeshhost-vmlinuz-testing id=default image=vmeshhost/testing 
nodetype=vmeshhost rw root=/dev/ram0 
initrd=/initrd/vmeshhost-dom0-initrd-testing init=/linuxrc 
--- /initrd/vmeshhost-dom0-initrd-testing

samsel@vmhost2:~$ lspci
00:00.0 Host bridge: Intel Corporation Device d131 (rev 11)
00:03.0 PCI bridge: Intel Corporation Device d138 (rev 11)
00:08.0 System peripheral: Intel Corporation Device d155 (rev 11)
00:08.1 System peripheral: Intel Corporation Device d156 (rev 11)
00:08.2 System peripheral: Intel Corporation Device d157 (rev 11)
00:08.3 System peripheral: Intel Corporation Device d158 (rev 11)
00:10.0 System peripheral: Intel Corporation Device d150 (rev 11)
00:10.1 System peripheral: Intel Corporation Device d151 (rev 11)
00:1a.0 USB Controller: Intel Corporation Device 3b3c (rev 05)
00:1b.0 Audio device: Intel Corporation Device 3b56 (rev 05)
00:1c.0 PCI bridge: Intel Corporation Device 3b42 (rev 05)
00:1c.4 PCI bridge: Intel Corporation Device 3b4a (rev 05)
00:1c.5 PCI bridge: Intel Corporation Device 3b4c (rev 05)
00:1c.6 PCI bridge: Intel Corporation Device 3b4e (rev 05)
00:1c.7 PCI bridge: Intel Corporation Device 3b50 (rev 05)
00:1d.0 USB Controller: Intel Corporation Device 3b34 (rev 05)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
00:1f.0 ISA bridge: Intel Corporation Device 3b02 (rev 05)
00:1f.2 SATA controller: Intel Corporation Device 3b22 (rev 05)
00:1f.3 SMBus: Intel Corporation Device 3b30 (rev 05)
01:00.0 VGA compatible controller: ATI Technologies Inc Device 954f
01:00.1 Audio device: ATI Technologies Inc Device aa38
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI 
Express Gigabit Ethernet controller (rev 03) <- not used
03:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI 
Controller (rev 03)
03:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI 
Controller (rev 03)
07:01.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet 
Controller <- used
07:04.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller 
(rev c0)

samsel@vmhost2:/proc$ cat cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 30
model name      : Intel(R) Core(TM) i7 CPU         860  @ 2.80GHz
stepping        : 5
cpu MHz         : 2808.822
cache size      : 8192 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic mtrr mca cmov pat clflush 
acpi mmx fxsr sse sse2 ss ht nx constant_tsc nonstop_tsc aperfmperf pni est 
ssse3 sse4_1 sse4_2 popcnt hypervisor ida
bogomips        : 5617.64
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

DomUs (example, we run various kernels):
samsel@vmrouter313:~$ uname -a
Linux vmrouter313 2.6.24.7-pae-um #7 SMP Thu Apr 9 15:35:55 CEST 2009 i686 
GNU/Linux

Attachment: vmhost2-config
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel