|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] xen dom0 2.6.32.15 kernel BUG at drivers/xen/grant-table
Hi,
Am 14.06.2010 12:57, schrieb Stefano Stabellini:
> On Mon, 14 Jun 2010, Arnd Hannemann wrote:
>> Hi,
>>
>> we have regular but hard to reproduce (wait for a day or two starting domUs)
>> kernel panics (see below) with latest
>> "xen/stable-2.6.32.x" git tree.
>>
>> Any idea, anyone?
>>
>
> this CS from origin/xen/dom0/gntdev should fix your problem:
>
> sstabellini@kaball-desktop:~/xensource/linux-pvops-latest$ git show
> ad469f0da31bc16b945f9a06710b9d45434d0091
> commit ad469f0da31bc16b945f9a06710b9d45434d0091
> Author: Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>
> Date: Wed Jun 9 12:34:02 2010 -0700
>
> xen/gntdev: use spinlocks rather than rwsem for locking
>
> The mmu notifier mechanism calls its callbacks with an rcu lock,
> which disables preemption. This means we cannot use any blocking
> synchronization for locking.
>
> Convert all the rwsemas to plain spinlocks. This requires that
> the memory allocation and copying to/from userspace be split
> from the actual datastructure updates since they can't be done
> under spinlock.
>
> Signed-off-by: Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>
> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>
>
Unfortunately, this patch does not seem to help. We get a very similar
backtrace after one hour stress testing with a script starting and stopping
domUs in a loop.
Maybe the problem is the hypervisor itself?
We are currently using 4.0.1-rc2-pre (we updated from 4.0.0 because of what we
believed was the same
problem, we had no working netconsole back then though).
Jun 14 14:07:22 vmhost2 [ 2418.542425] ------------[ cut here ]------------
Jun 14 14:07:22 vmhost2 [ 2418.542475] kernel BUG at
drivers/xen/grant-table.c:583!
Jun 14 14:07:22 vmhost2 [ 2418.542515] invalid opcode: 0000 [#1]
Jun 14 14:07:22 vmhost2 SMP
Jun 14 14:07:22 vmhost2
Jun 14 14:07:22 vmhost2 [ 2418.542574] last sysfs file:
/sys/devices/virtual/net/br0/bridge/topology_change_detected
Jun 14 14:07:22 vmhost2 [ 2418.542640] Modules linked in:
Jun 14 14:07:22 vmhost2 netconsole
Jun 14 14:07:22 vmhost2 raid0
Jun 14 14:07:22 vmhost2 md_mod
Jun 14 14:07:22 vmhost2 rtc_cmos
Jun 14 14:07:22 vmhost2 rtc_core
Jun 14 14:07:22 vmhost2 rtc_lib
Jun 14 14:07:22 vmhost2 ipv6
Jun 14 14:07:22 vmhost2 thermal
Jun 14 14:07:22 vmhost2 processor
Jun 14 14:07:22 vmhost2 thermal_sys
Jun 14 14:07:22 vmhost2 hwmon
Jun 14 14:07:22 vmhost2 pl2303
Jun 14 14:07:22 vmhost2 button
Jun 14 14:07:22 vmhost2 acpi_processor
Jun 14 14:07:22 vmhost2 usbserial
Jun 14 14:07:22 vmhost2 sr_mod
Jun 14 14:07:22 vmhost2 evdev
Jun 14 14:07:22 vmhost2 cdrom
Jun 14 14:07:22 vmhost2
Jun 14 14:07:22 vmhost2 [ 2418.542937]
Jun 14 14:07:22 vmhost2 [ 2418.542970] Pid: 0, comm: swapper Not tainted
(2.6.32.15-xen4.0.0-dom0-stefano #2) System Product Name
Jun 14 14:07:22 vmhost2 [ 2418.543034] EIP: 0061:[<c120f170>] EFLAGS: 00010282
CPU: 0
Jun 14 14:07:22 vmhost2 [ 2418.543077] EIP is at
gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:07:22 vmhost2 [ 2418.543117] EAX: ffffffea EBX: c153be84 ECX:
00000001 EDX: 00000000
Jun 14 14:07:22 vmhost2 [ 2418.543158] ESI: 00007ff0 EDI: 00000013 EBP:
c290e660 ESP: c153be50
Jun 14 14:07:22 vmhost2 [ 2418.543199] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS:
0069
Jun 14 14:07:22 vmhost2 [ 2418.543239] Process swapper (pid: 0, ti=c153a000
task=c1543760 task.ti=c153a000)
Jun 14 14:07:22 vmhost2 [ 2418.543297] Stack:
Jun 14 14:07:22 vmhost2 [ 2418.543329] 00000000
Jun 14 14:07:22 vmhost2 00213784
Jun 14 14:07:22 vmhost2 c2904dc0
Jun 14 14:07:22 vmhost2 0002c233
Jun 14 14:07:22 vmhost2 ec233000
Jun 14 14:07:22 vmhost2 ecf85bec
Jun 14 14:07:22 vmhost2 00000013
Jun 14 14:07:22 vmhost2 ec233000
Jun 14 14:07:22 vmhost2
Jun 14 14:07:22 vmhost2 [ 2418.543461] <0>
Jun 14 14:07:22 vmhost2 00000000
Jun 14 14:07:22 vmhost2 ebd6e000
Jun 14 14:07:22 vmhost2 00000000
Jun 14 14:07:22 vmhost2 00000013
Jun 14 14:07:22 vmhost2 c1350000
Jun 14 14:07:22 vmhost2 13784001
Jun 14 14:07:22 vmhost2 00000000
Jun 14 14:07:22 vmhost2 0002c233
Jun 14 14:07:22 vmhost2
Jun 14 14:07:22 vmhost2 [ 2418.543616] <0>
Jun 14 14:07:22 vmhost2 00000000
Jun 14 14:07:22 vmhost2 c1628284
Jun 14 14:07:22 vmhost2 c155b978
Jun 14 14:07:22 vmhost2 c1628284
Jun 14 14:07:22 vmhost2 00560014
Jun 14 14:07:22 vmhost2 c12200c1
Jun 14 14:07:22 vmhost2 00000001
Jun 14 14:07:22 vmhost2 00000000
Jun 14 14:07:22 vmhost2
Jun 14 14:07:22 vmhost2 [ 2418.543797] Call Trace:
Jun 14 14:07:22 vmhost2 [ 2418.543838] [<c1350000>] ? sock_release+0x10/0x80
Jun 14 14:07:22 vmhost2 [ 2418.543882] [<c12200c1>] ? net_tx_action+0x1d1/0x9b0
Jun 14 14:07:22 vmhost2 [ 2418.543925] [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 14 14:07:22 vmhost2 [ 2418.543967] [<c103c378>] ? __do_softirq+0x88/0x110
Jun 14 14:07:22 vmhost2 [ 2418.544009] [<c1210057>] ?
__xen_evtchn_do_upcall+0xd7/0x160
Jun 14 14:07:22 vmhost2 [ 2418.544053] [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 14 14:07:22 vmhost2 [ 2418.544094] [<c121063a>] ?
xen_evtchn_do_upcall+0x2a/0x40
Jun 14 14:07:22 vmhost2 [ 2418.544147] [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 14 14:07:22 vmhost2 [ 2418.544190] [<c10013a7>] ?
hypercall_page+0x3a7/0x1010
Jun 14 14:07:22 vmhost2 [ 2418.544234] [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 14 14:07:22 vmhost2 [ 2418.544275] [<c100382c>] ? xen_idle+0x1c/0x30
Jun 14 14:07:22 vmhost2 [ 2418.544316] [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 14 14:07:22 vmhost2 [ 2418.544359] [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 14 14:07:22 vmhost2 [ 2418.544401] [<c1578367>] ?
unknown_bootoption+0x0/0x190
Jun 14 14:07:22 vmhost2 [ 2418.544444] [<c157b0e6>] ?
xen_start_kernel+0x624/0x62c
Jun 14 14:07:22 vmhost2 [ 2418.544483] Code:
Jun 14 14:07:22 vmhost2 8d
Jun 14 14:07:22 vmhost2 5c
Jun 14 14:07:22 vmhost2 24
Jun 14 14:07:22 vmhost2 34
Jun 14 14:07:22 vmhost2 c1
Jun 14 14:07:22 vmhost2 e0
Jun 14 14:07:22 vmhost2 0c
Jun 14 14:07:22 vmhost2 83
Jun 14 14:07:22 vmhost2 c8
Jun 14 14:07:22 vmhost2 01
Jun 14 14:07:22 vmhost2 89
Jun 14 14:07:22 vmhost2 44
Jun 14 14:07:22 vmhost2 24
Jun 14 14:07:22 vmhost2 34
Jun 14 14:07:22 vmhost2 8b
Jun 14 14:07:22 vmhost2 44
Jun 14 14:07:22 vmhost2 24
Jun 14 14:07:22 vmhost2 0c
Jun 14 14:07:22 vmhost2 c7
Jun 14 14:07:22 vmhost2 44
Jun 14 14:07:22 vmhost2 24
Jun 14 14:07:22 vmhost2 40
Jun 14 14:07:22 vmhost2 00
Jun 14 14:07:22 vmhost2 00
Jun 14 14:07:22 vmhost2 00
Jun 14 14:07:22 vmhost2 00
Jun 14 14:07:22 vmhost2 89
Jun 14 14:07:22 vmhost2 44
Jun 14 14:07:22 vmhost2 24
Jun 14 14:07:22 vmhost2 3c
Jun 14 14:07:22 vmhost2 e8
Jun 14 14:07:22 vmhost2 b8
Jun 14 14:07:22 vmhost2 1e
Jun 14 14:07:22 vmhost2 df
Jun 14 14:07:22 vmhost2 ff
Jun 14 14:07:22 vmhost2 85
Jun 14 14:07:22 vmhost2 c0
Jun 14 14:07:22 vmhost2 0f
Jun 14 14:07:22 vmhost2 84
Jun 14 14:07:22 vmhost2 2c
Jun 14 14:07:22 vmhost2 ff
Jun 14 14:07:22 vmhost2 ff
Jun 14 14:07:22 vmhost2 ff
Jun 14 12:07:21 vmhost2 unparseable log message: "<0f> "
Jun 14 14:07:22 vmhost2 0b
Jun 14 14:07:22 vmhost2 eb
Jun 14 14:07:22 vmhost2 fe
Jun 14 14:07:22 vmhost2 0f
Jun 14 14:07:22 vmhost2 0b
Jun 14 14:07:22 vmhost2 eb
Jun 14 14:07:22 vmhost2 fe
Jun 14 14:07:22 vmhost2 0f
Jun 14 14:07:22 vmhost2 0b
Jun 14 14:07:22 vmhost2 eb
Jun 14 14:07:22 vmhost2 fe
Jun 14 14:07:22 vmhost2 8b
Jun 14 14:07:22 vmhost2 54
Jun 14 14:07:22 vmhost2 24
Jun 14 14:07:22 vmhost2 04
Jun 14 14:07:22 vmhost2 8b
Jun 14 14:07:22 vmhost2 44
Jun 14 14:07:22 vmhost2 24
Jun 14 14:07:22 vmhost2 0c
Jun 14 14:07:22 vmhost2 e8
Jun 14 14:07:22 vmhost2
Jun 14 14:07:22 vmhost2 [ 2418.545277] EIP: [<c120f170>]
Jun 14 14:07:22 vmhost2 gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:07:22 vmhost2 SS:ESP 0069:c153be50
Jun 14 14:07:22 vmhost2 [ 2418.545597] ---[ end trace f877a40240218318 ]---
Jun 14 14:07:22 vmhost2 [ 2418.545669] Kernel panic - not syncing: Fatal
exception in interrupt
Jun 14 14:07:22 vmhost2 [ 2418.545746] Pid: 0, comm: swapper Tainted: G D
2.6.32.15-xen4.0.0-dom0-stefano #2
Jun 14 14:07:22 vmhost2 [ 2418.545840] Call Trace:
Jun 14 14:07:22 vmhost2 [ 2418.545912] [<c141d3b5>] ? panic+0x42/0xe1
Jun 14 14:07:22 vmhost2 [ 2418.545986] [<c100cc56>] ? oops_end+0x96/0xa0
Jun 14 14:07:22 vmhost2 [ 2418.546060] [<c100a73f>] ? do_invalid_op+0x7f/0x90
Jun 14 14:07:22 vmhost2 [ 2418.546135] [<c120f170>] ?
gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:07:22 vmhost2 [ 2418.546223] [<c10741e4>] ?
__alloc_pages_nodemask+0xe4/0x5b0
Jun 14 14:07:22 vmhost2 [ 2418.546303] [<c1006197>] ?
xen_force_evtchn_callback+0x17/0x30
Jun 14 14:07:22 vmhost2 [ 2418.546380] [<c1006a98>] ? check_events+0x8/0xc
Jun 14 14:07:22 vmhost2 [ 2418.546455] [<c141faa6>] ? error_code+0x66/0x6c
Jun 14 14:07:22 vmhost2 [ 2418.546530] [<c100a6c0>] ? do_invalid_op+0x0/0x90
Jun 14 14:07:22 vmhost2 [ 2418.546606] [<c120f170>] ?
gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:07:22 vmhost2 [ 2418.546687] [<c1350000>] ? sock_release+0x10/0x80
Jun 14 14:07:22 vmhost2 [ 2418.546763] [<c12200c1>] ? net_tx_action+0x1d1/0x9b0
Jun 14 14:07:22 vmhost2 [ 2418.546839] [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 14 14:07:22 vmhost2 [ 2418.546915] [<c103c378>] ? __do_softirq+0x88/0x110
Jun 14 14:07:22 vmhost2 [ 2418.546993] [<c1210057>] ?
__xen_evtchn_do_upcall+0xd7/0x160
Jun 14 14:07:22 vmhost2 [ 2418.547070] [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 14 14:07:22 vmhost2 [ 2418.547145] [<c121063a>] ?
xen_evtchn_do_upcall+0x2a/0x40
Jun 14 14:07:22 vmhost2 [ 2418.547222] [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 14 14:07:22 vmhost2 [ 2418.547299] [<c10013a7>] ?
hypercall_page+0x3a7/0x1010
Jun 14 14:07:22 vmhost2 [ 2418.547385] [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 14 14:07:22 vmhost2 [ 2418.547463] [<c100382c>] ? xen_idle+0x1c/0x30
Jun 14 14:07:22 vmhost2 [ 2418.547537] [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 14 14:07:22 vmhost2 [ 2418.547615] [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 14 14:07:22 vmhost2 [ 2418.547690] [<c1578367>] ?
unknown_bootoption+0x0/0x190
Jun 14 14:07:22 vmhost2 [ 2418.547766] [<c157b0e6>] ?
xen_start_kernel+0x624/0x62c
Best regards,
Arnd
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|