WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Kernel BUG at "arch/xen/x86_64/mm/hypervisor.c":198

Hi again,

first the good news:    your 3.0 xen kernel doesn't seem to crash

the bad news:           the xen kernel from your web page doesn't really work
                        in the SUSE 10.0 without magic changes ?!


thanks for the hints with the serial console for xen boot command line,
so here is one full serial console log for the crash with SUSE 10.0+you
64bit xen kernel, see below.


On Jan 11, Jan Beulich wrote:

> From all I can tell this problem will only be fixed with the next SL10 kernel 
> update; I can't, however, tell you when
> this can be expected. Jan

Jan, can you please give any details about the reasons for this instability 
problem ?
does this crash get triggered by a bug in xen  hypervisor, or in dom0 kernel ?

is it possible to get a source patch which I can try to apply to the
SUSE xen sources for testing ? using a "non-SUSE" plain xen kernel seems to be 
difficult and painful (at least for me?!).




about my problems running SUSE 10.0 with the 64 bit SMP tar ball from
http://bits.xensource.com/Xen/latest/xen-3.0.0-install-x86_64.tgz.

with all setups with your xen kernel below, the SUSE 10.0 doesn't 
load the "tg3" kernel module for the ethernet cards, so there is 
no network connection at all until I use the console and load module
and start the net manually... 


a)
first I only  used /boot/* hypervisor plus kernel and the modules in /lib/*,
everything else still plain SUSE 10.0.

here, xend doesn't start (Connection refused) and locks up as zombie.
this causes the "rc" script to hang -- no getty processes get started
(but xdm is aleady running, so X11 login is possible...)

one obvious problem is that SUSE 10.0 /etc/init.d/boot.udev
needs /proc/config.gz to detect udev support in the kernel,
so it doesn't start udev without this patch:

-------------------------------------------------------------------------------
--- /etc/init.d/boot.udev.orig  2006-01-12 17:00:50.000000000 +0100
+++ /etc/init.d/boot.udev       2006-01-12 17:00:19.000000000 +0100
@@ -62,7 +62,7 @@
                #
                # Check whether we can use uevent messages
                #
-               if zcat /proc/config.gz | grep -q KOBJECT_UEVENT ; then
+               if grep -q kobject_uevent /proc/kallsyms ; then
                    # Yes, just use udevd
                    export UDEVD_EXPECTED_SEQNUM=$(cat 
/sys/kernel/hotplug_seqnum)
                    export UDEVD_EVENT_TIMEOUT=1
-------------------------------------------------------------------------------

but this doesn't really help, "xend start" still locks up.
some strace output pointed to xenstored, but only replacing
xenstored from SUSE to your binary didn't work either,
xend is still not working. 


b)
so I did a "full" install of your tar ball, overwriting some SUSE stuff.
now, xend boots and runs ok, but still e.g. the SUSE network setup
in dom0 doesn't work -- need to load "trg3" module and init network
form the console :-(


I'm a bit surprised (shocked?!) that it seems to be pretty painful 
to use a non-SUSE xen kernel with SUSE 10.0... 



I'm open to any suggestions & patches which make that SUSE 10.0 x86_64
work stable -- prefered with minimal changes (e.g. only patching & recompiling
the SUSE kernel sources, not touching xen-tools, udev setup etc.).

thanks in advance for any help -- and now to the (slightly shrinked -- ask for
full logs if needed) serial console log for a crash with SUSE 10.0 xen 64 bit 
kernel:



---- 8< ------- 8< ------- 8< ------- 8< ------- 8< ------- 8< ------- 8< ----
 __  __            _____  ___    _____ __    ___   ___    ____    _ 
 \ \/ /___ _ __   |___ / / _ \  |___  / /_  / _ \ ( _ )  |___ \  / |
  \  // _ \ '_ \    |_ \| | | |    / / '_ \| | | |/ _ \ __ __) | | |
  /  \  __/ | | |  ___) | |_| |   / /| (_) | |_| | (_) |__/ __/ _| |
 /_/\_\___|_| |_| |____(_)___/___/_/  \___/ \___/ \___/  |_____(_)_|
                            |_____|                                 
 http://www.cl.cam.ac.uk/netos/xen
 University of Cambridge Computer Laboratory

 Xen version 3.0_7608-2.1 (abuild@xxxxxxx) (gcc version 4.0.2 20050901 
(prerelease) (SUSE Linux)) Fri Nov 18 01:01:08 UTC 2005
 Latest ChangeSet: Wed Nov  2 11:12:30 2005 +0100 7608:76fbcb25d174

(XEN) Console output is synchronous.
...
(XEN) System RAM: 8191MB (8388156kB)
(XEN) Xen heap: 14MB (14424kB)
(XEN) found SMP MP-table at 000ff780
...
(XEN) Initializing CPU#0
(XEN) Detected 1794.911 MHz processor.
(XEN) Using scheduler: Simple EDF Scheduler (sedf)
(XEN) CPU0: AMD Flush Filter disabled
(XEN) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
(XEN) CPU: L2 Cache: 1024K (64 bytes/line)
(XEN) CPU 0(2) -> Core 0
(XEN) CPU0: AMD Dual Core AMD Opteron(tm) Processor 265 stepping 02
(XEN) Booting processor 1/1 eip 90000
(XEN) Initializing CPU#1
(XEN) CPU1: AMD Flush Filter disabled
(XEN) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
(XEN) CPU: L2 Cache: 1024K (64 bytes/line)
(XEN) CPU 1(2) -> Core 0
(XEN) CPU1: AMD Dual Core AMD Opteron(tm) Processor 265 stepping 02
(XEN) Booting processor 2/2 eip 90000
(XEN) Initializing CPU#2
(XEN) CPU2: AMD Flush Filter disabled
(XEN) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
(XEN) CPU: L2 Cache: 1024K (64 bytes/line)
(XEN) CPU 2(2) -> Core 0
(XEN) CPU2: AMD Dual Core AMD Opteron(tm) Processor 265 stepping 02
(XEN) Booting processor 3/3 eip 90000
(XEN) Initializing CPU#3
(XEN) CPU3: AMD Flush Filter disabled
(XEN) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
(XEN) CPU: L2 Cache: 1024K (64 bytes/line)
(XEN) CPU 3(2) -> Core 0
(XEN) CPU3: AMD Dual Core AMD Opteron(tm) Processor 265 stepping 02
(XEN) Total of 4 processors activated.
...
(XEN) Scrubbing Free RAM: 
.............................................................................................done.
(XEN) Xen trace buffers: disabled
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to 
Xen).
Linux version 2.6.13-15.7-xen (geeko@buildhost) (gcc version 4.0.2 20050901 
(prerelease) (SUSE Linux)) #1 SMP Tue Nov 29 14:32:29 UTC 2005
kernel direct mapping tables upto 1fc00000 @ c79000-d79000
...
Initializing CPU#1
Initializing CPU#2
Brought up 4 CPUs
Initializing CPU#3
...
NET: Registered protocol family 17
microcode: Unknown symbol sys_munlock
microcode: Unknown symbol sys_mlock
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "arch/xen/x86_64/mm/hypervisor.c":198
invalid operand: 0000 [1] SMP 
CPU 0 
Modules linked in: bridge button battery ac af_packet tg3 ohci_hcd usbcore 
i2c_amd756 hw_random i2c_amd8111 i2c_core generic parport_pc lp parport ipv6 
ext3 jbd dm_mod reiserfs fan thermal processor sg aic79xx ide_cd cdrom 
scsi_transport_spi amd74xx sd_mod scsi_mod ide_disk ide_core
Pid: 6984, comm: lops1 Not tainted 2.6.13-15.7-xen
RIP: e030:[<ffffffff80122ea7>] <ffffffff80122ea7>{xen_pgd_pin+71}
RSP: e02b:ffff88000f1c9df0  EFLAGS: 00010282
RAX: 00000000ffffffea RBX: ffff880018f52dc0 RCX: ffffffff80122ea3
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88000f1c9df0
RBP: ffff880018f52e30 R08: ffff88001d91c000 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000246 R12: ffff880018f52dc0
R13: ffff88001a0d8040 R14: ffff8800017fad20 R15: 0000000000000000
FS:  00002aaaab297f60(0000) GS:ffffffff804df000(0000) knlGS:00000000556af6c0
CS:  e033 DS: 0000 ES: 0000
Process lops1 (pid: 6984, threadinfo ffff88000f1c8000, task ffff880019993760)
Stack: ffff880000000003 00000000000b006e ffff88001a0d8040 ffff8800017fad20 
       0000000000000000 ffffffff80121dc3 0000000000000000 0000000000000000 
       ffff88000f1c9f70 ffffffff8035f724 
Call Trace:<ffffffff80121dc3>{mm_pin+355} <ffffffff8035f724>{schedule+2964}
       <ffffffff8011739f>{do_softirq+79} <ffffffff801177d9>{do_IRQ+57}
       <ffffffff8010dee5>{evtchn_do_upcall+149} 
<ffffffff80113599>{retint_careful+41}
       

Code: 0f 0b a3 d0 e8 37 80 ff ff ff ff c2 c6 00 48 83 c4 28 c3 66 
RIP <ffffffff80122ea7>{xen_pgd_pin+71} RSP <ffff88000f1c9df0>
---- 8< ------- 8< ------- 8< ------- 8< ------- 8< ------- 8< ------- 8< ----




Harald Koenig
-- 
"I hope to die                                      ___       _____
before I *have* to use Microsoft Word.",           0--,|    /OOOOOOO\
Donald E. Knuth, 02-Oct-2001 in Tuebingen.        <_/  /  /OOOOOOOOOOO\
                                                    \  \/OOOOOOOOOOOOOOO\
                                                      \ OOOOOOOOOOOOOOOOO|//
Harald Koenig                                          \/\/\/\/\/\/\/\/\/
science+computing ag                                    //  /     \\  \
koenig@xxxxxxxxxxxxxxxxxxxx                            ^^^^^       ^^^^^

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel