Hi,
I've build an entire army of xen PV and HVM guests. I succeded to resolve
any problem encountered in my xen-projects. Everything was jus great in my
"perfect-xen-world" until I've changed something simple.
THE PROBLEM: I've made an upgrade from 2GB RAM to 4GB RAM and everything
was screwed up: zombie domUs, corrupted tap:aio files, tons of erros etc.
If I'm going back to old configuration with 2 GB of RAM the everything
comes back to normal.
First time I thought that new DDR2 modules are the cause of this cataclysm
but memtest did not return any error. That was on my Intel Server with 4 x
1 GB DDR2 modules at 533 Mhz and 1 x Intel(R) Pentium(R) D CPU 3.40GHz
with 2 cores and 2 MB L2 Cache on each core.
I've made additional tests on a separated machine: Dell PowerEdge 1950
with 8 x 512 DDR2 modules at 677 Mhz and 2 x Intel(R) Xeon(R) CPU 5130 @
2.00GHz with 2 cores / CPU and 4 MB L2 Cache on each core. The problem is
exactly the same... but if I'm using only 2 GB of RAM everything is ok.
Issues:
* I don't think Intel SE7230NH1 and Dell PE 1950 servers had hardware
problems that could cause this problem
* A non-PAE kernel can only address up to 4GB of memory... I do not have
more than 4 GB of RAM
* I found some articles related to Xen having problems with more than 4
Gbytes of network traffic
* I found some articles related to LSI Logic RAID Controllers combined
with linux driver having problems on machines with 4 GB of memory
* It's something related to memory holes?
* It's something related only to Fedora Core Xen Kernels?
Here are some about my setup
---------------------------------------------------
** Details on dom0:
title Fedora Core (2.6.19-1.2911.6.5.fc6xen)
root (hd0,0)
kernel /xen.gz-2.6.19-1.2911.6.5.fc6
module /vmlinuz-2.6.19-1.2911.6.5.fc6xen ro root=/dev/sda1
module /initrd-2.6.19-1.2911.6.5.fc6xen.img
host : localhost.localdomain
release : 2.6.19-1.2911.6.5.fc6xen
version : #1 SMP Sun Mar 4 16:59:41 EST 2007
machine : i686
nr_cpus : 4
nr_nodes : 1
sockets_per_node : 2
cores_per_socket : 2
threads_per_core : 1
cpu_mhz : 1995
hw_caps :
bfebfbff:20100000:00000000:00000140:0004e33d:00000000:00000001
total_memory : 4095
free_memory : 512
xen_major : 3
xen_minor : 0
xen_extra : .3-0-1.2911.6.5
xen_caps : xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p
xen_pagesize : 4096
platform_params : virt_start=0xf5800000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.1 20070105 (Red Hat 4.1.1-51)
cc_compile_by : brewbuilder
cc_compile_domain : build.redhat.com
cc_compile_date : Sun Mar 4 15:44:49 EST 2007
xend_config_format : 2
** Parts from xm dmesg
Xen version 3.0.3-0-1.2911.6.5.fc6 (brewbuilder@xxxxxxxxxxxxxxxx) (gcc
version 4.1.1 20070105 (Red Hat 4.1.1-51)) Sun Mar 4 15:44:49 EST 2007
Latest ChangeSet: unavailable
(XEN) Command line: /boot/xen.gz-2.6.19-1.2911.6.5.fc6
(XEN) Physical RAM map:
(XEN) 0000000000000000 - 00000000000a0000 (usable)
(XEN) 0000000000100000 - 00000000cffa8000 (usable)
(XEN) 00000000cffa8000 - 00000000cffb7c00 (ACPI data)
(XEN) 00000000cffb7c00 - 00000000d0000000 (reserved)
(XEN) 00000000e0000000 - 00000000f0000000 (reserved)
(XEN) 00000000fe000000 - 0000000100000000 (reserved)
(XEN) 0000000100000000 - 0000000130000000 (usable)
(XEN) System RAM: 4095MB (4193568kB)
(XEN) Xen heap: 9MB (10224kB)
(XEN) PAE enabled, limit: 16 GB
(XEN) found SMP MP-table at 000fe710
(XEN) DMI 2.4 present.
(XEN) Using APIC driver default
(XEN) VMXON is done
(XEN) ENABLING IO-APIC IRQs
(XEN) -> Using new ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) checking TSC synchronization across 4 CPUs: passed.
(XEN) Platform timer is 14.318MHz HPET
(XEN) Brought up 4 CPUs
(XEN) Machine check exception polling timer started.
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Domain 0 kernel supports features = { 0000001f }.
(XEN) Domain 0 kernel requires features = { 00000000 }.
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 0000000007000000->0000000008000000 (999720 pages to
be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: c0400000->c085aa08
(XEN) Init. ramdisk: c085b000->c0b8e000
(XEN) Phys-Mach map: c0b8e000->c0f624a0
(XEN) Start info: c0f63000->c0f6346c
(XEN) Page tables: c0f64000->c0f71000
(XEN) Boot stack: c0f71000->c0f72000
(XEN) TOTAL: c0000000->c1000000
(XEN) ENTRY ADDRESS: c0400000
(XEN) Dom0 has maximum 4 VCPUs
(XEN) Initrd len 0x333000, start at 0xc085b000
(XEN) Scrubbing Free RAM:
.................................................done.
(XEN) Xen trace buffers: disabled
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input
to Xen).
** xm list
Name ID Mem(MiB) VCPUs State Time(s)
Domain-0 0 3024 4 r----- 115.8
Zombie-test5 16 256 1 ----cd 0.5
** Details on domU:
Configuration
name = "test1"
memory = "256"
disk = [ 'tap:aio:/huge/test1.img,xvda,w', ]
bootloader="/usr/bin/pygrub"
on_reboot = 'restart'
on_crash = 'restart'
Errors extracted from xm console test1
Loading xenblk.ko module
Registering block device major 202
xvda:<1>WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 0
Buffer I/O error on device xvda, logical block 0
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 0
Buffer I/O error on device xvda, logical block 0
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 0
Buffer I/O error on device xvda, logical block 0
unable to read partition table
Creating root device.
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 20971392
Buffer I/O error on device xvda, logical block 2621424
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 20971392
Buffer I/O error on device xvda, logical block 2621424
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 0
Buffer I/O error on device xvda, logical block 0
Buffer I/O error on device xvda, logical block 1
Buffer I/O error on device xvda, logical block 2
Buffer I/O error on device xvda, logical block 3
Buffer I/O error on device xvda, logical block 4
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 0
Mounting root filesystem.
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 20971392
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 20971392
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 0
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
end_request: I/O error, dev xvda, sector 0
mount: could not find filesystem '/dev/root'
Setting up other filesystems.
Setting up new root fs
setuproot: moving /dev failed: No such file or directory
no fstab.sys, mounting internal defaults
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
Switching to new root and running init.
unmounting old /dev
unmounting old /proc
unmounting old /sys
switchroot: mount failed: No such file or directory
Kernel panic - not syncing: Attempted to kill init!
---------------------------------------------------
I can understand that (for the moment) I cannot rely on Xen in production
enviroments BUT I never imagine that Xen could be so unpredictable!
--
Sergiu Strat,
HQN - High Quality Networks
www.hqn.ro
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|