Hello,
after setting up quite a lot of xen server without any issue in the past few
month, now I've got serious trouble getting xen 3.1 running stable on new
server hardware. In my humble opinion it's related to PAE (once again).
The problem is, that starting domains is working in general, but only if RAM
below about 500-800MB is used. Each domain that gets started after already
this amount of RAM is in use will crash (see kernel message below).
Surprisingly live-migrating domains from other system to this one, even if
the migrated domain consums memory above the critical point, seems to work
without any trouble.
I am already starting the hypervisor with the option :"dom0_mem=256M", so this
doesn't help already.
But first of all the technical specs of the system:
- Dual Dualcore-Xeon 5130 (2Ghz)
- 8 GB RAM
- Intel chipset (5000 or 5000p)
- Intel Network, USB, PCI Express Controller
- ATI ES1000 graphiccard
- 3Ware SATA Raid Controller (9xxx?)
- IPMI Remote Management Card (not visible with lspci)
Software:
Debian Etch with 2.6.18-4-xen-686 kernel (pae) and backported Xen 3.1 Debian
Packages. Besides the backported xen 3.1 packages nothing really special.
DRBD and LVM is used to handle the block devices.
In the past, the 3Ware SATA Controller was causing trouble with Xen in PAE
mode, but with Ian Pratt's help this could be solved. After that a patch has
been included in xen 3.0.x. (3.0.2 if I remember it correctly). Just an idea,
but maybe this patch is not included in 3.1 anymore? It really could be a
Xen/PAE/3Ware Problem again.
Help is very much appreciated.
regards,
Ralph
P.s.:
Here are some more technical facts. If even more is needed for debugging this
problem, please let me know. For some days even SSH access would be possible,
if needed.
First of all the output of the crashing domainU:
Started domain vm-test
Linux version 2.6.18-4-xen-686 (Debian 2.6.18.dfsg.1-12etch2)
(dannf@xxxxxxxxxx) (gcc version 4.1.2 20061115 (prerelease) (Debian
4.1.1-21)) #1 SMP Thu May 10 03:24:35 UTC 2007
BIOS-provided physical RAM map:
Xen: 0000000000000000 - 0000000020800000 (usable)
0MB HIGHMEM available.
520MB LOWMEM available.
NX (Execute Disable) protection: active
ACPI in unprivileged domain disabled
Built 1 zonelists. Total pages: 133120
Kernel command line: root=/dev/hda1 ro
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Xen reported: 2000.068 MHz processor.
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Software IO TLB disabled
vmalloc area: e1000000-f51fe000, maxmem 2d7fe000
Memory: 503168k/532480k available (1582k kernel code, 20936k reserved, 585k
data, 148k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode...
<1>BUG: unable to handle kernel paging request at virtual address c00fc72c
printing eip:
c01106c2
00f66000 -> *pde = 00000000:2f0be001
00f67000 -> *pme = 00000000:2f0bd067
00f68000 -> *pte = 00000000:2ff29061
Oops: 0003 [#1]
SMP
Modules linked in:
CPU: 0
EIP: e019:[<c01106c2>] Not tainted VLI
EFLAGS: 00010202 (2.6.18-4-xen-686 #1)
EIP is at __set_fixmap+0x1b0/0x341
eax: 00000002 ebx: 00224d9a ecx: 00000000 edx: ffffffff
esi: c00fc728 edi: 000fc728 ebp: 000000fc esp: c0321f80
ds: e021 es: e021 ss: e021
Process swapper (pid: 0, ti=c0320000 task=c02d0660 task.ti=c0320000)
Stack: 00000000 00000000 f54e5000 f500e000 2f0be001 00000000 00000249 00000001
00000000 00082000 00000094 00000249 c032f869 00000000 00000000 00000025
80000000 c029e35a 0000062e 00001472 c1433000 c0349404 c142d364 00000020
Call Trace:
[<c032f869>] mem_init+0x345/0x392
[<c032557f>] start_kernel+0x1fb/0x37f
Code: 00 c0 75 0e a1 a8 bc 34 c0 8b 1c 98 81 e3 ff ff ff 7f 8b 4c 24 38 89 d8
c1 e8 14 89 ca 09 d0 8b 15 08 e0 31 c0 23 05 0c e0 31 c0 <89> 46 04 c1 e3 0c
0b 5c 24 34 21 d3 89 9f 00 00 00 c0 e9 63 01
EIP: [<c01106c2>] __set_fixmap+0x1b0/0x341 SS:ESP e021:c0321f80
<0>Kernel panic - not syncing: Attempted to kill the idle task!
----------------------------------------------------------------------------------------------
Output of "xm dmesg":
Xen version 3.1.0-1 (Debian 3.1.0-0-tha6) (ralph@xxxxxxxxxxxxx) (gcc version
4.
1.2 20061115 (prerelease) (Debian 4.1.1-21)) Sat Jul 21 18:55:55 UTC 2007
Latest ChangeSet: unavailable
(XEN) Command line: /boot/xen-3.1.0-1-i386-pae.gz
(XEN) 0000000000000000 - 000000000009c000 (usable)
(XEN) 000000000009c400 - 00000000000a0000 (reserved)
(XEN) 00000000000ce000 - 00000000000d0000 (reserved)
(XEN) 00000000000e4000 - 0000000000100000 (reserved)
(XEN) 0000000000100000 - 00000000cff60000 (usable)
(XEN) 00000000cff60000 - 00000000cff69000 (ACPI data)
(XEN) 00000000cff69000 - 00000000cff80000 (ACPI NVS)
(XEN) 00000000cff80000 - 00000000d0000000 (reserved)
(XEN) 00000000e0000000 - 00000000f0000000 (reserved)
(XEN) 00000000fec00000 - 00000000fec10000 (reserved)
(XEN) 00000000fee00000 - 00000000fee01000 (reserved)
(XEN) 00000000ff000000 - 0000000100000000 (reserved)
(XEN) 0000000100000000 - 0000000230000000 (usable)
(XEN) System RAM: 8190MB (8387568kB)
(XEN) Xen heap: 9MB (10004kB)
(XEN) Domain heap initialised: DMA width 32 bits
(XEN) PAE enabled, limit: 16 GB
(XEN) Processor #0 6:15 APIC version 20
(XEN) Processor #6 6:15 APIC version 20
(XEN) Processor #1 6:15 APIC version 20
(XEN) Processor #7 6:15 APIC version 20
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(XEN) IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47
(XEN) Enabling APIC mode: Flat. Using 2 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2000.123 MHz processor.
(XEN) HVM: VMX enabled
(XEN) VMX: MSR intercept bitmap enabled
(XEN) CPU0: Intel(R) Xeon(R) CPU 5130 @ 2.00GHz stepping 06
(XEN) Mapping cpu 0 to node 255
(XEN) Booting processor 1/6 eip 90000
(XEN) Mapping cpu 1 to node 255
(XEN) CPU1: Intel(R) Xeon(R) CPU 5130 @ 2.00GHz stepping 06
(XEN) Booting processor 2/1 eip 90000
(XEN) Mapping cpu 2 to node 255
(XEN) CPU2: Intel(R) Xeon(R) CPU 5130 @ 2.00GHz stepping 06
(XEN) Booting processor 3/7 eip 90000
(XEN) Mapping cpu 3 to node 255
(XEN) CPU3: Intel(R) Xeon(R) CPU 5130 @ 2.00GHz stepping 06
(XEN) Total of 4 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN) -> Using new ACK method
(XEN) Platform timer is 1.193MHz PIT
(XEN) Brought up 4 CPUs
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Xen kernel: 32-bit, PAE, lsb
(XEN) Dom0 kernel: 32-bit, PAE, lsb, paddr 0xc0100000 -> 0xc0396b54
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 000000007e000000->0000000080000000 (2036959 pages to be
al
located)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: c0100000->c0396b54
(XEN) Init. ramdisk: c0397000->c0ee2400
(XEN) Phys-Mach map: c0ee3000->c16b037c
(XEN) Start info: c16b1000->c16b146c
(XEN) Page tables: c16b2000->c16c3000
(XEN) Boot stack: c16c3000->c16c4000
(XEN) TOTAL: c0000000->c1800000
(XEN) ENTRY ADDRESS: c0100000
(XEN) Dom0 has maximum 4 VCPUs
(XEN) PIT Timer HW error: 23865
(XEN) Initrd len 0xb4b400, start at 0xc0397000
(XEN) Scrubbing Free RAM: .done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to
Xen
).
(XEN) mm.c:612:d5 Non-privileged (5) attempt to map I/O space 0024489b
(XEN) mm.c:3267:d5 ptwr_emulate: could not get_page_from_l1e()
(XEN) mm.c:636:d9 Error getting mfn 22fcce (pfn 55555555) from L1 entry
00000002
2fcce025 for dom9
(XEN) mm.c:3267:d9 ptwr_emulate: could not get_page_from_l1e()
(XEN) mm.c:636:d10 Error getting mfn 22fcd2 (pfn 55555555) from L1 entry
0000000
22fcd2025 for dom10
(XEN) mm.c:3267:d10 ptwr_emulate: could not get_page_from_l1e()
----------------------------------------------------------------------------------------------
"lspci" in dom0:
00:00.0 Host bridge: Intel Corporation 5000P Chipset Memory Controller Hub
(rev b1)
00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port
2-3 (rev b1)
00:04.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port
4-5 (rev b1)
00:06.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port
6-7 (rev b1)
00:08.0 System peripheral: Intel Corporation 5000 Series Chipset DMA Engine
(rev b1)
00:10.0 Host bridge: Intel Corporation 5000 Series Chipset Error Reporting
Registers (rev b1)
00:10.1 Host bridge: Intel Corporation 5000 Series Chipset Error Reporting
Registers (rev b1)
00:10.2 Host bridge: Intel Corporation 5000 Series Chipset Error Reporting
Registers (rev b1)
00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers
(rev b1)
00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers
(rev b1)
00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev
b1)
00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev
b1)
00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI
USB Controller #1 (rev 09)
00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI
USB Controller #2 (rev 09)
00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI
USB Controller #3 (rev 09)
00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset EHCI
USB2 Controller (rev 09)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC
Interface Controller (rev 09)
00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev
09)
00:1f.3 SMBus: Intel Corporation 631xESB/632xESB/3100 Chipset SMBus Controller
(rev 09)
01:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Upstream
Port (rev 01)
01:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to PCI-X
Bridge (rev 01)
02:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream
Port E1 (rev 01)
02:02.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream
Port E3 (rev 01)
03:00.0 PCI bridge: Intel Corporation 6702PXH PCI Express-to-PCI Bridge A (rev
09)
04:01.0 RAID bus controller: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID (rev
01)
05:00.0 Ethernet controller: Intel Corporation 631xESB/632xESB DPT LAN
Controller Copper (rev 01)
05:00.1 Ethernet controller: Intel Corporation 631xESB/632xESB DPT LAN
Controller Copper (rev 01)
09:01.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
09:02.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet
Controller (rev 05)
----------------------------------------------------------------------------------------------
output of "lsmod" in dom0:
Module Size Used by
sha1 3424 1
drbd 176680 2
cn 9056 1 drbd
xt_physdev 3792 1
iptable_filter 3872 1
ip_tables 13892 1 iptable_filter
x_tables 14084 2 xt_physdev,ip_tables
netloop 7360 0
bridge 50268 1 xt_physdev
button 7440 0
ac 5956 0
battery 10404 0
ipv6 229088 32
loop 15944 0
i2c_i801 8236 0
serio_raw 7428 0
floppy 51684 0
rtc 13300 0
serial_core 20288 0
i2c_core 20480 1 i2c_i801
psmouse 35880 0
pcspkr 3840 0
shpchp 33632 0
pci_hotplug 29472 1 shpchp
joydev 9856 0
evdev 9856 0
ext3 120072 1
jbd 53224 1 ext3
mbcache 9124 1 ext3
dm_mirror 20048 0
dm_snapshot 16320 0
dm_mod 51000 9 dm_mirror,dm_snapshot
ide_cd 36832 0
cdrom 33312 1 ide_cd
sd_mod 19808 4
generic 6244 0 [permanent]
usbhid 38208 0
3w_xxxx 26176 3
scsi_mod 125160 2 sd_mod,3w_xxxx
piix 10212 0 [permanent]
ide_core 112392 3 ide_cd,generic,piix
e1000 110432 0
ehci_hcd 29288 0
uhci_hcd 22188 0
usbcore 114372 4 usbhid,ehci_hcd,uhci_hcd
thermal 14376 0
processor 29608 1 thermal
fan 5572 0
----------------------------------------------------------------------------------------------
Output of "xm info":
host : testsystem1
release : 2.6.18-4-xen-686
version : #1 SMP Thu May 10 03:24:35 UTC 2007
machine : i686
nr_cpus : 4
nr_nodes : 1
sockets_per_node : 2
cores_per_socket : 2
threads_per_core : 1
cpu_mhz : 2000
hw_caps :
bfebfbff:20100000:00000000:00000140:0004e33d:00000000:00000001
total_memory : 8190
free_memory : 518
xen_major : 3
xen_minor : 1
xen_extra : .0-1
xen_caps : xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xf5800000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.2 20061115 (prerelease) (Debian
4.1.1-21)
cc_compile_by : ralph
cc_compile_domain : debianbase.de
cc_compile_date : Sat Jul 21 18:55:55 UTC 2007
xend_config_format : 4
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|