WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] Xen 3.1 (pae-mode) causing domU's to crash if RAM above abou

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Xen 3.1 (pae-mode) causing domU's to crash if RAM above about 500-800MB is used
From: Ralph Passgang <ralph@xxxxxxxxxxxxx>
Date: Thu, 2 Aug 2007 14:50:43 +0200
Delivery-date: Thu, 02 Aug 2007 05:48:58 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.9.7
Hello,

after setting up quite a lot of xen server without any issue in the past few 
month, now I've got serious trouble getting xen 3.1 running stable on new 
server hardware. In my humble opinion it's related to PAE (once again).

The problem is, that starting domains is working in general, but only if RAM 
below about 500-800MB is used. Each domain that gets started after already 
this amount of RAM is in use will crash (see kernel message below). 
Surprisingly live-migrating domains from other system to this one, even if 
the migrated domain consums memory above the critical point, seems to work 
without any trouble.

I am already starting the hypervisor with the option :"dom0_mem=256M", so this 
doesn't help already.

But first of all the technical specs of the system:

- Dual Dualcore-Xeon 5130 (2Ghz)
- 8 GB RAM
- Intel chipset (5000 or 5000p)
- Intel Network, USB, PCI Express Controller
- ATI ES1000 graphiccard
- 3Ware SATA Raid Controller (9xxx?)
- IPMI Remote Management Card (not visible with lspci)

Software:
Debian Etch with 2.6.18-4-xen-686 kernel (pae) and backported Xen 3.1 Debian 
Packages. Besides the backported xen 3.1 packages nothing really special. 
DRBD and LVM is used to handle the block devices.

In the past, the 3Ware SATA Controller was causing trouble with Xen in PAE 
mode, but with Ian Pratt's help this could be solved. After that a patch has 
been included in xen 3.0.x. (3.0.2 if I remember it correctly). Just an idea, 
but maybe this patch is not included in 3.1 anymore? It really could be a 
Xen/PAE/3Ware Problem again.

Help is very much appreciated.

regards,
 Ralph

P.s.:
Here are some more technical facts. If even more is needed for debugging this 
problem, please let me know. For some days even SSH access would be possible, 
if needed.

First of all the output of the crashing domainU:

Started domain vm-test
Linux version 2.6.18-4-xen-686 (Debian 2.6.18.dfsg.1-12etch2) 
(dannf@xxxxxxxxxx) (gcc version 4.1.2 20061115 (prerelease) (Debian 
4.1.1-21)) #1 SMP Thu May 10 03:24:35 UTC 2007
BIOS-provided physical RAM map:
 Xen: 0000000000000000 - 0000000020800000 (usable)
0MB HIGHMEM available.
520MB LOWMEM available.
NX (Execute Disable) protection: active
ACPI in unprivileged domain disabled
Built 1 zonelists.  Total pages: 133120
Kernel command line: root=/dev/hda1 ro
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Xen reported: 2000.068 MHz processor.
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Software IO TLB disabled
vmalloc area: e1000000-f51fe000, maxmem 2d7fe000
Memory: 503168k/532480k available (1582k kernel code, 20936k reserved, 585k 
data, 148k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... 
<1>BUG: unable to handle kernel paging request at virtual address c00fc72c
 printing eip:
c01106c2
00f66000 -> *pde = 00000000:2f0be001
00f67000 -> *pme = 00000000:2f0bd067
00f68000 -> *pte = 00000000:2ff29061
Oops: 0003 [#1]
SMP
Modules linked in:
CPU:    0
EIP:    e019:[<c01106c2>]    Not tainted VLI
EFLAGS: 00010202   (2.6.18-4-xen-686 #1)
EIP is at __set_fixmap+0x1b0/0x341
eax: 00000002   ebx: 00224d9a   ecx: 00000000   edx: ffffffff
esi: c00fc728   edi: 000fc728   ebp: 000000fc   esp: c0321f80
ds: e021   es: e021   ss: e021
Process swapper (pid: 0, ti=c0320000 task=c02d0660 task.ti=c0320000)
Stack: 00000000 00000000 f54e5000 f500e000 2f0be001 00000000 00000249 00000001
       00000000 00082000 00000094 00000249 c032f869 00000000 00000000 00000025
       80000000 c029e35a 0000062e 00001472 c1433000 c0349404 c142d364 00000020
Call Trace:
 [<c032f869>] mem_init+0x345/0x392
 [<c032557f>] start_kernel+0x1fb/0x37f
Code: 00 c0 75 0e a1 a8 bc 34 c0 8b 1c 98 81 e3 ff ff ff 7f 8b 4c 24 38 89 d8 
c1 e8 14 89 ca 09 d0 8b 15 08 e0 31 c0 23 05 0c e0 31 c0 <89> 46 04 c1 e3 0c 
0b 5c 24 34 21 d3 89 9f 00 00 00 c0 e9 63 01
EIP: [<c01106c2>] __set_fixmap+0x1b0/0x341 SS:ESP e021:c0321f80
 <0>Kernel panic - not syncing: Attempted to kill the idle task!

----------------------------------------------------------------------------------------------

Output of "xm dmesg":

 Xen version 3.1.0-1 (Debian 3.1.0-0-tha6) (ralph@xxxxxxxxxxxxx) (gcc version 
4.
1.2 20061115 (prerelease) (Debian 4.1.1-21)) Sat Jul 21 18:55:55 UTC 2007
 Latest ChangeSet: unavailable

(XEN) Command line: /boot/xen-3.1.0-1-i386-pae.gz
(XEN)  0000000000000000 - 000000000009c000 (usable)
(XEN)  000000000009c400 - 00000000000a0000 (reserved)
(XEN)  00000000000ce000 - 00000000000d0000 (reserved)
(XEN)  00000000000e4000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000cff60000 (usable)
(XEN)  00000000cff60000 - 00000000cff69000 (ACPI data)
(XEN)  00000000cff69000 - 00000000cff80000 (ACPI NVS)
(XEN)  00000000cff80000 - 00000000d0000000 (reserved)
(XEN)  00000000e0000000 - 00000000f0000000 (reserved)
(XEN)  00000000fec00000 - 00000000fec10000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ff000000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000230000000 (usable)
(XEN) System RAM: 8190MB (8387568kB)
(XEN) Xen heap: 9MB (10004kB)
(XEN) Domain heap initialised: DMA width 32 bits
(XEN) PAE enabled, limit: 16 GB
(XEN) Processor #0 6:15 APIC version 20
(XEN) Processor #6 6:15 APIC version 20
(XEN) Processor #1 6:15 APIC version 20
(XEN) Processor #7 6:15 APIC version 20
(XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
(XEN) IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47
(XEN) Enabling APIC mode:  Flat.  Using 2 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2000.123 MHz processor.
(XEN) HVM: VMX enabled
(XEN) VMX: MSR intercept bitmap enabled
(XEN) CPU0: Intel(R) Xeon(R) CPU            5130  @ 2.00GHz stepping 06
(XEN) Mapping cpu 0 to node 255
(XEN) Booting processor 1/6 eip 90000
(XEN) Mapping cpu 1 to node 255
(XEN) CPU1: Intel(R) Xeon(R) CPU            5130  @ 2.00GHz stepping 06
(XEN) Booting processor 2/1 eip 90000
(XEN) Mapping cpu 2 to node 255
(XEN) CPU2: Intel(R) Xeon(R) CPU            5130  @ 2.00GHz stepping 06
(XEN) Booting processor 3/7 eip 90000
(XEN) Mapping cpu 3 to node 255
(XEN) CPU3: Intel(R) Xeon(R) CPU            5130  @ 2.00GHz stepping 06
(XEN) Total of 4 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using new ACK method
(XEN) Platform timer is 1.193MHz PIT
(XEN) Brought up 4 CPUs
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 32-bit, PAE, lsb
(XEN)  Dom0 kernel: 32-bit, PAE, lsb, paddr 0xc0100000 -> 0xc0396b54
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   000000007e000000->0000000080000000 (2036959 pages to be 
al
located)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: c0100000->c0396b54
(XEN)  Init. ramdisk: c0397000->c0ee2400
(XEN)  Phys-Mach map: c0ee3000->c16b037c
(XEN)  Start info:    c16b1000->c16b146c
(XEN)  Page tables:   c16b2000->c16c3000
(XEN)  Boot stack:    c16c3000->c16c4000
(XEN)  TOTAL:         c0000000->c1800000
(XEN)  ENTRY ADDRESS: c0100000
(XEN) Dom0 has maximum 4 VCPUs
(XEN) PIT Timer HW error: 23865
(XEN) Initrd len 0xb4b400, start at 0xc0397000
(XEN) Scrubbing Free RAM: .done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to 
Xen
).
(XEN) mm.c:612:d5 Non-privileged (5) attempt to map I/O space 0024489b
(XEN) mm.c:3267:d5 ptwr_emulate: could not get_page_from_l1e()
(XEN) mm.c:636:d9 Error getting mfn 22fcce (pfn 55555555) from L1 entry 
00000002
2fcce025 for dom9
(XEN) mm.c:3267:d9 ptwr_emulate: could not get_page_from_l1e()
(XEN) mm.c:636:d10 Error getting mfn 22fcd2 (pfn 55555555) from L1 entry 
0000000
22fcd2025 for dom10
(XEN) mm.c:3267:d10 ptwr_emulate: could not get_page_from_l1e()

----------------------------------------------------------------------------------------------

"lspci" in dom0:

00:00.0 Host bridge: Intel Corporation 5000P Chipset Memory Controller Hub 
(rev b1)
00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port 
2-3 (rev b1)
00:04.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port 
4-5 (rev b1)
00:06.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port 
6-7 (rev b1)
00:08.0 System peripheral: Intel Corporation 5000 Series Chipset DMA Engine 
(rev b1)
00:10.0 Host bridge: Intel Corporation 5000 Series Chipset Error Reporting 
Registers (rev b1)
00:10.1 Host bridge: Intel Corporation 5000 Series Chipset Error Reporting 
Registers (rev b1)
00:10.2 Host bridge: Intel Corporation 5000 Series Chipset Error Reporting 
Registers (rev b1)
00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers 
(rev b1)
00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved Registers 
(rev b1)
00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 
b1)
00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers (rev 
b1)
00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI 
USB Controller #1 (rev 09)
00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI 
USB Controller #2 (rev 09)
00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset UHCI 
USB Controller #3 (rev 09)
00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset EHCI 
USB2 Controller (rev 09)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC 
Interface Controller (rev 09)
00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE Controller (rev 
09)
00:1f.3 SMBus: Intel Corporation 631xESB/632xESB/3100 Chipset SMBus Controller 
(rev 09)
01:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Upstream 
Port (rev 01)
01:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to PCI-X 
Bridge (rev 01)
02:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream 
Port E1 (rev 01)
02:02.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express Downstream 
Port E3 (rev 01)
03:00.0 PCI bridge: Intel Corporation 6702PXH PCI Express-to-PCI Bridge A (rev 
09)
04:01.0 RAID bus controller: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID (rev 
01)
05:00.0 Ethernet controller: Intel Corporation 631xESB/632xESB DPT LAN 
Controller Copper (rev 01)
05:00.1 Ethernet controller: Intel Corporation 631xESB/632xESB DPT LAN 
Controller Copper (rev 01)
09:01.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
09:02.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet 
Controller (rev 05)

----------------------------------------------------------------------------------------------

output of "lsmod" in dom0:

Module                  Size  Used by
sha1                    3424  1
drbd                  176680  2
cn                      9056  1 drbd
xt_physdev              3792  1
iptable_filter          3872  1
ip_tables              13892  1 iptable_filter
x_tables               14084  2 xt_physdev,ip_tables
netloop                 7360  0
bridge                 50268  1 xt_physdev
button                  7440  0
ac                      5956  0
battery                10404  0
ipv6                  229088  32
loop                   15944  0
i2c_i801                8236  0
serio_raw               7428  0
floppy                 51684  0
rtc                    13300  0
serial_core            20288  0
i2c_core               20480  1 i2c_i801
psmouse                35880  0
pcspkr                  3840  0
shpchp                 33632  0
pci_hotplug            29472  1 shpchp
joydev                  9856  0
evdev                   9856  0
ext3                  120072  1
jbd                    53224  1 ext3
mbcache                 9124  1 ext3
dm_mirror              20048  0
dm_snapshot            16320  0
dm_mod                 51000  9 dm_mirror,dm_snapshot
ide_cd                 36832  0
cdrom                  33312  1 ide_cd
sd_mod                 19808  4
generic                 6244  0 [permanent]
usbhid                 38208  0
3w_xxxx                26176  3
scsi_mod              125160  2 sd_mod,3w_xxxx
piix                   10212  0 [permanent]
ide_core              112392  3 ide_cd,generic,piix
e1000                 110432  0
ehci_hcd               29288  0
uhci_hcd               22188  0
usbcore               114372  4 usbhid,ehci_hcd,uhci_hcd
thermal                14376  0
processor              29608  1 thermal
fan                     5572  0

----------------------------------------------------------------------------------------------

Output of "xm info":
host                   : testsystem1
release                : 2.6.18-4-xen-686
version                : #1 SMP Thu May 10 03:24:35 UTC 2007
machine                : i686
nr_cpus                : 4
nr_nodes               : 1
sockets_per_node       : 2
cores_per_socket       : 2
threads_per_core       : 1
cpu_mhz                : 2000
hw_caps                : 
bfebfbff:20100000:00000000:00000140:0004e33d:00000000:00000001
total_memory           : 8190
free_memory            : 518
xen_major              : 3
xen_minor              : 1
xen_extra              : .0-1
xen_caps               : xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xf5800000
xen_changeset          : unavailable
cc_compiler            : gcc version 4.1.2 20061115 (prerelease) (Debian 
4.1.1-21)
cc_compile_by          : ralph
cc_compile_domain      : debianbase.de
cc_compile_date        : Sat Jul 21 18:55:55 UTC 2007
xend_config_format     : 4

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users