This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-bugs] [Bug 729] New: domU migration causes fatal page fault crash

To: xen-bugs@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-bugs] [Bug 729] New: domU migration causes fatal page fault crash
From: bugzilla-daemon@xxxxxxxxxxxxxxxxxxx
Date: Tue, 08 Aug 2006 10:26:03 -0700
Delivery-date: Tue, 08 Aug 2006 10:26:58 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-bugs-request@lists.xensource.com?subject=help>
List-id: Xen Bugzilla <xen-bugs.lists.xensource.com>
List-post: <mailto:xen-bugs@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-bugs>, <mailto:xen-bugs-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-bugs>, <mailto:xen-bugs-request@lists.xensource.com?subject=unsubscribe>
Reply-to: bugs@xxxxxxxxxxxxxxxxxx
Sender: xen-bugs-bounces@xxxxxxxxxxxxxxxxxxx

           Summary: domU migration causes fatal page fault crash
           Product: Xen
           Version: unstable
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Unspecified
        AssignedTo: xen-bugs@xxxxxxxxxxxxxxxxxxx
        ReportedBy: krysans@xxxxxxxxxx

Running xm-test with xen-unstable changeset 10949 on Unisys ES7000/one, x86_64,
with 32 physical processors and 128 GB RAM results in a fatal page fault crash.
 dom0 was ballooned down to 5GB before running xm-test.  The tail of serial
console output shows:
(XEN)    00000800f0000001 0000000000000000 0000000080000003 fffffffffffffff3
(XEN)    00007fffffc9dff0 0000000000305000 000000000000000a 000000000000000a
(XEN)    ffff830000fe7d88 ffff830000106b3c ffff830000f15ff8 ffff832007da0ff8
(XEN)    00000000000001ff 0000000000000200 0300000100000019 0000000b04680094
(XEN)    0000000000501140 0000000000004800 00007fffffc9e020 000000000010e3a8
(XEN)    0000000000000003 000000000000000a 0000000000000001 0000000000003b00
(XEN)    0000000000000000 0000000000000000 20c49ba500000001 0000000000000000
(XEN)    0000000000000000 00002b66043fb436 00002b66043f4748 ffff830000111b63
(XEN)    0000000000000001 ffff8284041e35d0 0000000000000001 0000000000000001
(XEN)    0000000000800067 0000000000000296 ffff830000efe080 ffff830000f1e080
(XEN)    0000000000000015 ffff8300001089f0 ffff830000f1e080 ffff8800e24f7d01
(XEN) Xen call trace:
(XEN)    [<ffff83000013be35>] free_shadow_page+0x165/0xba0
(XEN)    [<ffff83000013d34f>] free_shadow_pages+0x23f/0x2d0
(XEN)    [<ffff83000013de3b>] shadow_mode_control+0x18b/0x500
(XEN)    [<ffff83000011b791>] arch_do_dom0_op+0x101/0xc50
(XEN)    [<ffff830000120a1a>] apic_timer_interrupt+0x2a/0x30
(XEN)    [<ffff830000120a1a>] apic_timer_interrupt+0x2a/0x30
(XEN)    [<ffff830000106b3c>] do_dom0_op+0xe1c/0xf10
(XEN)    [<ffff830000111b63>] csched_vcpu_wake+0x213/0x220
(XEN)    [<ffff8300001089f0>] evtchn_set_pending+0x80/0x110
(XEN)    [<ffff830000108cb4>] evtchn_send+0xf4/0x110
(XEN)    [<ffff830000109531>] do_event_channel_op+0x861/0xc90
(XEN)    [<ffff8300001696cb>] do_iret+0x7b/0x140
(XEN)    [<ffff830000168702>] syscall_enter+0x62/0x67
(XEN) Pagetable walk from 0000000000000000:
(XEN)  L4 = 0000000056e7b067 0000000001f8b67b
(XEN)   L3 = 00000001e88b5067 0000000001f8ec0a
(XEN)    L2 = 0000000000000000 ffffffffffffffff
(XEN) ****************************************
(XEN) Panic on CPU 4:
(XEN) [error_code=0000]
(XEN) Faulting linear address: 0000000000000000
(XEN) ****************************************
(XEN) Reboot in five seconds...

We suspected that perhaps the 5GB we allotted to dom0 may not be enough for a
host with 128GB RAM, so we attempted to do a migrate of a 256MB domU after the
system rebooted and we ballooned dom0 down to 10GB.  This resulted in the domU
to crash and dom0 to hang.  The tail of the serial console output showed:
(XEN) Xen stack trace from rsp=ffff830000f77d98:
(XEN)    00000002b1f70f0f 0000000005568588 00000000002229bd ffff828405568588
(XEN)    ffff830000f77f28 0000000080000000 ffff830000ec0080 0000000080000000
(XEN)    000000000170c728 00000000000003c0 ffff830000f77f28 ffff83000013e3ef
(XEN)    00000000000003e8 ffff8284399f1e40 0000000090000001 ffff8284399f1e40
(XEN)    ffff8300001db500 0000000080000000 ffff8300001842c8 ffff830000125596
(XEN)    ffff830000e9c080 0000000080000001 0000000080000000 ffff8284399f1e40
(XEN)    0fffffff00000000 ffff830000125844 ffff8300001db500 000000000170c728
(XEN)    ffff830000ec0080 ffff8284399f1e40 ffff8300001db500 ffff830000e9c080
(XEN)    0000000000000001 ffff83000012a1e6 00007ff00170c728 0000000000000000
(XEN)    0000001400000000 ffff880000639e28 0000000000000004 000000000170c728
(XEN)    00000002ffffffff 000000000012f095 00007fffffb387f8 ffff830000e9c080
(XEN)    ffff88000fe73760 ffff88000e5f08c0 0000000000000001 0000000000000001
(XEN)    00007fffffb387f8 ffff830000168702 00007fffffb387f8 0000000000000001
(XEN)    0000000000000001 ffff88000e5f08c0 ffff88000fe73760 ffff88000e5f08c0
(XEN)    0000000000000246 0000000000007ff0 0000000000000001 ffff88000e5f08c0
(XEN)    000000000000001a ffffffff8010734a 0000000000000000 0000000000000001
(XEN)    ffff880000639e28 0000010000000000 ffffffff8010734a 000000000000e033
(XEN)    0000000000000246 ffff880000639e10 000000000000e02b 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000014
(XEN)    ffff830000e9c080
(XEN) Xen call trace:
(XEN)    [<ffff83000013be35>] free_shadow_page+0x165/0xba0
(XEN)    [<ffff83000013e3ef>] remove_shadow+0x1cf/0x210
(XEN)    [<ffff830000125596>] free_page_type+0x1b6/0x3d0
(XEN)    [<ffff830000125844>] put_page_type+0x94/0xf0
(XEN)    [<ffff83000012a1e6>] do_mmuext_op+0x356/0x820
(XEN)    [<ffff830000168702>] syscall_enter+0x62/0x67
(XEN) Pagetable walk from ffffffffffffffff:
(XEN)  L4 = 000000075a209067 00000000000101fc
(XEN)   L3 = 000000081ef06067 000000000000f606
(XEN)    L2 = 0000000000000000 ffffffffffffffff
(XEN) ****************************************
(XEN) Panic on CPU 20:
(XEN) [error_code=0000]
(XEN) Faulting linear address: ffffffffffffffff
(XEN) ****************************************
(XEN) Reboot in five seconds...

Next we attempted to migrate the 256MB domU after system reboot without
ballooning down dom0, allowing it to consume most of the 128GB host memory. 
This resulted in the domU to crash and dom0 to hang.  The tail of the serial
console output showed:

ACPI: PCI Interrupt 0000:10:01.1[B] -> GSI 73 (level, low) -> IRQ 20
e1000: 0000:10:01.1: e1000_probe: (PCI-X:133MHz:64-bit) 08:00:0b:1e:1a:15
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@xxxxxxxxxx
ACPI: Power Button (FF) [PWRF]
e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex
(XEN) mtrr: type mismatch for f6000000,400000 old: uncachable new:
(XEN) mtrr: type mismatch for f6000000,400000 old: uncachable new:
(XEN) Couldn't alloc shadow page! dom1 count=169
(XEN) Shadow table counts: l1=0 l2=0 hl2=0 snapshot=0
(XEN) domain_crash_sync called from shadow.c:445
(XEN) Domain 1 (vcpu#1) crashed on cpu#14:
(XEN) ----[ Xen-3.0-unstable    Not tainted ]----
(XEN) CPU:    14
(XEN) RIP:    e033:[<00002b081a4ca6c1>]
(XEN) RFLAGS: 0000000000010246   CONTEXT: guest
(XEN) rax: 00002b081a6e27b8   rbx: 000000000040b160   rcx: 0000000041ee8f60
(XEN) rdx: 0000000000910a58   rsi: 00000000091c4165   rdi: 00000000004109d8
(XEN) rbp: 0000000000000596   rsp: 0000000041ee8df0   r8:  00002b081a4dd2b0
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: 0000000000000000   r13: 00002b081a4f5000   r14: 0000000000000000
(XEN) r15: 0000000000402b50   cr0: 000000008005003b   cr3: 0000000058da2000
(XEN) ds: 0000   es: 0000   fs: 0063   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=0000000041ee8df0:
(XEN)    0000000041ee8e60 0000000000000001 0000000041ee8f60 000000000040b160
(XEN)    00000000091c4165 00000000004109d8 0000000000000013 000000000040bd90
(XEN)    0000000000414592 00000000004b1670 0000000000000000 0000000000910a58
(XEN)    00000000000001af 00002b081ab9da93 0000000100000000 00002b081a4dd348
(XEN)    0000000041ee8fa0 0000000041ee8f60 00000000091c4165 0000000000000000
(XEN)    0000000000000000 00002b081a4caad7 0000000000000000 0000000000000001
(XEN)    0000000000000000 00002aaa00000001 00002b081aba2518 0000000000000001
(XEN)    00002b081ab9cdb9 0000000000000000 0000000041ee8fb0 0000000000000000
(XEN)    0000000141ee8ff0 00002b081a4dd348 0000000041ee8fd0 00002b081a4dd000
(XEN)    00000000004109d8 00002b081af8d27d 0000000000000b07 00000000004b6ed7
(XEN)    0000000000000000 00000000004b11ff 0000000000000401 0000000000000000
(XEN)    0000000000000102 00000000004b9894 0000000000000000 0000000000000000
(XEN)    0000000000000401 000000000069a888 00002aaaaaacd5c0 000000000049ae60
(XEN)    00002aaaaaad78c0 0000000041ee9940 0000000000000000 00002b081a4ce0e5
(XEN)    ff0a00a300000001 0000000000000000 00002aaa00000000 0000000041ee9100
(XEN)    000000000040b160 00002aaaaaacd5c0 0000000000b99f60 00002b081a4d3542
(XEN)    0000000000000000 0000000041ee9940 0000000041ee9940 000000000048bee0
(XEN)    000000000070e9b0 000000000069bd28 0000000000000c1c 00002b081a4dd000
(XEN)    0000000000000164 000000000048bed3 00002aaaaaacd5c0 000000000046a477
(XEN)    000000000049ae60 0000000041ee908c 0000000000b99f60 000000000046b608

Next we rebooted the host with only 32GB host RAM, did not balloon down dom0,
and the migrate of the 256MB domU was successful.

Configure bugmail: 
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Xen-bugs mailing list

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-bugs] [Bug 729] New: domU migration causes fatal page fault crash, bugzilla-daemon <=