WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ppc-devel

[XenPPC] [RFC] 'xm restore' following boot

'xm restore' immediately following boot usually wedges the cpu.
However, xm save followed by xm restore works fine (even when
guest domain and htab are relocated to new memory areas).

^AAA shows:  with .plpar_hcall_norets  @ c00000000003af78
             and  .HYPERVISOR_sched_op @ c00000000004415c
(XEN) *** Dumping CPU3 state: ***
(XEN) ----[ Xen-3.0-unstable     ]----
(XEN) CPU: 00000003   DOMID: 00000001
(XEN) pc c00000000003af88 msr 8000000000009032
(XEN) lr c000000000044210 ctr c000000000044238
(XEN) srr0 ffffffffffffffff srr1 ffffffffffffffff
(XEN) r00: 0000000024555548 c00000000065bcb0 c000000000656630 0000000000000000
(XEN) r04: 0000000000000001 0000000000000000 0000000024555542 c00000000000fc24
(XEN) r08: 00000000ecf515a8 c000000000044238 0000000000989680 c0000000000441a4
(XEN) r12: 0000000001a9f9f8 c00000000052e300 5555555555555555 5555555555555555
(XEN) r16: 5555555555555555 5555555555555555 5555555555555555 5555555555555555
(XEN) r20: 5555555555555555 5555555555555555 5555555555555555 5555555555555555
(XEN) r24: 5555555555555555 5555555555555555 4000000000000000 c000000000000000
(XEN) r28: 0000000000000000 0000000000000010 c00000000053d3c8 0000000000000001
(XEN) reprogram_timer[00] Timeout in the past 0x0000004332DBA479 > 
0x00000042C2424DF3


Here are typical console with debug prints and execptions:
If 'xm restore' is run several times, often it will start working,
though the exceptions still occur... (user domain has ramdisk & networking)
At the bottom, some code specified by a couple Exceptions...


1. 'xm restore' following xm save:

cso84:~ # xm console 6
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6200120000000042
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315899
__sti()
xencons_resume() 
xenbus_resume()
smp_resume()
mfdec: 63024
returning
netfront: device eth0 has copying receive path.

[user@bringup /]# 


2. reboot with 'xm restore' that worked 1st time:

cso84:~ # xm console 1
mfdec: -14
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315924
__sti()
xencons_resume() 
xenbus_resume()
BUG: soft lockup detected on CPU#0!
Call Trace:
[C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128
[C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c
    LR = .__do_IRQ+0x1ac/0x2b4
[C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable)
[C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4
[C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4
[C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28
[C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100
[C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
    LR = .HYPERVISOR_sched_op+0xb4/0x10c
[C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable)
[C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98
[C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154
[C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c
[C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308
[C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54
smp_resume()
mfdec: 90178
returning
netfront: device eth0 has copying receive path.

[user@bringup /]# 


3. reboot with typical wedge:

cso84:~ # xm console 1
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315903
__sti()
xencons_resume() 
xenbus_resume()
smp_resume()
mfdec: 14218880
returning
BUG: soft lockup detected on CPU#0!
Call Trace:
[C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128
[C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c
    LR = .__do_IRQ+0x1ac/0x2b4
[C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable)
[C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4
[C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4
[C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28
[C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100
[C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
    LR = .HYPERVISOR_sched_op+0xb4/0x10c
[C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable)
[C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98
[C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154
[C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c
[C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308
[C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54
cso84:~ # 


4. reboot with another wedge:

cso84:~ # xm console 1
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315908
__sti()
xencons_resume() 
xenbus_resume()
BUG: soft lockup detected on CPU#0!
Call Trace:
[C000000001AA3650] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C000000001AA3700] [C00000000008956C] .softlockup_tick+0x100/0x128
[C000000001AA37C0] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C000000001AA3840] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C000000001AA3970] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .plpar_hcall_norets+0x10/0x1c
    LR = .HYPERVISOR_event_channel_op+0x34/0x50
[C000000001AA3C60] [C0000000000442E4] .HYPERVISOR_event_channel_op+0x1c/0x50 (un
reliable)
[C000000001AA3CF0] [C0000000002BD1F0] .xb_read+0x190/0x2ac
[C000000001AA3E30] [C0000000002BEFD4] .xenbus_thread+0x84/0x278
[C000000001AA3EE0] [C000000000074D08] .kthread+0x158/0x1a8
[C000000001AA3F90] [C000000000028310] .kernel_thread+0x4c/0x68
cso84:~ # 



Some code, for example 3:

--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c : c000000000089d2c
0:mon> di c000000000089d20
c000000000089d20  7c0000a6      mfmsr   r0
c000000000089d24  60008000      ori     r0,r0,32768
c000000000089d28  7c010164      mtmsrd  r0,1
c000000000089d2c  7c7d07b4      extsw   r29,r3
c000000000089d30  48000010      b       c000000000089d40        # 
.handle_IRQ_event+0x60/0x13c
c000000000089d34  ebff0028      ld      r31,40(r31)
c000000000089d38  2fbf0000      cmpdi   cr7,r31,0
c000000000089d3c  419e005c      beq     cr7,c000000000089d98    # 
.handle_IRQ_event+0xb8/0x13c


--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c : c00000000003af988
0:mon> di c00000000003af78
c00000000003af78  7c421378      mr      r2,r2
c00000000003af7c  7c000026      mfcr    r0
c00000000003af80  90010008      stw     r0,8(r1)
c00000000003af84  44000022      svca    8
c00000000003af88  80010008      lwz     r0,8(r1)
c00000000003af8c  7c0ff120      mtcr    r0
c00000000003af90  4e800020      blr

_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel

<Prev in Thread] Current Thread [Next in Thread>