WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] pv 2.6.31 (kernel.org) and save/migrate, domU BUG()

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Subject: Re: [Xen-devel] pv 2.6.31 (kernel.org) and save/migrate, domU BUG()
From: Pasi Kärkkäinen <pasik@xxxxxx>
Date: Sun, 8 Nov 2009 16:17:43 +0200
Cc: "Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Sun, 08 Nov 2009 06:18:07 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <373a54d1-3dec-4185-b1ca-0363e14329b4@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20091107110905.GB1434@xxxxxxxxxxx> <373a54d1-3dec-4185-b1ca-0363e14329b4@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.13 (2006-08-11)
On Sat, Nov 07, 2009 at 07:32:49AM -0800, Dan Magenheimer wrote:
> > > Well, first, I got 2.6.31.5 to boot in a PV guest in another
> > > machine and it fails to save also.  Are you able to save
> > > 2.6.31{,.5} successfully?  On latest xen-unstable?
> > > (NOTE: Yes, I do have CONFIG_XEN_SAVE_RESTORE=y... don't
> > > know if that is important.)
> > 
> > I'll have to try it later today..
> 
> Let me know.
> 

Ok. I just tried with a Fedora 12 (rawhide) PV guest. I was able to 
"xm save" and "xm restore" it without problems. 

But I noticed there was a BUG printed on the guest console:
http://pasik.reaktio.net/xen/debug/dmesg-2.6.31.5-122.fc12.x86_64-saverestore.txt

BUG: sleeping function called from invalid context at kernel/mutex.c:94
in_atomic(): 0, irqs_disabled(): 1, pid: 1052, name: kstop/0
Pid: 1052, comm: kstop/0 Not tainted 2.6.31.5-122.fc12.x86_64 #1
Call Trace:
 [<ffffffff8104021f>] __might_sleep+0xe6/0xe8
 [<ffffffff81419c84>] mutex_lock+0x22/0x4e
 [<ffffffff812afdce>] dpm_resume_noirq+0x21/0x11f
 [<ffffffff81272b05>] xen_suspend+0xca/0xd1
 [<ffffffff8108c172>] stop_cpu+0x8c/0xd2
 [<ffffffff8106350c>] worker_thread+0x18a/0x224
 [<ffffffff81067ae7>] ? autoremove_wake_function+0x0/0x39
 [<ffffffff8141ab29>] ? _spin_unlock_irqrestore+0x19/0x1b
 [<ffffffff81063382>] ? worker_thread+0x0/0x224
 [<ffffffff81067765>] kthread+0x91/0x99
 [<ffffffff81012daa>] child_rip+0xa/0x20
 [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6
 [<ffffffff81012da0>] ? child_rip+0x0/0x20


More information about my setup:

Host/dom0: Fedora 12 (latest rawhide) with included Xen 3.4.1-5 and
custom 2.6.31.5 x86_64 pv_ops dom0 kernel (a couple of days old).

Guest/domU: Fedora 12 (latest rawhide) with the included/default
2.6.31.5-122.fc12.x86_64 kernel.

> > > (On the machine I couldn't boot 2.6.31.5 as a PV guest, there
> > > was absolutely no console output.  However, I think tools
> > > are out-of-date on that machine so ignore that.)
> > 
> > Did you have "console=hvc0 earlyprintk=xen" in the domU kernel
> > parameters?
> 
> No, but that didn't work either.
> 

Ok.. then it crashes really early.

> > You might also change the xen guest cfgfile so that you have
> > on_crash=preserve and then when the PV guest is crashed run this:
> > 
> > /usr/lib/xen/bin/xenctx -s System.map-domUkernelversion <domid>
> > 
> > (if you have 64b host the xenctx binary might be under /usr/lib64/)
> > 
> > to get a stack trace..
> 
> Very interesting and useful!  I was completely unaware of
> xenctx and could have used it many times in tmem development!
> 
> The results explain why I can get it to run on
> one machine (an older laptop) and not run on another
> machine (a Nehalem system)... looks like this is maybe
> related to the cpuid-extended-topology-leaf bug that Jeremy
> sent a fix for upstream recently.
> 

Did you try with that patch applied? 

-- Pasi

> cs:eip: e019:c040342d xen_cpuid+0x46 
> flags: 00001206 i nz p
> ss:esp: e021:c0779ee4
> eax: 00000001 ebx: 00000002   ecx: 00000100   edx: 00000001
> esi: c0779f1c edi: c0779f18   ebp: c0779f24
>  ds:     e021  es:     e021    fs:     00d8    gs:     0000
> Code (instr addr c040342d)
> 24 04 8b 15 a4 02 7c c0 89 54 24 08 8b 0e 0f 0b 78 65 6e 0f a2 <89> 45 00 8b 
> 04 24 89 18 89 0e 89 
> 
> 
> Stack:
>  c0779f20 ffffffff ffffffff c07c0360 c0779f18 c0779f1c c0779f20 c066fd0f
>  c0779f18 c0779f24 00000002 16aee301 00000001 00000001 16aee301 00000002
>  0000000b c07c03cc c07c0360 c07c0360 c07c03d8 c0670ed8 c0779f58 00000001
>  c07c0360 c0779f60 c066fe6a c0779f60 c0779f60 00000003 00000001 00000000
> 
> Call Trace:
>   [<c040342d>] xen_cpuid+0x46  <--
>   [<c066fd0f>] detect_extended_topology+0xae 
>   [<c0670ed8>] init_intel+0x140 
>   [<c066fe6a>] init_scattered_cpuid_features+0x82 
>   [<c06705e2>] identify_cpu+0x22d 
>   [<c040584c>] xen_force_evtchn_callback+0xc 
>   [<c0405e78>] check_events+0x8 
>   [<c07c9dec>] identify_boot_cpu+0xa 
>   [<c07c9e9a>] check_bugs+0x8 
>   [<c07c27bd>] start_kernel+0x2a0 
>   [<c07c5206>] xen_start_kernel+0x340 




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>