This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] RE: what could cause crash on restore from save

To: "Paul Durrant" <Paul.Durrant@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] RE: what could cause crash on restore from save
From: "James Harper" <james.harper@xxxxxxxxxxxxxxxx>
Date: Mon, 24 Jan 2011 20:01:01 +1100
Delivery-date: Mon, 24 Jan 2011 01:02:13 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <291EDFCB1E9E224A99088639C47620228CFFAD73AB@xxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AEC6C66638C05B468B556EA548C1A77D01BB91E0@trantor> <291EDFCB1E9E224A99088639C47620228CFFAD73AB@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acu7RxZ/EHIQ7fWqTk2a5SGRlvoS/QAXH9ZgAAA6eZA=
Thread-topic: what could cause crash on restore from save
> Does your VM have multiple vCPUs?

Just the one in my minimal test case.

> Did you corral them into some known code to
> make sure everything was in a known state?

For MP, yes. CPU0 waits until all others are spinning at HIGH_LEVEL
before making the suspend hypercall. This all used to work just fine but
it's been a while since I tested it.

> How 'shortly' after return from suspend are things dying?

Pretty much as soon as usermode code gets a chance to execute. Sometimes
vbd is part way through initialisation but most of the time its well
before that.

I've minimised things into leaving all my code in a stalled state (eg
set things up but turn off the xen interrupt so no requests are ever
answered) and it still hangs so I'm pretty sure it's not something I'm
doing on resume, so either it's something I'm doing on suspend, or it's
something xen is doing to my memory.

I've also tried doing everything _except_ making the hypercall and that
all works fine (eg xenbus gets shut down and then resumed etc).

The windows debugger doesn't give me anything particularly useful, and
at least one thread ends up executing where no known code is, so I'm
wondering if I'm doing something that is preventing memory from being
restored correctly. I'm trying to build some tests for that now.


>   Paul
> > -----Original Message-----
> > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
> > bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of James Harper
> > Sent: 23 January 2011 21:47
> > To: xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: [Xen-devel] what could cause crash on restore from save
> >
> > GPLPV is crashing very shortly after restoring from a save. The bug
> > check code is 0x7F (0xd, 0, 0, 0) which is described as "some other
> > exception". This doesn't seem to happen if I omit the actual suspend
> > hypercall, so I'm wondering if I'm not accounting for something
> > somewhere.
> >
> > Any suggestions?
> >
> > Thanks
> >
> > James
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>