This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] stack overflow during pv-guest restore

To: Jan Beulich <jbeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] stack overflow during pv-guest restore
From: Don Dutile <ddutile@xxxxxxxxxx>
Date: Fri, 01 Feb 2008 09:32:56 -0500
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 01 Feb 2008 06:33:22 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <47A2F567.76E4.0078.0@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <47A22A15.8060403@xxxxxxxxxx> <47A2F567.76E4.0078.0@xxxxxxxxxx>
Reply-to: ddutile@xxxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird (X11/20070718)
Jan (& Keir)

That's what I did the first time (used a simple static) I generated
a fix & tested it, because a lock is taken out (long) before __cpu_up() is 
to ensure only one cpu is in that code path anyhow.

*But* as I explained below, a modified version of cpu_initialize_context()
was already upstream and it went with a kzalloc-kfree implementation, so
maybe you want to ask Jeremy why he didn't go with a simple static as well.

So, the bottom line: the fix tries to follow upstream... typically that's
what we do (or change upstream, so all of us can follow it).

- Don

Jan Beulich wrote:
I think this can be fixed with a much smaller change: Since the only caller
of cpu_initialize_context() is __cpu_up(), and since __cpu_up() invocation
is serialized, the variable in question could simply be static.


Don Dutile <ddutile@xxxxxxxxxx> 31.01.08 21:05 >>>

When secondary cpus are initialized during an i386 pv-guest restore
(due to save/restore or live migration), and the guest has
a load that generates a fair number of interrupts (e.g., parallel kernel make),
a stack overflow can occur because cpu_initialize_context() has
a 2800 byte structure it declares on its stack.  linux-i386 has 4K stacks, by 
Using 2800 bytes out of 4K by a single function in a call list isn't nice;
add the beginning of interrupt handling at just the right point, before the
switch to the interrupt stack is made... and... the scale tips...

Simple fix: malloc & free structure as needed.

Would fail save/restore testing of an i386-guest running a parallel, kernel 
make after
50->100 save/restores;  with the fix, haven't seen it fail after 650 

Note: this is a basic port of this function in Jeremy Fitzharinge's Xen 
implementation of pv-ops
       in upstream Linux, part of 15/32 patch, Xen SMP guest support.

Original patch done on rhel5; did a simple diff & merge to xen-unstable's 
of smpboot.c to generate the attached patch, so it cleanly applies; but
haven't built/run/tested the xen-unstable version.

Signed-off-by: Donald Dutile <ddutile@xxxxxxxxxx>

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>