WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: A race condition introduced by changeset 15175: Re-init

To: "Cui, Dexuan" <dexuan.cui@xxxxxxxxx>, "'xen-devel@xxxxxxxxxxxxxxxxxxx'" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: A race condition introduced by changeset 15175: Re-init hypercall stubs page after HVM save/restore
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Tue, 07 Oct 2008 11:16:23 +0100
Cc:
Delivery-date: Tue, 07 Oct 2008 03:16:47 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <F4AE3CDE26E0164D9E990A34F2D4E0DF08A5F0CFA1@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AckoZJqd7Bw334uKTW2HgFcuLGFhCgAASUbm
Thread-topic: A race condition introduced by changeset 15175: Re-init hypercall stubs page after HVM save/restore
User-agent: Microsoft-Entourage/11.4.0.080122
Hi Dexuan,

Are you really sure that this is the problem? The suspend_lock was
introduced specifically to solve this problem. Note that the BSP takes this
lock before messing with the hypercall page.

 -- Keir

On 7/10/08 11:08, "Cui, Dexuan" <dexuan.cui@xxxxxxxxx> wrote:

> For an SMP Linux HVM guest with PV drivers inserted, when we do save/restore
> (or LiveMigration) for the guest, it might panic after it's restored.
> The panic point is inside ap_suspend():
>  ....
>     while (info->do_spin) {
>         cpu_relax();
>         read_lock(&suspend_lock);
>         HYPERVISOR_yield();      ----> guest might panic on the invocation of
> this function.
>         read_unlock(&suspend_lock);
>     }
> ...
> 
> The root cause is: ap might be invoking the hypercall while bsp is asking the
> hypervisor to re-initialize the hypercall page when the guest has been just
> restored!
> 
> What's the purpose of re-initializing the hypercall page here? To improve the
> compatibility in the case the src/target hosts have different hypercall stub
> codes?
> 
> PS, I'm using c/s 18353 to debug the issue. (the latest xen-unstable.hg's S/R
> and L/M are broken by 18383.
> 
> Thanks,
> -- Dexuan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel