WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

LAST_CHECKPOINT and save/restore/migrate compatibility (was Re: [Xen-dev

To: Ian Campbell <ian.campbell@xxxxxxxxxx>
Subject: LAST_CHECKPOINT and save/restore/migrate compatibility (was Re: [Xen-devel] [PATCH 3 of 4] libxc: provide notification of final checkpoint to restore end)
From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date: Fri, 1 Apr 2011 12:02:28 +0100
Cc: Shriram Rajagopalan <rshriram@xxxxxxxxx>, Brendan Cully <brendan@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 01 Apr 2011 04:04:15 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:date:x-google-sender-auth :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=vBpWR8FlJRSd4XWcJIYW3keHpRu0uYGvWiMmwZ9qC0A=; b=MnAk05gRn8CRD/q0rju4cx0uYQchLTyS8X7SFMExOl0ESYbE2JR0Qs2ttGZxhjXJs9 ShOly254n/BIE9guBKk7be/ODNbBrKqgpe7Qtawm1vbK2orHXSDfVwYQBRP+YF35ZL2x 5EizjvGWXVrcZ8/gMK7pKf9IMc5RpVQInxdaw=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; b=GUeUX6oAWa2Y3WAnNqILlmsAuAdk9xviYd1unTrRybwuYOPkZZEnpleyIIXRIQXy2p wpqK9p/H8ECkffY+l/+XTYn86RNnwnFE1NSFHSC/Vd2ZSPn5mn6Cf9/6RQ7TOQZU2IVx S2IdYekol9+ZsshPF1fRo63gc+OLsAo/X4+is=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Mon, Sep 6, 2010 at 11:03 AM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
> # HG changeset patch
> # User Ian Campbell <ian.campbell@xxxxxxxxxx>
> # Date 1283766891 -3600
> # Node ID bdf8ce09160d715451e1204babe5f80886ea6183
> # Parent  5f96f36feebdb87eaadbbcab0399f32eda86f735
> libxc: provide notification of final checkpoint to restore end
>
> When the restore code sees this notification it will restore the
> currently in-progress checkpoint when it completes.
>
> This allows the restore end to finish up without waiting for a
> spurious timeout on the receive fd and thereby avoids unnecessary
> error logging in the case of a successful migration or restore.
>
> In the normal migration or restore case the first checkpoint is always
> the last. For a rolling checkpoint (such as Remus) the notification is
> currently unused but could be used in the future for example to
> provide a controlled failover for reasons other than error

Unfortunatley, this introduces a backwards-compatibility problem,
since older versions of the tools never send LAST_CHECKPOINT.

Would it make sense to invert the logic on this -- i.e., default to
only one checkpoint, and send "MORE_CHECKPOINTS" if it expects this
*not* to be the last (either by expecting the checkpoint() function to
do it, or by sending it from xc_domain_save)?

Now that 4.1 is out, we have to consider version compatibilities.  But
we should be able to make it so that there would only be if:
1) You're running REMUS
2) One of your toolstacks is 4.1.0, and one is not.

That seems like it shouldn't be too bad.

Thoughts?

 -George

>
> Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> Acked-by: Brendan Cully <brendan@xxxxxxxxx>
>
> diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_restore.c
> --- a/tools/libxc/xc_domain_restore.c   Mon Sep 06 10:54:51 2010 +0100
> +++ b/tools/libxc/xc_domain_restore.c   Mon Sep 06 10:54:51 2010 +0100
> @@ -42,6 +42,7 @@ struct restore_ctx {
>     xen_pfn_t *p2m; /* A table mapping each PFN to its new MFN. */
>     xen_pfn_t *p2m_batch; /* A table of P2M mappings in the current region. 
>  */
>     int completed; /* Set when a consistent image is available */
> +    int last_checkpoint; /* Set when we should commit to the current 
> checkpoint when it completes. */
>     struct domain_info_context dinfo;
>  };
>
> @@ -765,6 +766,11 @@ static int pagebuf_get_one(xc_interface
>         // DPRINTF("console pfn location: %llx\n", buf->console_pfn);
>         return pagebuf_get_one(xch, ctx, buf, fd, dom);
>
> +    case XC_SAVE_ID_LAST_CHECKPOINT:
> +        ctx->last_checkpoint = 1;
> +        // DPRINTF("last checkpoint indication received");
> +        return pagebuf_get_one(xch, ctx, buf, fd, dom);
> +
>     default:
>         if ( (count > MAX_BATCH_SIZE) || (count < 0) ) {
>             ERROR("Max batch size exceeded (%d). Giving up.", count);
> @@ -1296,10 +1302,23 @@ int xc_domain_restore(xc_interface *xch,
>             goto out;
>         }
>         ctx->completed = 1;
> -        /* shift into nonblocking mode for the remainder */
> -        if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
> -            flags = 0;
> -        fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
> +
> +        /*
> +         * If more checkpoints are expected then shift into
> +         * nonblocking mode for the remainder.
> +         */
> +        if ( !ctx->last_checkpoint )
> +        {
> +            if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
> +                flags = 0;
> +            fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
> +        }
> +    }
> +
> +    if ( ctx->last_checkpoint )
> +    {
> +        // DPRINTF("Last checkpoint, finishing\n");
> +        goto finish;
>     }
>
>     // DPRINTF("Buffered checkpoint\n");
> diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_save.c
> --- a/tools/libxc/xc_domain_save.c      Mon Sep 06 10:54:51 2010 +0100
> +++ b/tools/libxc/xc_domain_save.c      Mon Sep 06 10:54:51 2010 +0100
> @@ -1616,6 +1616,20 @@ int xc_domain_save(xc_interface *xch, in
>         }
>     }
>
> +    if ( !callbacks->checkpoint )
> +    {
> +        /*
> +         * If this is not a checkpointed save then this must be the first and
> +         * last checkpoint.
> +         */
> +        i = XC_SAVE_ID_LAST_CHECKPOINT;
> +        if ( wrexact(io_fd, &i, sizeof(int)) )
> +        {
> +            PERROR("Error when writing last checkpoint chunk");
> +            goto out;
> +        }
> +    }
> +
>     /* Zero terminate */
>     i = 0;
>     if ( wrexact(io_fd, &i, sizeof(int)) )
> diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xg_save_restore.h
> --- a/tools/libxc/xg_save_restore.h     Mon Sep 06 10:54:51 2010 +0100
> +++ b/tools/libxc/xg_save_restore.h     Mon Sep 06 10:54:51 2010 +0100
> @@ -131,6 +131,7 @@
>  #define XC_SAVE_ID_TMEM_EXTRA         -6
>  #define XC_SAVE_ID_TSC_INFO           -7
>  #define XC_SAVE_ID_HVM_CONSOLE_PFN    -8 /* (HVM-only) */
> +#define XC_SAVE_ID_LAST_CHECKPOINT    -9 /* Commit to restoring after 
> completion of current iteration. */
>
>  /*
>  ** We process save/restore/migrate in batches of pages; the below
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel