On Fri, 2011-04-01 at 12:02 +0100, George Dunlap wrote:
> On Mon, Sep 6, 2010 at 11:03 AM, Ian Campbell <ian.campbell@xxxxxxxxxx> wrote:
> > # HG changeset patch
> > # User Ian Campbell <ian.campbell@xxxxxxxxxx>
> > # Date 1283766891 -3600
> > # Node ID bdf8ce09160d715451e1204babe5f80886ea6183
> > # Parent 5f96f36feebdb87eaadbbcab0399f32eda86f735
> > libxc: provide notification of final checkpoint to restore end
> >
> > When the restore code sees this notification it will restore the
> > currently in-progress checkpoint when it completes.
> >
> > This allows the restore end to finish up without waiting for a
> > spurious timeout on the receive fd and thereby avoids unnecessary
> > error logging in the case of a successful migration or restore.
> >
> > In the normal migration or restore case the first checkpoint is always
> > the last. For a rolling checkpoint (such as Remus) the notification is
> > currently unused but could be used in the future for example to
> > provide a controlled failover for reasons other than error
>
> Unfortunatley, this introduces a backwards-compatibility problem,
> since older versions of the tools never send LAST_CHECKPOINT.
Out of interest what actually happens? (I think you mentioned something
about failing when restoring from a file?)
My intention at the time was that in the absence of a LAST_CHECKPOINT
things would continue to behave as before e.g. you would see the
spurious timeout on the receive fd but still restore correctly.
Perhaps the associated changes to the O_NONBLOCK'ness of the fd at
various stages is what broken this for the non-Remus migrations?
> Would it make sense to invert the logic on this -- i.e., default to
> only one checkpoint, and send "MORE_CHECKPOINTS" if it expects this
> *not* to be the last (either by expecting the checkpoint() function to
> do it, or by sending it from xc_domain_save)?
I'd quite like to understand what I broke and investigate the
trivialness (or otherwise) of a fix, but this change would probably
independently make sense too.
> Now that 4.1 is out, we have to consider version compatibilities. But
> we should be able to make it so that there would only be if:
> 1) You're running REMUS
Is it the case that Remus only cares about checkpointing between like
versions of the toolstack?
> 2) One of your toolstacks is 4.1.0, and one is not.
>
> That seems like it shouldn't be too bad.
>
> Thoughts?
>
> -George
>
> >
> > Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> > Acked-by: Brendan Cully <brendan@xxxxxxxxx>
> >
> > diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_restore.c
> > --- a/tools/libxc/xc_domain_restore.c Mon Sep 06 10:54:51 2010 +0100
> > +++ b/tools/libxc/xc_domain_restore.c Mon Sep 06 10:54:51 2010 +0100
> > @@ -42,6 +42,7 @@ struct restore_ctx {
> > xen_pfn_t *p2m; /* A table mapping each PFN to its new MFN. */
> > xen_pfn_t *p2m_batch; /* A table of P2M mappings in the current region.
> > */
> > int completed; /* Set when a consistent image is available */
> > + int last_checkpoint; /* Set when we should commit to the current
> > checkpoint when it completes. */
> > struct domain_info_context dinfo;
> > };
> >
> > @@ -765,6 +766,11 @@ static int pagebuf_get_one(xc_interface
> > // DPRINTF("console pfn location: %llx\n", buf->console_pfn);
> > return pagebuf_get_one(xch, ctx, buf, fd, dom);
> >
> > + case XC_SAVE_ID_LAST_CHECKPOINT:
> > + ctx->last_checkpoint = 1;
> > + // DPRINTF("last checkpoint indication received");
> > + return pagebuf_get_one(xch, ctx, buf, fd, dom);
> > +
> > default:
> > if ( (count > MAX_BATCH_SIZE) || (count < 0) ) {
> > ERROR("Max batch size exceeded (%d). Giving up.", count);
> > @@ -1296,10 +1302,23 @@ int xc_domain_restore(xc_interface *xch,
> > goto out;
> > }
> > ctx->completed = 1;
> > - /* shift into nonblocking mode for the remainder */
> > - if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
> > - flags = 0;
> > - fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
> > +
> > + /*
> > + * If more checkpoints are expected then shift into
> > + * nonblocking mode for the remainder.
> > + */
> > + if ( !ctx->last_checkpoint )
> > + {
> > + if ( (flags = fcntl(io_fd, F_GETFL,0)) < 0 )
> > + flags = 0;
> > + fcntl(io_fd, F_SETFL, flags | O_NONBLOCK);
> > + }
> > + }
> > +
> > + if ( ctx->last_checkpoint )
> > + {
> > + // DPRINTF("Last checkpoint, finishing\n");
> > + goto finish;
> > }
> >
> > // DPRINTF("Buffered checkpoint\n");
> > diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xc_domain_save.c
> > --- a/tools/libxc/xc_domain_save.c Mon Sep 06 10:54:51 2010 +0100
> > +++ b/tools/libxc/xc_domain_save.c Mon Sep 06 10:54:51 2010 +0100
> > @@ -1616,6 +1616,20 @@ int xc_domain_save(xc_interface *xch, in
> > }
> > }
> >
> > + if ( !callbacks->checkpoint )
> > + {
> > + /*
> > + * If this is not a checkpointed save then this must be the first
> > and
> > + * last checkpoint.
> > + */
> > + i = XC_SAVE_ID_LAST_CHECKPOINT;
> > + if ( wrexact(io_fd, &i, sizeof(int)) )
> > + {
> > + PERROR("Error when writing last checkpoint chunk");
> > + goto out;
> > + }
> > + }
> > +
> > /* Zero terminate */
> > i = 0;
> > if ( wrexact(io_fd, &i, sizeof(int)) )
> > diff -r 5f96f36feebd -r bdf8ce09160d tools/libxc/xg_save_restore.h
> > --- a/tools/libxc/xg_save_restore.h Mon Sep 06 10:54:51 2010 +0100
> > +++ b/tools/libxc/xg_save_restore.h Mon Sep 06 10:54:51 2010 +0100
> > @@ -131,6 +131,7 @@
> > #define XC_SAVE_ID_TMEM_EXTRA -6
> > #define XC_SAVE_ID_TSC_INFO -7
> > #define XC_SAVE_ID_HVM_CONSOLE_PFN -8 /* (HVM-only) */
> > +#define XC_SAVE_ID_LAST_CHECKPOINT -9 /* Commit to restoring after
> > completion of current iteration. */
> >
> > /*
> > ** We process save/restore/migrate in batches of pages; the below
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> >
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|