WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on xen4 
pvops"):
> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
> error: Error when reading batch size
> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
> error: error when buffering batch, finishing

These errors, and the slowness of migrations, are caused by changes
made to support Remus.  Previously, a migration would be regarded as
complete as soon as the final information including CPU states was
received at the migration target.  xc_domain_restore would return
immediately at that point.

Since the Remus patches, xc_domain_restore waits until it gets an IO
error, and also has a very short timeout which induces IO errors if
nothing is received if there is no timeout.  This is correct in the
Remus case but wrong in the normal case.

The code should be changed so that xc_domain_restore
 (a) takes an explicit parameter for the IO timeout, which
     should default to something much longer than the 100ms or so of
     the Remus case, and
 (b) gets told whether
    (i) it should return immediately after receiving the "tail"
        which contains the CPU state; or
    (ii) it should attempt to keep reading after receiving the "tail"
        and only return when the connection fails.

In the case (b)(i), which should be the usual case, the behaviour
should be that which we would get if changeset 20406:0f893b8f7c15 was
reverted.  The offending code is mostly this, from 20406:

+    // DPRINTF("Buffered checkpoint\n");
+
+    if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) {
+        ERROR("error when buffering batch, finishing\n");
+        goto finish;
+    }
+    memset(&tmptail, 0, sizeof(tmptail));
+    if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap,
+                     ext_vcpucontext) < 0 ) {
+        ERROR ("error buffering image tail, finishing");
+        goto finish;
+    }
+    tailbuf_free(&tailbuf);
+    memcpy(&tailbuf, &tmptail, sizeof(tailbuf));
+
+    goto loadpages;
+
+  finish:

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel