Hi all,
The following small set of patches introduces an event-channel-based
mechanism for suspending guests during the final stage of live
migration. They were written in support of our Remus high availability
project, presented at NSDI this April. The full paper is available
here:
http://www.usenix.org/event/nsdi08/tech/cully.html
Because Remus takes checkpoints many times per second, it cannot
afford the tens of milliseconds currently spent in signalling between
the checkpoint process and the target domain (largely due to
xenstore). So this patch set uses event channels instead of xenstore
watches to perform this signalling, greatly reducing the amount of
time spent waiting for message delivery and process scheduling. It is
a revised version of the prototype patches originally submitted here:
http://lists.xensource.com/archives/html/xen-devel/2007-05/msg00276.html
This code is backwards-compatible with unmodified guest kernels (it
simply falls back to the current xenstore-based notification mechanism
for these guests).
I've added timestamps to the suspend_and_state function in
xc_domain_save, before and after the suspend callback. The difference
in execution times (5 runs, idle dom0, idle guest, one vpcu for dom0
and domU, each pinned to separate hyperthreads on a P4):
Old method: 84ms, 87ms, 92ms, 89ms, 92ms
New method: 1ms, 1ms, 1ms, 1ms, 1ms
Could this code be considered for the upcoming 3.3 release?
Thanks,
Brendan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|