I have done some experiments with remus and had some problems with its failover.
I set up dormO, and dormU like below and backup server is setup as same as primary.
Ubuntu 9.10 Xen 4.0.1-rc2 kernel for dorm0 : 2.6.32.18
kernel for dormU : 2.6.18.8
with idle guest running on dorm0, I run remus on primary server, and destroy guest or remus, remus failover works and guest from primary server moves to backup server.
but for some workload experiment, I run specweb or kernel compile on the guest and primary server runs remus.
when the guest is destroyed or remus is killed, it doesn't survive at backup server even though it is checkpointing before. there was 'p' state of guest at backup server while checkpointing, but it's disappeared.
Error in xend.log at backup server shows this message.
----
[XXXX-XX-XX 13:56:50 6038] ERROR (XendCheckpoint:357) /usr/lib/xen/bin/xc_restore 36 92 1 2 0 0 0 0 failed Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 309, in restore
forkHelper(cmd, fd, handler.handler, True) File "/usr/lib/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 411, in forkHelper raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen/bin/xc_restore 36 92 1 2 0 0 0 0 failed [XXXX-XX-XX 13:56:50 6038] ERROR (XendDomain:1175) Restore failed Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/xen/xend/XendDomain.py", line 1159, in domain_restore_fd
dominfo = XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating) File "/usr/lib/python2.6/site-packages/xen/xend/XendCheckpoint.py", line 358, in restore raise exn XendError: /usr/lib/xen/bin/xc_restore 36 92 1 2 0 0 0 0 failed
---- it looks quite same with previous question from Shriram Rajagopalan http://lists.xensource.com/archives/html/xen-devel/2010-09/msg00369.html
and this error seems appeared in xen live migration in the past, since remus shares functions with live migration, and error showed at xen live migration function. anyone has previous similar experience either with remus or xen live migration?
anyone found any reason or solution for this? I will appreciate it if anyone can help with this. Thank you. Kyungjin.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|