Thank you very much for your reply and help. I've tested Remus with xen-unstable-4.0 and the latest linux-2.6.18-xen.hg. The remus seems to work well. The previous error seems caused by using an old version of linux-2.6.18-xen.hg and I've fixed it.
I encounter some problem when try to run remus with two vms on one single physical machine,like:
The first command can run correctly, while the second line get error and give the following messages. Can Remus provide two vms (on one physical machine) the fault tolerance simultaneously?
ERROR Internal error: Can't create lock file for suspend event channel
WARNING: suspend event channel unavailable, falling back to slow xenstore signalling
Had 0 unexplained entries in p2m table
1: sent 64491, skipped 725, delta 3425ms, dom0 75%, target 75%, sent 617Mb/s, dirtied 10Mb/s 1085 pages
2: sent 1083, skipped 2, delta 43ms, dom0 100%, target 100%, sent 825Mb/s, dirtied 12Mb/s 16 pages
3: sent 15, skipped 1, Start last iteration
PROF: suspending at 1271404630.401318
installing buffer on imq0
RTNETLINK answers: File exists
ERROR Internal error: Suspend request failed
ERROR Internal error: Domain appears not to have suspended
Save exit rc=1
Traceback (most recent call last):
File "/usr/bin/remus", line 359, in ?
run(cfg)
File "/usr/bin/remus", line 340, in run
for buf in bufs:
File "/usr/bin/remus", line 277, in postsuspend
buf.postsuspend()
File "/usr/bin/remus", line 159, in postsuspend
self._setup()
File "/usr/bin/remus", line 185, in _setup
self.rth.talk(req.pack())
File "usr/lib/python2.4/site-packages/xen/remus/netlink.py", line 314, in talk
IOError: error sending message
On Tue, Apr 6, 2010 at 5:30 AM, Brendan Cully <brendan@xxxxxxxxx> wrote: