Thanks. Your problem is definitely the difference of the hardware
capabilities of the boxes. I ran into the same thing when I built my
cluster. Short answer is that you must always start your domain on
the box with the most common caps, which in your case is Server02.
Long answer is below.
Basically, when a domain starts it remembers the capabilities from the
host on which it started. You'll even see it in the kernel boot
messages. After that, anything with less capabilities will not run
the domain. It will complete the migration, but as soon as the
destination host tries to start it, it will immediately die.
Take a look at the hw_caps line from both boxes:
Server01: bfebfbff:00100000:00000000:00000180:0000441d
Server02: 3febfbff:00000000:00000000:00000080
These are bit masks, encoded as base64. If we take the caps strings
and convert them to binary, we get:
Server01: 101111111110101111111011111111110000000000010000...
Server02: 001111111110101111111011111111110000000000000000...
They're actually much longer than that, but that's enough to get the
idea across. As you can see, Server01 has two bits set that Server02
doesn't have. A domain started on Server01, when migrated to
Server02, will try to continue to use the capability it had when it
was on Server01, and because Server02 can't do that, Server02 will
immediately error and destroy the domain.
However, notice that all the capabilities that Server02 has, Server01
also has. So if you start a domain on Server02, when migrated to
Server01, it will not error, because Server02 can do everything
Server01 could. And because the capabilities are remembered at boot
time, the domain will not notice that the host it's running on now has
more caps than it did before. So you can migrate it back to Server02
and it will work fine.
This is how I believe it works, but I could certainly be wrong. If
there's someone with more knowledge about what the hw_caps are and the
way they affect PV domains, please share what you know.
-Jeff
On Jun 23, 2008, at 7:55 AM, Cody Jarrett wrote:
Sure, here is the info:
Server01:
root@localhost:~
$ xm info
host : localhost
release : 2.6.18-53.1.21.el5xen
version : #1 SMP Tue May 20 10:31:46 EDT 2008
machine : i686
nr_cpus : 2
nr_nodes : 1
sockets_per_node : 1
cores_per_socket : 1
threads_per_core : 2
cpu_mhz : 3000
hw_caps : bfebfbff:00100000:00000000:00000180:0000441d
total_memory : 502
free_memory : 2
xen_major : 3
xen_minor : 1
xen_extra : .0-53.1.21.el5
xen_caps : xen-3.0-x86_32p
xen_pagesize : 4096
platform_params : virt_start=0xf5800000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)
cc_compile_by : mockbuild
cc_compile_domain :
cc_compile_date : Tue May 20 09:27:25 EDT 2008
xend_config_format : 2
Server02:
$ xm info
host : server02
release : 2.6.18-53.1.21.el5xen
version : #1 SMP Tue May 20 10:31:46 EDT 2008
machine : i686
nr_cpus : 1
nr_nodes : 1
sockets_per_node : 1
cores_per_socket : 1
threads_per_core : 1
cpu_mhz : 1694
hw_caps : 3febfbff:00000000:00000000:00000080
total_memory : 1023
free_memory : 128
xen_major : 3
xen_minor : 1
xen_extra : .0-53.1.21.el5
xen_caps : xen-3.0-x86_32p
xen_pagesize : 4096
platform_params : virt_start=0xf5800000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)
cc_compile_by : mockbuild
cc_compile_domain :
cc_compile_date : Tue May 20 09:27:25 EDT 2008
xend_config_format : 2
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|