|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] Network Issues on Migration
On Fri, Jan 09, 2009 at 02:17:34PM -0500, Wendell Dingus wrote:
>
> I've read and experimented extensively and being in desperate need of
> "finishing" this setup and getting it deployed live, would like to see if
> anyone has any suggestions on the last hangup we seem to have.
>
> Two SuperMicro 1U servers with dual quad-core CPUs and 16GB RAM each. CentOS
> 5.2 x86_64 and it's xen implementation. The only thing non "stock" CentOS at
> this point are the Intel IGB drivers. The RHEL/CentOS drivers for Intel IGB
> appear to have a bug with DHCP over a bridged interface which the latest
> drivers downloaded straight from Intel cured for us.
>
> Anyway, both are attached to shared FC storage and are doing RHCS with both
> IP and disk-based quorum. CLVMD with a shared VG for creating LV's in as
> containers for VMs. That part is all working very good.
>
> Each DOM0 has 2 physical NICs and both are bridged. Additionally we added a
> virbr0 as a bridged per-DOM0 local network as well.
>
> When any VM boots up it can ping and traceroute on any of it's respective
> networks perfectly. Inbound/outbound data flow of any kind appears perfect as
> well. Once a VM is migrated or live-migrated to the other DOM0 though the
> ability to ping or traceroute ceases. Sessions via ssh or httpd either
> inbound or outbound continue to work fine though.
>
> When a VM boots I see this in dmesg:
> netfront: Initialising virtual ethernet driver.
> netfront: device eth0 has flipping receive path.
>
> I read something about a CRC problem and had each of them do "ethtool -K
> eth{n} tx off" but don't think that was necessary in this instance, I've
> never seen any error messages about CRC errors. The described problem and
> solution I followed was not heavily detailed and it was just an attempt to
> see if that helped with the problem.
>
> The following was added to the end of /etc/sysctl.conf on both DOM0's only
> (per the excellent wiki article):
> net.ipv4.icmp_echo_ignore_broadcasts = 1
> net.ipv4.conf.all.accept_redirects = 0
> net.ipv4.conf.all.send_redirects = 0
>
> The other oddity about this is that a VM started on server1 and live migrated
> to server2, a running ping only pauses a short while then picks right back up
> and continues to be successful. Migrating it back to server1 or initially
> starting a VM on server2 and migrating it to server1 is where the ping
> "stuck" issue comes into play. We were very careful and documented well as we
> installed both boxes, in an attempt to keep them as identical as possible. I
> fear this behavior proves that's not the case though, ugh...
>
> After migrating from 2 to 1 and then trying a ping (and waiting a good logn
> while before ctrl-c'ing this):
> PING 192.168.77.1 (192.168.77.1) 56(84) bytes of data.
> 64 bytes from 192.168.77.1: icmp_seq=1 ttl=64 time=0.000 ms
>
> --- 192.168.77.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.000/0.000/0.000/0.000 ms
>
> Very strange... Additionally a "service network restart" at this point
> results in all interfaces going down, loopback being reinitialized and then
> it hangs on trying to bring up eth0. I can ctrl-c it three times as it pauses
> on each interface, then "ifconfig" and see all the IPs are still there. Still
> can't ping but can "telnet google.com 80" for instance. Odd...
>
> So anyway, any pointers or suggestions you might have, would be greatly
> appreciated...
>
https://www.redhat.com/archives/rhelv5-announce/2008-October/msg00000.html
Some entries from the RHEL 5.3 beta changelog:
+ Timer problems after migration were fixed
+ Lengthy network outage after migrations was fixed
Dunno if it's that what you're seeing..
-- Pasi
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|
|
|