This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] lost gARP after live migration

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] lost gARP after live migration
From: Laszlo Ersek <lersek@xxxxxxxxxx>
Date: Tue, 28 Jun 2011 15:01:20 +0200
Cc: netdev@xxxxxxxxxxxxxxx, Paolo Bonzini <pbonzini@xxxxxxxxxx>
Delivery-date: Tue, 28 Jun 2011 06:05:26 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20110428 Fedora/3.1.10-1.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.10

with reference to RHBZ#713585:

It seems when a RHEL-6.1 or F-15 Xen PV guest is live migrated, the gratuitous ARP packet is not forwarded to the affected "networking equipment". The netback vif is added to a routed bridge in the host(s) and external hosts are expeted to have connection to the guest at all times, no matter the current Xen host.

I experimented a bit with tcpdump, and the gARP does appear on the netfront interface. It also appears on the host bridge if sufficient time passes between completing the xenbus handshake and sending the gARP.

When the guest queues eg. three gARPs in rapid succession, a variable number of them gets lost. (When all such packets disappear, then the migrated guest becomes invisible to the outside world, until it initiates network traffic on its own.)

When the guest waits for about half a second before sending (queueing), the very first gARP packet successfully appears on the host bridge.

I suspect it's a timing race against the netback vif being added to the host bridge. What would be a good countermeasure?

- Adding two modparams to xen-netfront (gARP requeue count & number of msecs to wait between queueing the gARPs). - (Paolo's idea:) watching the "hotplug-status" xenstore node and sending a single gARP when the watch fires with "connected". This node belongs to the backend xenstore subtree, thus watching it from the guest doesn't please the architecture astronaut in me.
- Something else.

Sorry for the naivety / verbiage.


Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>