On Sat, Apr 07, 2007 at 05:45:24PM +0200, Axel Thimm wrote:
> after some patching of the xen scripts (to properly `migrate' slaves
> over from bond0 to pbond0), I have xen and bonding working. But only
> for 20-25 seconds, after that the network throughput suddenly falls
> from for example 110MB/sec to 70-120KB/sec, e.g. about a factor of
> thousand. Stopping the network bridge restores the throughput, but
> again after a short delay of 0.5-1 minute.
>
> Does that ring a bell? What can be the troublemaker and why does it
> appear with such a great delay? There is no hint in the logs on why
> the performance drops that dramatically.
I checked where the packets got dropped by checking ICMP traffic on
o eth0,eth1 the two slaves
o pbond0 the physical bond of these two
o xenbr0
o bond0, aka veth0
While the network works well, the ICMP requests/replies can be seen on
all interfaces [1]. When the network breaks down to below 1% of its
bandwidth I can see the external ICMP requests reaching as far as
xenbr0. The virtual interface bond0 does not see the packets anymore.
So it looks like the bridge is leaking the packets, even after the
packets have passed into the bridge through the bonded device. This
makes it even more mysterious, since if the issue was bonding &
bridges I would expect the packets to drop on the incoming side of
the bridge.
[1] Incoming traffic is not captured in general on the slaves, so eth0
and eth1 did not show the ICMP requests, only the outgoing ICMP
replies.
--
Axel.Thimm at ATrpms.net
pgp2rD590URjS.pgp
Description: PGP signature
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|