xen-devel
[Xen-devel] (repeatable) cross-domain networking failure
Summary:
I'm running into a situation where, after sending some UDP traffic between
two xen domains (Domain 0 and Domain 1) the networking between the
domains fails. This failure is 100% repeatable.
In more detail:
I have two xen domains. They run the kernels from the 2.0.3 release. (I've
run into the same problem with 2.0.1 as well.) Domain 0 has 5 physical
ethernet interfaces, and a virtual interface to Domain 1. Domain 1 has
just the virtual interface to Domain 0.
D0 is configured with IP address 192.168.0.1, and D1 with 192.168.1.1. The
netmask is set to 255.255.0.0.
When I bring up D1, I can ping D1 from D0, ssh into D1, etc.
I then start a UDP server in D0, and a traffic generator in D1. After the
traffic generator sends its 128-th packet, networking between the domains
fails. The 128th packet is received successfully by the UDP server, but no
later traffic arrives in D0. This includes UDP, TCP, ICMP, and ARP.
Looking at the interrupt counts in /proc/interrupts, I see that D0 no
longer receives packets sent by D1. D1, however, does receive packets sent
by D0. (To be clear, D0->D1 traffic is ICMP ping requests, unrelated to
the UDP traffic. There is not UDP traffic sent from D0 to D1.)
(I suspect the stuff in this paragraph doesn't matter, but include it for
completeness.) Eventually, D0's ARP cache entry for D1 expires. D0 ARPs
for D1, and D1 replies. But D0 never receives these replies. And
eventually, D1 stops replying to the ARPs entirely. (D1's sending behavior
is observed via tcpdump running in the console connection to D1.)
Note that the networking failure only occurs if the UDP packets are
delivered to a user-level process in D0. In particular, UDP traffic to
D0's kernel NFS server does not induce the failure. Nor does traffic sent
to D0 for which there is no user process to accept the packets. And
neither does traffic which is forwarded on to other hosts via NAT. (I
haven't tested the regular forwarding case.)
Also, for what it's worth, Domain 0's network connectivity on its other
interfaces (which are connected to the world at large) are unaffected.
Looking through the mailing list archive, I saw a prior bug that seemed
similar, but involved IP fragmentation. That is not the case here, as the
UDP packets sent by D1 are small (<100 bytes).
Any suggestions for debugging this?
Thanks,
mukesh
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
|
|
|