Maybe add some tracing to the backend driver -- it's possible the
backend isn't sending responses for those packets back to domU, and so
things seize up for a while. If no responses are being generated it is
because the backend thinks the packets are still in flight, so there
would be some bug-hunting to find out why that is.
-- Keir
>
> Summary:
>
> After sending some UDP traffic between two xen domains (Domain 0 and
> Domain 1) the networking between the domains fails. This failure is 100%
> repeatable.
>
> In more detail:
>
> I have two xen domains. They run the kernels from the 2.0.3 release. (I've
> run
> into the same problem with 2.0.1 as well.) Domain 0 has 5 physical ethernet
> interfaces, and a virtual interface to Domain 1. Domain 1 has just the
> virtual
> interface to Domain 0.
>
> D0 is configured with IP address 192.168.0.1, and D1 with 192.168.1.1. The
> netmask is set to 255.255.0.0.
>
> When I bring up D1, I can ping D1 from D0, ssh into D1, etc.
>
> I then start a UDP server in D0, and a traffic generator in D1. After the
> traffic generator sends its 128-th packet, networking between the domains
> fails. The 128th packet is received successfully by the UDP server, but no
> later traffic arrives in D0. This includes UDP, TCP, ICMP, and ARP.
>
> Looking at the interrupt counts in /proc/interrupts, I see that D0 no longer
> receives packets sent by D1. D1, however, does receive packets sent by D0.
> (To
> be clear, D0->D1 traffic is ICMP ping requests, unrelated to the UDP traffic.
> There is not UDP traffic sent from D0 to D1.)
>
> (I suspect the stuff in this paragraph doesn't matter, but include it for
> completeness.) Eventually, D0's ARP cache entry for D1 expires. D0 ARPs for
> D1,
> and D1 replies. But D0 never receives these replies. And eventually, D1 stops
> replying to the ARPs entirely. (D1's sending behavior is observed via tcpdump
> running in the console connection to D1.)
>
> Note that the networking failure only occurs if the UDP packets are delivered
> to a user-level process in D0. In particular, UDP traffic to D0's kernel NFS
> server does not induce the failure. Nor does traffic sent to D0 for which
> there
> is no user process to accept the packets. And neither does traffic which is
> forwarded on to other hosts via NAT. (I haven't tested the regular forwarding
> case.)
>
> Also, for what it's worth, Domain 0's network connectivity on its other
> interfaces (which are connected to the world at large) are unaffected.
>
> Looking through the mailing list archive, I saw a prior bug that seemed
> similar, but involved IP fragmentation. That is not the case here, as the UDP
> packets sent by D1 are small (<100 bytes).
>
> Any suggestions for debugging this?
>
> Thanks,
> mukesh
>
>
> -------------------------------------------------------
> The SF.Net email is sponsored by: Beat the post-holiday blues
> Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
> It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
|