RE: [Xen-users] Networking bug in xen 4.0.0 ?

Hello,


looks similar like the infamous incorrect checksum error - 
http://wiki.xensource.com/xenwiki/XenFaq#head-4ce9767df34fe1c9cf4f85f7e07cb10110eae9b7.

Is there something in the logs of the Dom0?
How recent is the pvops 2.6.31.13 dom0 pvops kernel?

You can check the wrong checksum problem with tcpdump from dom0 (tcpdump -nvvi 
interface). In some cases it can be solved with ethtool (ethtool -K eth0 tx 
off), in some cases one had to use another Dom0 kernel (latest pvops 2.6.32.x 
or patch the one in use). 


Regards

Matej

-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx 
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of S.M.R. Mahdavian
Sent: Tuesday, July 13, 2010 9:04 AM
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Networking bug in xen 4.0.0 ?

Hello all.

I have observed a strange behaviour in xen 4.0.0 networking. I Compiled xen on
an Ubuntu 9.10 and created two virtual machines on it ("Router" and "Server")
with the following network configuration:




           Xen host  192.168.1.5

     ################################################
     #                                              #
     #                                              #
     #                                              #
     #      Server                                  #
     #    192.168.2.20               Router         #           Client
     #                         192.168.2.1          #         192.168.1.10
     #    ##########                   192.168.1.1  #
     #    #        #                                #         ##########
     #    #        #     ###     ##############     #         #        #
     #    #        #     # #     #            #     # host    #        #
     #    #        #     # ####### eth1       #     # eth0    #        #
     #    #        #     # #     #       eth0 ################# eth0   #
     #    #   eth0 ####### #     #            #     #         #        #
     #    #        #     # #     #            #     #         #        #
     #    #        #     ###     #            #     #         #        #
     #    #        #             ##############     #         #        #
     #    ##########    Linux                       #         ##########
     #                  Bridge                      #
     #                  (sw0)                       #
     #                                              #
     #                                              #
     #                                              #
     #                                              #
     ################################################




Both "Router" and "Server" are Ubuntu 9.04 with 2.6.24-27-xen kernel. The two
domU's have an old-style xen kernel, whereas dom0 has a new 2.6.31.13 pvops
kernel. As can be seen from the figure above, "Router" has two interfaces.
One interface is connected to the "host's" eth0 and the other one is
internally connected to "Server" via a created linux bridge named sw0. IP
forwarding is enabled on "Router" so that packets coming from the external
"Client" can be forwarded to "Server". "Client" is running Windows XP.

The strange behavior is that packets are forwarded correctly from "Client" to
"Server" when packet sizes are small, but if they are larger than a certain
size, then packets coming out of Router's eth1 do not reach the Server or the
linux bridge. I fact tcpdump suggests that these packets do not make it to
vif1.1 (assuming that Router id number is 1). This happens only if packets
are *forwarded* by "Router". If packets are *generated* within Router, then
no problem is observed. Therefore pinging "Server" from "Client" using large
size packets fails, whereas pinging "Server" from "Router" is OK regardless
of the size of the packet.

I suspect that this is a bug in either xen 4.0.0 or the pvops kernel that
comes with it. Has anyone experienced something similar? Any recommendations?

Regards,
-- Mahdavian

P.S. The following shows some of the results from the experiment:

root@xen-host:~# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0  3023     2     r-----     36.9
router                                       1   256     1     -b----      5.8
server                                       2   256     1     -b----      5.7

root@xen-host:~# brctl show
bridge name     bridge id               STP enabled     interfaces
eth0            8000.002354cd2c7f       no              peth0
                                                        vif1.0
sw0             8000.feffffffffff       no              vif1.1
                                                        vif2.0

# Ping from "Client" using packet size of 157 bytes (that's the limit!):
C:\Documents and Settings\Admin>ping -l 157 192.168.2.20

Pinging 192.168.2.20 with 157 bytes of data:

Reply from 192.168.2.20: bytes=157 time<1ms TTL=63
Reply from 192.168.2.20: bytes=157 time<1ms TTL=63
Reply from 192.168.2.20: bytes=157 time<1ms TTL=63
Reply from 192.168.2.20: bytes=157 time<1ms TTL=63


# Ping from "Client" using packet size of 158 bytes:
C:\Documents and Settings\Admin>ping -l 158 192.168.2.20

Pinging 192.168.2.20 with 158 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.


# tcpdump suggests that packets do leave the router interface:
root@router:~# tcpdump -n -v -i eth1
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes
11:18:02.787099 IP (tos 0x0, ttl 127, id 560, offset 0, flags [none], proto 
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512, 
seq 6400, length 166
11:18:07.785175 arp who-has 192.168.2.20 tell 192.168.2.1
11:18:07.785280 arp reply 192.168.2.20 is-at 00:16:3e:1e:67:9f
11:18:08.043239 IP (tos 0x0, ttl 127, id 564, offset 0, flags [none], proto 
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512, 
seq 6656, length 166
11:18:13.516628 IP (tos 0x0, ttl 127, id 566, offset 0, flags [none], proto 
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512, 
seq 6912, length 166
11:18:19.003203 IP (tos 0x0, ttl 127, id 568, offset 0, flags [none], proto 
ICMP (1), length 186) 192.168.1.10 > 192.168.2.20: ICMP echo request, id 512, 

# tcpdump suggests that the link between the frontend and the backend 
interfaceseq 7168, length 166
# is broken:
root@xen-host:~# tcpdump -n -i vif1.1
tcpdump: WARNING: vif1.1: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vif1.1, link-type EN10MB (Ethernet), capture size 96 bytes
11:18:07.552453 ARP, Request who-has 192.168.2.20 tell 192.168.2.1, length 28
11:18:07.552524 ARP, Reply 192.168.2.20 is-at 00:16:3e:1e:67:9f, length 28


# No problem is observed when packets are *generated* rather that *forwarded*
# by "Router".
root@router:~# ping -c 4 -s 1400 192.168.2.20
PING 192.168.2.20 (192.168.2.20) 1400(1428) bytes of data.
1408 bytes from 192.168.2.20: icmp_seq=1 ttl=64 time=0.885 ms
1408 bytes from 192.168.2.20: icmp_seq=2 ttl=64 time=0.068 ms
1408 bytes from 192.168.2.20: icmp_seq=3 ttl=64 time=0.066 ms
1408 bytes from 192.168.2.20: icmp_seq=4 ttl=64 time=0.068 ms


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
WARNING - OLD ARCHIVES

xen-users

RE: [Xen-users] Networking bug in xen 4.0.0 ?