WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Jeremy's GIT-tree and network problems

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] Jeremy's GIT-tree and network problems
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Mon, 29 Mar 2010 11:42:40 -0700
Cc: Adnan Misherfi <adnan.misherfi@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Delivery-date: Mon, 29 Mar 2010 11:46:14 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100329T1647.GA.2c139.stse@xxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20100329T1647.GA.2c139.stse@xxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3
On 03/29/2010 08:40 AM, Stephan Seitz wrote:
Hi!

I have bridging problems with the Dom0 kernels from Jeremy’s tree. I wrote a mail to xen-user (MSG-ID <20091220T1944.GA.ab998.stse@xxxxxxxxxxxxxxxxxxx>, 20 Dec 2009), but without solutions. So I try xen-devel this time.


My hardware setup:
A PC with two NICs (Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller) is used as working environment (Dom0) and as firewall/proxy/DSL-router (DomU).
The two NICs are bridged between Dom0 and DomU.
Bridge eth0 containing peth0 and vif1.0 has an IP address in Dom0 and DomU. The DomU IP address is the gateway address in Dom0. Bridge xenbr1 containing eth1 and vif1.1 has no IP address in Dom0 and DomU and is only used to connect the DSL modem to DomU. The IP address is given to the PPP interface in DomU. Linux distribution is Debian/Testing (64bit) with XEN version 3.4.2 in December and 3.4.3rc3 now. The kernels are always self-compiled.


My working setup:
Dom0 with kernel 2.6.29.5 with xen-patches-2.6.29-6.tar.bz2 and DomU with standard kernel 2.6.32.x (and the 2.6.29.5 xen kernel before). The hypervisor was 3.4.2 and is now 3.4.3rc3. Here everything works as expected. DomU acts as firewall and is using correct masquerading for all internet traffic.


My non-working setup:
Dom0 with the PV-Ops kernel from Jeremy’s tree (I tried the follwoing kernels: 2.6.31.5-00500-g34013be, 2.6.31.6-00696-g41a0695 (tested in December) and now from xen/stable the versions 2.6.32.10-02792-gf112549 and 2.6.32.10-02798-gd945b01). DomU kernel and hypervisor are the same as in the working setup.

What is working?
IP connection between Dom0 and DomU is working and between DomU and the internet. Traffic from Dom0 to the internet is working if DomU is used as a proxy (e.g. HTTP traffice with a squid in DomU).

What is not working?
Direct IP connection between Dom0 and the internet (tested with ping and „telnet <host> <port>”. If I trace in DomU I see the packets leaving the ppp0 interface (correctly masqueraded), but I see no answering packets. If I trace in Dom0 using the bridge interfaces between the DSL modem and DomU (xenbr1, eth1, vif1.1, see hardware setup above), I don’t see the packets anymore. I only see packets from traffic generated directly by DomU. The DomU configuration between the working and non-working setup is not changed, only the Dom0 kernel is changed.


So if anyone has an idea, what this could be and how to fix it, I will be glad.


Further information:
The NIC and the bridge driver are the same in all kernels from 2.6.29.5 until 2.6.32.10:

osgiliath:~# ethtool -i eth1
driver: r8169
version: 2.3LK-NAPI
firmware-version:
bus-info: 0000:03:00.0
osgiliath:~# ethtool -i xenbr1
driver: bridge
version: 2.3
firmware-version: N/A
bus-info: N/A

The only difference in the output of „ethtool eth1” are additional information about „link partner advertised modes” in the 2.6.3x kernels.

„ethtool -k eth1” shows the error message „Cannot get device flags: Operation not supported” in the working setup for the working Dom0 kernel. All other output is identical in all kernel versions:

osgiliath:~# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: off
scatter-gather: off
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off

Switching rx-checksumming off does not help.

Have you tried carpet-bombing the ethtools: turn off everything on all the dom0 interfaces (both the bridge(s) and all the component interfaces) and all the domU interfaces? It does look like some kind of checksum problem (or perhaps other offload?).

Fortunately it looks like this is going to get some systematic attention. I'd really like any reasonable (ie, not inherently broken for other reasons) network setup to just work.

Thanks,
J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel