This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Linux TCP Checksum offload limitations

To: "Alan Cox" <alan@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] Linux TCP Checksum offload limitations
From: "James Harper" <james.harper@xxxxxxxxxxxxxxxx>
Date: Tue, 8 Apr 2008 11:01:40 +1000
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 07 Apr 2008 18:03:02 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <20080407000238.6e879cf9@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AEC6C66638C05B468B556EA548C1A77D013DC2D9@trantor> <20080407000238.6e879cf9@xxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AciY8Xx1he+MirLqQ8y2ulQ5BzTgFQAIIpZg
Thread-topic: [Xen-devel] Linux TCP Checksum offload limitations
> On Fri, 4 Apr 2008 22:04:53 +1100
> "James Harper" <james.harper@xxxxxxxxxxxxxxxx> wrote:
> > Some version of Windows appear to give the network adapter driver a
> > packet broken up into fairly small pieces, eg
> > Page 0: 14 bytes of Ethernet Header
> > Page 1: 20 bytes of IP Header
> > Page 2: 20 bytes of TCP Header
> > Page 3: 1460 bytes of TCP Data
> NDIS fragments are nothing to do with the wire side interface

An NDIS fragment is just a page of data...

> > Our best guess is that the Linux checksum offload code can't cope
> > the way Windows is fragmenting the packets, but maybe Xen is somehow
> > involved in this...
> Unconnected with Linux, Xen bug. Xen is responsible for handling NDIS
> lists on the windows side and turning them into a single virtual
> packet

A single virtual network packet as passed from windows to Xen consists
of one or more pages of data. When the first page contains at least the
Ethernet+IP+TCP header, everything works great. When the first page
contains the Ethernet header, the second page the IP header, the third
the TCP header, and subsequent pages contain the data, Linux refuses to
accept that 'csum_blank' is valid and drops the packet _after_ it leaves
the vif interface.

Just to elaborate on that, Xen successfully builds a packet out of the
pages, and I can definitely see the packet via a tcpdump on (say)
vif537.0, but it is dropped by Linux before it gets passed on the
bridge. So Linux initially accepts the packet as valid.

Now, from looking at the code I can see that an skb can definitely
handle a packet with the data split across multiple pages, but my theory
is that the Linux checksum offload stuff can't handle having the packet
_header_ (Ethernet+IP+TCP) split across multiple pages.

This gives me four possible truths...

1. Linux definitely requires that the first page in an skb consist of a
complete packet header, and this is a documented requirement but I
couldn't find it (eg it's a bug for my Windows PV drivers to give a
packet like this to Xen)

2. As above but it is not documented anywhere (eg it's a bug in the

3. Linux should handle the complete packet header being split across
multiple pages, but for some reason it doesn't, and it's never come up
before (eg it's a bug in the Linux csum offload code)

4. Something else I haven't thought of.

I guess I'm just looking for someone who knows about these things to say
that "yes, Linux should handle such a header split" or "no, Linux
doesn't handle this, fix your NDIS driver."

I have actually done the latter for now - the windows PV drivers now
merge enough data together to guarantee that the entire Ethernet+IP+TCP
header is on a single page, but there are overheads in doing that.



Xen-devel mailing list