[Xen-devel] Xen checksumming bug with IPsec ESP packets

System setup:

We have a domU sending packets through DOM0 to an external
host machine. We have setup an IPSEC tunnel between DOM0 and
the host machine through which the packets between domU and
the host machine are routed. We switched off MAC bridging for
this purpose, and configured all interfaces statically.

+ xen-unstable.hg from 7.28.2005.
+ Static IP configuration of DOM0 and one domU (NO MAC bridging).

+ IPsec enabled in DOM0 by editing linux-2.6.12-xen0/.config asappropriate and recompiling.+ IPsec using ESP in tunnel mode configured to tunnel all trafficbetweem domU and a particular external host (running linux-2.6.13-rc3with IPsec enabled).




Problem:

Code for a checksum optimization imposed by Xen (basically, don'tchecksum packets between domU and DOM0 since there is no real wire onwhich they can become garbled) is not placed correctly. As-is, ESPpackets encapsulating IP packets from domU will be silently dropped byDOM0. Using tcpdump on DOM0, the packets from domU can be seen arrivingin DOM0, but no ESP packet leaves DOM0 for the external host. It turnsout that an ESP packet is being created, but it gets dropped innet/core/dev.c:dev_queue_xmit() in the switch(skb->nh.iph->protocol)statement. It gets dropped here because the protocol is IPPROTO_ESP,and that switch statement can only handle IPPROTO_[TCP|UDP]. The errnoreturned is -ENOMEM. Debugging would have been significantly easierwith a more specific errno.



More info:

Xen gives its virtual network interfaces in domU domains theNETIF_F_IP_CSUM feature flag, which is defined ininclude/linux/netdevice.h to mean the interface is capable only ofchecksumming TCP/UDP over IPv4. The expectation is that one can thenget away with not checksumming TCP/UDP packets at all as they passbetween domU and DOM0. This looks to me like a common-case optimizationand saves CPU cycles. Some code is then inserted innet/core/dev.c:dev_queue_xmit() on DOM0 which puts in the checksum forpackets that are actually going on to the rest of the world. Thismanifested itself as a problem for us in two ways:

1. The code in net/core/dev.c:dev_queue_xmit() (activated whenskb->proto_csum_blank == 1) can only handle TCP and UDP packets destinedfor the rest of the world. ESP packets activate the `default:` case inthe switch() statement, and thus fail with the default errno in thatfunction: -ENOMEM.2. I changed net/core/dev.c:dev_queue_xmit() to allow ESP throughunmolested just because I was curious. The ESP packets then went allthe way from DOM0 to the external host, where they were decrypted. Oncethe tunneled packet was exposed, it was dropped on the remote systembecause it did not have a valid checksum. In other words, the logic inDOM0 (switch() statement in net/core/dev.c:dev_queue_xmit()) that issupposed to insert the needed checksum into the original packet fromdomU is too late.


Problem summarized:

The original packet from domU did not get the checksum it needed, andthe ESP packet created in DOM0 wasn't allowed to leave because thetoo-late-code doesn't know how to handle ESP packets.



Temporary Solution:

We fixed this by removing the addition of flag NETIF_F_IP_CSUM indrivers/xen/netfront/netfront.c:create_netdev(). I believe this tellsthe kernel to just always do the checksum in software. Thus, the brokenoptimization for TCP/UDP packets gets bypassed.



Permanent Solution:

???

That's why I posted this message... :-)


Cheers,
-Jonathan McCune
jonmccune@xxxxxxx



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Xen checksumming bug with IPsec ESP packets