WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] New CPUs, now get: NETDEV WATCHDOG: eth0: transmit timed out

To: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] New CPUs, now get: NETDEV WATCHDOG: eth0: transmit timed out
From: "Christopher S. Aker" <caker@xxxxxxxxxxxx>
Date: Mon, 11 Oct 2010 16:36:32 -0400
Delivery-date: Mon, 11 Oct 2010 13:37:27 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4
At Linode we recently altered our server build spec from Intel L5520 to L5630. Nothing else has changed as far as we can tell, however with this new build we're experiencing a new problem -- permanent loss of networking after some time (measured in days) on these new machines, with:

NETDEV WATCHDOG: eth0: transmit timed out <-- kiss of network death

The link is remains active on switch, yet the NIC stops receiving any interrupts. No amount of prodding wakes it back up...

Some data points:

2.6.18.8 @ 931 contains an older igb driver
2.6.18.8 @ 1038 contains newest igb driver (as of last week)

2.6.18.8 @ 931 works perfectly on all our equipment prior to L5630
2.6.18.8 @ 1038 times out on everything

Motherboard BIOS version is the same.
Upgrading BIOS on affected boxes has no effect.

A year or two back (after 931), I had to build a newer 2.6.18.8 for whatever reason and decided to include the newest igb drivers at that time. I eventually had to roll this back because the NICs started timing out. However, even our "good" build is timing out on the new spec machine.

These machines don't appear to present the problem when on bare metal.

dmesg:

http://theshore.net/~caker/xen/BUGS/nic-timeout/

What we're trying:

1) On an affected machine, we're swapping out the L5630 back to the L5520.
2) Moving from Xen 3.4.1 to Xen 3.4.4-rc1-pre
3) Xen 3.4.4-rc1-pre along with 2.6.32.23-g41a85de5 dom0

This certainly appears as some strange incompatibility with Xen, dom0, and/or the NIC driver. No more interrupts being delivered is suspicious.

I'd be grateful for any insight!

Thanks,
-Chris


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-devel] New CPUs, now get: NETDEV WATCHDOG: eth0: transmit timed out, Christopher S. Aker <=