WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Xen 3.4 & tg3 gigabit ethernet stalls

To: Pasi Kärkkäinen <pasik@xxxxxx>
Subject: Re: [Xen-devel] Xen 3.4 & tg3 gigabit ethernet stalls
From: René Bühlmann <buehlmann@xxxxxx>
Date: Thu, 11 Mar 2010 19:42:57 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 11 Mar 2010 10:44:08 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100310185115.GU1878@xxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4B97E2C6.5060305@xxxxxx> <20100310185115.GU1878@xxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.23 (Windows/20090812)
Pasi Kärkkäinen schrieb:
On Wed, Mar 10, 2010 at 07:19:50PM +0100, René Bühlmann wrote:
Hi all,

Some Weeks ago, i've upgraded my Xen 3.3 installation to a 3.4 Xen. The same time, i upgraded Dom0 Kernel to 2.6.31.2 which also made the interface to switch from 100Mbit to 1Gbit. Since then, the tg3 interface stalls about once a week and i need to reboot Xen. The problem looks similar to http://lists.xensource.com/archives/html/xen-devel/2009-07/msg00139.html but the solution there (adding cpuidle=0 cpufreq=none to kernel param.) did not help for me.

Does anyone has a solution or workaround for this or how could i debug the problem?


Have you monitored the recent tg3 driver changes in 2.6.32.x and/or 2.6.33? I remember seeing some patches and discussion about tg3.. maybe it's a tg3 driver bug?
I went through the tg3 parts of the kernel changelog. I'm not sure if these patches are related to my problem. I will try 2.6.33 as soon as pv_ops gets ported to it.
Are you using pv_ops dom0 kernel, or a kernel with forwardported patches?
I'm using pv_ops dom0 from Jeremy's stable branch.

-- Pasi

Thanks
René

Here is the dmesg output:

WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0x23e/0x250()
Hardware name: ProLiant ML110 G4
NETDEV WATCHDOG: peth1 (tg3): transmit queue 0 timed out
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.31.6 #2
Call Trace:
<IRQ>  [<ffffffff8143bbbe>] ? dev_watchdog+0x23e/0x250
[<ffffffff8143bbbe>] ? dev_watchdog+0x23e/0x250
[<ffffffff81045054>] ? warn_slowpath_common+0x74/0xd0
[<ffffffff814c1fdf>] ? br_handle_frame_finish+0x14f/0x190
[<ffffffff81045111>] ? warn_slowpath_fmt+0x51/0x60
[<ffffffff8100f10f>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff81503f4c>] ? _spin_unlock_irqrestore+0xc/0x10
[<ffffffff8104150f>] ? try_to_wake_up+0xbf/0x1e0
[<ffffffff8100f009>] ? xen_clocksource_get_cycles+0x9/0x20
[<ffffffff81218111>] ? strlcpy+0x41/0x50
[<ffffffff81426e6b>] ? netdev_drivername+0x3b/0x40
[<ffffffff8143bbbe>] ? dev_watchdog+0x23e/0x250
[<ffffffff8100e9a9>] ? xen_force_evtchn_callback+0x9/0x10
[<ffffffff8100f122>] ? check_events+0x12/0x20
[<ffffffff8143b980>] ? dev_watchdog+0x0/0x250
[<ffffffff8104f1ac>] ? run_timer_softirq+0x13c/0x210
[<ffffffff8104a935>] ? __do_softirq+0xa5/0x140
[<ffffffff810141ac>] ? call_softirq+0x1c/0x30
[<ffffffff8101613d>] ? do_softirq+0x4d/0x90
[<ffffffff812740be>] ? xen_evtchn_do_upcall+0x14e/0x1d0
[<ffffffff810141fe>] ? xen_do_hypervisor_callback+0x1e/0x30
<EOI>  [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1010
[<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1010
[<ffffffff8100ea3c>] ? xen_safe_halt+0xc/0x20
[<ffffffff8100b825>] ? xen_idle+0x25/0x50
[<ffffffff810122a6>] ? cpu_idle+0x66/0xa0
[<ffffffff818208df>] ? start_kernel+0x2e6/0x328
[<ffffffff818229d5>] ? xen_start_kernel+0x65c/0x6dc
---[ end trace 8108a21093ed2967 ]---
tg3: peth1: transmit timed out, resetting
tg3: DEBUG: MAC_TX_STATUS[ffffffff] MAC_RX_STATUS[ffffffff]
tg3: DEBUG: RDMAC_STATUS[ffffffff] WDMAC_STATUS[ffffffff]
tg3: tg3_stop_block timed out, ofs=2c00 enable_bit=2
tg3: tg3_stop_block timed out, ofs=2000 enable_bit=2
tg3: tg3_stop_block timed out, ofs=2400 enable_bit=2
tg3: tg3_stop_block timed out, ofs=2800 enable_bit=2
tg3: tg3_stop_block timed out, ofs=3000 enable_bit=2
tg3: tg3_stop_block timed out, ofs=1400 enable_bit=2
tg3: tg3_stop_block timed out, ofs=1800 enable_bit=2
tg3: tg3_stop_block timed out, ofs=c00 enable_bit=2
tg3: tg3_stop_block timed out, ofs=4800 enable_bit=2
tg3: tg3_stop_block timed out, ofs=1000 enable_bit=2
tg3: tg3_stop_block timed out, ofs=1c00 enable_bit=2
tg3: tg3_abort_hw timed out for peth1, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
tg3: tg3_stop_block timed out, ofs=3c00 enable_bit=2
tg3: tg3_stop_block timed out, ofs=4c00 enable_bit=2
tg3: peth1: No firmware running.
tg3: tg3_abort_hw timed out for peth1, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
tg3: peth1: Link is down.
xenbr1: port 1(peth1) entering disabled state




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel




Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel