Hi,
After a while (most recently 4 days) I am seeing the following in my logs:
Mar 13 15:36:45 xenserver kernel: NETDEV WATCHDOG: peth0: transmit timed out
This message will repeat every half hour or so until the dom0 is
rebooted. During that time the ethernet device and the domUs bridged
to it lose their network connectivity. I am not aware of any
particularly high network traffic or machine activity that coincides
with this.
Googling turned up the attached messages, but no responses. Any
suggestions would be greatly appreciated.
I am running xen 3.2 with the stock xen 2.6.18 kernel, both built from
source, on Debian Etch. The NIC in question is a D-Link System Inc
DGE-530T Gigabit Ethernet Adapter (rev 11) using the skge driver and
attached to a gigabit switch. Two identical NICs are passed to a domU
and are working without any problems.
xm dmesg:
__ __ _____ ____ ___
\ \/ /___ _ __ |___ / |___ \ / _ \
\ // _ \ \047_ \ |_ \ __) || | | |
/ \ __/ | | | ___) | / __/ | |_| |
/_/\_\___|_| |_| |____(_)_____(_)___/
(XEN) Xen version 3.2.0 (root@xxxxxxxxxxxxxxxx) (gcc version 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)) Tue Feb 5 00:56:
40 PST 2008
(XEN) Latest ChangeSet: unavailable
(XEN) Command line: dom0_mem=512M noapic acpi=off
(XEN) Video information:
(XEN) VGA is text mode 80x25, font 8x16
(XEN) VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN) Found 3 MBR signatures
(XEN) Found 3 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN) 0000000000000000 - 000000000009fc00 (usable)
(XEN) 000000000009fc00 - 00000000000a0000 (reserved)
(XEN) 00000000000f0000 - 0000000000100000 (reserved)
(XEN) 0000000000100000 - 00000000c0000000 (usable)
(XEN) 00000000fec00000 - 00000000fec01000 (reserved)
(XEN) 00000000fee00000 - 00000000fee01000 (reserved)
(XEN) 00000000fff80000 - 0000000100000000 (reserved)
(XEN) System RAM: 3071MB (3145340kB)
(XEN) Xen heap: 9MB (10052kB)
(XEN) Domain heap initialised: DMA width 32 bits
(XEN) PAE enabled, limit: 16 GB
(XEN) Local APIC disabled by BIOS -- you can enable it with "lapic"
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 1150.080 MHz processor.
(XEN) CPU: CLK_CTL MSR was 6003d22f. Reprogramming to 2003d22f
(XEN) CPU0: AMD Athlon(tm) stepping 00
(XEN) Platform timer overflows in 2 jiffies.
(XEN) Platform timer is 1.193MHz PIT
(XEN) Brought up 1 CPUs
(XEN) xenoprof: Initialization failed. No APIC
(XEN) AMD IOMMU: Disabled
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Xen kernel: 32-bit, PAE, lsb
(XEN) Dom0 kernel: 32-bit, PAE, lsb, paddr 0xc0100000 -> 0xc039eb54
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 000000003e000000->000000003f000000 (126976 pages
to be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: c0100000->c039eb54
(XEN) Init. ramdisk: c039f000->c0ee6e00
(XEN) Phys-Mach map: c0ee7000->c0f67000
(XEN) Start info: c0f67000->c0f67474
(XEN) Page tables: c0f68000->c0f75000
(XEN) Boot stack: c0f75000->c0f76000
(XEN) TOTAL: c0000000->c1000000
(XEN) ENTRY ADDRESS: c0100000
(XEN) Dom0 has maximum 1 VCPUs
(XEN) Initrd len 0xb47e00, start at 0xc039f000
(XEN) Scrubbing Free RAM: ........................done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is relinquishing VGA console.
(XEN) *** Serial input -> DOM0 (type \047CTRL-a\047 three times to
switch input to Xen)
(XEN) Freed 96kB init memory.
(XEN) spurious 8259A interrupt: IRQ7.
cat /proc/interrupts
CPU0
1: 30 Phys-irq i8042
11: 49430 Phys-irq skge
14: 66353 Phys-irq ide0
15: 157395 Phys-irq ide1
256: 3341511 Dynamic-irq timer0
257: 0 Dynamic-irq resched0
258: 0 Dynamic-irq callfunc0
259: 12841 Dynamic-irq xenbus
260: 0 Dynamic-irq console
261: 367 Dynamic-irq pciback
262: 24 Dynamic-irq blkif-backend
263: 61 Dynamic-irq blkif-backend
264: 1061 Dynamic-irq blkif-backend
265: 2636 Dynamic-irq vif5.0
266: 2873 Dynamic-irq vif5.1
267: 419 Dynamic-irq pciback
268: 2095 Dynamic-irq blkif-backend
269: 147 Dynamic-irq blkif-backend
270: 2666 Dynamic-irq blkif-backend
271: 28 Dynamic-irq blkif-backend
272: 9175 Dynamic-irq blkif-backend
273: 21222 Dynamic-irq vif2.0
274: 721 Dynamic-irq blkif-backend
275: 5314 Dynamic-irq blkif-backend
276: 24 Dynamic-irq blkif-backend
277: 52 Dynamic-irq blkif-backend
278: 15 Dynamic-irq blkif-backend
279: 35 Dynamic-irq blkif-backend
280: 571 Dynamic-irq vif3.0
281: 3152 Dynamic-irq blkif-backend
282: 512 Dynamic-irq blkif-backend
283: 6245 Dynamic-irq blkif-backend
284: 33 Dynamic-irq blkif-backend
285: 24 Dynamic-irq blkif-backend
286: 27576 Dynamic-irq blkif-backend
287: 22326 Dynamic-irq vif4.0
NMI: 0
LOC: 0
ERR: 0
MIS: 0
On 28/11/2007, Erik Logtenberg <erik@xxxxxxxxxxxxx> wrote:
> Hi,
>
> I have a problem with networking under XEN, I hope someone can help me
> out. The problem is that after some time (a day or so) the network
> suddenly stops working, and it takes some waiting and/or a reboot to fix
> this.
>
> I'm running XEN 3.1.0-rc7 on an Intel Core2 (x86_64). It's a Fedora 8
> system, with the following packages:
>
> o Xen version 3.1.0-rc7-2950.fc8 (kojibuilder@(none)) (gcc version 4.1.2
> 20070925 (Red Hat 4.1.2-32)) Tue Oct 23 12:21:08 EDT 2007
> o xen-3.1.0-13.fc8
> o kernel-xen-2.6.21-2950.fc8
>
> The error message the kernel gives me is the following:
> Nov 16 14:38:31 xen2 kernel: NETDEV WATCHDOG: peth0: transmit timed out
> Nov 16 14:38:31 xen2 kernel: sky2 peth0: tx timeout
> Nov 16 14:38:31 xen2 kernel: sky2 peth0: disabling interface
> Nov 16 14:38:31 xen2 kernel: sky2 peth0: enabling interface
> Nov 16 14:38:31 xen2 kernel: sky2 peth0: ram buffer 48K
> Nov 16 14:38:31 xen2 kernel: eth0: port 1(peth0) entering disabled state
>
> When I googled around, searching for these error messages, I found three
> previous e-mails to this xen-users list, reporting exactly the same
> problem. Unfortunately none of these messages seem to have received any
> reply? At least I wasn't able to find any.
>
> Below I pasted the previous mails from Markus Goldstein (december 2006),
> Brock Palen (january 2007) and Ian Tobin (august 2007) for more
> information on this subject.
> I hope someone has any idea what could be done to solve this problem, or
> if this issue is already taken care of by someone, or any other insight
> that might help.
>
> Kind regards,
>
> Erik Logtenberg.
>
>
>
> On december 22, 2006, Markus Goldstein wrote: "Debian Etch and nvidia
> chipset trouble (Kernel bug)"
> > Hi all,
> >
> > I have a problem running Xen 3.0.3-1 on Debian Etch (amd64).
> > Packages installed:
> >
> > linux-image-2.6.18-3-xen-amd64 2.6.18-7
> > linux-modules-2.6.18-3-xen-amd64 2.6.18-7
> > xen-hypervisor-3.0.3-1-amd64 3.0.3-0-2
> > xen-ioemu-3.0.3-1 3.0.3-0-2
> > xen-linux-system-2.6.18-3-xen-amd64 2.6.18-7
> >
> > I have a nvidia chipset and I am using the onboard gigabit ethernet
> > controller.
> >
> > From time to time, the networking hangs and gives the output
> >
> > Dec 21 19:09:40 xen kernel: NETDEV WATCHDOG: peth0: transmit timed out
> > Dec 21 19:09:40 xen kernel: peth0: Got tx_timeout. irq: 00000000
> > Dec 21 19:09:40 xen kernel: peth0: Ring at 4923c000: next 25635708
> > nic 25635452
> > Dec 21 19:09:40 xen kernel: peth0: Dumping tx registers
> > (full output below)
> >
> > After rebooting the machine, I get a Kernel Bug:
> > Dec 21 19:53:18 xen kernel: Kernel BUG at
> > drivers/xen/core/evtchn.c:481
> > (full output below)
> >
> > After waiting a couple of hours and then rebooting the machine,
> > everything works fine again for a certain time until the net hangs
> > again.
> >
> > I am not quite sure, what causes this and how to debug.
> >
> > Any help is really appreciated.
> >
> > Thanks,
> >
> > Markus.
>
>
>
> On january 7, 2007, Brock Palen wrote: "more xen network problems"
> > > Hello again, I put in pci network cards:
> > >
> > > National Semiconductor Corporation DP83820
> >
> > Ok i have made progress, the problem listed below does go away, when
> > using a old 3com pci card. So it looks like the ns83820 module has
> > issues with Linux bridging. Is there a Wiki page for working
> > networking? And hardware? The system works fine now (Dell PowerEdge
> > 440SC) Just the internal networking is broken with xen, so you will
> > need to add your own working networking.
> >
> > Other than that Is there a way to tell xend when it starts and
> > creates a bridge to use eth1 and not eth0 to create the bridge? eth0
> > (the ns83820) will be used as a crossover between teh two boxes for
> > drbd. It works just fine if you dont create a bridge on that device.
> >
> > Brock
> >
> > > They get addresses and make a /dev/eth0 allowing network access. I
> > > had no luck making the bcm57xx work.
> > >
> > > I now have a new problem, When I turned on xend networking no-
> > > longer works, the output from the 'route' command is very slow to
> > > appear. and i see the following in the logs, The system this is
> > > replacing is a old xen-2.07 box so im not familiar with peth.
> > >
> > > Jan 7 17:33:11 xen1 kernel: NETDEV WATCHDOG: peth0: transmit timed
> > > out Jan 7 17:33:11 xen1 kernel: peth0: tx_timeout: tx_done_idx=10
> > > free_idx=1 cmdsts=8000002a Jan 7 17:33:11 xen1 kernel: peth0: after:
> > > tx_done_idx=10 free_idx=1 cmdsts=8000002a Jan 7 17:33:12 xen1
> > > kernel: peth0: tx_timeout: tx_done_idx=10 free_idx=1 cmdsts=8000002a
> > > Jan 7 17:33:12 xen1 kernel: peth0: after: tx_done_idx=10 free_idx=1
> > > cmdsts=8000002a Jan 7 17:33:14 xen1 kernel: peth0: tx_timeout:
> > > tx_done_idx=10 free_idx=1 cmdsts=8000002a Jan 7 17:33:14 xen1
> > > kernel: peth0: after: tx_done_idx=10 free_idx=1 cmdsts=8000002a Jan
> > > 7 17:33:16 xen1 kernel: peth0: tx_timeout: tx_done_idx=10 free_idx=1
> > > cmdsts=8000002a Jan 7 17:33:16 xen1 kernel: peth0: after:
> > > tx_done_idx=10 free_idx=1 cmdsts=8000002a
> > >
> > > Its filling up my logs and filling up dmesg. I found some mentions
> > > of this when i googled the archives but no solutions. Anyone have
> > > any ideas?
> > >
> > > Brock Palen
>
>
>
> On august 16, 2007, Ian Tobin wrote: "annoying network problem"
> > Hello,
> >
> > We are having an issue with XEN network where by after some time of
> > the server being booted we start seeing these messages in the syslog
> >
> > kernel: NETDEV WATCHDOG: peth0: transmit timed out
> >
> > then all networking stops responding and the only way to solve it
> > is to reboot the server.
> >
> > I have looked up and down on the web and some have mentioned putting
> > pci=noacpi in the grub boot file but this has no affect.
> >
> > The network card is Compaq Computer Corporation Netelligent 10/100
> > TX PCI
> >
> > Has anyone got any suggestion or work arounds for this?
> >
> > Any help is much appreciated
> >
> > Thanks
> >
> > Ian
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users
>
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|