WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] dom0 networking getting screwd at random periods

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] dom0 networking getting screwd at random periods
From: Martins Lazdans <marrtins@xxxxxxxx>
Date: Fri, 17 Jun 2011 14:29:40 +0300
Delivery-date: Fri, 17 Jun 2011 04:30:14 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <BANLkTi=_LBHQT520pWeJNvujHNUF0zZh3A@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <4DEB8F34.2050807@xxxxxxxx> <4DF87AA8.4070102@xxxxxxxx> <BANLkTi=_LBHQT520pWeJNvujHNUF0zZh3A@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.24 (Windows/20100228)
It seems that troublemaker was NICs afterall. I put some braodcom dual gbit into server and after that it's been 36h and no problems at all. Before that i was using built-int nvidia NICs (forcedeth driver). It's strange however why restarting bridges helped?

Todd Deshane wrote:
On Wed, Jun 15, 2011 at 5:26 AM, Martins Lazdans <marrtins@xxxxxxxx> wrote:
Does anyone have any hint where to dig? Right now I wrote a script that
checks ping to gateway and in case of timeouts restarts bridge and
reattaches domUs to it. Right now bridge is stopping working like every 3
minutes. Can this be hardware related?


You may want to consider setting up the bridges manually with the
CentOS networking scripts.

This is actually the default way of doing things starting with Xen 4.1
http://wiki.xensource.com/xenwiki/HostConfiguration/Networking

The standard bridging built into CentOS may end up being more stable for you.

Hope that helps.

Thanks,
Todd

One guy replied to me directly having the same problem on CentOS (I guess it
was CentOS).

Martins Lazdans wrote:
Hello!

I having a problem with dom0 (running Debian lenny, latest patches
applied) networking getting stopped after random periods of time - it's been
18 hours and it's been 5 hours. dom0 got two NIC (peth0, peth1), two subnets
(a, b), configured two bridges (eth0, eth1 - each subnet resides on
different NIC). dom0 have each NIC configured with dedicated IP form
correspongin subnets. So, eth0 serves subnet a, while eth1 is serving subnet
b.

In xend-config.sxp:
(network-script network-bridge-wrapper) and network-bridge-wrapper is
just:

#!/bin/sh
/etc/xen/scripts/network-bridge "$@" netdev=eth0
/etc/xen/scripts/network-bridge "$@" netdev=eth1

After random periods of time, eth0 goes down along with all domU
networking attached to it. I'm unnable to connect or ping dom0 neither on
peth0 nor peth1 nor any of domU attached to that bridge.

However, domU using eth1 bridge are working just fine. I can SSH to domU
and then SSH to dom0 on peth1 IP address (remember, I could not do that form
"outside").

After I do `/etc/xen/scripts/network-bridge stop` and then start, dom0
networking comes back. I just do `brctl addif eth0 vifX.0` to attach domUs
back and all works fine till next such event.

Anyone had experienced such thing? I've been running this server for some
three years w/o problems. This problem started around 31 of may. Maybe it
has something to do with any of latest updates?

There are no network spikes, no high load avg, nothing in logfiles,
nothing on xm dmesg, ne domU restarts. I've been searching Google for some
days now, and nothing comes up.

Basicly, I've got the same issue as described here:
http://lists.xensource.com/archives/html/xen-users/2008-03/msg00609.html

Btw, I've got very similar issue twice with other sarver, differend
hardware, different data center, CentOS 5.6, Xen 3.4.2. However, that server
was using only one NIC so I'm not aware if this was excacly the same
problem. I ran into this problem twice, about two months ago. However there
is no errors since then which made me believe problem was solved with some
OS updates? Back then I thought it's beeing hardware error as I had this
server deployed only 3 months before incident.

My config is below.

Many thanks!

# uname -a
Linux doom 2.6.26-2-xen-amd64 #1 SMP Tue Jan 25 06:13:50 UTC 2011 x86_64
GNU/Linux

# xm info
host                   : doom
version                : #1 SMP Tue Jan 25 06:13:50 UTC 2011
machine                : x86_64
nr_cpus                : 8
nr_nodes               : 1
cores_per_socket       : 4
threads_per_core       : 1
cpu_mhz                : 2200
hw_caps                :
178bf3ff:efd3fbff:00000000:00000110:00802001:00000000:000007ff
total_memory           : 32767
free_memory            : 2116
node_to_cpu            : node0:0-7
xen_major              : 3
xen_minor              : 2
xen_extra              : -1
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
cc_compiler            : gcc version 4.3.1 (Debian 4.3.1-2)
cc_compile_by          : waldi
cc_compile_domain      : debian.org
cc_compile_date        : Sat Jun 28 09:32:18 UTC 2008
xend_config_format     : 4


--
Martins Lazdans

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users





--
Martins Lazdans

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>