WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] domU network interface half-dies regularly

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] domU network interface half-dies regularly
From: Daniel De Marco <ddm@xxxxxxxxxxxxxxx>
Date: Tue, 16 Mar 2010 13:20:08 -0400
Cc: Mariusz Mazur <mmazur@xxxxxxxxx>
Delivery-date: Tue, 16 Mar 2010 10:21:19 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201003151207.05788.mmazur@xxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Mail-followup-to: xen-users@xxxxxxxxxxxxxxxxxxx, Mariusz Mazur <mmazur@xxxxxxxxx>
References: <201003151207.05788.mmazur@xxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.19 (2009-01-05)
I had the same problem yesterday. One of the domU running on a server
had the same symptoms: TX counter stopped while the RX one was
increasing normally. 
I'm running Centos 5.3 with 2.6.18-92.1.22.el5xen on the dom0 and CentOS
5.2 with 2.6.18-92.1.22.el5xen on the domU.

Rebooting the domU solvs the problem, but it isn't an attractive
solution...

Daniel.

* Mariusz Mazur <mmazur@xxxxxxxxx> [03/15/2010 07:10]:
> I'm trying to figure out how to debug this. Any suggestions would be 
> appreciated.
> 
> Every once in a while a random domU on a random xen server of ours has its 
> network interface die. I've recently figured out what the exact symptoms are: 
> TX count on that interface (as seen from inside the domU) stops increasing. 
> There's no way of actually sending anything from within the domU. Even arp 
> packets aren't sent. Everything works fine with receiving packets however.
> 
> Of the things I did check:
> - Doing an ip set link down/up on both dom0/domU doesn't do anything. 
> - Removing/reattaching the dom0 interface from/to its bridge doesn't help.
> - It's interface-specific. I'm currently logged onto a domU that has one of 
> its net interfaces half-dead as described, but the other perfectly functional.
> - Interestingly, the problem prevents "xm save" from working. It timeouts 
> without anything getting written to disk (except a kilobyte or so of, I'm 
> guessing, some headers).
> - I'm seeing this problem across:
>   - 2.6.18 xen.org dom0 3.3.X and 3.4.X
>   - xen.org hypervisor 3.3.X and 3.4.X
>   - domU xen.org 2.6.18.8_xen3.3.0U
>   - kernel.org 2.6.29.6 (pvops)
>   - A few different machines from different vendors.
> - Nothing in dom0/domU kernel logs.
> 
> Whatever the cause is, I seriously doubt it's domU's fault, considering I'm 
> seeing the problem on both xen.org and kernel.org domU kernels. I also don't 
> know what the trigger is (plus, those are production systems), so enabling a 
> bunch of DEBUG prints in xen isn't much of an option.
> 
> Any suggestions/hints on where to look next? I'm guessing there are ways of 
> inspecting various network code structures.
> 
> --mmazur
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users