WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] network dropouts

To: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] network dropouts
From: Jan Kundrát <jan.kundrat@xxxxxx>
Date: Fri, 17 Dec 2004 19:37:51 +0100
Cc: Ian Pratt <Ian.Pratt@xxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 17 Dec 2004 18:39:44 +0000
Envelope-to: xen+James.Bulpin@xxxxxxxxxxxx
In-reply-to: <E1CeAyG-0007I7-00@xxxxxxxxxxxxxxxxx>
List-archive: <http://sourceforge.net/mailarchive/forum.php?forum=xen-devel>
List-help: <mailto:xen-devel-request@lists.sourceforge.net?subject=help>
List-id: List for Xen developers <xen-devel.lists.sourceforge.net>
List-post: <mailto:xen-devel@lists.sourceforge.net>
List-subscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=subscribe>
List-unsubscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=unsubscribe>
References: <E1CeAyG-0007I7-00@xxxxxxxxxxxxxxxxx>
Sender: xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 0.9 (X11/20041130)
Keir Fraser wrote:
no, I'm setting it manually via `ifconfig` in init script. Notice the following lines in kernel log:

Dec 12 20:33:23 zirafa xen-br0: port 1(eth0) entering disabled state
Dec 12 20:34:23 zirafa xen-br0: port 1(eth0) entering learning state

Why is eth0 (the only physical device in xen-br0) getting into disabled state?


My guess would be that the bridge code is receiving carrier-change
events from eth0. This causes it to put eth0 in disabled state for a
while.
One way to check this would be to add some printk()'s to
net/bridge/br_notify.c and see whether you are getting NETDEV_CHANGE
or NETDEV_DOWN events. If so, it may be that your physical connection,
or your router/switch/hub, is a bit dodgy.

None of the other paths via which the interface may get disabled seem
very likely to occur, but we can look at those if it doesn't appear
that you are getting NETDEV events.

Hi, I finally managed to get it. Here is my patch to net/bridge/br_notify.c:

--- br_notify.c.original        2004-11-14 15:31:58.000000000 +0100
+++ br_notify.c 2004-12-17 13:57:19.000000000 +0100
@@ -45,37 +45,58 @@
        switch (event) {
        case NETDEV_CHANGEMTU:
                dev_set_mtu(br->dev, br_min_mtu(br));
+               printk("NETDEV_CHANGEMTU\n");
                break;

        case NETDEV_CHANGEADDR:
                br_fdb_changeaddr(p, dev->dev_addr);
                br_stp_recalculate_bridge_id(br);
+               printk("NETDEV_CHANGEADDR\n");
                break;

        case NETDEV_CHANGE:     /* device is up but carrier changed */
-               if (!(br->dev->flags & IFF_UP))
+               printk("NETDEV_CHANGE ");
+               if (!(br->dev->flags & IFF_UP)) {
+                       printk("- !(br->dev->flags & IFF_UP)==true\n");
                        break;
+               }

                if (netif_carrier_ok(dev)) {
-                       if (p->state == BR_STATE_DISABLED)
+                       printk("- netif_carrier_ok(dev)==true ");
+                       if (p->state == BR_STATE_DISABLED) {
+                               printk("- calling br_stp_enable_port(p)");
                                br_stp_enable_port(p);
+                       }
                } else {
-                       if (p->state != BR_STATE_DISABLED)
+                       printk("- netif_carrier_ok(dev)!=true ");
+                       if (p->state != BR_STATE_DISABLED) {
+                               printk(" - calling br_stp_disable_port(p)");
                                br_stp_disable_port(p);
+                       }
                }
+               printk("\n");
                break;

        case NETDEV_DOWN:
-               if (br->dev->flags & IFF_UP)
+               printk("NETDEV_DOWN ");
+               if (br->dev->flags & IFF_UP) {
+                       printk("- calling br_stp_disable_port(p)");
                        br_stp_disable_port(p);
+               }
+               printk("\n");
                break;

        case NETDEV_UP:
-               if (netif_carrier_ok(dev) && (br->dev->flags & IFF_UP))
+               printk("NETDEV_UP ");
+               if (netif_carrier_ok(dev) && (br->dev->flags & IFF_UP)) {
+                       printk("- calling br_stp_enable_port(p)");
                        br_stp_enable_port(p);
+               }
+               printk("\n");
                break;

        case NETDEV_UNREGISTER:
+               printk("NETDEV_UNREGISTER\n");
                spin_unlock_bh(&br->lock);
                br_del_if(br, dev);
                goto done;

and messages from syslog (obviously my patch is wrong as it'ss putting several messages together (missing "\n")):

Dec 17 18:37:02 zirafa NETDEV_CHANGE - netif_carrier_ok(dev)!=true - calling br_stp_disable_port(p)<6>xen-br0: port 1(eth0) entering disabled state
Dec 17 18:37:02 zirafa
Dec 17 18:38:02 zirafa NETDEV_CHANGE - netif_carrier_ok(dev)==true - calling br_stp_enable_port(p)<6>xen-br0: port 1(eth0) entering learning state
Dec 17 18:38:02 zirafa
Dec 17 18:38:02 zirafa xen-br0: topology change detected, propagating
Dec 17 18:38:02 zirafa xen-br0: port 1(eth0) entering forwarding state

However, `ifconfig` says 0 for carrier:

eth0      Link encap:Ethernet  HWaddr 00:10:4B:B6:BD:0E
          inet addr:10.18.6.60  Bcast:10.18.6.63  Mask:255.255.255.192
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3382 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3164 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:312112 (304.7 Kb)  TX bytes:334955 (327.1 Kb)
          Interrupt:11 Base address:0xe400
xen-br0   Link encap:Ethernet  HWaddr 00:10:4B:B6:BD:0E
          inet addr:10.18.6.60  Bcast:10.18.6.63  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3186 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2971 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:246738 (240.9 Kb)  TX bytes:307288 (300.0 Kb)

Xen box has 3c590 card which is connected by cat5 1m long patch cable into TP-Link TL-SL3210 managable switch (which has 8*100Mbps, 1*1Gbps and 1 GBIC slot) which was quite expensive, so I don't think it's such a piece of s*** :-), but I might be mistaken (hope I am not).

Possible workaround could be setting up routing between domain0 and other domains, but it won't be a simple task.

TIA,
jkt

--
cd /local/pub && more beer > /dev/mouth



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel