Reposting a message that did not go through to the list last night.
Steve Timm
--
------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525
timm@xxxxxxxx http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.
---------- Forwarded message ----------
Date: Thu, 17 Apr 2008 22:35:42 -0500 (CDT)
From: Steven Timm <timm@xxxxxxxx>
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: help--dom0 network goes unpingable when xend starts
Six more things about the situation below:
1) network guys have checked out the switch and give us a clean
bill of health.
2) I see this problem on 3 identical machines, each time I lose
the network when I start up the xend.
3) Dell poweredge 2950 (which works) as compared to dell poweredge1950
(which doesn't)--one possible difference is that there is a 2nd
mac address associated with the port which is normally the eth0
on the PE 1950, used for the IPMI controller. but I disabled this,
and also switched the cable to the other port which doesn't have
the extra mac. same problem as before.,
4) i reported below that the domu's can ping the dom0. Turns out
that the dom0 can ping the domu's too.
and 5)
netstat -a on the non-working machine doesn't show all the bridges
that normally show up.
6) Theres a 169.254.0.0 route of unknown origin on this machine
on the same interface that I'm trying to use.
Is there any idea--what to strace? what to tcpdump?
Steve
------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525
timm@xxxxxxxx http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.
On Thu, 17 Apr 2008, Steven Timm wrote:
I installed 64-bit xen 3.1.0 (from xensource.com tarballs) on
three new machines today, using a configuration setup that I've
used successfully many times before. However, I encountered a
new problem.
These are Dell Poweredge 1950 servers, by the way.
From lspci
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708
Gigabit Ethernet (rev 12)
08:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708
Gigabit Ethernet (rev 12)
from lspci
Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.4.44 (August 10, 2006)
ACPI: PCI Interrupt 0000:08:00.0[A] -> GSI 16 (level, low) -> IRQ 16
eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found
at
mem f4000000, IRQ 16, node addr 0019b9ec40ba
ACPI: PCI Interrupt 0000:04:00.0[A] -> GSI 16 (level, low) -> IRQ 16
eth1: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found
at
mem f8000000, IRQ 16, node addr 0019b9ec40b8
---------------
note that the Xen kernel 2.6.18 picks the opposite mac addresses as eth0
from what the redhatized non-xen kernel does. This is
undone by ifcfg-eth0.
When the xen kernel boots, before xend starts, I can see the outside network
just fine.
[root@fnpcsrv3 xen]# netstat -nNr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt
Iface
192.168.167.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
131.225.166.0 0.0.0.0 255.255.254.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth1
0.0.0.0 131.225.167.200 0.0.0.0 UG 0 0 0 eth0
[root@fnpcsrv3 xen]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:19:B9:EC:40:B8
inet addr:131.225.166.97 Bcast:131.225.167.255 Mask:255.255.254.0
inet6 addr: fe80::219:b9ff:feec:40b8/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:591697 errors:0 dropped:0 overruns:0 frame:0
TX packets:3060 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:38067586 (36.3 MiB) TX bytes:395536 (386.2 KiB)
Interrupt:16 Memory:f8000000-f8011100
eth1 Link encap:Ethernet HWaddr 00:19:B9:EC:40:BA
inet addr:192.168.167.3 Bcast:192.168.167.255 Mask:255.255.255.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:16 Memory:f4000000-f4011100
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:192 errors:0 dropped:0 overruns:0 frame:0
TX packets:192 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:15314 (14.9 KiB) TX bytes:15314 (14.9 KiB)
--------------------------------------------------------------
Now here's ifconfig from an identical system once xend is turned on
[root@fnpcsrv5 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:19:B9:EC:4A:21
inet addr:131.225.166.100 Bcast:131.225.167.255 Mask:255.255.254.0
inet6 addr: fe80::219:b9ff:feec:4a21/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:508292 errors:0 dropped:0 overruns:0 frame:0
TX packets:33 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:30786266 (29.3 MiB) TX bytes:1658 (1.6 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:21 errors:0 dropped:0 overruns:0 frame:0
TX packets:21 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1916 (1.8 KiB) TX bytes:1916 (1.8 KiB)
peth0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF
inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
UP BROADCAST RUNNING NOARP MTU:1500 Metric:1
RX packets:523679 errors:0 dropped:0 overruns:0 frame:0
TX packets:15964 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:33836052 (32.2 MiB) TX bytes:1132609 (1.0 MiB)
Interrupt:16 Memory:f4000000-f4011100
vif0.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF
inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
UP BROADCAST RUNNING NOARP MTU:1500 Metric:1
RX packets:33 errors:0 dropped:0 overruns:0 frame:0
TX packets:508293 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1658 (1.6 KiB) TX bytes:30786336 (29.3 MiB)
vif1.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF
inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
UP BROADCAST RUNNING NOARP MTU:1500 Metric:1
RX packets:7848 errors:0 dropped:0 overruns:0 frame:0
TX packets:499340 errors:0 dropped:159 overruns:0 carrier:0
collisions:0 txqueuelen:32
RX bytes:347417 (339.2 KiB) TX bytes:30239848 (28.8 MiB)
vif2.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF
inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
UP BROADCAST RUNNING NOARP MTU:1500 Metric:1
RX packets:7867 errors:0 dropped:0 overruns:0 frame:0
TX packets:496186 errors:0 dropped:191 overruns:0 carrier:0
collisions:0 txqueuelen:32
RX bytes:346478 (338.3 KiB) TX bytes:30050363 (28.6 MiB)
xenbr0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:508099 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:23650570 (22.5 MiB) TX bytes:90 (90.0 b)
xenbr1 Link encap:Ethernet HWaddr 00:00:00:00:00:00
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:468 (468.0 b)
------------------------------------------------------------------------
As part of the debugging I dialed back my configuration, which normally
has a xenbr0 and a xenbr1, to use a just a xenbr0 and have just one
network interface on each domU and on the dom0. Nevertheless the
problem is the same and I can't seem to get rid of the xenbr1.
I note that at the start of the xend there is a timeout of about 10-15
seconds as it is trying to turn on the second bridge.
Now here is the really strange part. While logged in on the
console of the dom0, I can go ahead and start xen domU's, and they
go ahead and boot up normally and can see the outside network fine.
[root@fnpcsrv5 ~]# xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 1953 8 r----- 127.3
fnpc5x1 1 6000 4 -b---- 23.1
fnpc5x4 2 2000 1 -b---- 20.6
[root@fnpcsrv5 ~]#
Oh, and by the way, dom0 is pingable from the domU's although
it cannot be seen from the outside net.
What should I be looking at?
Steve Timm
------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525
timm@xxxxxxxx http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|