|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] domU network has sleeping sickness
Steven Timm wrote:
> I've seen the same problem with my xen 3.1.0 setup. What
> the Xen gurus are telling us is that this is a symptom of Xen dom0
> being busy and not servicing the network interrupts of the domu's
> promptly. Their advice to us was to shift an application that
> had been running on dom0 to another Xen instance to see if that
> would help. We are in the process of implementing that solution now.
>
There is nothing running on my dom0's. They're only purpose is managing
the domU's.
On one of the problematic XEN-hosts is actually load on the three
domU's, they are serving continous build systems. But another sleepy
XEN-host with five domU's is more or less in pre-production state and
idling.
> By the way my system (Dell poweredge2950) has got broadcomm
> inbuilt network cards, not Intel E1000 so it is unlikely that
> it is a network driver specific issue.
>
> During these episodes of non-network connectivity, by the way,
> it was not unusual to see the following kernel dump in dom0
>
I do'nt find anything helpful or suspicious in any log. But maybe I'm
missing it.
I'm looking in dom0 in dmesg, messages, warn, xend-debug.log, xend.log
and xen-hotplug.log and in the domU in dmesg, messages and warn.
But after the bootup process there is more or less nothing important logged.
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: Call Trace:
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: <IRQ>
> [<ffffffff8025
> 8269>] softlockup_tick+0xcc/0xde
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel:
> [<ffffffff8020e84d>]
> timer_interrupt+0x3a3/0x401
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel:
> [<ffffffff80258898>]
> handle_IRQ_event+0x4b/0x93
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel:
> [<ffffffff8025897e>]
> __do_IRQ+0x9e/0x100
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel:
> [<ffffffff8020cc97>]
> do_IRQ+0x63/0x71
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel:
> [<ffffffff8034b347>]
> evtchn_do_upcall+0xee/0x165
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel:
> [<ffffffff8020abca>]
> do_hypervisor_callback+0x1e/0x2c
> 2008-02-05T18:35:16-06:00 s_sys@xxxxxxxxxxxxxxxxxxx kernel: <EOI>
>
> or
>
> Feb 25 10:32:39 fermigrid6 kernel: BUG: soft lockup detected on CPU#0!
> Feb 25 10:32:39 fermigrid6 kernel:
> Feb 25 10:32:39 fermigrid6 kernel: Call Trace:
> Feb 25 10:32:39 fermigrid6 kernel: <IRQ> [<ffffffff80258269>]
> softlockup_tick+0xcc/0xde
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020e84d>]
> timer_interrupt+0x3a3/0x401
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff80258898>]
> handle_IRQ_event+0x4b/0x93
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8025897e>]
> __do_IRQ+0x9e/0x100
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020cc97>] do_IRQ+0x63/0x71
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8034b347>]
> evtchn_do_upcall+0xee/0x165
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020abca>]
> do_hypervisor_callback+0x1e/0x2c
> Feb 25 10:32:39 fermigrid6 kernel: <EOI> [<ffffffff8020622a>]
> hypercall_page+0x22a/0x1000
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020622a>]
> hypercall_page+0x22a/0x1000
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8034b258>]
> force_evtchn_callback+0xa/0xb
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff803f2272>]
> thread_return+0xdf/0x119
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020622a>]
> hypercall_page+0x22a/0x1000
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff80228a25>]
> __cond_resched+0x1c/0x44
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff803f25df>]
> cond_resched+0x37/0x42
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff802343c4>]
> ksoftirqd+0x0/0xbf
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff80234432>]
> ksoftirqd+0x6e/0xbf
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff802422d7>]
> kthread+0xc8/0xf1
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020ae1c>]
> child_rip+0xa/0x12
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8024220f>] kthread+0x0/0xf1
> Feb 25 10:32:39 fermigrid6 kernel: [<ffffffff8020ae12>]
> child_rip+0x0/0x12
>
> ----------------
>
> One of our dom0's was running an LVS server, the other one on
> identical hardware was not. We moved the LVS server from one to the
> other and
> the network problems and kernel panics followed it.
>
> Steve Timm
>
> On Mon, 3 Mar 2008, Marc Teichgraeber wrote:
>
>> Hi all,
>>
>> I have a strange network problem with some domU's on three XEN-Hosts.
>> They are loosing their network connectivity. I do bridged networking.
>> * It happens randomly and could happen right after bootup of the domU
>> or anytime later.
>> * The domU is not reachable from another host on the LAN.
>> * The domU is always reachable from the dom0 (ssh, ping).
>> * I can 'repair' the connection when attaching to the console and
>> ping out from the domU. First nothings happens, then the machine gets
>> back their network. (And thats also my momentary workaround, pinging all
>> the time from the console)
>> * Pinging from another host at the same time helps too.
>> * It could be that I can ping continously from one host and another
>> hosts gets only every 10th packet or so back.
>> * The interfaces could come back from their sleep by itself.
>> * When the networks has fallen asleep, ssh on the domU from another
>> host hangs, it does not come back with "no route to host" or something.
>>
>> I'm suspicious about the network controllers, they are the same on all
>> hosts: "Intel Corporation 80003ES2LAN Gigabit Ethernet Controller
>> (Copper)"(lspci) some kind of "Intel® PRO/1000 EB Network Connection
>> with I/O Acceleration"(Intel website). I've tried the latest e1000
>> driver from Intel but it does'nt helped.
>> I've checked all MAC Adresses, they are unique, also the IP Adresses.
>>
>> Any ideas are welcome :)
>>
>> -------------------------------------------------------------------------
>>
>> "xm info" from host1, openSUSE 10.2 (X86-64):
>>
>> release : 2.6.18.8-0.9-xen
>> version : #1 SMP Sun Feb 10 22:48:05 UTC 2008
>> machine : x86_64
>> nr_cpus : 4
>> nr_nodes : 1
>> sockets_per_node : 2
>> cores_per_socket : 2
>> threads_per_core : 1
>> cpu_mhz : 2327
>> hw_caps :
>> bfebfbff:20100800:00000000:00000140:0004e3bd:00000000:00000001
>> total_memory : 32766
>> free_memory : 21607
>> max_free_memory : 21607
>> max_para_memory : 21603
>> max_hvm_memory : 21544
>> xen_major : 3
>> xen_minor : 0
>> xen_extra : .3_11774-23
>> xen_caps : xen-3.0-x86_64
>> xen_pagesize : 4096
>> platform_params : virt_start=0xffff800000000000
>> xen_changeset : 11774
>> cc_compiler : gcc version 4.1.2 20061115 (prerelease) (SUSE
>> Linux)
>> cc_compile_by : abuild
>> cc_compile_domain : suse.de
>> cc_compile_date : Thu Jan 10 21:22:54 UTC 2008
>> xend_config_format : 2
>> -------------------------------------------------------------------------
>>
>> "xm info" output on host2, openSUSE 10.3 (X86-64)
>>
>> release : 2.6.22.13-0.3-xen
>> version : #1 SMP 2007/11/19 15:02:58 UTC
>> machine : x86_64
>> nr_cpus : 8
>> nr_nodes : 1
>> sockets_per_node : 2
>> cores_per_socket : 4
>> threads_per_core : 1
>> cpu_mhz : 3000
>> hw_caps :
>> bfebfbff:20100800:00000000:00000140:0004e3bd:00000000:00000001
>> total_memory : 16382
>> free_memory : 591
>> max_free_memory : 591
>> max_para_memory : 587
>> max_hvm_memory : 577
>> xen_major : 3
>> xen_minor : 1
>> xen_extra : .0_15042-51
>> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p
>> xen_scheduler : credit
>> xen_pagesize : 4096
>> platform_params : virt_start=0xffff800000000000
>> xen_changeset : 15042
>> cc_compiler : gcc version 4.2.1 (SUSE Linux)
>> cc_compile_by : abuild
>> cc_compile_domain : suse.de
>> cc_compile_date : Tue Sep 25 21:16:06 UTC 2007
>> xend_config_format : 4
>>
>>
>
--
--------------------------------
Marc Teichgraeber
Systemadministrator
Systemadministration
neofonie GmbH
Robert-Koch-Platz 4
10115 Berlin
fon: +49.30 24627 185
fax: +49.30 24627 120
marc.teichgraeber@xxxxxxxxxxx
http://www.neofonie.de
Handelsregister
Berlin-Charlottenburg: HRB 67460
Geschaeftsfuehrung
Helmut Hoffer von Ankershoffen
Nurhan Yildirim
--------------------------------
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|
|
|