WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] xen kernel errors

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] xen kernel errors
From: John McMonagle <johnm@xxxxxxxxxxx>
Date: Mon, 25 Jul 2011 07:10:25 -0500
Delivery-date: Mon, 25 Jul 2011 05:11:49 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201107221609.39312.johnm@xxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Organization: Advocap Inc
References: <201107221609.39312.johnm@xxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.13.5 (Linux/2.6.32-5-amd64; KDE/4.4.5; x86_64; ; )
Still having the problem.

Concerning igb error I have tried the following  one at a time:
New igb driver from Intel site.
kernel parameter  pcie_aspm=off
ethtool -K eth0 tx off  on dom0
ethtool -K eth0 gro off  on dom0 

Concerning the enlighten.c error. While it does seem to do anything if it 
concerns me  power management may be it's the real cause of problems.

I got the debian kernel source and line 726 is the line with the *****

static void xen_clts(void)
{
        struct multicall_space mcs;  *****

        mcs = xen_mc_entry(0);

        MULTI_fpu_taskswitch(mcs.mc, 0);

        xen_mc_issue(PARAVIRT_LAZY_CPU);
}

Means nothing to me :-(

It has never died doing iperf from dom0 or domu  <> external.
Never died during network backup.

Usually takes a least a few hours and has never made it a day running a domu.
Any ideas?
I'm pretty much down to different network cards.

John




On Friday, July 22, 2011 04:09:39 pm John McMonagle wrote:
> Have a new amd 6100 based server.
> http://www.supermicro.com/Aplus/system/2U/2022/AS-2022G-URF.cfm
> 
> I'm seeing 2 errors.
> during boot get this:
> 
> [    0.004823] ------------[ cut here ]------------
> [    0.004833] WARNING:
> at
> /build/buildd-linux-2.6_2.6.32-35-amd64-aZSlKL/linux-2.6-2.6.32/debian/bui
> ld/source_amd64_xen/arch/x86/xen/enlighten.c:726
> init_hw_perf_events+0x32d/0x3cd()
> [    0.004838] Hardware name: H8DGU
> [    0.004841] Modules linked in:
> [    0.004847] Pid: 0, comm: swapper Not tainted 2.6.32-5-xen-amd64 #1
> [    0.004850] Call Trace:
> [    0.004857]  [<ffffffff81510efc>] ? init_hw_perf_events+0x32d/0x3cd
> [    0.004862]  [<ffffffff81510efc>] ? init_hw_perf_events+0x32d/0x3cd
> [    0.004870]  [<ffffffff8104ef00>] ? warn_slowpath_common+0x77/0xa3
> [    0.004875]  [<ffffffff81510efc>] ? init_hw_perf_events+0x32d/0x3cd
> [    0.004881]  [<ffffffff813044dc>] ? identify_cpu+0x2f7/0x300
> [    0.004888]  [<ffffffff8100eccf>] ? xen_restore_fl_direct_end+0x0/0x1
> [    0.004895]  [<ffffffff810e81d5>] ? kmem_cache_alloc+0x8c/0xf0
> [    0.004900]  [<ffffffff81510a16>] ? identify_boot_cpu+0x15/0x3e
> [    0.004904]  [<ffffffff81510baa>] ? check_bugs+0x9/0x2e
> [    0.004910]  [<ffffffff81509cce>] ? start_kernel+0x3cd/0x3e8
> [    0.004915]  [<ffffffff8150bc93>] ? xen_start_kernel+0x586/0x58a
> [    0.004926] ---[ end trace a7919e7f17c0a725 ]---
> [    0.004930] ... version:                0
> [    0.004932] ... bit width:              48
> [    0.004935] ... generic registers:      4
> [    0.004938] ... value mask:             0000ffffffffffff
> [    0.004940] ... max period:             00007fffffffffff
> [    0.004943] ... fixed-purpose events:   0
> [    0.004946] ... event mask:             000000000000000f
> 
> Have not noticed any particular problems.
> What is it?
> What can be done?
> 
> Then next one may not be xen but I only had the problem after running a
> domu. After a while I get kernel error and networking stops.
> This is the error:
> [ 1411.813376] ------------[ cut here ]------------
> [ 1411.813398] WARNING:
> at
> /build/buildd-linux-2.6_2.6.32-35-amd64-aZSlKL/linux-2.6-2.6.32/debian/bui
> ld/source_amd64_xen/net/sched/s ch_generic.c:261 dev_watchdog+0xe2/0x194()
> [ 1411.813410] Hardware name: H8DGU
> [ 1411.813417] NETDEV WATCHDOG: peth0 (igb): transmit queue 1 timed out
> [ 1411.813424] Modules linked in: xt_physdev iptable_filter tun ip_tables
> x_tables bridge stp sg sr_mod cdrom xfs exportfs ipmi_si i
> pmi_devintf ipmi_watchdog ipmi_msghandler xen_evtchn blktap xenfs loop
> snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr psmouse joydev evdev
> serio_raw i2c_piix
> 4 edac_core k10temp edac_mce_amd i2c_core processor button acpi_processor
> ext4 mbcache jbd2 crc16 usbhid hid dm_mod raid1 md_mod sd_mod crc_t10dif
> ata_generic usb_s
> torage pata_atiixp ahci ohci_hcd libata ehci_hcd usbcore nls_base scsi_mod
> igb dca thermal thermal_sys [last unloaded: scsi_wait_scan]
>  [ 1411.813656] Pid: 4, comm: ksoftirqd/0 Tainted: G        W
> 2.6.32-5-xen-amd64 #1
> [ 1411.813664] Call Trace:
> [ 1411.813671]  <IRQ>  [<ffffffff81272e42>] ? dev_watchdog+0xe2/0x194
> [ 1411.813697]  [<ffffffff81272e42>] ? dev_watchdog+0xe2/0x194
> [ 1411.813711]  [<ffffffff8104ef00>] ? warn_slowpath_common+0x77/0xa3
> [ 1411.813724]  [<ffffffff81272d60>] ? dev_watchdog+0x0/0x194
> [ 1411.813736]  [<ffffffff8104ef88>] ? warn_slowpath_fmt+0x51/0x59
> [ 1411.813751]  [<ffffffff8130d42a>] ? _spin_unlock_irqrestore+0xd/0xe
> [ 1411.813762]  [<ffffffff8104b41e>] ? try_to_wake_up+0x289/0x29b
> [ 1411.813778]  [<ffffffff81272d34>] ? netif_tx_lock+0x3d/0x69
> [ 1411.813791]  [<ffffffff8125d7da>] ? netdev_drivername+0x3b/0x40
> [ 1411.813803]  [<ffffffff81272e42>] ? dev_watchdog+0xe2/0x194
> [ 1411.813816]  [<ffffffff8100ece2>] ? check_events+0x12/0x20
> [ 1411.813827]  [<ffffffff81040e42>] ? check_preempt_wakeup+0x0/0x268
> [ 1411.813841]  [<ffffffff8105b5ef>] ? run_timer_softirq+0x1c9/0x268
> [ 1411.813855]  [<ffffffff81054c9b>] ? __do_softirq+0xdd/0x1a6
> [ 1411.813867]  [<ffffffff81012cac>] ? call_softirq+0x1c/0x30
> [ 1411.813873]  <EOI>  [<ffffffff8101422b>] ? do_softirq+0x3f/0x7c
> [ 1411.813893]  [<ffffffff810548c2>] ? ksoftirqd+0x5f/0xd3
> [ 1411.813905]  [<ffffffff81054863>] ? ksoftirqd+0x0/0xd3
> [ 1411.813915]  [<ffffffff81065c39>] ? kthread+0x79/0x81
> [ 1411.813926]  [<ffffffff81012baa>] ? child_rip+0xa/0x20
> [ 1411.813937]  [<ffffffff81011d61>] ? int_ret_from_sys_call+0x7/0x1b
> [ 1411.813948]  [<ffffffff8101251d>] ? retint_restore_args+0x5/0x6
> [ 1411.813958]  [<ffffffff81012ba0>] ? child_rip+0x0/0x20
> [ 1411.813966] ---[ end trace a7919e7f17c0a727 ]---
> [ 1412.052253] eth0: port 1(peth0) entering disabled state
> [ 1635.796207] frontend_changed: backend/vbd/3/768: prepare for reconnect
> [ 1647.137513] eth0: port 3(vif3.0) entering disabled state
> [ 1647.157527] eth0: port 3(vif3.0) entering disabled state
>  Kernel logging (proc) stopped.
> 
> In this case dom0 locked up. Some times just networking stops and some
> times networking recovers.
> 
> I downloaded  new igb driver source from Intel and installed it.
> It's running OK so far but it has not been long enough to tell.
> 
> Any ideas?
> 
> John
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>