WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Error: Device 0 (vif) could notbeconnected. Hotplugscrip

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Error: Device 0 (vif) could notbeconnected. Hotplugscripts not working
From: Helmut Wieser <helmut.wieser@xxxxxxxx>
Date: Wed, 02 Jun 2010 09:29:14 +0200
Delivery-date: Wed, 02 Jun 2010 00:30:32 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4C04D89C.8060009@xxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <4BFFA3A5.8010500@xxxxxxxx> <AANLkTilyjvSu7Zpwbjspn1djQSroLM_1ZCu8WBUvCdQG@xxxxxxxxxxxxxx> <4C03C3FA.4080308@xxxxxxxx> <201005311627.05455.nd@xxxxxxxxxxxx> <4C04B0A5.5030709@xxxxxxxx> <4C04D89C.8060009@xxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.9) Gecko/20100317 Lightning/1.0b1 Thunderbird/3.0.4
Ok, now this is starting to get interesting. I previously had xen-netback statically compiled into the kernel.
It's hard to debug static drivers, so I changed it to compile as a module. And lo and behold, the kernel oops disappeared.

It is not a stable solution though. Sometimes I still get the same oops, it seems to be a race condition.
I'm running 2.6.32.14 from Jeremy's xen/stable-2.6.32.x


On 01.06.2010 11:53, Helmut Wieser wrote:
Doesn't seem to make a difference.
I even downgraded to udevd 141, no change.

I found my problem here, and applied the patch from http://lists.xensource.com/archives/html/xen-devel/2010-05/msg01462.html
But as it's incomplete it didn't help me with my configuration.
I even tried to compile 2.6.32.14 and still have the same issue.

This is the relevant part of my drivers/xen/netback/netbus.c:
static int netback_uevent(struct xenbus_device *xdev, struct kobj_uevent_env *env)
{
        struct backend_info *be;
        struct xen_netif *netif;
        char *val;

        DPRINTK("netback_uevent");

        be = dev_get_drvdata(&xdev->dev);
        if (!be)
                return 0;
        netif = be->netif;

        val = xenbus_read(XBT_NIL, xdev->nodename, "script", NULL);
        if (IS_ERR(val)) {
                int err = PTR_ERR(val);
                xenbus_dev_fatal(xdev, err, "reading script");
                return err;
        }
        else {
                if (add_uevent_var(env, "script=%s", val)) {
                        kfree(val);
                        return -ENOMEM;
                }
                kfree(val);
        }

        if (add_uevent_var(env, "vif=%s", netif->dev->name))
                return -ENOMEM;

        return 0;
}

This is the dmesg when I start a hvm domU for the first time:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
IP: [<ffffffff8123610a>] netback_uevent+0x8e/0xbf
PGD 1e2bc067 PUD 1dd03067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/vif-1-0/uevent
CPU 7
Modules linked in: bridge stp llc ipv6 xen_netfront firewire_sbp2 snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd tpm_tis soundcore tpm serio_raw snd_page_alloc pcspkr tpm_bios wmi firewire_ohci usb_storage firewire_core crc_itu_t tg3 floppy [last unloaded: scsi_wait_scan]
Pid: 2141, comm: udevd Not tainted 2.6.32.14 #6 HP Z600 Workstation
RIP: e030:[<ffffffff8123610a>]  [<ffffffff8123610a>] netback_uevent+0x8e/0xbf
RSP: e02b:ffff88001d21fda8  EFLAGS: 00010246
RAX: 00200000000000c1 RBX: ffff88001cccde00 RCX: 0000000000800046
RDX: ffff88001d7e3b00 RSI: ffffea00006739a8 RDI: 00200000000002c0
RBP: ffff88001d21fdc8 R08: 0000000000000000 R09: ffffffff815c7cf0
R10: ffff88001e292904 R11: ffff88001e292154 R12: ffff88001e292000
R13: 0000000000000000 R14: ffff88001d7e3b80 R15: ffff88001e7be000
FS:  00007f7591154790(0000) GS:ffff880002ca2000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000110 CR3: 000000001d240000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process udevd (pid: 2141, threadinfo ffff88001d21e000, task ffff88001e6a16e0)
Stack:
 ffff88001cccde40 ffff88001e292000 ffff88001cccde00 ffffffff815fe0e8
<0> ffff88001d21fdf8 ffffffff8122badc ffff88001cccde40 ffff88001e292000
<0> ffff88001fc49f60 ffff88001cccde50 ffff88001d21fe28 ffffffff81266772
Call Trace:
 [<ffffffff8122badc>] xenbus_uevent_backend+0x90/0xab
 [<ffffffff81266772>] dev_uevent+0x102/0x146
 [<ffffffff81267459>] show_uevent+0x81/0xd8
 [<ffffffff81266434>] dev_attr_show+0x22/0x49
 [<ffffffff810a0e41>] ? __get_free_pages+0x9/0x46
 [<ffffffff8112561c>] sysfs_read_file+0xac/0x12e
 [<ffffffff810d459f>] vfs_read+0xa6/0x103
 [<ffffffff810d46b2>] sys_read+0x45/0x69
 [<ffffffff81012a82>] system_call_fastpath+0x16/0x1b
Code: c6 79 48 53 81 31 c0 4c 89 e7 e8 ea 03 f8 ff 85 c0 74 10 4c 89 f7 41 bc f4 ff ff ff e8 99 3e e9 ff eb 2d 4c 89 f7 e8 8f 3e e9 ff <49> 8b 95 10 01 00 00 4c 89 e7 31 c0 48 c7 c6 83 48 53 81 41 bc
RIP  [<ffffffff8123610a>] netback_uevent+0x8e/0xbf
 RSP <ffff88001d21fda8>
CR2: 0000000000000110
---[ end trace 4f88c9bf70342ee1 ]---

I don't get it, because the patch is supposed to prevent null pointers. Either xdev itself is corrupt, or returning corrupt data.
I'm stumped.


On 01.06.2010 09:03, Helmut Wieser wrote:
No joy. I couldn't find out what CONFIG_XEN_SYSFS does, but it doesn't seem to be part of 2.6.31.13. I set all the other options apart from wireless that you suggested.

I'll try to use Zhang Enming's kernel config next.

Oh, and of course I get the infamous oops from bug 1612, I just never noticed it because my console doesn't work with gfx passthru.
Here's the output:

[  167.571125]   alloc irq_desc for 826 on node 0
[  167.571131]   alloc kstat_irqs on node 0
[  167.724755] BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
[  167.724943] IP: [<ffffffff8124ad7c>] netback_uevent+0x90/0xd5
[  167.725066] PGD 1d5e6067 PUD 1ddd2067 PMD 0
[  167.725296] Oops: 0000 [#1] SMP
[  167.725472] last sysfs file: /sys/devices/vif-1-0/uevent
[  167.725544] CPU 2
[  167.725653] Modules linked in: bridge stp xenfs blktap pci_hotplug xen_blkfront xen_netfront xen_evtchn loop firewire_sbp2 snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep usbhid snd_pcm snd_timer snd hid wmi processor soundcore acpi_processor pcspkr psmouse snd_page_alloc serio_raw button evdev ext3 jbd mbcache usb_storage sr_mod sd_mod crc_t10dif cdrom tg3 firewire_ohci floppy thermal ahci firewire_core libphy libata thermal_sys crc_itu_t scsi_mod uhci_hcd ehci_hcd usbcore nls_base [last unloaded: scsi_wait_scan]
[  167.728264] Pid: 1955, comm: udevd Not tainted 2.6.31.13 #3 HP Z600 Workstation
[  167.728356] RIP: e030:[<ffffffff8124ad7c>]  [<ffffffff8124ad7c>] netback_uevent+0x90/0xd5
[  167.728497] RSP: e02b:ffff880002095d98  EFLAGS: 00010246
[  167.728569] RAX: 0000000000000000 RBX: ffff88000231ea00 RCX: 000000000080007c
[  167.728644] RDX: ffff88001d3ab440 RSI: 00000000a3c9a148 RDI: 01000000000002c0
[  167.728722] RBP: ffff88000244e000 R08: 0000000000000000 R09: 0000000000000000
[  167.728797] R10: ffffffff8100eddf R11: 00000000a3c9a148 R12: 0000000000000000
[  167.728873] R13: ffff88001d3ab680 R14: ffff88001d5b9000 R15: ffffffff81502b30
[  167.728955] FS:  00007fdb8f1c5790(0000) GS:ffffc9000002e000(0000) knlGS:0000000000000000
[  167.729051] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  167.729128] CR2: 0000000000000110 CR3: 000000001dcfc000 CR4: 0000000000002660
[  167.729212] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  167.729303] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  167.729405] Process udevd (pid: 1955, threadinfo ffff880002094000, task ffff88001d5206c0)
[  167.729525] Stack:
[  167.729609]  ffffffff814608a6 00000000a3c9a148 0000000000000002 ffff88000231ea40
[  167.729835] <0> ffff88000244e000 ffff88000231ea50 ffff88000231ea50 ffffffff81283264
[  167.730179] <0> 00007fdb00000010 00000000a3c9a148 ffff880002095f50 0000000000000000
[  167.730566] Call Trace:
[  167.730639]  [<ffffffff81283264>] ? dev_uevent+0x1a2/0x207
[  167.730714]  [<ffffffff81284725>] ? show_uevent+0x92/0xfd
[  167.730790]  [<ffffffff81282e8b>] ? dev_attr_show+0x2e/0x6b
[  167.730873]  [<ffffffff810ce1b0>] ? get_zeroed_page+0x21/0x76
[  167.730957]  [<ffffffff811647fb>] ? sysfs_read_file+0xbb/0x156
[  167.731051]  [<ffffffff8100e301>] ? xen_force_evtchn_callback+0x1d/0x37
[  167.731133]  [<ffffffff8100edf2>] ? check_events+0x12/0x20
[  167.731209]  [<ffffffff81105f91>] ? vfs_read+0xb1/0x123
[  167.731285]  [<ffffffff8100eddf>] ? xen_restore_fl_direct_end+0x0/0x1
[  167.731366]  [<ffffffff81384013>] ? _spin_unlock_irqrestore+0x24/0x3e
[  167.731445]  [<ffffffff811060eb>] ? sys_read+0x55/0x90
[  167.731527]  [<ffffffff81013e42>] ? system_call_fastpath+0x16/0x1b
[  167.731619] Code: c7 c6 33 1f 45 81 31 c0 48 89 ef e8 17 f0 f7 ff 85 c0 74 0f 4c 89 ef bd f4 ff ff ff e8 66 3e eb ff eb 2b 4c 89 ef e8 5c 3e eb ff <49> 8b 94 24 10 01 00 00 48 89 ef 31 c0 48 c7 c6 3d 1f 45 81 e8
[  167.734689] RIP  [<ffffffff8124ad7c>] netback_uevent+0x90/0xd5
[  167.734818]  RSP <ffff880002095d98>
[  167.734890] CR2: 0000000000000110
[  167.734966] ---[ end trace d844f79248755c84 ]---
[  167.927689] device vif1.0 entered promiscuous mode
[  167.938021] eth0: port 2(vif1.0) entering forwarding state
[  168.100416] ip_tables: (C) 2000-2006 Netfilter Core Team
[  168.316187] nf_conntrack version 0.5.0 (4044 buckets, 16176 max)
[  168.316778] CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
[  168.316873] nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or
[  168.316966] sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
[  168.427110] physdev match: using --physdev-out in the OUTPUT, FORWARD and POSTROUTING chains for non-bridged traffic is not supported anymore.
[  168.620471] tun: Universal TUN/TAP device driver, 1.6
[  168.620550] tun: (C) 1999-2004 Max Krasnyansky <maxk@xxxxxxxxxxxx>
[  168.668587] device tap1.0 entered promiscuous mode
[  168.668694] eth0: port 3(tap1.0) entering forwarding state
[  168.688096]   alloc irq_desc for 825 on node 0
[  168.688170]   alloc kstat_irqs on node 0

But the machine comes up for the first time and everything seems to be working fine.

I use udevd 151.


On 31.05.2010 16:27, Niels Dettenbach wrote:
Am Montag 31 Mai 2010, 16:13:14 schrieb Helmut Wieser:
  
No, this doesn't help.
I'm currently trying to ditch the debian kernel and compiling one of 
jeremy's kernels with a config close to the one from debian.
    
...you may try this:

 1.) make shure udev is <=151 (i use 141 currently)


 2.) set in your xen kernel (if not): 

CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSFS_DEPRECATED_V2=y
CONFIG_ACPI_SYSFS_POWER=y
CONFIG_WIRELESS_EXT_SYSFS=y    *
CONFIG_GPIO_SYSFS=y  *
CONFIG_VIDEO_PVRUSB2_SYSFS=y
CONFIG_RTC_INTF_SYSFS=y
CONFIG_XEN_SYSFS=y
CONFIG_SYSFS=y

(* only if applies to your hardware)

(not shure if it's optimal but seems to work for me with 3.4x and 4.x)

=> reboot


 3.) make a
	mount -t sysfs sys /sys

=> if you still have any sysfs mounted you might try to unmount it before this 
step

I have a line
sys                     /sys            sysfs           auto 0 0

in my fstab which seems to help...

May be this is widely waste but it seems to help me - so pls don't hit me... 
;)

Another thing is that you might have fractions of your (to new) udev config 
from before downgrading.

I'm working with gentoo which compiles things as i want so i'm not fully in 
the view what your distributor and package management might does well and what 
not with your (udev) configs...


may be this helps,


Niels.

- 

  
_______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
_______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
_______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users