WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] vbd devices stuck in Initialising/InitWait

To: Ewan Mellor <ewan@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] vbd devices stuck in Initialising/InitWait
From: "Christopher S. Aker" <caker@xxxxxxxxxxxx>
Date: Wed, 29 Mar 2006 17:15:25 -0600
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 29 Mar 2006 23:17:20 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20060329085320.GA31336@xxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4429C801.7070900@xxxxxxxxxxxx> <20060329085320.GA31336@xxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5 (Windows/20051201)
Ewan Mellor wrote:
That sounds to me like the guest kernel is crashing or deadlocking.  Do you
see anything on the guest console?  Perhaps you could put #define DEBUG 1 at
the top of blkfront/block.h and xenbus/xenbus_probe.c for your guest kernel,
and try and figure out where it is getting stuck.

The domains aren't crashing since they will boot, provided the root device isn't among the missing...

I've added DEBUG a bunch of printk's to talk_to_backend() and xenbus_switch_state(). The outputs are here:

http://www.theshore.net/~caker/xen/InitWait/dmesg-working.txt
http://www.theshore.net/~caker/xen/InitWait/dmesg-not_working.txt

At the bottom of both of those files, I've pasted in just the debugging messages in order.

There are two main differences:

When devices are missing, talk_to_backend() is making duplicate calls for the same vbd to xenbus_switch_state(), and on the second call xenbus_switch_state avoids writing to xenstore an identical value (which it's supposed to). Why the duplicate calls?

Second difference: even though there were two calls to xenbus_switch_state) to set the state to 3, later on xenbus_probe only detects the state as 2.

talk_to_backend - about to call xenbus_switch_state
xenbus_switch_state() nodename=device/vbd/770 state=3 - entering
xenbus_switch_state() nodename=device/vbd/770 state=3 - finished
talk_to_backend - about to call xenbus_switch_state
xenbus_switch_state() nodename=device/vbd/770 state=3 - entering
xenbus_switch_state() nodename=device/vbd/770 state=3 - state == dev->state

but then:
xenbus_probe (otherend_changed:302) state is 2, /local/domain/0/backend/vbd/151/770/state, /local/domain/0/backend/vbd/151/770/state.

So, is this a deadlock or locking issue like you suspected?

Thanks,
-Chris

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel