WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] xend falls over *a lot* in past 2 weeks

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] xend falls over *a lot* in past 2 weeks
From: Dan Smith <danms@xxxxxxxxxx>
Date: Tue, 13 Sep 2005 13:03:04 -0700
Cc: Sean Dague <sean@xxxxxxxxx>
Delivery-date: Tue, 13 Sep 2005 20:18:01 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20050913193543.GA7021@xxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.3 (gnu/linux)
SD> Any thoughts on why this is might be the case?

So, as far as I can tell, there is some state being kept in xend,
which causes the problem.  In my testing, I create and destroy a
domain repeatedly with the same name.  Sometimes a destroy operation
marks the domain in XenDomainDict as "terminated", but doesn't
actually remove it.  Then, xend allows another domain by the same name
to be created, thus corrupting xend's internal domain list.  Next, the
create routines in xend try to unpause the domain referenced by the
name, which turns up the record from the list of the old domain, and
therefore the old domid.  The unpause routine makes a call to libxc to
unpause the old domid, which isn't found in the list, so ESRCH ("No
such process") is returned.

It seems to me that there are (at least) two problems here:

1. The domain objects in xend's list sometimes seem to stick around
   longer than they should after a destroy operation.

2. Xend will create a duplicate domain if asked, and therefore will
   corrupt its own internal list.

I'm testing a patch right now that will cause xend to do a quick
sanity check before creating a domain to make sure that the list does
not currently contain a domain object of the same name.

-- 
Dan Smith
IBM Linux Technology Center
Open Hypervisor Team
email: danms@xxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>