WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] A fix for the xend restart problems (2.0.x)

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] A fix for the xend restart problems (2.0.x)
From: Jed Davis <jdev@xxxxxxxxx>
Date: Fri, 19 Aug 2005 22:06:46 -0400
Cancel-lock: sha1:ejQUA5S7K5mA+SCJobd5lyyuMkQ=
Delivery-date: Sat, 20 Aug 2005 02:19:32 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Please stand clear of the closing doors.
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (berkeley-unix)
The basic problem, which from the list archives it seems that I'm not
the only one running into: the first time xend is restarted (while
there are any guests running), it immediately dies on an exception
along the lines of "Invalid backend domain" after destroying one of
the domU's.  Further attempts to restart it get a "Failed to map
domain control interface" -- unless the dom0 kernel is NetBSD with
DIAGNOSTICS, in which case it panics.

After far too much time assuming this was a NetBSD-specific problem, I
eventually tracked it down in xend, and have this patch, which
probably isn't the Right solution, but nonetheless works:

--- tools/python/xen/xend/XendDomain.py.orig    2005-08-13 01:54:56.000000000 
-0400
+++ tools/python/xen/xend/XendDomain.py 2005-08-13 01:55:17.000000000 -0400
@@ -147,7 +147,10 @@
             domid = str(d['dom'])
             doms[domid] = d
         dlist = []
-        for config in self.domain_db.values():
+        domkeys = map(int, self.domain_db.keys())
+        domkeys.sort()
+        for domkey in domkeys:
+            config = self.domain_db.get(str(domkey))
             domid = str(sxp.child_value(config, 'id'))
             if domid in doms:
                 d_dom = self._new_domain(config, doms[domid])

This change in traversal order avoids the exception shown below, when
the domU's info is being reconstructed, and its devices' backend
domain (here, dom0) is looked up -- but doesn't appear to exist yet,
because it hasn't been restored from the state files (or by querying
the hypervisor, for that matter) yet.  I assume it's due to code reuse
with a domain's actual creation that the exception causes xend to try
to destroy the domain after this fails.  The idea of the above patch,
then, is to restore the domains' state in the same order as they were
created.

This is the trace of the exception in question -- normally it gets
caught partway up and the "invalid backend domain" exception is thrown
from there, but I commented out the try/except so I could see that
first exception:

Traceback (most recent call last):
  File "/usr/local/sbin/xend", line 121, in ?
    sys.exit(main())
  File "/usr/local/sbin/xend", line 107, in main
    return daemon.start()
  File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDaemon.py", line 
525, in start
  File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDaemon.py", line 
615, in run
  File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvServer.py", line 
47, in create
  File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvRoot.py", line 
29, in __init__
  File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDir.py", line 69, 
in get
  File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDir.py", line 39, 
in getobj
  File "/pkg/xentools-2.0.6/usr/lib/python/xen/xend/server/SrvDomainDir.py", 
line 25, in __init__
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 
800, in instance
    inst = XendDomain()
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 
65, in __init__
    self.initial_refresh()
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 
154, in initial_refresh
    d_dom = self._new_domain(config, doms[domid])
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 
189, in _new_domain
    deferred = XendDomainInfo.vm_recreate(savedinfo, info)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", 
line 218, in vm_recreate
    d = vm.construct(config)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", 
line 456, in construct
    deferred = self.configure()
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", 
line 975, in configure
    d = self.create_devices()
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", 
line 803, in create_devices
    v = dev_handler(self, dev, dev_index)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", 
line 1110, in vm_dev_vif
    defer = ctrl.attachDevice(vif, val, recreate=recreate)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 
423, in attachDevice
    dev = self.addDevice(vif, config)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 
400, in addDevice
    dev = NetDev(vif, self, config)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 
105, in __init__
    self.configure(config)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/server/netif.py", line 
150, in configure
    self.backendDomain = int(xd.domain_lookup(sxp.child_value(config, 
'backend', '0')).id)
  File "/usr/local/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 
430, in domain_lookup
    raise XendError('invalid domain:' + name)
xen.xend.XendError.XendError: invalid domain:0



-- 
(let ((C call-with-current-continuation)) (apply (lambda (x y) (x y)) (map
((lambda (r) ((C C) (lambda (s) (r (lambda l (apply (s s) l))))))  (lambda
(f) (lambda (l) (if (null? l) C (lambda (k) (display (car l)) ((f (cdr l))
(C k)))))))    '((#\J #\d #\D #\v #\s) (#\e #\space #\a #\i #\newline)))))


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-devel] A fix for the xend restart problems (2.0.x), Jed Davis <=