|   xen-devel
Re: [Xen-devel] xend failure, restart doesn't work 
| | Ok heres some more info: 
 currently there are multiple domains up and running perfectly. i'm on
dom0
 
 wilde root 0 #  xm list
 (111, 'Connection refused')
 Error: Error connecting to xend, is xend running?
 
 check if it's running:
 
 wilde root 0 #  ps |grep xend
 root     13347  0.0  0.3   1436   404 pts/0    D+   11:57   0:00 grep
xend
 wilde root 0 #  ps |grep xfrd
 root     26122  0.0  0.3   3252   464 ?        S    Mar16   0:00 xfrd
 
 
 cleared the logs:
 wilde root 0 #  cat /var/log/xend.log /var/log/xend-debug.log
 wilde root 0 #
 
 stopping xend (just to make sure):
 wilde root 0 #  /etc/init.d/xend stop
 wilde root 0 #  cat /var/log/xend.log /var/log/xend-debug.log
 wilde root 0 #
 
 starting xend:
 
 wilde root 0 #  /etc/init.d/xend start
 .........
 wilde root 3 #
 
 (exit status 3, takes about 5 seconds of time-out-time)
 
 
 see what is runnning:
 
 
 wilde root 0 #  ps |grep xend
 root     13469  0.0  0.3   1436   472 pts/0    R+   12:03   0:00 grep
xend
 wilde root 0 #  ps |grep xfr
 root     13420  0.0  0.6   3048   792 ?        S    12:02   0:00 xfrd
 root     13471  0.0  0.3   1436   472 pts/0    R+   12:04   0:00 grep
xfr
 wilde root 0 #  ps |grep xcs
 root     13477  0.0  0.3   1436   472 pts/0    R+   12:04   0:00 grep
xcs
 
 So only xfrd has started. lets see what is in the logs:
 
 wilde root 1 #  cat /var/log/xend.log
 [2005-03-25 12:02:45 xend] INFO (SrvDaemon:610) Xend Daemon started
 
 ( i wish :S )
 
 wilde root 0 #  cat /var/log/xend-debug.log
 network start bridge=xen-br0 netdev=eth0 antispoof=yes
 Traceback (most recent call last):
 File "/usr/sbin/xend", line 121, in ?
 sys.exit(main())
 File "/usr/sbin/xend", line 107, in main
 return daemon.start()
 File "/usr/lib/python/xen/xend/server/SrvDaemon.py", line 525, in
start
 self.run()
 File "/usr/lib/python/xen/xend/server/SrvDaemon.py", line 615, in run
 SrvServer.create(bridge=1)
 File "/usr/lib/python/xen/xend/server/SrvServer.py", line 47, in
create
 xend = SrvRoot()
 File "/usr/lib/python/xen/xend/server/SrvRoot.py", line 29, in
__init__
 self.get(name)
 File "/usr/lib/python/xen/xend/server/SrvDir.py", line 69, in get
 val = val.getobj()
 File "/usr/lib/python/xen/xend/server/SrvDir.py", line 39, in getobj
 self.obj = klassobj()
 File "/usr/lib/python/xen/xend/server/SrvDomainDir.py", line 25, in
__init__
 self.xd = XendDomain.instance()
 File "/usr/lib/python/xen/xend/XendDomain.py", line 798, in instance
 inst = XendDomain()
 File "/usr/lib/python/xen/xend/XendDomain.py", line 65, in __init__
 self.initial_refresh()
 File "/usr/lib/python/xen/xend/XendDomain.py", line 153, in
initial_refresh
 d_dom = self._new_domain(config, doms[domid])
 File "/usr/lib/python/xen/xend/XendDomain.py", line 188, in
_new_domain
 deferred = XendDomainInfo.vm_recreate(savedinfo, info)
 File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 218, in
vm_recreate
 d = vm.construct(config)
 File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 453, in
construct
 self.construct_image()
 File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 480, in
construct_image
 image_handler(self, image)
 File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 1065, in
vm_image_linux
 vm.create_domain("linux", kernel, ramdisk, cmdline)
 File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 757, in
create_domain
 self.create_channel()
 File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 782, in
create_channel
 remote_port=remote)
 File "/usr/lib/python/xen/xend/server/SrvDaemon.py", line 660, in
createDomChannel
 remote_port=remote_port)
 File "/usr/lib/python/xen/xend/server/channel.py", line 59, in
domChannel
 remote_port=remote_port)
 File "/usr/lib/python/xen/xend/server/channel.py", line 229, in
__init__
 remote_port=remote_port)
 File "/usr/lib/python/xen/xend/server/channel.py", line 113, in
createPort
 remote_port=int(remote_port))
 xen.lowlevel.xu.PortError: Failed to map domain control interface
 
 So there is the problem i guess. But i don't know what it means or how
i should fix it. Any ideas ?
 
 Ian Pratt wrote:
 
   
   
    Hi,
Every now and then xend seems to fail, the domains keep running but 
control is lost completely.
A restart seems not possible. The logfile says xend restarts but the 
only thing that restarts is xfrd binding on port 8002.
The only way to get control back seems to be to reboot the machine, 
rebooting all of the client domains too wich is a *bad thing* for our 
clients.
Is there any way to get control back?
Is anyone working on the problem is there a patch or even a 
know cause??
Are there any steps i can take (or NOT take) to prevent this 
from happening?
     
You could try 'xend stop' and then kill 'xcs' manually, then 
'xend start'.
You'll need to give us more help to debug the actual problem your
experiencing.
Ian
   
 | 
 | 
 |  |