WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] xend failure, restart doesn't work

To: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] xend failure, restart doesn't work
From: Marius Karthaus <Marius@xxxxxxxxxxx>
Date: Fri, 25 Mar 2005 12:10:49 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxxxx, ian.pratt@xxxxxxxxxxxx
Delivery-date: Fri, 25 Mar 2005 11:11:48 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <A95E2296287EAD4EB592B5DEEFCE0E9D1E387C@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-archive: <http://sourceforge.net/mailarchive/forum.php?forum=xen-devel>
List-help: <mailto:xen-devel-request@lists.sourceforge.net?subject=help>
List-id: List for Xen developers <xen-devel.lists.sourceforge.net>
List-post: <mailto:xen-devel@lists.sourceforge.net>
List-subscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=subscribe>
List-unsubscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=unsubscribe>
References: <A95E2296287EAD4EB592B5DEEFCE0E9D1E387C@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
Ok heres some more info:

currently there are multiple domains up and running perfectly. i'm on dom0

wilde root 0 #  xm list
(111, 'Connection refused')
Error: Error connecting to xend, is xend running?


check if it's running:

wilde root 0 #  ps |grep xend
root     13347  0.0  0.3   1436   404 pts/0    D+   11:57   0:00 grep xend
wilde root 0 #  ps |grep xfrd
root     26122  0.0  0.3   3252   464 ?        S    Mar16   0:00 xfrd


cleared the logs:
wilde root 0 #  cat /var/log/xend.log /var/log/xend-debug.log
wilde root 0 # 

stopping xend (just to make sure):
wilde root 0 #  /etc/init.d/xend stop
wilde root 0 #  cat /var/log/xend.log /var/log/xend-debug.log
wilde root 0 # 

starting xend:

wilde root 0 #  /etc/init.d/xend start
.........
wilde root 3 # 

(exit status 3, takes about 5 seconds of time-out-time)


see what is runnning:


wilde root 0 #  ps |grep xend
root     13469  0.0  0.3   1436   472 pts/0    R+   12:03   0:00 grep xend
wilde root 0 #  ps |grep xfr
root     13420  0.0  0.6   3048   792 ?        S    12:02   0:00 xfrd
root     13471  0.0  0.3   1436   472 pts/0    R+   12:04   0:00 grep xfr
wilde root 0 #  ps |grep xcs
root     13477  0.0  0.3   1436   472 pts/0    R+   12:04   0:00 grep xcs

So only xfrd has started. lets see what is in the logs:

wilde root 1 #  cat /var/log/xend.log                        
[2005-03-25 12:02:45 xend] INFO (SrvDaemon:610) Xend Daemon started

( i wish :S )

wilde root 0 #  cat /var/log/xend-debug.log
network start bridge=xen-br0 netdev=eth0 antispoof=yes
Traceback (most recent call last):
  File "/usr/sbin/xend", line 121, in ?
    sys.exit(main())
  File "/usr/sbin/xend", line 107, in main
    return daemon.start()
  File "/usr/lib/python/xen/xend/server/SrvDaemon.py", line 525, in start
    self.run()
  File "/usr/lib/python/xen/xend/server/SrvDaemon.py", line 615, in run
    SrvServer.create(bridge=1)
  File "/usr/lib/python/xen/xend/server/SrvServer.py", line 47, in create
    xend = SrvRoot()
  File "/usr/lib/python/xen/xend/server/SrvRoot.py", line 29, in __init__
    self.get(name)
  File "/usr/lib/python/xen/xend/server/SrvDir.py", line 69, in get
    val = val.getobj()
  File "/usr/lib/python/xen/xend/server/SrvDir.py", line 39, in getobj
    self.obj = klassobj()
  File "/usr/lib/python/xen/xend/server/SrvDomainDir.py", line 25, in __init__
    self.xd = XendDomain.instance()
  File "/usr/lib/python/xen/xend/XendDomain.py", line 798, in instance
    inst = XendDomain()
  File "/usr/lib/python/xen/xend/XendDomain.py", line 65, in __init__
    self.initial_refresh()
  File "/usr/lib/python/xen/xend/XendDomain.py", line 153, in initial_refresh
    d_dom = self._new_domain(config, doms[domid])
  File "/usr/lib/python/xen/xend/XendDomain.py", line 188, in _new_domain
    deferred = XendDomainInfo.vm_recreate(savedinfo, info)
  File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 218, in vm_recreate
    d = vm.construct(config)
  File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 453, in construct
    self.construct_image()
  File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 480, in construct_image
    image_handler(self, image)
  File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 1065, in vm_image_linux
    vm.create_domain("linux", kernel, ramdisk, cmdline)
  File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 757, in create_domain
    self.create_channel()
  File "/usr/lib/python/xen/xend/XendDomainInfo.py", line 782, in create_channel
    remote_port=remote)
  File "/usr/lib/python/xen/xend/server/SrvDaemon.py", line 660, in createDomChannel
    remote_port=remote_port)
  File "/usr/lib/python/xen/xend/server/channel.py", line 59, in domChannel
    remote_port=remote_port)
  File "/usr/lib/python/xen/xend/server/channel.py", line 229, in __init__
    remote_port=remote_port)
  File "/usr/lib/python/xen/xend/server/channel.py", line 113, in createPort
    remote_port=int(remote_port))
xen.lowlevel.xu.PortError: Failed to map domain control interface

So there is the problem i guess. But i don't know what it means or how i should fix it. Any ideas ?

Ian Pratt wrote:
 
  
Hi,
Every now and then xend seems to fail, the domains keep running but 
control is lost completely.
A restart seems not possible. The logfile says xend restarts but the 
only thing that restarts is xfrd binding on port 8002.
The only way to get control back seems to be to reboot the machine, 
rebooting all of the client domains too wich is a *bad thing* for our 
clients.
Is there any way to get control back?
Is anyone working on the problem is there a patch or even a 
know cause??
Are there any steps i can take (or NOT take) to prevent this 
from happening?
    

You could try 'xend stop' and then kill 'xcs' manually, then 
'xend start'.

You'll need to give us more help to debug the actual problem your
experiencing.

Ian
  

<Prev in Thread] Current Thread [Next in Thread>