WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [PATCH] xend: Sleep before sending SIGKILL to device model

To: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
Subject: [Xen-devel] [PATCH] xend: Sleep before sending SIGKILL to device model
From: Yosuke Iwamatsu <y-iwamatsu@xxxxxxxxxxxxx>
Date: Thu, 29 Jan 2009 17:40:49 +0900
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 29 Jan 2009 00:43:59 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <18816.15891.759792.425858@xxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <49801B68.5060700@xxxxxxxxxxxxx> <18816.15891.759792.425858@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.19 (Windows/20081209)
Ian Jackson wrote:
The code already has a timeout to forcibly kill the device model after
(I think) 10 seconds.  Surely we should reuse that code path (and the
same timeout value) ?

Restarting xend is not a usual thing to do and I think it's OK if
shutting down a domain started by a previous xend involves waiting for
such a longer timeout.  It's better to err on the side of safety.

O.K. Attached is a revised patch which reuses the existing code path.
10 seconds seems to me a bit too long, but I can agree we had better
keep on the safe side.

Also, your patch was:
  Content-Type: all/allfiles;
This is not a recognised content type and prevented both of my
mailreaders from displaying it to me.  Can you please fix your MUA ?

Sorry for the inconvenience.
This time your mail client can recognize it, I think.

-- Yosuke

Signed-off-by: Yosuke Iwamatsu <y-iwamatsu@xxxxxxxxxxxxx>



diff -r 39517e863cc8 tools/python/xen/xend/image.py
--- a/tools/python/xen/xend/image.py    Mon Jan 26 16:19:42 2009 +0000
+++ b/tools/python/xen/xend/image.py    Thu Jan 29 17:30:20 2009 +0900
@@ -558,24 +558,30 @@
                     os.kill(self.pid, signal.SIGHUP)
                 except OSError, exn:
                     log.exception(exn)
-                try:
-                    # Try to reap the child every 100ms for 10s. Then SIGKILL 
it.
-                    for i in xrange(100):
+                # Try to reap the child every 100ms for 10s. Then SIGKILL it.
+                for i in xrange(100):
+                    try:
                         (p, rv) = os.waitpid(self.pid, os.WNOHANG)
                         if p == self.pid:
                             break
-                        time.sleep(0.1)
-                    else:
-                        log.warning("DeviceModel %d took more than 10s "
-                                    "to terminate: sending SIGKILL" % self.pid)
+                    except OSError:
+                        # This is expected if Xend has been restarted within
+                        # the life of this domain.  In this case, we can kill
+                        # the process, but we can't wait for it because it's
+                        # not our child. We continue this loop, and after it is
+                        # terminated make really sure the process is going away
+                        # (SIGKILL).
+                        pass
+                    time.sleep(0.1)
+                else:
+                    log.warning("DeviceModel %d took more than 10s "
+                                "to terminate: sending SIGKILL" % self.pid)
+                    try:
                         os.kill(self.pid, signal.SIGKILL)
                         os.waitpid(self.pid, 0)
-                except OSError, exn:
-                    # This is expected if Xend has been restarted within the
-                    # life of this domain.  In this case, we can kill the 
process,
-                    # but we can't wait for it because it's not our child.
-                    # We just make really sure it's going away (SIGKILL) first.
-                    os.kill(self.pid, signal.SIGKILL)
+                    except OSError:
+                        # This happens if the process doesn't exist.
+                        pass
                 state = xstransact.Remove("/local/domain/0/device-model/%i"
                                           % self.vm.getDomid())
             finally:

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel