This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-changelog] [xen-unstable] xenstored: Recover from corrupt tdb on re

To: xen-changelog@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-changelog] [xen-unstable] xenstored: Recover from corrupt tdb on reboot
From: Xen patchbot-unstable <patchbot-unstable@xxxxxxxxxxxxxxxxxxx>
Date: Fri, 09 Nov 2007 04:20:47 -0800
Delivery-date: Fri, 09 Nov 2007 04:36:59 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-changelog-request@lists.xensource.com?subject=help>
List-id: BK change log <xen-changelog.lists.xensource.com>
List-post: <mailto:xen-changelog@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=unsubscribe>
Reply-to: xen-devel@xxxxxxxxxxxxxxxxxxx
Sender: xen-changelog-bounces@xxxxxxxxxxxxxxxxxxx
# HG changeset patch
# User Keir Fraser <keir@xxxxxxxxxxxxx>
# Date 1194342044 0
# Node ID 468a30d74bd6ee7c13aa72f4f5626c5649141019
# Parent  ed20c4232e16d1bf4e346deb02ca6a1a6271d5b4
xenstored: Recover from corrupt tdb on reboot

Xen cannot work when xenstored's tdb is corrupt.  When that happens
somehow (and we've seen it happen), even reboot doesn't recover from
it.  It could: there is no state in tdb that needs to be persisted
across reboots.

This patch arranges that tdb is removed before xenstored is started,
provided it doesn't already run.  This is safe, because:

* xenstored cannot be restarted.  If it dies, Xen's screwed until

* /usr/sbin/xend always starts xenstored anyway.

* xenstored locks its pid-file (see write_pidfile() in
  tools/xenstore/xenstored_core.c), and refuses to start when it

* My patch makes /usr/sbin/xend remove tdb iff it can lock the
  pid-file.  In other words, it removes tdb only when xenstored is not
  running, and locks it out until it is done.

  Bonus fix: it also removes stale copies of the tdb xenstored tends
  to leave behind when it exits uncleanly.

Signed-off-by: Markus Armbruster <armbru@xxxxxxxxxx>
 tools/misc/xend |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+)

diff -r ed20c4232e16 -r 468a30d74bd6 tools/misc/xend
--- a/tools/misc/xend   Tue Nov 06 09:39:25 2007 +0000
+++ b/tools/misc/xend   Tue Nov 06 09:40:44 2007 +0000
@@ -23,6 +23,8 @@
    On Solaris, the daemons are SMF managed, and you should not attempt
    to start xend by hand.
+import fcntl
+import glob
 import os
 import os.path
 import sys
@@ -76,6 +78,23 @@ def check_user():
         raise CheckError("invalid user")
 def start_xenstored():
+    pidfname = "/var/run/xenstore.pid"
+    try:
+        f = open(pidfname, "a")
+        try:
+            fcntl.lockf(f, fcntl.LOCK_EX | fcntl.LOCK_NB)
+            rootdir = os.getenv("XENSTORED_ROOTDIR") or "/var/lib/xenstored"
+            for i in glob.glob(rootdir + "/tdb*"):
+                try:
+                    os.unlink(i)
+                except:
+                    pass
+            os.unlink(pidfname)
+        except:
+            pass
+        f.close()
+    except:
+        pass
     cmd = "xenstored --pid-file /var/run/xenstore.pid"

Xen-changelog mailing list

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-changelog] [xen-unstable] xenstored: Recover from corrupt tdb on reboot, Xen patchbot-unstable <=