WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH 1 of 2] Fix invalid memory access in OCaml mmap l

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] Re: [PATCH 1 of 2] Fix invalid memory access in OCaml mmap library (to play nicely with the GC)
From: Zheng Li <zheng.li@xxxxxxxxxxxxx>
Date: Fri, 30 Sep 2011 18:45:54 +0100
Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>
Delivery-date: Fri, 30 Sep 2011 10:47:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1317400445.26672.314.camel@xxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <patchbomb.1317397198@eta> <a980cfb2e4da7ce1780f.1317397199@eta> <1317400445.26672.314.camel@xxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:7.0) Gecko/20110927 Thunderbird/7.0
On 30/09/2011 17:34, Ian Campbell wrote:
On Fri, 2011-09-30 at 16:39 +0100, Zheng Li wrote:
This was a bug reported by Roberto Di Cosmo. When he tried to reuse
the mmap library for his own project, Mmap.read occasionally got
different result when reading from the same map. This turned out to be
a bug in the binding, where a C pointer was created pointing to a
OCaml value, and the OCaml value was subsequently moved around by the
GC after memory allocation and hence invalidated the C pointer. This
patch removes the indirection of C pointer and uses OCaml macro to
access values directly.

I was initially quite confused about how the value of a parameter to a C
function could change while that function was running but I see now that
the CAMLparam* stuff apparently enables that behaviour by stashing the
address of the parameters so if you ever call back into the runtime (and
presumably asynchronously if you are multithreaded) the GC can indeed
move stuff around.

I think the issue itself is not related to multi-thread though. In the original 
version of the stub_mmap_read,

        intf = GET_C_STRUCT(interface);

"interface" was passed to the C function as an OCaml value registered in the OCaml runtime, underline it was just a 
pointer pointing to a block in the OCaml heap. It was copied and casted to the "intf" on the C side, hence 
"intf" was initially pointing to the same address as "interface". Unfortunately, before we made any use of 
"intf", we first called

        data = caml_alloc_string(c_len);

which called back to the OCaml runtime and allocated memory in the OCaml heap. Since caml_alloc_string was a 
function defined by OCaml runtime, it deliberately called quite a few GC routines under the hood which can 
possibly move any values in the OCaml heap. OCaml runtime would certainly update the "interface" 
pointer accordingly since it was registered with the runtime. But "intf" was a hard copy on the C 
side that OCaml had no idea about. So "intf" still pointed to the old address which was no longer 
valid. Despite the back and forth between C and OCaml territories, all was single threaded here.

The proposed patch only appears to fix the issue if there is some higher
level locking which prevents another thread from triggering a gc while
this function is running. IIRC there is a single global lock which
covers all C code called from ocaml, is that right?

OCaml runtime has a global lock that will only allow one thread to execute it 
at a time and the GC is single threaded as well. This ensures the safety but 
unfortunately introduces the bottleneck. On the other hand, for threads that 
don't need to access memory on the OCaml side for a while (e.g. long running 
pure C routines invoked from OCaml side), they can run in parallel by using 
caml_{enter|leaving}_blocking_section primitives to give up and then retain the 
rights to enter OCaml runtime.

Cheers
--
Zheng
Only Mmap.read function had this problem. The other functions, despite
having the same code style, didn't have memory allocation involved
hence wouldn't intrigue such an error. I've changed all of them to the
safer style for future proof. Directly casting OCaml value's *data
block* (rather than the value itself) as a C pointer is not a common
practice either, but I'll leave it as it is.

The bug hadn't occured on XenServer because XenServer didn't make use
of the Mmap.read function (except in one place for debugging). In
XenServer, most mmap operations were going through another pair of
separately implemented functions (Xs_ring.read/write).


Signed-off-by: Zheng Li<dev@xxxxxxxx>


  tools/ocaml/libs/mmap/mmap_stubs.c |  24 +++++++++---------------
  1 files changed, 9 insertions(+), 15 deletions(-)


----
diff --git a/tools/ocaml/libs/mmap/mmap_stubs.c 
b/tools/ocaml/libs/mmap/mmap_stubs.c
--- a/tools/ocaml/libs/mmap/mmap_stubs.c
+++ b/tools/ocaml/libs/mmap/mmap_stubs.c
@@ -71,12 +71,10 @@ CAMLprim value stub_mmap_init(value fd,
  CAMLprim value stub_mmap_final(value interface)
  {
        CAMLparam1(interface);
-       struct mmap_interface *intf;

-       intf = GET_C_STRUCT(interface);
-       if (intf->addr != MAP_FAILED)
-               munmap(intf->addr, intf->len);
-       intf->addr = MAP_FAILED;
+       if (GET_C_STRUCT(interface)->addr != MAP_FAILED)
+               munmap(GET_C_STRUCT(interface)->addr, 
GET_C_STRUCT(interface)->len);
+       GET_C_STRUCT(interface)->addr = MAP_FAILED;

        CAMLreturn(Val_unit);
  }
@@ -85,21 +83,19 @@ CAMLprim value stub_mmap_read(value inte
  {
        CAMLparam3(interface, start, len);
        CAMLlocal1(data);
-       struct mmap_interface *intf;
        int c_start;
        int c_len;

        c_start = Int_val(start);
        c_len = Int_val(len);
-       intf = GET_C_STRUCT(interface);

-       if (c_start>  intf->len)
+       if (c_start>  GET_C_STRUCT(interface)->len)
                caml_invalid_argument("start invalid");
-       if (c_start + c_len>  intf->len)
+       if (c_start + c_len>  GET_C_STRUCT(interface)->len)
                caml_invalid_argument("len invalid");

        data = caml_alloc_string(c_len);
-       memcpy((char *) data, intf->addr + c_start, c_len);
+       memcpy((char *) data, GET_C_STRUCT(interface)->addr + c_start, c_len);

        CAMLreturn(data);
  }
@@ -108,20 +104,18 @@ CAMLprim value stub_mmap_write(value int
                                 value start, value len)
  {
        CAMLparam4(interface, data, start, len);
-       struct mmap_interface *intf;
        int c_start;
        int c_len;

        c_start = Int_val(start);
        c_len = Int_val(len);
-       intf = GET_C_STRUCT(interface);

-       if (c_start>  intf->len)
+       if (c_start>  GET_C_STRUCT(interface)->len)
                caml_invalid_argument("start invalid");
-       if (c_start + c_len>  intf->len)
+       if (c_start + c_len>  GET_C_STRUCT(interface)->len)
                caml_invalid_argument("len invalid");

-       memcpy(intf->addr + c_start, (char *) data, c_len);
+       memcpy(GET_C_STRUCT(interface)->addr + c_start, (char *) data, c_len);

        CAMLreturn(Val_unit);
  }



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel