On Wed, 2011-09-21 at 16:02 +0100, Daniel De Graaf wrote:
> On 09/21/2011 06:03 AM, Ian Campbell wrote:
> > On Mon, 2011-09-19 at 23:43 +0100, Daniel De Graaf wrote:
> >> @@ -116,4 +116,35 @@ struct ioctl_gntdev_set_max_grants {
> >> uint32_t count;
> >> };
> >>
> >> +/*
> >> + * Sets up an unmap notification within the page, so that the other side
> >> can do
> >> + * cleanup if this side crashes. Required to implement cross-domain robust
> >> + * mutexes or close notification on communication channels.
> >> + *
> >> + * Each mapped page only supports one notification; multiple calls
> >> referring to
> >> + * the same page overwrite the previous notification. You must clear the
> >> + * notification prior to the IOCTL_GNTALLOC_DEALLOC_GREF if you do not
> >> want it
> >> + * to occur.
> >> + */
> >> +#define IOCTL_GNTDEV_SET_UNMAP_NOTIFY \
> >> +_IOC(_IOC_NONE, 'G', 7, sizeof(struct ioctl_gntdev_unmap_notify))
> >> +struct ioctl_gntdev_unmap_notify {
> >> + /* IN parameters */
> >> + /* Offset in the file descriptor for a byte within the page (same as
> >> + * used in mmap).
> >
> > I'm probably being thick but I don't understand what this means, i.e.
> > what this thing is relative to.
>
> This is an offset that acts like a byte offset in the /dev/xen/gntdev device.
> Conceptually, if /dev/xen/evtchn implemented pwrite() then this offset would
> be the offset you would pass to modify your target byte.
>
> For example, if you use gntdev to map two pages, the first will be at offset
> 0 and the second at 4096; you would pass 4098 here to indicate the third byte
> of the second page. The offsets (0, 4096) are returned by the map-grant
> ioctls.
Hmm. I think I was confused because it was an offset into the file
rather than, say, a virtual address.
When I map a page how do I know what the offset of it is wrt the file
descriptor? DO I just have to remember how many pages I mapped an *=
4096?
>
> >> If using UNMAP_NOTIFY_CLEAR_BYTE, this is the byte to
> >> + * be cleared. Otherwise, it can be any byte in the page whose
> >> + * notification we are adjusting.
> >> + */
> >> + uint64_t index;
> >> + /* Action(s) to take on unmap */
> >> + uint32_t action;
> >> + /* Event channel to notify */
> >> + uint32_t event_channel_port;
> >
> > evtchn_port_t ?
>
> Using that would require an include dependency on event_channel.h which is
> not exposed as a userspace-visible header. Linux's evtchn.h also does not
> use evtchn_port_t (it uses unsigned int).
>
> Since this is just a direct copy of the header from the linux source tree,
> any changes really need to happen there first.
OK, that's fine as it is then.
>
> >> +};
> >> +
> >> +/* Clear (set to zero) the byte specified by index */
> >> +#define UNMAP_NOTIFY_CLEAR_BYTE 0x1
> >> +/* Send an interrupt on the indicated event channel */
> >> +#define UNMAP_NOTIFY_SEND_EVENT 0x2
> >> +
> >> #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
> >> diff --git a/tools/libxc/xc_gnttab.c b/tools/libxc/xc_gnttab.c
> >> index 4f55fce..3d3c63b 100644
> >> --- a/tools/libxc/xc_gnttab.c
> >> +++ b/tools/libxc/xc_gnttab.c
> >> @@ -18,6 +18,7 @@
> >> */
> >>
> >> #include "xc_private.h"
> >> +#include <errno.h>
> >>
> >> int xc_gnttab_op(xc_interface *xch, int cmd, void * op, int op_size, int
> >> count)
> >> {
> >> @@ -174,6 +175,28 @@ void *xc_gnttab_map_domain_grant_refs(xc_gnttab *xcg,
> >> count, domid, refs,
> >> prot);
> >> }
> >>
> >> +void *xc_gnttab_map_grant_ref_notify(xc_gnttab *xcg,
> >> + uint32_t domid,
> >> + uint32_t ref,
> >> + uint32_t notify_offset,
> >> + evtchn_port_t notify_port,
> >> + int *notify_result)
> >> +{
> >> + if (xcg->ops->u.gnttab.map_grant_ref_notify)
> >> + return xcg->ops->u.gnttab.map_grant_ref_notify(xcg,
> >> xcg->ops_handle,
> >> + domid, ref, notify_offset, notify_port, notify_result);
> >> + else {
> >> + void* area = xcg->ops->u.gnttab.map_grant_ref(xcg,
> >> xcg->ops_handle,
> >> + domid, ref, PROT_READ|PROT_WRITE);
> >> + if (area && notify_result) {
> >> + *notify_result = -1;
> >> + errno = ENOSYS;
> >> + }
> >> + return area;
> >> + }
> >
> > I think the new public interface is fine but do we really need a new
> > internal interface here?
> >
> > I think you can just add the notify_* arguments to the existing OSDEP
> > function and have those OS backends which don't implement that feature
> > return ENOSYS if notify_offset != 0 (or ~0 or whatever invalid value
> > works).
> >
> > Why doesn't the *_notify variant take a prot argument?
>
> At least for the byte-clear portion of the notify, the page must be writable
> or the notify will not work. I suppose an event channel alone could be used
> for a read-only mapping, but normally the unmapping of a read-only mapping is
> not an important event - although I guess you could contrive a use for this
> type of notificaiton, so there's no reason not to allow it.
If you combine this the two functions then returning EINVAL for attempts
to map without PROT_WRITE (or whatever perm is necessary) would be
reasonable IMHO.
> > I'd be tempted to do away with notify_result too -- if the caller asked
> > for notification and we fail to give that then we can cleanup and return
> > an error. If they want to try again without the notification then that's
> > up to them.
>
> The source of the error might be unclear, but this would make the interface
> cleaner.
>
> >
> >> +}
> >> +
> >> +
> >> int xc_gnttab_munmap(xc_gnttab *xcg,
> >> void *start_address,
> >> uint32_t count)
> >> diff --git a/tools/libxc/xc_linux_osdep.c b/tools/libxc/xc_linux_osdep.c
> >> index dca6718..3040cb6 100644
> >> --- a/tools/libxc/xc_linux_osdep.c
> >> +++ b/tools/libxc/xc_linux_osdep.c
> >> @@ -613,6 +613,62 @@ static void
> >> *linux_gnttab_map_domain_grant_refs(xc_gnttab *xcg, xc_osdep_handle
> >> return do_gnttab_map_grant_refs(xcg, h, count, &domid, 0, refs, prot);
> >> }
> >>
> >> +static void *linux_gnttab_map_grant_ref_notify(xc_gnttab *xch,
> >> xc_osdep_handle h,
> >> + uint32_t domid, uint32_t
> >> ref,
> >> + uint32_t notify_offset,
> >> + evtchn_port_t notify_port,
> >> + int *notify_result)
> >> +{
> >> + int fd = (int)h;
> >> + int rv = 0;
> >> + struct ioctl_gntdev_map_grant_ref map;
> >> + struct ioctl_gntdev_unmap_notify notify;
> >> + void *addr;
> >> +
> >> + map.count = 1;
> >> + map.refs[0].domid = domid;
> >> + map.refs[0].ref = ref;
> >> +
> >> + if ( ioctl(fd, IOCTL_GNTDEV_MAP_GRANT_REF, &map) ) {
> >> + PERROR("xc_gnttab_map_grant_ref: ioctl MAP_GRANT_REF failed");
> >> + return NULL;
> >> + }
> >> +
> >> + addr = mmap(NULL, XC_PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd,
> >> map.index);
> >> + if ( addr == MAP_FAILED )
> >> + {
> >> + int saved_errno = errno;
> >> + struct ioctl_gntdev_unmap_grant_ref unmap_grant;
> >> +
> >> + PERROR("xc_gnttab_map_grant_ref: mmap failed");
> >> + unmap_grant.index = map.index;
> >> + unmap_grant.count = 1;
> >> + ioctl(fd, IOCTL_GNTDEV_UNMAP_GRANT_REF, &unmap_grant);
> >> + errno = saved_errno;
> >> + return NULL;
> >> + }
> >
> > The non-notify variant handles EAGAIN, why doesn't this one need to do
> > so?
>
> The non-notify variant doesn't need to handle EAGAIN anymore (with modern
> kernels, at least... perhaps it should remain for older kernels). Also,
> do_gnttab_map_grant_refs does not handle EAGAIN.
OK I guess that is fine (although if you combine them as I suggest it
comes back?)
I hadn't noticed that we have both map_gratn_ref and map_grant_refs,
that seems like an area which could be cleaned up. (not saying you
should, just noticing it)
>
> >> +
> >> + notify.index = map.index;
> >> + notify.action = 0;
> >> + if (notify_offset >= 0) {
> >> + notify.index += notify_offset;
> >> + notify.action |= UNMAP_NOTIFY_CLEAR_BYTE;
> >> + }
> >> + if (notify_port >= 0) {
> >> + notify.event_channel_port = notify_port;
> >> + notify.action |= UNMAP_NOTIFY_SEND_EVENT;
> >> + }
> >> + if (notify.action)
> >> + rv = ioctl(fd, IOCTL_GNTDEV_SET_UNMAP_NOTIFY, ¬ify);
> >
> > Is there a race if the other end (or this process) dies between the MAP
> > ioctl and here?
> >
> > Ian.
> >
>
> Technically it's a race, but it doesn't impact any reasonable use of the
> notification. The local process can't actually be using the shared page
> at this point, and the other side will not be certain that the map has
> actually succeeded until after the function returns (and it is notified
> in some way - libvchan changes the notify byte from 2->1 at this point).
>
> If the domain whose memory we are mapping crashes, this ioctl will succeed
> unless the event channel it refers to has already been invalidated - but
> either way, the notifications are now irrelevant as there is nobody to
> listen to them.
OK.
Thanks,
Ian.
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|