On 12/14/2010 05:40 PM, Jeremy Fitzhardinge wrote:
> On 12/14/2010 02:06 PM, Daniel De Graaf wrote:
>>>> +static int gntalloc_mmap(struct file *filp, struct vm_area_struct *vma)
>>>> +{
>>>> + struct gntalloc_file_private_data *priv = filp->private_data;
>>>> + struct gntalloc_gref *gref;
>>>> +
>>>> + if (debug)
>>>> + printk("%s: priv %p, page %lu\n", __func__,
>>>> + priv, vma->vm_pgoff);
>>>> +
>>>> + /*
>>>> + * There is a 1-to-1 correspondence of grant references to shared
>>>> + * pages, so it only makes sense to map exactly one page per
>>>> + * call to mmap().
>>>> + */
>>> Single-page mmap makes sense if the only possible use-cases are for
>>> single-page mappings, but if you're talking about framebuffers and the
>>> like is seems like a very awkward way to use mmap. It would be cleaner
>>> from an API perspective to have a user-mode defined flat address space
>>> indexed by pgoff which maps to an array of grefs, so you can sensibly do
>>> a multi-page mapping.
>>>
>>> It would also allow you to hide the grefs from usermode entirely. Then
>>> its just up to usermode to choose suitable file offsets for itself.
>> I considered this, but wanted to keep userspace compatability with the
>> previously created interface.
>
> Is that private to you, or something in broader use?
This module was used as part of Qubes (http://www.qubes-os.org). The device
path has changed (/dev/gntalloc to /dev/xen/gntalloc), and the API change
adds useful functionality, so I don't think we must keep compatibility. This
will also allow cleaning up the interface to remove parameters that make no
sense (owner_domid, for example).
>> If there's no reason to avoid doing so, I'll
>> change the ioctl interface to allocate an array of grants and calculate the
>> offset similar to how gntdev does currently (picks a suitable open slot).
>
> I guess there's three options: you could get the kernel to allocate
> extents, make usermode do it, or have one fd per extent and always start
> from offset 0. I guess the last could get very messy if you want to
> have lots of mappings... Making usermode define the offsets seems
> simplest and most flexible, because then they can stitch together the
> file-offset space in any way that's convenient to them (you just need to
> deal with overlaps in that space).
Would it be useful to also give userspace control over the offsets in gntdev?
One argument for doing it in the kernel is to avoid needing to track what
offsets are already being used (and then having the kernel re-check that).
While this isn't hard, IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR only exists in
order to relieve userspace of the need to track its mappings, so this
seems to have been a concern before.
Another use case of gntalloc that may prove useful is to have more than
one application able to map the same grant within the kernel. For gntdev,
this can be done by mapping the pages multiple times (although that may
not be the most efficient). Moving the gntalloc lists out of the file
private data would allow any user of gntalloc to map any shared page. The
primary reason not to do this is that it prevents automatic cleanup of
allocated pages on close (which is important when the userspace app doing
the mapping crashes).
>> Userspace does still have to know about grefs, of course, but only to pass
>> to the domain doing the mapping, not for its own mmap().
>
> Ah, yes, of course.
>
>>>> +
>>>> + /* This flag prevents Bad PTE errors when the memory is unmapped. */
>>>> + vma->vm_flags |= VM_RESERVED;
>>>> + vma->vm_flags |= VM_DONTCOPY;
>>>> + vma->vm_flags |= VM_IO;
>>> If you set VM_PFNMAP then you don't need to deal with faults.
>> Useful to know. Is that more efficient/preferred to defining a
>> .fault handler? I used this method because that's what is used in
>> kernel/relay.c.
>
> Well, as you currently have it, your mmap() function doesn't map
> anything, so you're relying on demand faulting to populate the ptes.
> Since you've already allocated the pages that's just a soft fault, but
> it means you end up with a lot of per-page entries into the hypervisor.
>
> If you make mmap pre-populate all the ptes (a nice big fat batch
> operation), then you should never get faults on the vma. You can set
> PFNMAP to make sure of that (since you're already setting all the
> "woowoo vma" flags, that makes sense).
>
> Its actual meaning is "this vma contains pages which are not really
> kernel memory, so paging them doesn't make sense" - ie, device or
> foreign memory (we use it in gntdev).
>
> In this case, the pages are normal kernel pages but they're being given
> over to a "device" and so are no longer subject to normal kernel
> lifetime rules. So I think PFNMAP makes sense in this case too.
>
>
> J
>
Agreed; once mapped, the frame numbers (GFN & MFN) won't change until
they are unmapped, so pre-populating them will be better.
--
Daniel De Graaf
National Security Agency
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|