> Many Xen hypercalls pass mlocked pointers as parameters for
> both input and output. For example, xc_get_pfn_list() is a
> nice one with multiple levels of structures/mlocking.
>
> Considering just the tools for the moment, those pointers are
> userspace addresses. Ultimately the hypervisor ends up with
> that userspace address, from which it reads and writes data.
> This is OK for x86, since userspace, kernel, and hypervisor
> all share the same virtual address space (and userspace has
> carefully mlocked the relevent memory).
>
> On PowerPC though, the hypervisor runs in real mode (no MMU
> translation).
> Unlike x86, PowerPC exceptions arrive in real mode, and also
> PowerPC does not force a TLB flush when switching between
> real and virtual modes. So a virtual address is pretty much
> worthless as a hypervisor parameter; performing the MMU
> translation in software is infeasible.
I think I'd prefer to hide all of this by co-operation between the
kernel and the hypervisor's copy to/from user.
The kernel can easily translate a virtual address and length into a list
of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
user function can then use this list when doing its work.
Ian
> Although it rarely passes parameters by pointer, the way the
> pSeries hypervisor handles this is having the kernel always
> pass a "pseudo-physical"
> address (to borrow Xen terminology), which is trivially
> translatable to a "machine" address in the hypervisor. The
> processor has some notion of a large (e.g. 64M) chunk of
> contiguous machine memory, so the hypervisor keeps a table of
> chunks which can be used to translate pseudo-physical addresses.
>
> Of course, userspace doesn't know psuedo-physical addresses,
> only the kernel does. So one way or another, to pass
> parameters by pointer to the PPC hypervisor, the kernel is
> going to need to translate them. That also means userspace
> memory areas will be limited to one page (since virtually
> consecutive pages may not be representable by a single
> pseudo-physical address).
>
> If we're stuck with structure addresses in hypercalls, one
> possible solution is to modify libxc so that all parameter
> addresses are physical pointers within the same page, then
> pass that page's physical address into the hypercall.
> Something like this:
>
> ulong magicpage_vaddr;
> ulong magicpage_paddr;
>
> libxc_init() {
> #ifdef __powerpc__
> posix_memalign(&magicpage_vaddr, PAGE_SIZE, PAGE_SIZE);
> mlock(magicpage_vaddr);
> magicpage_paddr = new_translate_syscall(magicpage_vaddr);
> #endif
> ...
> }
>
> xc_get_pfn_list() {
> dom0_op_t *op;
> ulong op_paddr;
> magicalloc(&op, &op_paddr, sizeof(dom0_op_t));
> ...
> }
>
> #ifdef __powerpc__
> magicalloc(ulong &usable_addr, ulong &hcall_addr, int bytes) {
> *usable_addr = magicpage_vaddr + offset;
> *hcall_addr = magicpage_paddr + offset;
> offset += bytes;
> }
>
> do_xen_hypercall(ptr) {
> ptr -= magicpage_vaddr - magicpage_paddr;
> do_privcmd(..., ptr);
> }
> #endif
>
> (Note that this is for discussion only, not a proposed interface.)
>
> Each architecture would provide their own magicalloc and
> do_xen_hypercall, and for x86 magicalloc would be
> malloc+mlock and both pointers are the same. x86
> do_xen_hypercall would remain unchanged. Basically, any
> current use of mlock in libxc would be replaced with calls to
> magicalloc.
>
> For example, if we're willing to change the embedded pointers
> in dom0_ops to offsets, we do not need to invent a new
> "translate" system call.
>
> Other suggestions are welcome.
>
> --
> Hollis Blanchard
> IBM Linux Technology Center
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|