WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] passing hypercall parameters by pointer

To: "Hollis Blanchard" <hollisb@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] passing hypercall parameters by pointer
From: "Ian Pratt" <m+Ian.Pratt@xxxxxxxxxxxx>
Date: Wed, 17 Aug 2005 21:44:25 +0100
Cc: Jimi Xenidis <jimix@xxxxxxxxxxxxxx>
Delivery-date: Wed, 17 Aug 2005 20:42:34 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcWjZVq2w3oCeTJVTs67J8Weva1pcAABoQkQ
Thread-topic: [Xen-devel] passing hypercall parameters by pointer
> Many Xen hypercalls pass mlocked pointers as parameters for 
> both input and output. For example, xc_get_pfn_list() is a 
> nice one with multiple levels of structures/mlocking.
> 
> Considering just the tools for the moment, those pointers are 
> userspace addresses. Ultimately the hypervisor ends up with 
> that userspace address, from which it reads and writes data. 
> This is OK for x86, since userspace, kernel, and hypervisor 
> all share the same virtual address space (and userspace has 
> carefully mlocked the relevent memory).
> 
> On PowerPC though, the hypervisor runs in real mode (no MMU 
> translation).  
> Unlike x86, PowerPC exceptions arrive in real mode, and also 
> PowerPC does not force a TLB flush when switching between 
> real and virtual modes. So a virtual address is pretty much 
> worthless as a hypervisor parameter; performing the MMU 
> translation in software is infeasible.

I think I'd prefer to hide all of this by co-operation between the
kernel and the hypervisor's copy to/from user.

The kernel can easily translate a virtual address and length into a list
of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
user function can then use this list when doing its work. 

Ian


> Although it rarely passes parameters by pointer, the way the 
> pSeries hypervisor handles this is having the kernel always 
> pass a "pseudo-physical" 
> address (to borrow Xen terminology), which is trivially 
> translatable to a "machine" address in the hypervisor. The 
> processor has some notion of a large (e.g. 64M) chunk of 
> contiguous machine memory, so the hypervisor keeps a table of 
> chunks which can be used to translate pseudo-physical addresses.
> 
> Of course, userspace doesn't know psuedo-physical addresses, 
> only the kernel does. So one way or another, to pass 
> parameters by pointer to the PPC hypervisor, the kernel is 
> going to need to translate them. That also means userspace 
> memory areas will be limited to one page (since virtually 
> consecutive pages may not be representable by a single 
> pseudo-physical address).
> 
> If we're stuck with structure addresses in hypercalls, one 
> possible solution is to modify libxc so that all parameter 
> addresses are physical pointers within the same page, then 
> pass that page's physical address into the hypercall. 
> Something like this:
> 
> ulong magicpage_vaddr;
> ulong magicpage_paddr;
> 
> libxc_init() {
> #ifdef __powerpc__
>       posix_memalign(&magicpage_vaddr, PAGE_SIZE, PAGE_SIZE);
>       mlock(magicpage_vaddr);
>       magicpage_paddr = new_translate_syscall(magicpage_vaddr);
> #endif
>       ...
> }
> 
> xc_get_pfn_list() {
>       dom0_op_t *op;
>       ulong op_paddr;
>       magicalloc(&op, &op_paddr, sizeof(dom0_op_t));
>       ...
> }
> 
> #ifdef __powerpc__
> magicalloc(ulong &usable_addr, ulong &hcall_addr, int bytes) {
>       *usable_addr = magicpage_vaddr + offset;
>       *hcall_addr = magicpage_paddr + offset;
>       offset += bytes;
> }
> 
> do_xen_hypercall(ptr) {
>       ptr -= magicpage_vaddr - magicpage_paddr;
>       do_privcmd(..., ptr);
> }
> #endif
> 
> (Note that this is for discussion only, not a proposed interface.)
> 
> Each architecture would provide their own magicalloc and 
> do_xen_hypercall, and for x86 magicalloc would be 
> malloc+mlock and both pointers are the same. x86 
> do_xen_hypercall would remain unchanged. Basically, any 
> current use of mlock in libxc would be replaced with calls to 
> magicalloc.
> 
> For example, if we're willing to change the embedded pointers 
> in dom0_ops to offsets, we do not need to invent a new 
> "translate" system call.
> 
> Other suggestions are welcome.
> 
> --
> Hollis Blanchard
> IBM Linux Technology Center
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel