This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] passing hypercall parameters by pointer

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] passing hypercall parameters by pointer
From: Hollis Blanchard <hollisb@xxxxxxxxxx>
Date: Wed, 17 Aug 2005 14:51:06 -0500
Cc: Jimi Xenidis <jimix@xxxxxxxxxxxxxx>
Delivery-date: Wed, 17 Aug 2005 19:51:13 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: IBM Linux Technology Center
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.8.2
Many Xen hypercalls pass mlocked pointers as parameters for both input and 
output. For example, xc_get_pfn_list() is a nice one with multiple levels of 

Considering just the tools for the moment, those pointers are userspace 
addresses. Ultimately the hypervisor ends up with that userspace address, from 
which it reads and writes data. This is OK for x86, since userspace, kernel, 
and hypervisor all share the same virtual address space (and userspace has 
carefully mlocked the relevent memory).

On PowerPC though, the hypervisor runs in real mode (no MMU translation).  
Unlike x86, PowerPC exceptions arrive in real mode, and also PowerPC does not 
force a TLB flush when switching between real and virtual modes. So a virtual 
address is pretty much worthless as a hypervisor parameter; performing the 
MMU translation in software is infeasible.

Although it rarely passes parameters by pointer, the way the pSeries 
hypervisor handles this is having the kernel always pass a "pseudo-physical" 
address (to borrow Xen terminology), which is trivially translatable to a 
"machine" address in the hypervisor. The processor has some notion of a large 
(e.g. 64M) chunk of contiguous machine memory, so the hypervisor keeps a 
table of chunks which can be used to translate pseudo-physical addresses.

Of course, userspace doesn't know psuedo-physical addresses, only the kernel 
does. So one way or another, to pass parameters by pointer to the PPC 
hypervisor, the kernel is going to need to translate them. That also means  
userspace memory areas will be limited to one page (since virtually 
consecutive pages may not be representable by a single pseudo-physical 

If we're stuck with structure addresses in hypercalls, one possible solution 
is to modify libxc so that all parameter addresses are physical pointers 
within the same page, then pass that page's physical address into the 
hypercall. Something like this:

ulong magicpage_vaddr;
ulong magicpage_paddr;

libxc_init() {
#ifdef __powerpc__
        posix_memalign(&magicpage_vaddr, PAGE_SIZE, PAGE_SIZE);
        magicpage_paddr = new_translate_syscall(magicpage_vaddr);

xc_get_pfn_list() {
        dom0_op_t *op;
        ulong op_paddr;
        magicalloc(&op, &op_paddr, sizeof(dom0_op_t));

#ifdef __powerpc__
magicalloc(ulong &usable_addr, ulong &hcall_addr, int bytes) {
        *usable_addr = magicpage_vaddr + offset;
        *hcall_addr = magicpage_paddr + offset;
        offset += bytes;

do_xen_hypercall(ptr) {
        ptr -= magicpage_vaddr - magicpage_paddr;
        do_privcmd(..., ptr);

(Note that this is for discussion only, not a proposed interface.)

Each architecture would provide their own magicalloc and do_xen_hypercall, and 
for x86 magicalloc would be malloc+mlock and both pointers are the same. x86 
do_xen_hypercall would remain unchanged. Basically, any current use of mlock 
in libxc would be replaced with calls to magicalloc.

For example, if we're willing to change the embedded pointers in dom0_ops to 
offsets, we do not need to invent a new "translate" system call.

Other suggestions are welcome.

Hollis Blanchard
IBM Linux Technology Center

Xen-devel mailing list