This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] passing hypercall parameters by pointer

To: Arun Sharma <arun.sharma@xxxxxxxxx>
Subject: Re: [Xen-devel] passing hypercall parameters by pointer
From: Hollis Blanchard <hollisb@xxxxxxxxxx>
Date: Wed, 17 Aug 2005 17:11:13 -0500
Cc: Jimi Xenidis <jimix@xxxxxxxxxxxxxx>, Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>, "Ling, Xiaofeng" <xiaofeng.ling@xxxxxxxxx>
Delivery-date: Wed, 17 Aug 2005 22:10:45 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4303A6FC.6040601@xxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: IBM Linux Technology Center
References: <A95E2296287EAD4EB592B5DEEFCE0E9D282BB8@xxxxxxxxxxxxxxxxxxxxxxxxxxx> <mailman.1124311483.4826@xxxxxxxxxxxxxxxxxxxx> <4303A6FC.6040601@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.8.2
On Wednesday 17 August 2005 16:07, Arun Sharma wrote:
> Ian Pratt wrote:
> >>Many Xen hypercalls pass mlocked pointers as parameters for
> >>both input and output. For example, xc_get_pfn_list() is a
> >>nice one with multiple levels of structures/mlocking.
> >>
> >>Considering just the tools for the moment, those pointers are
> >>userspace addresses. Ultimately the hypervisor ends up with
> >>that userspace address, from which it reads and writes data.
> >>This is OK for x86, since userspace, kernel, and hypervisor
> >>all share the same virtual address space (and userspace has
> >>carefully mlocked the relevent memory).
> This is a problem even on x86 for VMX domains which execute hypercalls
> because of para virtualized device drivers.
> >>On PowerPC though, the hypervisor runs in real mode (no MMU
> >>translation).
> >>Unlike x86, PowerPC exceptions arrive in real mode, and also
> >>PowerPC does not force a TLB flush when switching between
> >>real and virtual modes. So a virtual address is pretty much
> >>worthless as a hypervisor parameter; performing the MMU
> >>translation in software is infeasible.
> >
> > I think I'd prefer to hide all of this by co-operation between the
> > kernel and the hypervisor's copy to/from user.
> This is basically what Xiaofeng attempted to do in this patch:
> http://article.gmane.org/gmane.comp.emulators.xen.devel/11107
> although the virtual -> pseudo physical is also done in the hypervisor.
> Please let us know if the patch is acceptable in light of your email.

This patch does performs MMU translation in software. Even if you like that on 
x86, trying to do that on PowerPC is considerably more expensive. Just the 
page table lookup could be 16 loads and compares, and that's not counting 

> > The kernel can easily translate a virtual address and length into a list
> > of psuedo-phyiscal frame numbers and initial offset. Xen's copy from
> > user function can then use this list when doing its work.
> The other alternative (which we talked about at OLS) is to use a couple
> of pinned pages for parameter passing - but it doesn't work very well for:
> a) Multiple levels of structures/pointers
> b) Arguments which may be bigger than a couple of pages
> (xc_get_pfn_list() for a bigmem domain for example).

This is pretty much the proposal I sent earlier. The multiple levels of 
pointers can be handled as I showed, by creating an allocator that manages 
the couple pages.

I have no answer for parameters that are very large, but I wonder how many 
cases there are. For example, DOM0_READCONSOLE could just be limited to 4KB 
reads, and if there's more data than that, call it again. Perhaps there is 
some case-specific solution to xc_get_pfn_list() as well.

Hollis Blanchard
IBM Linux Technology Center

Xen-devel mailing list