WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] understanding __linear_l2_table and friends

To: "Ian Pratt" <Ian.Pratt@xxxxxxxxxxxx>, "Gerd Knorr" <kraxel@xxxxxxxxxxx>
Subject: RE: [Xen-devel] understanding __linear_l2_table and friends
From: "Ian Pratt" <m+Ian.Pratt@xxxxxxxxxxxx>
Date: Thu, 21 Apr 2005 14:51:34 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, ak@xxxxxxx, Scott Parish <srparish@xxxxxxxxxx>
Delivery-date: Thu, 21 Apr 2005 13:51:18 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcVF9bm4OXdW7cqMRAq+ODh4xWzOnAAfp+8A
Thread-topic: [Xen-devel] understanding __linear_l2_table and friends
 
One key design decision with PAE para-virtualized guests is how to
handle the per-pagetable (as opposed to per-domain) mappings that exist
in the hypervisor reserved area. The only ones of these that spring to
mind are in fact the linear pagetable mappings.

PAE Linux currently uses a single L2 for all kernel mappings shared
across all pagetables. Thus, when we do the mmu_ext_op hypercall to
switch cr3 we'd need to write in new values into the appropriate L2 of
the destination pagetable before re-loading cr3 (since in reality
there'll only really ever be one such L2 for the domain, it makes sense
to leave an open map_domain_mem to it.)

The downside of this scheme is that it will cripple the TLB flush filter
on Opteron. Linux used to do this until 2.6.11 anyhow, and no-one really
complained much. The far bigger problem is that it won't work for SMP
guests, at least without making the L2 per VCPU and updating the L3
accordingly using mm ref counting, which would be messy but do-able.

The alternative is to hack PAE Linux to force the L2 containing kernel
mappings to be per-pagetable rather than shared. The downside of the is
that we use an extra 4KB per pagetable, and have the hassle of faulting
in kernel L2 mappings on demand (like non-PAE Linux has to). This plays
nicely with the TLB flush filter, and is fine for SMP guests. 

The simplest thing of all in the first instance is to turn all of the
linear pagetable accesses into macros taking (exec_domain, offset) and
then just implement them using pagetable walks.

What do you guys think? Implement option #3 in the first instance, then
aim for #2.

One completely different approach would be to first implement a PAE
guest using the "translate, internal" shadow mode where we don't have to
worry about any of this gory  stuff. Once its working, we could then
implement a paravirtualized mode to improve performance and save memory.
Getting shadow mode working on PAE shouldn't be too hard, as its been
written with 2, 3 and 4 level pagetables in mind.

The shadow mode approach could be implemented in parallel with the
paravirt approach. We could even turn it into a race to the first
multiuser boot :-)

Cheers,
Ian


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>