WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] understanding __linear_l2_table and friends II

Thinking about this a bit more:

On Thu, Apr 21, 2005 at 02:51:34PM +0100, Ian Pratt wrote:
> The downside of this scheme is that it will cripple the TLB flush filter
> on Opteron. Linux used to do this until 2.6.11 anyhow, and no-one really
> complained much. The far bigger problem is that it won't work for SMP
> guests, at least without making the L2 per VCPU and updating the L3
> accordingly using mm ref counting, which would be messy but do-able.
> 
> The alternative is to hack PAE Linux to force the L2 containing kernel
> mappings to be per-pagetable rather than shared. The downside of the is
> that we use an extra 4KB per pagetable, and have the hassle of faulting
> in kernel L2 mappings on demand (like non-PAE Linux has to). This plays
> nicely with the TLB flush filter, and is fine for SMP guests. 

<without having looked at the Xen code much, but some familiarity with
the i386 linux code>

I thought about this a bit more and your section alternative sounds
much better. Faulting on the kernel mappings is very infrequent
and usually after some time the PGD is fully set up and only the lower
level of the kernel mappings change with vmalloc etc.. On x86-64 Linux
I even initialize it when the PGD is created from a static template
page. The remaining cases for very big vmalloc can be handled on demand
without too much code. It should be pretty easy  to do on i386 too.


> 
> The simplest thing of all in the first instance is to turn all of the
> linear pagetable accesses into macros taking (exec_domain, offset) and
> then just implement them using pagetable walks.
> 
> What do you guys think? Implement option #3 in the first instance, then
> aim for #2.

I dont get your numbering, didnt you have only two options?
Or does the one below count too?

> 
> One completely different approach would be to first implement a PAE
> guest using the "translate, internal" shadow mode where we don't have to
> worry about any of this gory  stuff. Once its working, we could then
> implement a paravirtualized mode to improve performance and save memory.
> Getting shadow mode working on PAE shouldn't be too hard, as its been
> written with 2, 3 and 4 level pagetables in mind.

That sounds attractive too, except that duplicated page tables
can be killer on some workloads (database with many processes and
lots of shared memory, you end up with a lot of memory tied 
in page tables even with hugetlb). And normally databases are one of the most
common workloads for PAE. It might be a good idea to avoid it
at least for the para case.

-Andi

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel