WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

Re: [Xen-ia64-devel][PATCH][RFC] Task: support huge page RE: [Xen-ia64-d

To: "Xu, Anthony" <anthony.xu@xxxxxxxxx>
Subject: Re: [Xen-ia64-devel][PATCH][RFC] Task: support huge page RE: [Xen-ia64-devel] Xen/IA64 Healthiness Report -Cset#11460
From: Isaku Yamahata <yamahata@xxxxxxxxxxxxx>
Date: Fri, 29 Sep 2006 14:58:52 +0900
Cc: Magnus Damm <magnus@xxxxxxxxxxxxx>, Tristan Gingold <Tristan.Gingold@xxxxxxxx>, xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 28 Sep 2006 22:59:17 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <51CFAB8CB6883745AE7B93B3E084EBE207DC53@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-ia64-devel-request@lists.xensource.com?subject=help>
List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
List-post: <mailto:xen-ia64-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=unsubscribe>
References: <51CFAB8CB6883745AE7B93B3E084EBE207DC53@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
Hi Anthony.

On Fri, Sep 29, 2006 at 10:28:36AM +0800, Xu, Anthony wrote:

> Page allocator is still based on 16K page, otherwise will impact small memory 
> allocation, such as xmalloc, and there are many places need to be modified.
> 
> So I think the first step is
> Page allocator still use 16K page, but we allocate huge page for domain.
> 
> As for allocation failure, the first step is, if allocation fails, creating 
> domain fails,
> The next step is to defragmentation.

For the first step, it sounds reasonable to not care about allocation
failure.
I think that page fragmentation should be addressed as a eventual goal.
It might have an impact on its design.


> >> 4. Per_LP_VHPT may need to be modified to support huge page.
> >
> >Do you mean hash collision?
> 
> I don't mean hash collision.
> Per_LP_VHPT is long format VHPT, it can support huge page essentially, but 
> many code about VHPT assume page size is 16K, so itir.ps is always 16K in 
> some code sequence.

Understood.


> >* Presumably there are two goals
> >  - Support one large page size(e.g. 16MB) to map kernel.
> >  - Support hugetlbfs whose page size might be different from 16MB.
> >
> >  I.e. support three page sizes, normal page size 16KB, kernel mapping
> >  page size 16MB and hugetlbfs page size 256MB.
> >  I think hugetlbfs support can be addressed specialized way.
> 
> Kernel is using 16M identity mapping, and rr7.ps=16M, so if Xen allocate 16M 
> contiguous chunks for domain, then Xen can set machine rr7.ps 16M instead of 
> 16K, then all VHPT entries for region 7 in Per_LP_VHPT is 16M page size.
> 
> I'm using rhel4-u2 as guest, by default, rhel4-u2 set rr4.ps=256M.
> 
> For latest kernel who supports hugetlbfs, the biggest page size is 4G.
> 
> The goals from me is supporting 256M, if we can do that, and then supporting 
> huger tlb like 1G or 4G is trivial. :-)

I wanted to say that a implementation for hugetlbfs may be different
from a implementation for large page to map kernel. 
So we should describe not only page-size, but also its purpose.
I re-summarize the goals. What do you think?

- Support 16MB large page size to map kernel area.
  Although Linux can be configured to use 64MB page size to map kernel,
  we won't support 64MB page size. (at least for first prototype)
  It would be nice to support both kernel mapping page size, 16MB and 64MB,
  but it might be addressed by second phase.
  A domain uses only one of them. Potentially different domains may use 
  different kernel mapping page size.
  e.g. domainA uses 16MB to map kernel, domainB uses 64MB to map kernel.

- Support hugetlbfs with 256MB page size.
  If possible, it would be nice to support 1GB page size or 4GB page size.
  The huge page size is determined by Linux boot time options and
  only single page size from 256MB, 1GB, 4GB is used by hugetlbfs of a domain.
  Potentially different domains may use different huge page size.
  e.g. domainA uses 256MB huge page size, domainB uses 1GB huge page size.
  For first prototype, it is reasonable to support only 256MB huge page size.

- page fragmentation should be addressed.
  This isn't addressed at the first step.
  When large contiguous page allocation fails, fall back path is executed
  with normal page size, 16KB, possibly at degraded performance.
  Or try a sort of defragmentation to create a large contiguous memory.
    

> >hugetlbfs
> >* Some specialized path can be implemented to support hugetlbfs.
> >  - For domU
> >    paravirtualize hugetlbfs for domU.
> >    Hook to alloc_fresh_huge_page() in Linux. Then xen/ia64 is aware of
> >    large pages.
> >    Probably a new flag of the p2m entry, or other data structure might be
> >    introduced.
> >    For xenLinux, the region number, RGN_HPAGE can be used to check before
> >    entering hugetlbfs specialized path.
> 
> That's good, but first Xen need to allocate contiguous chunks for domU
> 
> >  - For domVTI
> >    Can the use of hugetlbfs be detected somehow?
> 
> In domVTI side, Xen don't know hugetlbfs, but Xen can capture guest accessing 
> rr,
> If a new preferred page size is set ( rr.ps),( preferred page size means this 
> page size is used mostly in this region). If Xen can set the same preferred 
> page size into machine rr.ps, that's great, most tlb miss can be handled by 
> assembly code, that means it can be found in long format VHPT, otherwise Xen 
> need to lookup VTLB in C code

Hutetlbfs page size is Linux boot time option. So I think it might be 
acceptable to require domain configuration option.
You already added order options in your proto type in fact.

thanks.
-- 
yamahata

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel