This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Vanilla Linux and has_foreign_mapping

To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject: Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
From: Michael Abd-El-Malek <mabdelmalek@xxxxxxx>
Date: Tue, 29 Apr 2008 12:39:28 -0400
Cc: Mark McLoughlin <markmc@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Andrea Arcangeli <andrea@xxxxxxxxxxxx>, Eduardo Habkost <ehabkost@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Christoph Lameter <clameter@xxxxxxx>
Delivery-date: Tue, 29 Apr 2008 09:39:57 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <48125C42.6030709@xxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C4329BC3.16E37%keir.fraser@xxxxxxxxxxxxx> <48112345.5000503@xxxxxxxx> <481210C0.6070109@xxxxxxx> <48122153.1070007@xxxxxxxx> <48122378.2090802@xxxxxxx> <48125C42.6030709@xxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Apr 25, 2008, at 6:33 PM, Jeremy Fitzhardinge wrote:
Michael Abd-El-Malek wrote:
How about we do the following:


tlb = tlb_gather_mmu(mm, 1);
/* Don't update_hiwater_rss(mm) here, do_exit already did */
/* Use -1 here to ensure all VMAs in the mm are unmapped */
end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL);


We'll reintroduce has_foreign_mappings. If has_foreign_mappings is _not_ set, then arch_exit_mmap_pre can early unpin the page tables and arch_exit_mmap_post will do nothing. If has_foreign_mappings is set, then arch_exist_mmap_pre won't do anything, and arch_exit_mmap_post will do the actual xen_exit_mmap call.

What do you think?

I'm thinking along the lines of:

 1. steal the "private" field in struct page for Xen pte pages
 2. if we install a grant mapping in that page, allocate a secondary
    page and point private to it.  In that secondary page, keep an
    array of grant handles corresponding to the grant mappings in the
    pte page (non-grant mappings have an invalid handle).
 3. In unpin_page, if we're unpinning a pte page with a non-null
    private page, then walk the private page to tear down the grant
mappings, and free the private page, and unpin the pte page normally.

I like it because it 1) avoids the need for any core kernel hooks, and 2) decouples unpinning grant pages from the mechanism used to actually map the grant pages, 2a) the metadata for granted pages is stored with the pagetable (effectively), so the grant driver doesn't need to do anything special to make it work. Also it means all the information to pull down the mapping is available for normal unmap operations (ie, we can do it in set_pte without needing a special zap_pte hook).

I like this approach!

No doubt I'm overlooking something important.  What is it?

Some drivers may need to do additional tasks besides just clearing the PTE. For example, when unzapping my kernel PTE, I need to restore the physical mapping of the page. (On the initial set_pte (which I've overridden), I removed the physical backing of the page.)

If we really want to avoid a zap_pte hook, I suppose we can add flags to the page/PTE that indicate things like "this page needs to have its physical backing restored".

I guess one concern is if the per-grant-mapping data is larger than a pte, then the private "page" will either need to be larger than a page, or more complex a structure than a simple array. The kernel and user handles would be stored separately, since they'd have separate ptes anyway. Looks like it will need to be a (domid, ref, handle) tuple, which would be 10 bytes. Are refs and/or handles really 32-bit quantities? Hm, though it looks like GNTTABOP_unmap_grant_ref only uses the handle, so that's quite convenient.

Do we even need to store a domid? The grant handle is all you need to unmap the grant. And that's 32-bits.

Would this scheme work? Does it seem reasonable? Does it solve the problem?

It's definitely reasonable, clean, and would solve the problem. My only concern is stated above.

If you think that having a "restore physical backing" page/PTE flag is OK, then I'm willing to make a 2.6.25 patch for this. The next couple of weeks are a bit hectic, but I can have it done by mid-May.


Xen-devel mailing list