WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH 0 of 4] mm+paravirt+xen: add pte read-modify-writ

To: Zachary Amsden <zach@xxxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH 0 of 4] mm+paravirt+xen: add pte read-modify-write abstraction
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Fri, 23 May 2008 21:32:39 +0100
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>, kvm-devel <kvm-devel@xxxxxxxxxxxxxxxxxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Virtualization Mailing List <virtualization@xxxxxxxxxxxxxx>, Hugh Dickins <hugh@xxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Delivery-date: Fri, 23 May 2008 13:33:34 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <1211567273.7465.36.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <patchbomb.1211552448@localhost> <1211567273.7465.36.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.14 (X11/20080501)
Zachary Amsden wrote:
I'm a bit skeptical you can get such a semantic to work without a very
heavyweight method in the hypervisor.  How do you guarantee no other CPU
is fizzling the A/D bits in the page table (it can be done by hardware
with direct page tables), unless you use some kind of IPI?  Is this why
it is still 7x?

No, you just use cmpxchg. It's pretty lightweight really. Xen holds a lock internally to stop other cpus from updating the pte in software, so the only source of modification is the hardware itself; the cmpxchg loop is guaranteed to terminate because the A/D bits can only transition from 0->1.

I haven't really gone into depth as to exactly where the 7x number comes from. I could increase the batch size (currently max of 32 pte updates/hypercall), and some of it is plain overhead from the in-kernel infrastructure. A simpler and more hackish approach which basically pastes the Xen hypercall directly into the mprotect loop gets the overhead down to about 5.5x.

Still, a 7x gain from asynchronous batching is very nice.  I wonder if
that means the average mprotect size in your benchmark is 7 pages.

Yeah, it's around 7x. The batching pays off even for single page mprotects, because the trap and emulate of xchg is so expensive.


I believe that other virtualization systems, whether they use direct
paging like Xen, or a shadow pagetable scheme (vmi, kvm, lguest), can
make use of this interface to improve the performance.

On VMI, we don't trap the xchg of the pte, thus we don't have any
bottleneck here to begin with.

If you're doing code rewriting then I guess you can effectively do the same trick at that point. If not, then presumably you take a fault for the first pte updated in the mprotect and then sync the shadow up when the tlb flush happens; batching that trap and the tlb flush would give you some benefit for small mprotects.

   J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel