WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks

To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks
From: Stephan Diestelhorst <stephan.diestelhorst@xxxxxxx>
Date: Mon, 10 Oct 2011 16:01:50 +0200
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>, Nick Piggin <npiggin@xxxxxxxxx>, KVM <kvm@xxxxxxxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, maintainers <x86@xxxxxxxxxx>, Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, Marcelo Tosatti <mtosatti@xxxxxxxxxx>, Kleen <andi@xxxxxxxxxxxxxx>, "Andi@xxxxxxxxxxxxxx" <Andi@xxxxxxxxxxxxxx>, Avi, Jan Beulich <JBeulich@xxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, "the@xxxxxxxxxxxxxx" <the@xxxxxxxxxxxxxx>, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxx>, Kivity <avi@xxxxxxxxxx>
Delivery-date: Mon, 10 Oct 2011 07:05:13 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <2523929.AGG4U997NO@d-allen>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: AMD OSRC
References: <cover.1315878463.git.jeremy.fitzhardinge@xxxxxxxxxx> <4E8DE7F1.3050108@xxxxxxxx> <2523929.AGG4U997NO@d-allen>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/4.7.1 (Linux/3.0.4-030004-generic; KDE/4.7.1; x86_64; ; )
On Monday 10 October 2011, 07:00:50 Stephan Diestelhorst wrote:
> On Thursday 06 October 2011, 13:40:01 Jeremy Fitzhardinge wrote:
> > On 10/06/2011 07:04 AM, Stephan Diestelhorst wrote:
> > > On Wednesday 28 September 2011, 14:49:56 Linus Torvalds wrote:
> > >> Which certainly should *work*, but from a conceptual standpoint, isn't
> > >> it just *much* nicer to say "we actually know *exactly* what the upper
> > >> bits were".
> > > Well, we really do NOT want atomicity here. What we really rather want
> > > is sequentiality: free the lock, make the update visible, and THEN
> > > check if someone has gone sleeping on it.
> > >
> > > Atomicity only conveniently enforces that the three do not happen in a
> > > different order (with the store becoming visible after the checking
> > > load).
> > >
> > > This does not have to be atomic, since spurious wakeups are not a
> > > problem, in particular not with the FIFO-ness of ticket locks.
> > >
> > > For that the fence, additional atomic etc. would be IMHO much cleaner
> > > than the crazy overflow logic.
> > 
> > All things being equal I'd prefer lock-xadd just because its easier to
> > analyze the concurrency for, crazy overflow tests or no.  But if
> > add+mfence turned out to be a performance win, then that would obviously
> > tip the scales.
> > 
> > However, it looks like locked xadd is also has better performance:  on
> > my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower
> > than locked xadd, so that pretty much settles it unless you think
> > there'd be a dramatic difference on an AMD system.
> 
> Indeed, the fences are usually slower than locked RMWs, in particular,
> if you do not need to add an instruction. I originally missed that
> amazing stunt the GCC pulled off with replacing the branch with carry
> flag magic. It seems that two twisted minds have found each other
> here :)
> 
> One of my concerns was adding a branch in here... so that is settled,
> and if everybody else feels like this is easier to reason about...
> go ahead :) (I'll keep my itch to myself then.)

Just that I can't... if performance is a concern, adding the LOCK
prefix to the addb outperforms the xadd significantly:

With mean over 100 runs... this comes out as follows
(on my Phenom II)

locked-add   0.648500 s   80%
add-rmwtos   0.707700 s   88%
locked-xadd  0.807600 s  100%
add-barrier  1.270000 s  157%

With huge read contention added in (as cheaply as possible):
locked-add.openmp  0.640700 s  84%
add-rmwtos.openmp  0.658400 s  86%
locked-xadd.openmp 0.763800 s 100%

And the numbers for write contention are crazy, but also feature the
locked-add version:
locked-add.openmp  0.571400 s  71%
add-rmwtos.openmp  0.699900 s  87%
locked-xadd.openmp 0.800200 s 100%

Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@xxxxxxx, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH
Einsteinring 24
85609 Aschheim
Germany

Geschaeftsfuehrer: Alberto Bozzo;
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632, WEEE-Reg-Nr: DE 12919551 

Attachment: add-rmwtos.c
Description: add-rmwtos.c

Attachment: add-rmwtos.openmp.c
Description: add-rmwtos.openmp.c

Attachment: locked-add.c
Description: locked-add.c

Attachment: locked-xadd.openmp.c
Description: locked-xadd.openmp.c

Attachment: locked-add.openmp.c
Description: locked-add.openmp.c

Attachment: locked-xadd.c
Description: locked-xadd.c

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel