xen-devel
Re: [Xen-devel] Performance overhead of paravirt_ops on nativeidentif
To: |
Jan Beulich <JBeulich@xxxxxxxxxx> |
Subject: |
Re: [Xen-devel] Performance overhead of paravirt_ops on nativeidentified |
From: |
Jeremy Fitzhardinge <jeremy@xxxxxxxx> |
Date: |
Wed, 20 May 2009 15:42:08 -0700 |
Cc: |
Nick Piggin <npiggin@xxxxxxx>, Xiaohui Xin <xiaohui.xin@xxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, Xin Li <xin.li@xxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, Ingo Molnar <mingo@xxxxxxx> |
Delivery-date: |
Wed, 20 May 2009 15:42:47 -0700 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<4A11280402000078000014E2@xxxxxxxxxxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
References: |
<4A0B62F7.5030802@xxxxxxxx> <4A0BED040200007800000DB0@xxxxxxxxxxxxxxxxxx> <4A0C58BB.3090303@xxxxxxxx> <4A0D3F8C02000078000010A7@xxxxxxxxxxxxxxxxxx> <4A0DB988.6000009@xxxxxxxx> <4A11280402000078000014E2@xxxxxxxxxxxxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
User-agent: |
Thunderbird 2.0.0.21 (X11/20090320) |
Jan Beulich wrote:
Jeremy Fitzhardinge <jeremy@xxxxxxxx> 15.05.09 20:50 >>>
Jan Beulich wrote:
A patch for the pv-ops kernel would require some time. What I can give you
right away - just for reference - are the sources we currently use in our
kernel:
attached.
Hm, I see. Putting a call out to a pv-ops function in the ticket lock
slow path looks pretty straightforward. The need for an extra lock on
the contended unlock side is a bit unfortunate; have you measured to see
what hit that has? Seems to me like you could avoid the problem by
using per-cpu storage rather than stack storage (though you'd need to
copy the per-cpu data to stack when handling a nested spinlock).
Not sure how you'd imagine this to work: The unlock code has to look at all
cpus' data in either case, so an inner lock would still be required imo.
Well, rather than an explicit lock I was thinking of an optimistic
scheme like the pv clock update algorithm.
"struct spinning" would have a version counter it could update using the
even=stable/odd=unstable algorithm. The lock side would simply save the
current per-cpu "struct spinning" state onto its own stack (assuming you
can even have nested spinning), and then update the per-cpu copy with
the current lock. The kick side can check the version counter to make
sure it gets a consistent snapshot of the target cpu's current lock state.
I think that only the locking side requires locked instructions, and the
kick side can be all unlocked.
What's the thinking behind the xen_spin_adjust() stuff?
That's the placeholder for implementing interrupt re-enabling in the irq-save
lock path. The requirement is that if a nested lock attempt hits the same
lock on the same cpu that it failed to get acquired on earlier (but got a ticket
already), tickets for the given (lock, cpu) pair need to be circularly shifted
around so that the innermost requestor gets the earliest ticket. This is what
that function's job will become if I ever get to implement this.
Sounds fiddly.
static __always_inline void __ticket_spin_lock(raw_spinlock_t *lock) {
unsigned int token, count; bool free; __ticket_spin_lock_preamble; if
(unlikely(!free)) token = xen_spin_adjust(lock, token); do { count = 1
<< 10; __ticket_spin_lock_body; } while (unlikely(!count) &&
!xen_spin_wait(lock, token)); }
How does this work? Doesn't it always go into the slowpath loop even if
the preamble got the lock with no contention?
It indeed always enters the slowpath loop, but only for a single pass through
part of its body (the first compare in the body macro will make it exit the loop
right away: 'token' is not only the ticket here, but the full lock->slock
contents). But yes, I think you're right, one could avoid entering the body
altogether by moving the containing loop into the if(!free) body. The logic
went through a number of re-writes, so I must have overlooked that
opportunity on the last round of adjustments.
I was thinking of something like this: (completely untested)
void __ticket_spin_lock(raw_spinlock_t *lock)
{
unsigned short inc = 0x100;
unsigned short token;
do {
unsigned count = 1 << 10;
asm volatile (
" lock xaddw %w0, %1\n"
"1: cmpb %h0, %b0\n"
" je 2f\n"
" rep ; nop\n"
" movb %1, %b0\n"
/* don't need lfence here, because loads are in-order */
" sub $1,%2\n"
" jnz 1b\n"
"2:"
: "+Q" (inc), "+m" (lock->slock), "+r" (count)
:
: "memory", "cc");
if (likely(count != 0))
break;
token = inc;
inc = 0;
} while (unlikely(!xen_spin_wait(lock, token)));
}
where xen_spin_wait() would actually be a pvops call, and perhaps the
asm could be pulled out into an inline to deal with the 8/16 bit ticket
difference.
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|