xen-devel

[Top] [All Lists]

[Xen-devel] [RFO] #2: removing a concurrency bottleneck

from [Dan Magenheimer]

[Permanent Link][Original]

To:	"Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject:	[Xen-devel] [RFO] #2: removing a concurrency bottleneck
From:	Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date:	Thu, 19 Mar 2009 10:52:08 -0700 (PDT)
Delivery-date:	Thu, 19 Mar 2009 10:54:14 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

Request for opinion #2:

In order to remove a (last?) concurrency bottleneck in tmem,
I have to replicate a pair of fairly large buffers, one is
two pages and the other is 8 pages.  (Note that if tmem
ever works on ia64, pagesize is larger.)  Since the buffers
are too large for the stack, they are declared as globals
and protected by a single lock.  But the buffers are used
for compression, which can take quite a bit of time (up
to tens of thousands of cycles and probably >80% of the
total time spent in tmem), and so are magnets for any spinlock.

I see two solutions: cascading or per-cpu.

In per-cpu, I would allocate at system initialization one
pair of buffers for each cpu (question: num_present_cpus,
num_online_cpus, or num_possible_cpus?).  Then no lock
is required.

In cascading, I would allocate a small number of pairs
of buffers, perhaps only two or three, and "trylock"
each, falling back to trylock the second if locked,
then the third and so on, then spinlock if all are in
use.  Statistically this is probably good enough, unless
I choose a small number, and Xen is running on a huge box.

I suppose a combination of the two would be to cascade,
but dynamically choose and allocate the quantity of
buffers based on (maybe log+1 of?) the number of cpus
(again, present, online, or possible?).  But this is
probably going overboard.

Opinions?  And if per-cpu, is the current Xen infrastructure
sufficiently robust to handle hot-plug CPUs and I should too?

Thanks,
Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] [RFO] #2: removing a concurrency bottleneck, Dan Magenheimer <= Re: [Xen-devel] [RFO] #2: removing a concurrency bottleneck, Keir Fraser Re: [Xen-devel] [RFO] #2: removing a concurrency bottleneck, Jan Beulich Re: [Xen-devel] [RFO] #2: removing a concurrency bottleneck, Keir Fraser

Previous by Date:	Re: [Xen-devel] Re: how to enable shadow page table? Do Ihavetorun HVM guest systems for shadow paging mode?, Tim Deegan
Next by Date:	Re: [Xen-devel] [RFO] #2: removing a concurrency bottleneck, Keir Fraser
Previous by Thread:	[Xen-devel] [PATCH] fs-backend: fix gnttab unmap, Stefano Stabellini
Next by Thread:	Re: [Xen-devel] [RFO] #2: removing a concurrency bottleneck, Keir Fraser
Indexes:	[Date] [Thread] [Top] [All Lists]