This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] [RFO] #2: removing a concurrency bottleneck

To: "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] [RFO] #2: removing a concurrency bottleneck
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Thu, 19 Mar 2009 10:52:08 -0700 (PDT)
Delivery-date: Thu, 19 Mar 2009 10:54:14 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Request for opinion #2:

In order to remove a (last?) concurrency bottleneck in tmem,
I have to replicate a pair of fairly large buffers, one is
two pages and the other is 8 pages.  (Note that if tmem
ever works on ia64, pagesize is larger.)  Since the buffers
are too large for the stack, they are declared as globals
and protected by a single lock.  But the buffers are used
for compression, which can take quite a bit of time (up
to tens of thousands of cycles and probably >80% of the
total time spent in tmem), and so are magnets for any spinlock.

I see two solutions: cascading or per-cpu.

In per-cpu, I would allocate at system initialization one
pair of buffers for each cpu (question: num_present_cpus,
num_online_cpus, or num_possible_cpus?).  Then no lock
is required.

In cascading, I would allocate a small number of pairs
of buffers, perhaps only two or three, and "trylock"
each, falling back to trylock the second if locked,
then the third and so on, then spinlock if all are in
use.  Statistically this is probably good enough, unless
I choose a small number, and Xen is running on a huge box.

I suppose a combination of the two would be to cascade,
but dynamically choose and allocate the quantity of
buffers based on (maybe log+1 of?) the number of cpus
(again, present, online, or possible?).  But this is
probably going overboard.

Opinions?  And if per-cpu, is the current Xen infrastructure
sufficiently robust to handle hot-plug CPUs and I should too?


Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>