WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 5 of 9] Fine-grained concurrency control structur

To: "Tim Deegan" <tim@xxxxxxx>
Subject: Re: [Xen-devel] [PATCH 5 of 9] Fine-grained concurrency control structure for the p2m
From: andres@xxxxxxxxxxxxxxxx
Date: Wed, 2 Nov 2011 07:20:09 -0700
Cc: olaf@xxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, andres@xxxxxxxxxxxxxx, keir.xen@xxxxxxxxx, Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>, adin@xxxxxxxxxxxxxx
Delivery-date: Wed, 02 Nov 2011 07:21:07 -0700
Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=lagarcavilla.com; h= message-id:in-reply-to:references:date:subject:from:to:cc :mime-version:content-type:content-transfer-encoding; s= lagarcavilla.com; bh=wfWnPBMh1rowyXlHw17cPa1KlEE=; b=WR3/Az/ed1O L1Cf7pGSAZIyGjolqViOC7N5q7/Q/sY9pXrx7kIzpiM/kUdBOiTFYyCv0qz3PzJZ 1P9Gwwc8p2TQ/PLV5jChOl6FQDsLKQUViUES4UgLPd4ZOaR3PiJZZI0nYstDT2u/ LMFa8jCS0dkjtThlXLXfJJO6gVryiMYA=
Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.com; h=message-id :in-reply-to:references:date:subject:from:to:cc:mime-version :content-type:content-transfer-encoding; q=dns; s= lagarcavilla.com; b=mLgX8Nt2PiFGWGrps9eqWDfZrrE3gr4rxfzSdiSIy2Dv BgQRzGIEWf5JMMJc8MseJk8/cAso0ov5olrxP4HMd/3kWx176v9YURs6pmkcszIb l+0CsA37TUqMYuQdEVZewFZ2YKKp/K3P6zj7bB9ekSNQZAU0tp0vUHDWYRORycs=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20111027144333.GM59656@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <patchbomb.1319690025@xxxxxxxxxxxxxxxxxxx> <a23e1262b1240dcabfa0.1319690030@xxxxxxxxxxxxxxxxxxx> <20111027144333.GM59656@xxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: SquirrelMail/1.4.21
Hey there,
(many inlines on this one)

> At 00:33 -0400 on 27 Oct (1319675630), Andres Lagar-Cavilla wrote:
>> Introduce a fine-grained concurrency control structure for the p2m. This
>> allows for locking 2M-aligned chunks of the p2m at a time, exclusively.
>> Recursive locking is allowed. Global locking of the whole p2m is also
>> allowed for certain operations. Simple deadlock detection heuristics are
>> put in place.
>>
>> Note the patch creates backwards-compatible shortcuts that will lock the
>> p2m globally. So it should remain functionally identical to what is
>> currently
>> in place.
>>
>> Signed-off-by: Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>
>
> Wow.  What a lot of code. :)  I took a look through, but I can't
> guarantee to have got all the details.  Things I saw:
>
> - You use atomic_t for the count but only ever update it under a
>   lock. :)  If you just need to be sure of atomic writes, then
>   atomic_set will do that without using a locked increment/decrement.

I'm a bit flaky on my atomics. And paranoid. I'll be less lenient next time.

>
> - You allocate the bitmaps from xenheap - they should really be using
>   p2m memory, so as to avoid changing the memory overhead of the domain
>   as it runs.   That will involve map_domain_page()ing the bitmaps as
>   you go, but at least on x86_64 that's very cheap.

p2m_alloc_ptp? Sure.

>
> - panic() on out-of-memory is pretty rude.
>
Yeah, but unwinding all possible lock callers to handle ENOMEM was over my
threshold. Reality is that on your run-of-the-mill 4GB domain you have 4
or 5 single page allocations. You have bigger problems if that fails.

> But stepping back, I'm not sure that we need all this just yet.  I think
> it would be worth doing the interface changes with a single p2m lock and
> measuring how bad it is before getting stuck in to fine-grained locking
> (fun though it might be).

Completely agree. I think this will also ease adoption and bug isolation.
It'll allow me to be more gradual. I'll rework the order. Thanks, very
good.

>
> I suspect that if this is a contention point, allowing multiple readers
> will become important, especially if there are particular pages that
> often get emulated access.
>
> And also, I'd  like to get some sort of plan for handling long-lived
> foreign mappings, if only to make sure that this phase-1 fix doesn't
> conflict wih it.
>

If foreign mappings will hold a lock/ref on a p2m subrange, then they'll
disallow global operations, and you'll get a clash between log-dirty and,
say, qemu. Ka-blam live migration.

Read-only foreign mappings are only problematic insofar paging happens.
With proper p2m update/lookups serialization (global or fine-grained) that
problem is gone.

Write-able foreign mappings are trickier because of sharing and w^x. Is
there a reason left, today, to not type PGT_writable an hvm-domain's page
when a foreign mapping happens? That would solve sharing problems. w^x
really can't be solved short of putting the vcpu on a waitqueue
(preferable to me), or destroying the mapping and forcing the foreign OS
to remap later. All a few steps ahead, I hope.

Who/what's using w^x by the way? If the refcount is zero, I think I know
what I'll do ;)

That is my current "long term plan".

> Oh, one more thing:
>
>> +/* Some deadlock book-keeping. Say CPU A holds a lock on range A, CPU B
>> holds a
>> + * lock on range B. Now, CPU A wants to lock range B and vice-versa.
>> Deadlock.
>> + * We detect this by remembering the start of the current locked range.
>> + * We keep a fairly small stack of guards (8), because we don't
>> anticipate
>> + * a great deal of recursive locking because (a) recursive locking is
>> rare
>> + * (b) it is evil (c) only PoD seems to do it (is PoD therefore evil?)
>> */
>
> If PoD could ba adjusted not to do it, could we get rid of all the
> recursive locking entirely?  That would simplify things a lot.
>

My comment is an exaggeration. In a fine-grained scenario, recursive
locking happens massively throughout the code. We just need to live with
it. I was ranting for free on the "evil" adjective.

What is a real problem is that pod sweeps can cause deadlocks. There is a
simple step to mitigate this: start the sweep from the current gfn and
never wrap around -- too bad if the gfn is too high. But this alters the
sweeping algorithm. I'll deal with it when its it's turn.

Andres
> Tim.
>
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>