This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: [PATCH] mem_sharing: fix race condition of nominate and

To: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH] mem_sharing: fix race condition of nominate and unshare
From: Jui-Hao Chiang <juihaochiang@xxxxxxxxx>
Date: Fri, 7 Jan 2011 11:54:27 +0800
Cc: tinnycloud <tinnycloud@xxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 06 Jan 2011 19:55:20 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=zU6B7IPUNBfG93gUis5GfjcdE8RDnr7pZ5weqQZBdPc=; b=VPKj1CfPLR+3CvPrfmfiyM8gAXMnevG+dExUAck1PhEZfOdACljvWoX8x4inQgM/k1 TII9jTQ9mp3tMIOcD+71vmCKBiXfplvWTvQcZdZPgwARlR5rFyUosyBirk/oICqy+M1b rBKSs4gMFn3fl88gfRtzv43Z9edZpdzJl3kdo=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=QmFuQs2eR7TAKTb5A5IiREk3lDH09M86+ySOfdH9680p1Ii8lK+HJq9RvyuOaV/OBo 46B21aXfw+FwT08Tq9WtlWXCh5bTt5ooQ1fscP1matYsLLLy4dh01NCosS72ZsSZiVyL FxNbQTIStr2qp0QzIkgVYYCm8lbFKDZ7Neec4=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110106165450.GO21948@xxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTinMp1v1zex2BfcUuszotPuxJFWZQNUp40gu_gxL@xxxxxxxxxxxxxx> <20110106165450.GO21948@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi, Tim

On Fri, Jan 7, 2011 at 12:54 AM, Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
At 16:11 +0000 on 06 Jan (1294330319), Jui-Hao Chiang wrote:
> Hi, this patch does the following
> (1) When updating/checking p2m type for mem_sharing, we must hold shr_lock
> (2) For nominate operation, if the page is already nominated, return the handle from page_info->shr_handle
> (3) For unshare operation, it is possible that multiple users unshare a page via hvm_hap_nested_page_fault() at the same time. If the page is already un-shared by someone else, simply return success.

I'm going to apply this, since it looks like an improvement, but I'm not
convinced it properly solves the problem.

It seems tinnycloud's case is when dom0 try to RW maps a shared page, which should unshare it properly, and change the type count.
But there is still a bug hidden in page_make_sharable() which fails to recover type count when the call fails.
Now I trace it again, and found something different than what we have discussed before.
I will find it and submit patch again.

> NOTE: we assume that nobody holds page_alloc_lock/p2m_lock before calling nominate/share/unshare.

As I told you earlier, that's not the case.  p2m_teardown() can call
mem_sharing_unshare_page() with the p2m lock held.

Oops, I forgot again.
After this change, unshare() has a potential problem of deadlock for shr_lock and p2m_lock with different locking order.
Assume two CPUs do the following
CPU1: hvm_hap_nested_page_fault() => unshare() => p2m_change_type() (locking order: shr_lock, p2m_lock)
CPU2: p2m_teardown() => unshare() (locking order: p2m_lock, shr_lock)
When CPU1 grabs shr_lock and CPU2 grabs p2m_lock, they deadlock later.

So it seems better to fix the following rules
(1) Fix locking order: p2m_lock ---> shr_lock
(2) Any function in mem_sharing, if modifying/checking p2m entry is necessary, it must hold p2m_lock and then shr_lock. Later on, when changing p2m entries, don't call any p2m function which locks p2m again

So for p2m functions, it seems better to provide some functions which don't call p2m_lock again.
What do you think? If that's ok, I will do it in this way.

Xen-devel mailing list