xen-devel

[Top] [All Lists]

[Xen-devel] Re: One (possible) x86 get_user_pages bug

from [Xiaowei Yang]

[Permanent Link][Original]

To:	Nick Piggin <npiggin@xxxxxxxxx>
Subject:	[Xen-devel] Re: One (possible) x86 get_user_pages bug
From:	Xiaowei Yang <xiaowei.yang@xxxxxxxxxx>
Date:	Fri, 28 Jan 2011 15:17:31 +0800
Cc:	Kaushik Barde <kbarde@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Kenneth Lee <liguozhu@xxxxxxxxxx>, Nick Piggin <npiggin@xxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, Jan Beulich <JBeulich@xxxxxxxxxx>, wangzhenguo@xxxxxxxxxx, linqaingmin <linqiangmin@xxxxxxxxxx>, fanhenglong@xxxxxxxxxx, Wu Fengguang <fengguang.wu@xxxxxxxxx>, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Delivery-date:	Thu, 27 Jan 2011 23:18:01 -0800
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<AANLkTikhoG4fgXjD9hS9yB+BVqeF3GXtTHiQVMn0n5TS@xxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<4D416D9A.9010603@xxxxxxxxxx> <4D41A651020000780002ED36@xxxxxxxxxxxxxxxxxx> <AANLkTikhoG4fgXjD9hS9yB+BVqeF3GXtTHiQVMn0n5TS@xxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4

On 2011-1-28 5:24, Nick Piggin wrote:

On Fri, Jan 28, 2011 at 3:07 AM, Jan Beulich<JBeulich@xxxxxxxxxx>  wrote:

On 27.01.11 at 14:05, Xiaowei Yang<xiaowei.yang@xxxxxxxxxx>  wrote:

We created a scenario to reproduce the bug:
----------------------------------------------------------------
// proc1/proc1.2 are 2 threads sharing one page table.
// proc1 is the parent of proc2.

proc1               proc2          proc1.2
...                 ...            // in gup_pte_range()
...                 ...            pte = gup_get_pte()
...                 ...            page1 = pte_page(pte)  // (1)
do_wp_page(page1)   ...            ...
...                 exit_map()     ...
...                 ...            get_page(page1)        // (2)
-----------------------------------------------------------------

do_wp_page() and exit_map() cause page1 to be released into free list
before get_page() in proc1.2 is called. The longer the delay between
(1)&(2), the easier the BUG_ON shows.


Other than responded initially, I don't this can happen outside
of Xen: do_wp_page() won't reach page_cache_release() when
gup_pte_range() is running for the same mm on another CPU,
since it can't get past ptep_clear_flush() (waiting for the CPU
in get_user_pages_fast() to re-enable interrupts).


Yeah, this cannot happen on native.

An experimental patch is made to prevent the PTE being modified in the
middle of gup_pte_range(). The BUG_ON disappears afterward.

However, from the comments embedded in gup.c, it seems deliberate to
avoid the lock in the fast path. The question is: if so, how to avoid
the above scenario?


Nick, based on your doing of the initial implementation, would
you be able to estimate whether disabling get_user_pages_fast()
altogether for Xen would be performing measurably worse than
adding the locks (but continuing to avoid acquiring mm->mmap_sem)
as suggested by Xiaowei? That's of course only if the latter is correct
at all, of which I haven't fully convinced myself yet.


You must have some way to guarantee existence of Linux page
tables when you walk them in order to resolve a TLB refill.

x86 does this with IPI and hardware fill that is atomic WRT interrupts.
So fast gup can disable interrupts to walk page tables, I don't think it
is fragile it is absolutely tied to the system ISA (of course that can
change, but as Peter said, other things will have to change).

Other architectures use RCU for this, so fast gup uses a lockless-
pagecache-alike protcol for that.

If Xen is not using IPIs for flush, it should use whatever locks or
synchronization its TLB refill is using.

Thanks everyone! It's very clear now that the problem only occurs on XenPV kernel which doesn't use IPI to flush TLB so lacks the implicit syncmechanism. For now we can disable the fast path as a temp solution untila better one comes up -- comparing to adding extra locks, the slow pathmay not be bad, and actually get_user_pages_fast() is not called thatmuch in our environment.

However, this issue may raise another concern: could there be otherplaces inside Xen PV kernel which has the same sync problem?


Thanks,
Xiaowei

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] Re: One (possible) x86 get_user_pages bug, (continued) [Xen-devel] Re: One (possible) x86 get_user_pages bug, Jeremy Fitzhardinge [Xen-devel] Re: One (possible) x86 get_user_pages bug, Peter Zijlstra [Xen-devel] Re: One (possible) x86 get_user_pages bug, Avi Kivity [Xen-devel] RE: One (possible) x86 get_user_pages bug, Kaushik Barde [Xen-devel] Re: One (possible) x86 get_user_pages bug, Peter Zijlstra [Xen-devel] Re: One (possible) x86 get_user_pages bug, Jan Beulich [Xen-devel] Re: One (possible) x86 get_user_pages bug, Peter Zijlstra [Xen-devel] Re: One (possible) x86 get_user_pages bug, Jan Beulich [Xen-devel] Re: One (possible) x86 get_user_pages bug, Peter Zijlstra [Xen-devel] Re: One (possible) x86 get_user_pages bug, Nick Piggin [Xen-devel] Re: One (possible) x86 get_user_pages bug, Xiaowei Yang <=

Previous by Date:	[Xen-devel] xen 4.1 rc2 test report ( 5new issues found), Zheng, Shaohui
Next by Date:	Re: [Xen-devel] FLR support in xl tool stack, Ian Campbell
Previous by Thread:	[Xen-devel] Re: One (possible) x86 get_user_pages bug, Nick Piggin
Next by Thread:	[Xen-devel] [PATCH] libxc: provide XENCTRL_HAS_XC_INTERFACE feature test macro, Ian Jackson
Indexes:	[Date] [Thread] [Top] [All Lists]