|
|
|
|
|
|
|
|
|
|
xen-devel
[Xen-devel] Re: One (possible) x86 get_user_pages bug
To: |
"Xiaowei Yang" <xiaowei.yang@xxxxxxxxxx>, "Nick Piggin" <npiggin@xxxxxxxxx> |
Subject: |
[Xen-devel] Re: One (possible) x86 get_user_pages bug |
From: |
"Jan Beulich" <JBeulich@xxxxxxxxxx> |
Date: |
Thu, 27 Jan 2011 16:07:29 +0000 |
Cc: |
Kaushik Barde <kbarde@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Kenneth Lee <liguozhu@xxxxxxxxxx>, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, wangzhenguo@xxxxxxxxxx, linqaingmin <linqiangmin@xxxxxxxxxx>, fanhenglong@xxxxxxxxxx, Wu Fengguang <fengguang.wu@xxxxxxxxx> |
Delivery-date: |
Thu, 27 Jan 2011 08:08:07 -0800 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<4D416D9A.9010603@xxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
References: |
<4D416D9A.9010603@xxxxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
>>> On 27.01.11 at 14:05, Xiaowei Yang <xiaowei.yang@xxxxxxxxxx> wrote:
> We created a scenario to reproduce the bug:
> ----------------------------------------------------------------
> // proc1/proc1.2 are 2 threads sharing one page table.
> // proc1 is the parent of proc2.
>
> proc1 proc2 proc1.2
> ... ... // in gup_pte_range()
> ... ... pte = gup_get_pte()
> ... ... page1 = pte_page(pte) // (1)
> do_wp_page(page1) ... ...
> ... exit_map() ...
> ... ... get_page(page1) // (2)
> -----------------------------------------------------------------
>
> do_wp_page() and exit_map() cause page1 to be released into free list
> before get_page() in proc1.2 is called. The longer the delay between
> (1)&(2), the easier the BUG_ON shows.
Other than responded initially, I don't this can happen outside
of Xen: do_wp_page() won't reach page_cache_release() when
gup_pte_range() is running for the same mm on another CPU,
since it can't get past ptep_clear_flush() (waiting for the CPU
in get_user_pages_fast() to re-enable interrupts).
> An experimental patch is made to prevent the PTE being modified in the
> middle of gup_pte_range(). The BUG_ON disappears afterward.
>
> However, from the comments embedded in gup.c, it seems deliberate to
> avoid the lock in the fast path. The question is: if so, how to avoid
> the above scenario?
Nick, based on your doing of the initial implementation, would
you be able to estimate whether disabling get_user_pages_fast()
altogether for Xen would be performing measurably worse than
adding the locks (but continuing to avoid acquiring mm->mmap_sem)
as suggested by Xiaowei? That's of course only if the latter is correct
at all, of which I haven't fully convinced myself yet.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|