This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: One (possible) x86 get_user_pages bug

To: "Xiaowei Yang" <xiaowei.yang@xxxxxxxxxx>, "Nick Piggin" <npiggin@xxxxxxxxx>
Subject: [Xen-devel] Re: One (possible) x86 get_user_pages bug
From: "Jan Beulich" <JBeulich@xxxxxxxxxx>
Date: Thu, 27 Jan 2011 14:49:42 +0000
Cc: Kaushik Barde <kbarde@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Kenneth Lee <liguozhu@xxxxxxxxxx>, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, wangzhenguo@xxxxxxxxxx, linqaingmin <linqiangmin@xxxxxxxxxx>, fanhenglong@xxxxxxxxxx, Wu Fengguang <fengguang.wu@xxxxxxxxx>
Delivery-date: Thu, 27 Jan 2011 06:51:19 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4D416D9A.9010603@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4D416D9A.9010603@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> On 27.01.11 at 14:05, Xiaowei Yang <xiaowei.yang@xxxxxxxxxx> wrote:
> We created a scenario to reproduce the bug:
> ----------------------------------------------------------------
> // proc1/proc1.2 are 2 threads sharing one page table.
> // proc1 is the parent of proc2.
> proc1               proc2          proc1.2
> ...                 ...            // in gup_pte_range()
> ...                 ...            pte = gup_get_pte()
> ...                 ...            page1 = pte_page(pte)  // (1)
> do_wp_page(page1)   ...            ...
> ...                 exit_map()     ...
> ...                 ...            get_page(page1)        // (2)
> -----------------------------------------------------------------
> do_wp_page() and exit_map() cause page1 to be released into free list 
> before get_page() in proc1.2 is called. The longer the delay between 
> (1)&(2), the easier the BUG_ON shows.

The scenario indeed seems to apply independent of virtualization,
but the window obviously can be unbounded unless running

However, going through all the comments in gup.c again I wonder
whether pv Xen guests don't violate the major assumption: There
is talk about interrupts being off preventing (or sufficiently
deferring) remote CPUs doing TLB flushes. In pv Xen guests,
however, non-local TLB flushes do not happen by sending IPIs -
the hypercall interface gets used instead. If that's indeed the
case, I would have expected quite a few bug reports, but I'm
unaware of any - Nick, am I overlooking something here?


Xen-devel mailing list