This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: One (possible) x86 get_user_pages bug

To: Kaushik Barde <kbarde@xxxxxxxxxx>
Subject: [Xen-devel] Re: One (possible) x86 get_user_pages bug
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Mon, 31 Jan 2011 10:04:30 -0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, 'Kenneth Lee' <liguozhu@xxxxxxxxxx>, 'Peter Zijlstra' <a.p.zijlstra@xxxxxxxxx>, 'Marcelo Tosatti' <mtosatti@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, 'Jan Beulich' <JBeulich@xxxxxxxxxx>, wangzhenguo@xxxxxxxxxx, 'Xiaowei Yang' <xiaowei.yang@xxxxxxxxxx>, 'linqaingmin' <linqiangmin@xxxxxxxxxx>, fanhenglong@xxxxxxxxxx, 'Avi Kivity' <avi@xxxxxxxxxx>, 'Wu Fengguang' <fengguang.wu@xxxxxxxxx>, 'Nick Piggin' <npiggin@xxxxxxxxx>
Delivery-date: Mon, 31 Jan 2011 10:05:30 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <001801cbc0cc$00d98d70$028ca850$@com>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4D416D9A.9010603@xxxxxxxxxx> <4D419416020000780002ECB7@xxxxxxxxxxxxxxxxxx> <4D41B90D.5000305@xxxxxxxx> <4D456139.4090508@xxxxxxxxxx> <001801cbc0cc$00d98d70$028ca850$@com>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.7
On 01/30/2011 02:21 PM, Kaushik Barde wrote:
> I agree i.e. deviation from underlying arch consideration is not a good
> idea.
> Also, agreed, hypervisor knows which page entries are ready for TLB flush
> across vCPUs. 
> But, using above knowledge, along with TLB flush based on IPI is a better
> solution.  Its ability to synchronize it with pCPU based IPI and TLB flush
> across vCPU. is key. 

I'm not sure I follow you here.  The issue with TLB flush IPIs is that
the hypervisor doesn't know the purpose of the IPI and ends up
(potentially) waking up a sleeping VCPU just to flush its tlb - but
since it was sleeping there were no stale TLB entries to flush.

Xen's TLB flush hypercalls can optimise that case by only sending a real
IPI to PCPUs which are actually running target VCPUs.  In other cases,
where a PCPU is known to have stale entries but it isn't running a
relevant VCPU, it can just mark a deferred TLB flush which gets executed
before the VCPU runs again.

In other words, Xen can take significant advantage of getting a
higher-level call ("flush these TLBs") compared just a simple IPI.

Are you suggesting that the hypervisor should export some kind of "known
dirty TLB" table to the guest, and have the guest work out which VCPUs
need IPIs sent to them?  How would that work?

> IPIs themselves should be in few hundred uSecs in terms latency. Also, why
> should pCPU be in sleep state for active vCPU scheduled page workload?

A "few hundred uSecs" is really very slow - that's nearly a
millisecond.  It's worth spending some effort to avoid those kinds of


> -Kaushik
> -----Original Message-----
> From: Avi Kivity [mailto:avi@xxxxxxxxxx] 
> Sent: Sunday, January 30, 2011 5:02 AM
> To: Jeremy Fitzhardinge
> Cc: Jan Beulich; Xiaowei Yang; Nick Piggin; Peter Zijlstra;
> fanhenglong@xxxxxxxxxx; Kaushik Barde; Kenneth Lee; linqaingmin;
> wangzhenguo@xxxxxxxxxx; Wu Fengguang; xen-devel@xxxxxxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; Marcelo Tosatti
> Subject: Re: One (possible) x86 get_user_pages bug
> On 01/27/2011 08:27 PM, Jeremy Fitzhardinge wrote:
>> And even just considering virtualization, having non-IPI-based tlb
>> shootdown is a measurable performance win, since a hypervisor can
>> optimise away a cross-VCPU shootdown if it knows no physical TLB
>> contains the target VCPU's entries.  I can imagine the KVM folks could
>> get some benefit from that as well.
> It's nice to avoid the IPI (and waking up a cpu if it happens to be 
> asleep) but I think the risk of deviating too much from the baremetal 
> arch is too large, as demonstrated by this bug.
> (well, async page faults is a counterexample, I wonder if/when it will 
> bite us)

Xen-devel mailing list