[Xen-devel] Re: Race between ept_get_entry / ept_set_entry

To:	xen-devel@xxxxxxxxxxxxxxxxxxx
Subject:	[Xen-devel] Re: Race between ept_get_entry / ept_set_entry
From:	George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date:	Thu, 26 Aug 2010 14:05:22 +0100
Cc:	"Li, Xin" <xin.li@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Delivery-date:	Thu, 26 Aug 2010 06:06:30 -0700
Dkim-signature:	v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=LwB2l1ABUS6CjBy5RYpBJlWiZxZKro7EaPKMzKLuAwY=; b=fYAjGS9JrX5blzbNzGdrerkKx46pOQobujYK+xTHz6Er8y0zhLNXhuYgR3n1C0mWY+ SJ0xc+0MBHPYMhhiRYtpercDEEkFJDtEik8CGvCsg6TVQCGgnsuxjJbcCKTiXYDg3/M0 KJHie2Tvv6t+KvsB7StzDQ6/GnHu5MMVBauc4=
Domainkey-signature:	a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=SOoMCX6WYisoDcMdKmkpXgxX2kfE2dtUbtTlgD4cBMCqNQEXHhkmZKXwzlpyYwu2/l jQTR6yTVRvFqQ8xZvlZFGanvMUbUZS7g9AYI/m8CaaM0p8JPll/ETGhdRSIokmxzbteO f7vMp/fzEgv3RJPBsrA+Yge9HmhtvHJK7aPmI=
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<AANLkTi=Kohcwesi3pBNV2-xFKrCF696Xeped8QGnwoMK@xxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<AANLkTi=Kohcwesi3pBNV2-xFKrCF696Xeped8QGnwoMK@xxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

FWIW, modifying ept_next_level() and ept_set_entry() to be
{write,read}-once (i.e., reading once and working with a local copy;
or making the entry in a local copy and writing it all at once) seems
to work as well, at least on 64-bit.

But overall, unless there's a measured reason to avoid locks, I think
the best thing for long-term stability is just to have ept_get_entry()
grab the p2m lock.

 -George

On Thu, Aug 26, 2010 at 11:35 AM, George Dunlap
<George.Dunlap@xxxxxxxxxxxxx> wrote:
> In the course of doing some fixes for my populate-on-demand testing, I
> found that a Windows Server 2008 VM with 30G static max and 24G ram
> (i.e., booting ballooned) crashed 1-2 times out of ten during boot,
> reporting MMIO errors.
>
> I managed to get a trace of this crash.  Strangely enough, the trace
> indicated that the page the NPF occured on was populate-on-demand --
> but that hvm_hap_nested_page_fault() injected a GP anyway.
>
> The only way this would be possible is if the gfn_to_mfn_query() in
> the trace function got a p2m type of p2m_popluate_on_demand, but the
> gfn_to_mfn_current() in hvm_hap_nested_page_fault() got a p2m type of
> p2m_mmio_dm.
>
> Looking at the trace (snippet attached), the failed NPF happened on
> d1v1; but almost simultaneously on d1v0, an NPF fault happened that
> caused a populate-on-demand demand populate.  That demand populate
> happened to be of a superpage that was shared with the gpa fault on
> d1v1.
>
> So, the first query on d1v1 (correctly) got a PoD; but the second
> query, instead of either causing the demand-populate, or successfully
> getting the result of d1v0's demand populate, returned failure,
> causing the guest to crash.
>
> I looked in the p2m-ept.c code, and noticed (once again) that
> ept_get_entry() can be called without the p2m lock held.  I added
> conditional locks, and am running the test again. The guest has now
> booted 20 times successfully without crashing (whereas before, the
> average was about 2 in 10 crashing).
>
> Looking closely at the code, I can see one potential race:
> * entry starts out PoD, not-present.
> * v0 finds the entry PoD, allocates a page, calls set_p2m_entry(),
> which calls ept_set_entry().
> * v1 begins to walk the pagetable; at some point, it calls
> ept_next_level(), which finds the flags all clear (entry->epte & 7 ==
> 0)
> * v0 ept_set_entry() changes the p2m type from p2m_populate_on_demand
> to p2m_ram_rw
> * v1 ept_next_level() reads entry->avail1 and finds that it is not
> p2m_populate_on_demand, so it returns GUEST_TABLE_MAP_FAILED
> * v0 ept_set_entry() sets the flags to present.
>
> Is there a good reason not to just grab the p2m lock when walking the
> ept tables?  We could conceivably do some cleverness to avoid this
> kind of race, but unless there's a significant performance gain, I
> think the simple approach is better.
>
>  -George
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: Race between ept_get_entry / ept_set_entry