This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: Race between ept_get_entry / ept_set_entry

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] Re: Race between ept_get_entry / ept_set_entry
From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date: Thu, 26 Aug 2010 14:05:22 +0100
Cc: "Li, Xin" <xin.li@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Delivery-date: Thu, 26 Aug 2010 06:06:30 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=LwB2l1ABUS6CjBy5RYpBJlWiZxZKro7EaPKMzKLuAwY=; b=fYAjGS9JrX5blzbNzGdrerkKx46pOQobujYK+xTHz6Er8y0zhLNXhuYgR3n1C0mWY+ SJ0xc+0MBHPYMhhiRYtpercDEEkFJDtEik8CGvCsg6TVQCGgnsuxjJbcCKTiXYDg3/M0 KJHie2Tvv6t+KvsB7StzDQ6/GnHu5MMVBauc4=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=SOoMCX6WYisoDcMdKmkpXgxX2kfE2dtUbtTlgD4cBMCqNQEXHhkmZKXwzlpyYwu2/l jQTR6yTVRvFqQ8xZvlZFGanvMUbUZS7g9AYI/m8CaaM0p8JPll/ETGhdRSIokmxzbteO f7vMp/fzEgv3RJPBsrA+Yge9HmhtvHJK7aPmI=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTi=Kohcwesi3pBNV2-xFKrCF696Xeped8QGnwoMK@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTi=Kohcwesi3pBNV2-xFKrCF696Xeped8QGnwoMK@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
FWIW, modifying ept_next_level() and ept_set_entry() to be
{write,read}-once (i.e., reading once and working with a local copy;
or making the entry in a local copy and writing it all at once) seems
to work as well, at least on 64-bit.

But overall, unless there's a measured reason to avoid locks, I think
the best thing for long-term stability is just to have ept_get_entry()
grab the p2m lock.


On Thu, Aug 26, 2010 at 11:35 AM, George Dunlap
<George.Dunlap@xxxxxxxxxxxxx> wrote:
> In the course of doing some fixes for my populate-on-demand testing, I
> found that a Windows Server 2008 VM with 30G static max and 24G ram
> (i.e., booting ballooned) crashed 1-2 times out of ten during boot,
> reporting MMIO errors.
> I managed to get a trace of this crash.  Strangely enough, the trace
> indicated that the page the NPF occured on was populate-on-demand --
> but that hvm_hap_nested_page_fault() injected a GP anyway.
> The only way this would be possible is if the gfn_to_mfn_query() in
> the trace function got a p2m type of p2m_popluate_on_demand, but the
> gfn_to_mfn_current() in hvm_hap_nested_page_fault() got a p2m type of
> p2m_mmio_dm.
> Looking at the trace (snippet attached), the failed NPF happened on
> d1v1; but almost simultaneously on d1v0, an NPF fault happened that
> caused a populate-on-demand demand populate.  That demand populate
> happened to be of a superpage that was shared with the gpa fault on
> d1v1.
> So, the first query on d1v1 (correctly) got a PoD; but the second
> query, instead of either causing the demand-populate, or successfully
> getting the result of d1v0's demand populate, returned failure,
> causing the guest to crash.
> I looked in the p2m-ept.c code, and noticed (once again) that
> ept_get_entry() can be called without the p2m lock held.  I added
> conditional locks, and am running the test again. The guest has now
> booted 20 times successfully without crashing (whereas before, the
> average was about 2 in 10 crashing).
> Looking closely at the code, I can see one potential race:
> * entry starts out PoD, not-present.
> * v0 finds the entry PoD, allocates a page, calls set_p2m_entry(),
> which calls ept_set_entry().
> * v1 begins to walk the pagetable; at some point, it calls
> ept_next_level(), which finds the flags all clear (entry->epte & 7 ==
> 0)
> * v0 ept_set_entry() changes the p2m type from p2m_populate_on_demand
> to p2m_ram_rw
> * v1 ept_next_level() reads entry->avail1 and finds that it is not
> p2m_populate_on_demand, so it returns GUEST_TABLE_MAP_FAILED
> * v0 ept_set_entry() sets the flags to present.
> Is there a good reason not to just grab the p2m lock when walking the
> ept tables?  We could conceivably do some cleverness to avoid this
> kind of race, but unless there's a significant performance gain, I
> think the simple approach is better.
>  -George

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>