xen-devel

[Top] [All Lists]

Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops do

from [Pasi Kärkkäinen]

[Permanent Link][Original]

To:	Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject:	Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0
From:	Pasi Kärkkäinen <pasik@xxxxxx>
Date:	Thu, 11 Jun 2009 20:24:52 +0300
Cc:	Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date:	Thu, 11 Jun 2009 10:25:40 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<4A312037.10300@xxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<1244209979.27370.188.camel@xxxxxxxxxxxxxxxxxxxxxx> <20090605154130.GB24960@xxxxxxxxxxxxxxx> <1244217948.27370.213.camel@xxxxxxxxxxxxxxxxxxxxxx> <1244218353.27370.216.camel@xxxxxxxxxxxxxxxxxxxxxx> <20090605181925.GC24960@xxxxxxxxxxxxxxx> <1244475935.27370.309.camel@xxxxxxxxxxxxxxxxxxxxxx> <1244476858.27370.325.camel@xxxxxxxxxxxxxxxxxxxxxx> <4A2E9BC3.4060507@xxxxxxxx> <1244710938.27370.502.camel@xxxxxxxxxxxxxxxxxxxxxx> <4A312037.10300@xxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	Mutt/1.5.13 (2006-08-11)

On Thu, Jun 11, 2009 at 08:18:15AM -0700, Jeremy Fitzhardinge wrote:
> On 06/11/09 02:02, Ian Campbell wrote:
> >On Tue, 2009-06-09 at 13:28 -0400, Jeremy Fitzhardinge wrote:
> >   
> >>Ian Campbell wrote:
> >>     
> >>>I wonder how this interacts with the logic in
> >>>arch/x86/xen/mmu.c:xen_pin_page() which holds the lock while waiting for
> >>>the (deferred) pin multicall to occur? Hmm, no this is about the
> >>>PagePinned flag on the struct page which is out of date WRT the actual
> >>>pinned status as Xen sees it -- we update the PagePinned flag early in
> >>>xen_pin_page() long before Xen the pin hypercall so this window is the
> >>>other way round to what would be needed to trigger this bug.
> >>>
> >>>       
> >>Yes, it looks like you could get a bad mapping here.  An obvious fix
> >>would be to defer clearing the pinned flag in the page struct until
> >>after the hypercall has issued.  That would make the racy
> >>kmap_atomic_pte map RO, which would be fine unless it actually tries to
> >>modify it (but I can't imagine it would do that unlocked).
> >>     
> >
> >But would it redo the mapping after taking the lock? It doesn't look
> >like it does (why would it). So we could end up writing to an unpinned
> >pte via a R/O mapping.
> >   
> 
> Hm, yep.  One thing I noticed is that set_pte() is used very rarely, so 
> it would be no cost to always use a hypercall in that case.  But 
> xen_set_pte_at() ends up calling xen_set_pte() as well, and I think 
> that's more common.  Certainly we need to make sure that we're actually 
> taking advantage of late-pin by direct writing unpinned ptes.
> 
> I've been thinking of rearranging the set_pte(_at) pvops a little bit 
> anyway; its not obvious we're really getting much benefit from using the 
> update_va_mapping hypercall, and if we're not using it, then the 
> set_pte_at pvop is taking a lot of unused parameters.
> 
> If we switch to just using mmu_update, then we can just pass the address 
> and pte value.  But we could also pass the struct page * (which makes a 
> bit of conceptual sense), so we could easy directly test whether the pte 
> is pinned, and either use a direct write or hypercall accordingly.
> 
> >As an experiment I tried the simple approach of flushing the multicalls
> >explicitly in xen_unpin_page and then clearing the Pinned bit and it all
> >goes a bit wrong. eip is "ptep->pte_low = 0" so I think the unpinned but
> >R/O theory holds...
> >   
> 
> Yes, I think the theory is sound.  But I'm curious why Pasi seems to be 
> able to hit the race easily, but we have not...
> 

Yeah, I've been thinking about that too.. 

My hardware is ~5 years old, but it has been running stable with multiple
distributions and kernel versions, on various types of loads. I think the
hardware should be all fine.

Atm I've been running Fedora 10 and Fedora 11 on it, both seem stable with
the distro-provided kernels.

ie. I'm only seeing the problem on pv_ops dom0 kernel.

My installation is pretty basic/standard.. root-fs on LVM-volume. Can't
really think of anything special.. 

And the problem seems to be _always_ reproducible with a simple 
"make clean && make bzImage && make modules" command on dom0 .. 

Anyway, I'll continue testing. Hopefully we get this hunted down :)

-- Pasi

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, (continued) Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Ian Campbell Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Ian Campbell Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Ian Campbell Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Jeremy Fitzhardinge Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Dulloor Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Jeremy Fitzhardinge Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen <= Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Jeremy Fitzhardinge Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Jeremy Fitzhardinge Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen

Previous by Date:	[Xen-devel] implement support for secondary consoles in the console backend, Stefano Stabellini
Next by Date:	Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Pasi Kärkkäinen
Previous by Thread:	Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Jeremy Fitzhardinge
Next by Thread:	Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Jeremy Fitzhardinge
Indexes:	[Date] [Thread] [Top] [All Lists]