WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops do

On Thu, Jun 11, 2009 at 08:18:15AM -0700, Jeremy Fitzhardinge wrote:
> On 06/11/09 02:02, Ian Campbell wrote:
> >On Tue, 2009-06-09 at 13:28 -0400, Jeremy Fitzhardinge wrote:
> >   
> >>Ian Campbell wrote:
> >>     
> >>>I wonder how this interacts with the logic in
> >>>arch/x86/xen/mmu.c:xen_pin_page() which holds the lock while waiting for
> >>>the (deferred) pin multicall to occur? Hmm, no this is about the
> >>>PagePinned flag on the struct page which is out of date WRT the actual
> >>>pinned status as Xen sees it -- we update the PagePinned flag early in
> >>>xen_pin_page() long before Xen the pin hypercall so this window is the
> >>>other way round to what would be needed to trigger this bug.
> >>>
> >>>       
> >>Yes, it looks like you could get a bad mapping here.  An obvious fix
> >>would be to defer clearing the pinned flag in the page struct until
> >>after the hypercall has issued.  That would make the racy
> >>kmap_atomic_pte map RO, which would be fine unless it actually tries to
> >>modify it (but I can't imagine it would do that unlocked).
> >>     
> >
> >But would it redo the mapping after taking the lock? It doesn't look
> >like it does (why would it). So we could end up writing to an unpinned
> >pte via a R/O mapping.
> >   
> 
> Hm, yep.  One thing I noticed is that set_pte() is used very rarely, so 
> it would be no cost to always use a hypercall in that case.  But 
> xen_set_pte_at() ends up calling xen_set_pte() as well, and I think 
> that's more common.  Certainly we need to make sure that we're actually 
> taking advantage of late-pin by direct writing unpinned ptes.
> 
> I've been thinking of rearranging the set_pte(_at) pvops a little bit 
> anyway; its not obvious we're really getting much benefit from using the 
> update_va_mapping hypercall, and if we're not using it, then the 
> set_pte_at pvop is taking a lot of unused parameters.
> 
> If we switch to just using mmu_update, then we can just pass the address 
> and pte value.  But we could also pass the struct page * (which makes a 
> bit of conceptual sense), so we could easy directly test whether the pte 
> is pinned, and either use a direct write or hypercall accordingly.
> 
> >As an experiment I tried the simple approach of flushing the multicalls
> >explicitly in xen_unpin_page and then clearing the Pinned bit and it all
> >goes a bit wrong. eip is "ptep->pte_low = 0" so I think the unpinned but
> >R/O theory holds...
> >   
> 
> Yes, I think the theory is sound.  But I'm curious why Pasi seems to be 
> able to hit the race easily, but we have not...
> 

Yeah, I've been thinking about that too.. 

My hardware is ~5 years old, but it has been running stable with multiple
distributions and kernel versions, on various types of loads. I think the
hardware should be all fine.

Atm I've been running Fedora 10 and Fedora 11 on it, both seem stable with
the distro-provided kernels.

ie. I'm only seeing the problem on pv_ops dom0 kernel.

My installation is pretty basic/standard.. root-fs on LVM-volume. Can't
really think of anything special.. 

And the problem seems to be _always_ reproducible with a simple 
"make clean && make bzImage && make modules" command on dom0 .. 

Anyway, I'll continue testing. Hopefully we get this hunted down :)

-- Pasi

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>