This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops do

To: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
Subject: Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Wed, 22 Jul 2009 11:16:25 -0700
Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 22 Jul 2009 11:18:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1244711914.27370.511.camel@xxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20090522080655.GA24960@xxxxxxxxxxxxxxx> <20090604202656.GR24960@xxxxxxxxxxxxxxx> <1244197217.27370.146.camel@xxxxxxxxxxxxxxxxxxxxxx> <20090605112347.GY24960@xxxxxxxxxxxxxxx> <1244201864.27370.172.camel@xxxxxxxxxxxxxxxxxxxxxx> <20090605133850.GA24960@xxxxxxxxxxxxxxx> <1244209979.27370.188.camel@xxxxxxxxxxxxxxxxxxxxxx> <20090605154130.GB24960@xxxxxxxxxxxxxxx> <1244217948.27370.213.camel@xxxxxxxxxxxxxxxxxxxxxx> <1244218353.27370.216.camel@xxxxxxxxxxxxxxxxxxxxxx> <20090605181925.GC24960@xxxxxxxxxxxxxxx> <1244475935.27370.309.camel@xxxxxxxxxxxxxxxxxxxxxx> <1244476858.27370.325.camel@xxxxxxxxxxxxxxxxxxxxxx> <4A2E9BC3.4060507@xxxxxxxx> <1244710938.27370.502.camel@xxxxxxxxxxxxxxxxxxxxxx> <1244711914.27370.511.camel@xxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1b3pre) Gecko/20090513 Fedora/3.0-2.3.beta2.fc11 Lightning/1.0pre Thunderbird/3.0b2
On 06/11/09 02:18, Ian Campbell wrote:
> Pasi, to validate the theory that you are seeing races between unpinning
> and kmap_atomic_pte can you give this biguglystick approach to solving
> it a go.

I gave up trying to solve this in any kind of clever way and just
decided to go with a slightly cleaned-up version of this patch. 
Unfortunately, I don't think it actually solves the problem because it
doesn't prevent unpin from happening while the page is still kmapped; it
just ends up using the spinlock as a barrier to move the race around to
some timing which is presumably mostly avoids the problem.

In principle the fix is to take the lock in xen_kmap_atomic and release
it in xen_kunmap_atomic.  Unfortunately this is pretty ugly and complex
because kmaps are 1) inherently per-cpu, and 2) there can be 2 levels of
kmapping at once.  This means that we'd need 2 per-cpu locks to allow us
to hold these locks for the mapping duration without introducing
deadlocks.  However unpin (and pin, in principle) need to take *all*
these locks to exclude kmap on all cpus.

This is total overkill, since we only really care about excluding kmap
and unpin from a given pagetable, which suggests that the locks should
be part of the mm rather than global.  Unfortunately k(un)map_atomic_pte
don't get the mm of the pagetable they're trying to pin, and I don't
think we can assume its the current mm.

Another (pretty hideous) approach might be to make unpin check the state
of the KMAP_PTE[01] slots in each CPU's kmap fixmaps and see if a
mapping exists for a page that its currently unpinning.  This also has
the problem of being inherently racy; if we unpin the page, there's
going to be a little window after the unpin and before the kmap pte
update (even if they're back-to-back batched hypercalls).

I guess we could have a global rw spinlock; kmap/unmap takes it for
reading, and so can be concurrent with all other kmaps, but pin/unpin
takes it for writing to exclude them.  That would work, but runs the
risk of pin/unpin from being livelocked out (I don't think rwspins will
block new readers if there's a pending writer).  Ugly, but its the only
thing I can think of which actually solves the problem.

Oh, crap, we don't have a kunmap_atomic_pte hook.

Thoughts?  Am I overthinking this and missing something obvious?



> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 1729178..beeb8e8 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -1145,9 +1145,12 @@ static int xen_unpin_page(struct mm_struct *mm, struct 
> page *page,
>       return 0;               /* never need to flush on unpin */
>  }
> +static DEFINE_SPINLOCK(hack_lock); /* Hack to sync unpin against 
> kmap_atomic_pte */
> +
>  /* Release a pagetables pages back as normal RW */
>  static void __xen_pgd_unpin(struct mm_struct *mm, pgd_t *pgd)
>  {
> +     spin_lock(&hack_lock);
>       xen_mc_batch();
>       xen_do_pin(MMUEXT_UNPIN_TABLE, PFN_DOWN(__pa(pgd)));
> @@ -1173,6 +1176,7 @@ static void __xen_pgd_unpin(struct mm_struct *mm, pgd_t 
> *pgd)
>       __xen_pgd_walk(mm, pgd, xen_unpin_page, USER_LIMIT);
>       xen_mc_issue(0);
> +     spin_unlock(&hack_lock);
>  }
>  static void xen_pgd_unpin(struct mm_struct *mm)
> @@ -1521,6 +1525,9 @@ static void xen_pgd_free(struct mm_struct *mm, pgd_t 
> *pgd)
>  static void *xen_kmap_atomic_pte(struct page *page, enum km_type type)
>  {
>       pgprot_t prot = PAGE_KERNEL;
> +     void *ret;
> +
> +     spin_lock(&hack_lock);
>       if (PagePinned(page))
>               prot = PAGE_KERNEL_RO;
> @@ -1530,7 +1537,11 @@ static void *xen_kmap_atomic_pte(struct page *page, 
> enum km_type type)
>                      page_to_pfn(page), type,
>                      (unsigned long)pgprot_val(prot) & _PAGE_RW ? "WRITE" : 
> "READ");
> -     return kmap_atomic_prot(page, type, prot);
> +     ret = kmap_atomic_prot(page, type, prot);
> +
> +     spin_unlock(&hack_lock);
> +
> +     return ret;
>  }
>  #endif

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Xen-devel] xen.git branch reorg / success with 2.6.30-rc3 pv_ops dom0, Jeremy Fitzhardinge <=