[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [PATCH 1 of 4] Nested p2m: implement "flush" as a first-class action



At 17:21 +0200 on 23 Jun (1308849681), Christoph Egger wrote:
> 
> Now I see a significant slowdown of the L2 guest with multiple vcpus.
> The reason is I see 10 times more IPIs for each vcpu with
> p2m_flush_nestedp2m().
> 
> Please add back nestedhvm_vmcx_flushtlbdomain().

It wasn't safe.  You need to kick other CPUs off the p2m table before
you free its memory.  But I'll look into avoiding those IPIs.

Tim.

> On 06/23/11 17:04, Christoph Egger wrote:
> >
> >I have a fix for this crash. See inline.
> >
> >Christoph
> >
> >
> >On 06/23/11 14:56, Christoph Egger wrote:
> >>On 06/23/11 14:50, Christoph Egger wrote:
> >>>
> >>>This patch crashes the host (and it doesn't disappear with the other
> >>>patches applied):
> >>
> >>err.. it crashes the host when the l1 guest boots the hvmloader invokes
> >>the VGA BIOS.
> >>
> >>>(XEN) Assertion 'pfn_to_pdx(ma>>    PAGE_SHIFT)<    (DIRECTMAP_SIZE>>
> >>>PAGE_SHIFT)'
> >>>failed at xen/include/asm/x86_64/page.h:99
> >>>(XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Tainted:    C ]----
> >>>(XEN) CPU:    1
> >>>(XEN) RIP:    e008:[<ffff82c4801d2007>] p2m_flush_locked+0x1a8/0x2be
> >>>(XEN) RFLAGS: 0000000000010212   CONTEXT: hypervisor
> >>>(XEN) rax: 0000000000000000   rbx: ffff830421db1d90   rcx: 0000000000000000
> >>>(XEN) rdx: f82f608076a60000   rsi: 000f82f608076a60   rdi: 0000000000000000
> >>>(XEN) rbp: ffff830425217a08   rsp: ffff8304252179c8   r8:  0000000000000000
> >>>(XEN) r9:  0000000000000004   r10: 00000000fffffffb   r11: 0000000000000003
> >>>(XEN) r12: ffff830421db1d90   r13: ffff82f608076a60   r14: ffff830403bbebe8
> >>>(XEN) r15: ffff830403bbe000   cr0: 000000008005003b   cr4: 00000000000406f0
> >>>(XEN) cr3: 0000000420f2b000   cr2: 000000000045f000
> >>>(XEN) ds: 0017   es: 0017   fs: 0000   gs: 0000   ss: e010   cs: e008
> >>>(XEN) Xen stack trace from rsp=ffff8304252179c8:
> >>>(XEN)    ffff830403bbebe8 ffff830421db1d90 ffff830421db1d90 
> >>>0000000000000000
> >>>(XEN)    ffff830403bbe000 ffff830403bbebe8 ffff830403bbebe8 
> >>>ffff830403bbeab0
> >>>(XEN)    ffff830425217a38 ffff82c4801d2599 0000000000000001 
> >>>0000000000000067
> >>>(XEN)    ffff830403bbeab0 ffff830403af4000 ffff830425217a88 
> >>>ffff82c480208a8c
> >>>(XEN)    0000000125941810 ffff830403bbe000 0000000000000000 
> >>>ffff830425217b18
> >>>(XEN)    ffff830421db1cc0 00000000000ff000 ffffffffffffffff 
> >>>0000000000000000
> >>>(XEN)    ffff830425217a98 ffff82c4801ce585 ffff830425217b68 
> >>>ffff82c4801d4b4e
> >>>(XEN)    0000000000000200 1000000000000000 ffff830425217b28 
> >>>ffff830425217b20
> >>>(XEN)    0000000000000001 ffff830425217ae8 ffff830425217ef8 
> >>>ffff830403af4000
> >>>(XEN)    0000000000000000 0000000000000000 0000000000000000 
> >>>0000000000000000
> >>>(XEN)    0000000000000000 0000000000000000 0000000000000000 
> >>>ffff830403af4000
> >>>(XEN)    0000000000403af4 ffff82c4801d5ec1 ffff830425217bfc 
> >>>ffffffffffffffff
> >>>(XEN)    0000000000000000 0000000000000001 00000000000ff000 
> >>>ffff830421db1cc0
> >>>(XEN)    ffff830425217bb8 ffff82c4801d02d1 0000000100000007 
> >>>ffff830403bbe000
> >>>(XEN)    0000000125217bc8 ffff8280019fafc0 ffff82c400cfd7e0 
> >>>000000000033f5f8
> >>>(XEN)    ffff830421db1cc0 0000000000000001 ffff830425217c28 
> >>>ffff82c4801d053d
> >>>(XEN)    ffff830425217bfc ffff830425217bf8 0000000025217bf0 
> >>>00000000000ff000
> >>>(XEN)    0000000000000001 ffff830425217c10 0000000000000007 
> >>>ffff830421db1cc0
> >>>(XEN)    ffff830421db1cc0 ffff830425217f18 ffff82c4802e1260 
> >>>0000000000000000
> >>>(XEN)    ffff830425217c78 ffff82c4801d11fa 000000000033f5f7 
> >>>00000000000ff000
> >>>(XEN) Xen call trace:
> >>>(XEN)    [<ffff82c4801d2007>] p2m_flush_locked+0x1a8/0x2be
> >>>(XEN)    [<ffff82c4801d2599>] p2m_flush_nestedp2m+0xea/0x140
> >>>(XEN)    [<ffff82c480208a8c>] hap_write_p2m_entry+0x223/0x246
> >>>(XEN)    [<ffff82c4801ce585>] paging_write_p2m_entry+0x6e/0x75
> >>>(XEN)    [<ffff82c4801d4b4e>] p2m_set_entry+0x42b/0x6a8
> >>>(XEN)    [<ffff82c4801d02d1>] set_p2m_entry+0xf2/0x140
> >>>(XEN)    [<ffff82c4801d053d>] p2m_remove_page+0x1dd/0x1ec
> >>>(XEN)    [<ffff82c4801d11fa>] guest_physmap_remove_page+0x109/0x148
> >>>(XEN)    [<ffff82c480168000>] arch_memory_op+0x6af/0x1039
> >>>(XEN)    [<ffff82c4801134fb>] do_memory_op+0x1bd7/0x1c2c
> >>>(XEN)    [<ffff82c480217708>] syscall_enter+0xc8/0x122
> >>>(XEN)
> >>>(XEN)
> >>>(XEN) ****************************************
> >>>
> >>>Christoph
> >>>
> >>>
> >>>
> >>>On 06/22/11 18:10, Tim Deegan wrote:
> >>>># HG changeset patch
> >>>># User Tim Deegan<Tim.Deegan@xxxxxxxxxx>
> >>>># Date 1308758648 -3600
> >>>># Node ID c323e69a0a08ce9f1e54d2e2fa2edd9845bc8efe
> >>>># Parent  b7e5a25663329254cba539e21f4fbd5b32c67556
> >>>>Nested p2m: implement "flush" as a first-class action
> >>>>rather than using the teardown and init functions.
> >>>>This makes the locking clearer and avoids an expensive scan of all
> >>>>pfns that's only needed for non-nested p2ms.  It also moves the
> >>>>tlb flush into the proper place in the flush logic, avoiding a
> >>>>possible race against other CPUs.
> >>>>
> >>>>Signed-off-by: Tim Deegan<Tim.Deegan@xxxxxxxxxx>
> >>>>
> >>>>diff -r b7e5a2566332 -r c323e69a0a08 xen/arch/x86/hvm/nestedhvm.c
> >>>>--- a/xen/arch/x86/hvm/nestedhvm.c    Tue Jun 21 18:28:53 2011 +0100
> >>>>+++ b/xen/arch/x86/hvm/nestedhvm.c    Wed Jun 22 17:04:08 2011 +0100
> >>>>@@ -119,12 +119,6 @@ nestedhvm_vmcx_flushtlb(struct p2m_domai
> >>>>         cpus_clear(p2m->p2m_dirty_cpumask);
> >>>>     }
> >>>>
> >>>>-void
> >>>>-nestedhvm_vmcx_flushtlbdomain(struct domain *d)
> >>>>-{
> >>>>-    on_selected_cpus(d->domain_dirty_cpumask, nestedhvm_flushtlb_ipi, d, 
> >>>>1);
> >>>>-}
> >>>>-
> >>>>     bool_t
> >>>>     nestedhvm_is_n2(struct vcpu *v)
> >>>>     {
> >>>>diff -r b7e5a2566332 -r c323e69a0a08 xen/arch/x86/mm/p2m.c
> >>>>--- a/xen/arch/x86/mm/p2m.c   Tue Jun 21 18:28:53 2011 +0100
> >>>>+++ b/xen/arch/x86/mm/p2m.c   Wed Jun 22 17:04:08 2011 +0100
> >>>>@@ -1050,20 +1050,41 @@ p2m_getlru_nestedp2m(struct domain *d, s
> >>>>         return lrup2m;
> >>>>     }
> >>>>
> >>>>-static int
> >>>>+/* Reset this p2m table to be empty */
> >>>>+static void
> >>>>     p2m_flush_locked(struct p2m_domain *p2m)
> >>>>     {
> >>>>-    ASSERT(p2m);
> >>>>-    if (p2m->cr3 == CR3_EADDR)
> >>>>-        /* Microoptimisation: p2m is already empty.
> >>>>-         * =>     about 0.3% speedup of overall system performance.
> >>>>-         */
> >>>>-        return 0;
> >>>>+    struct page_info *top, *pg;
> >>>>+    struct domain *d = p2m->domain;
> >>>>+    void *p;
> >+ mfn_t top_mfn;
> >
> >>>>
> >>>>-    p2m_teardown(p2m);
> >>>>-    p2m_initialise(p2m->domain, p2m);
> >>>>-    p2m->write_p2m_entry = nestedp2m_write_p2m_entry;
> >>>>-    return p2m_alloc_table(p2m);
> >>>>+    p2m_lock(p2m);
> >>>>+
> >>>>+    /* "Host" p2m tables can have shared entries&c that need a bit more
> >>>>+     * care when discarding them */
> >>>>+    ASSERT(p2m_is_nestedp2m(p2m));
> >>>>+    ASSERT(page_list_empty(&p2m->pod.super));
> >>>>+    ASSERT(page_list_empty(&p2m->pod.single));
> >>>>+
> >>>>+    /* This is no longer a valid nested p2m for any address space */
> >>>>+    p2m->cr3 = CR3_EADDR;
> >>>>+
> >>>>+    /* Zap the top level of the trie */
> >
> >+    top_mfn = pagetable_get_mfn(p2m_get_pagetable(p2m));
> >+    top = mfn_to_page(top_mfn);
> >+    p = map_domain_page(mfn_x(top_mfn));
> >
> >>>>+    clear_page(p);
> >>>>+    unmap_domain_page(p);
> >>>>+
> >>>>+    /* Make sure nobody else is using this p2m table */
> >>>>+    nestedhvm_vmcx_flushtlb(p2m);
> >>>>+
> >>>>+    /* Free the rest of the trie pages back to the paging pool */
> >>>>+    while ( (pg = page_list_remove_head(&p2m->pages)) )
> >>>>+        if ( pg != top )
> >>>>+            d->arch.paging.free_page(d, pg);
> >>>>+    page_list_add(top,&p2m->pages);
> >>>>+
> >>>>+    p2m_unlock(p2m);
> >>>>     }
> >>>>
> >>>>     void
> >>>>@@ -1074,9 +1095,8 @@ p2m_flush(struct vcpu *v, struct p2m_dom
> >>>>         ASSERT(v->domain == d);
> >>>>         vcpu_nestedhvm(v).nv_p2m = NULL;
> >>>>         nestedp2m_lock(d);
> >>>>-    BUG_ON(p2m_flush_locked(p2m) != 0);
> >>>>+    p2m_flush_locked(p2m);
> >>>>         hvm_asid_flush_vcpu(v);
> >>>>-    nestedhvm_vmcx_flushtlb(p2m);
> >>>>         nestedp2m_unlock(d);
> >>>>     }
> >>>>
> >>>>@@ -1086,12 +1106,8 @@ p2m_flush_nestedp2m(struct domain *d)
> >>>>         int i;
> >>>>
> >>>>         nestedp2m_lock(d);
> >>>>-    for (i = 0; i<     MAX_NESTEDP2M; i++) {
> >>>>-        struct p2m_domain *p2m = d->arch.nested_p2m[i];
> >>>>-        BUG_ON(p2m_flush_locked(p2m) != 0);
> >>>>-        cpus_clear(p2m->p2m_dirty_cpumask);
> >>>>-    }
> >>>>-    nestedhvm_vmcx_flushtlbdomain(d);
> >>>>+    for ( i = 0; i<     MAX_NESTEDP2M; i++ )
> >>>>+        p2m_flush_locked(d->arch.nested_p2m[i]);
> >>>>         nestedp2m_unlock(d);
> >>>>     }
> >>>>
> >>>>@@ -1104,7 +1120,7 @@ p2m_get_nestedp2m(struct vcpu *v, uint64
> >>>>         volatile struct nestedvcpu *nv =&vcpu_nestedhvm(v);
> >>>>         struct domain *d;
> >>>>         struct p2m_domain *p2m;
> >>>>-    int i, rv;
> >>>>+    int i;
> >>>>
> >>>>         if (cr3 == 0 || cr3 == CR3_EADDR)
> >>>>             cr3 = v->arch.hvm_vcpu.guest_cr[3];
> >>>>@@ -1136,9 +1152,7 @@ p2m_get_nestedp2m(struct vcpu *v, uint64
> >>>>          */
> >>>>         for (i = 0; i<     MAX_NESTEDP2M; i++) {
> >>>>             p2m = p2m_getlru_nestedp2m(d, NULL);
> >>>>-        rv = p2m_flush_locked(p2m);
> >>>>-        if (rv == 0)
> >>>>-            break;
> >>>>+        p2m_flush_locked(p2m);
> >>>>         }
> >>>>         nv->nv_p2m = p2m;
> >>>>         p2m->cr3 = cr3;
> >>>>diff -r b7e5a2566332 -r c323e69a0a08 xen/include/asm-x86/hvm/nestedhvm.h
> >>>>--- a/xen/include/asm-x86/hvm/nestedhvm.h     Tue Jun 21 18:28:53 2011 
> >>>>+0100
> >>>>+++ b/xen/include/asm-x86/hvm/nestedhvm.h     Wed Jun 22 17:04:08 2011 
> >>>>+0100
> >>>>@@ -61,7 +61,6 @@ unsigned long *nestedhvm_vcpu_iomap_get(
> >>>>         (!!vcpu_nestedhvm((v)).nv_vmswitch_in_progress)
> >>>>
> >>>>     void nestedhvm_vmcx_flushtlb(struct p2m_domain *p2m);
> >>>>-void nestedhvm_vmcx_flushtlbdomain(struct domain *d);
> >>>>
> >>>>     bool_t nestedhvm_is_n2(struct vcpu *v);
> >>>>
> >>>>
> 
> 
> -- 
> ---to satisfy European Law for business letters:
> Advanced Micro Devices GmbH
> Einsteinring 24, 85689 Dornach b. Muenchen
> Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
> Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
> Registergericht Muenchen, HRB Nr. 43632
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.