Xen project Mailing List

Re: [Xen-devel] [RFC/PATCH] Improve speed of mapping guest memory into Dom0

To: Mats Petersson <mats.petersson@xxxxxxxxxx>

From: David Vrabel <david.vrabel@xxxxxxxxxx>

Date: Wed, 14 Nov 2012 13:08:49 +0000

Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Wed, 14 Nov 2012 13:09:13 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Mats, Your patch has been white-space damaged and will not apply. You should use git send-email which will do the right thing. I've also CC'd Konrad who is the maintainer for the Xen parts of the kernel. On 14/11/12 11:13, Mats Petersson wrote: > [a long, rambling commit message?] The text as-is isn't really suitable for a commit message (too much irrelevant stuff) and there is no suitable subject line. > I have also found that the munmap() call used to unmap the guest memory > from Dom0 is about 35% slower in 3.7 kernel than in the 2.6 kernel (3.8M > cycles vs 2.8M cycles). This performance reduction only occurs with 32-bit guests is the Xen then traps-and-emulates both halves of the PTE write. > I think this could be made quicker by using a > direct write of zero rather than the compare exchange operation that is > currently used [which traps into Xen, performs the compare & exchange] - This is something I noticed but never got around to producing a patch. How about this (uncomplied!) patch? -- a/mm/memory.c +++ b/mm/memory.c @@ -1146,8 +1146,16 @@ again: page->index > details->last_index)) continue; } - ptent = ptep_get_and_clear_full(mm, addr, pte, - tlb->fullmm); + /* + * No need for the expensive atomic get and + * clear for anonymous mappings as the dirty + * and young bits are not used. + */ + if (PageAnon(page)) + pte_clear(mm, addr, pte); + else + ptent = ptep_get_and_clear_full(mm, addr, pte, + tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); if (unlikely(!page)) continue; Now for the patch itself. On the whole, the approach looks good and the real-word performance improvements are nice. Specific comments inline. > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -2542,3 +2561,77 @@ out: > return err; > } > EXPORT_SYMBOL_GPL(xen_remap_domain_mfn_range); > + > +/* like xen_remap_domain_mfn_range, but does a list of mfn's, rather > + * than the, for xen, quite useless, consecutive pages. > + */ /* * Like xen_remap_domain_mfn_range(), but more efficiently handles MFNs * that are not contiguous (which is common for a domain's memory). */ > +int xen_remap_domain_mfn_list(struct vm_area_struct *vma, > + unsigned long addr, > + unsigned long *mfn, int nr, > + int *err_ptr, > + pgprot_t prot, unsigned domid) xen_remap_domain_mfn_array() ? It's not taking a list. > +{ > + struct remap_list_data rmd; > + struct mmu_update mmu_update[REMAP_BATCH_SIZE]; This is surprisingly large (256 bytes) but I note that the existing xen_remap_domain_mfn_range() does the same thing so I guess it's ok. > + int batch; > + int done; > + unsigned long range; > + int err = 0; > + > + if (xen_feature(XENFEAT_auto_translated_physmap)) > + return -EINVAL; > + > + prot = __pgprot(pgprot_val(prot) | _PAGE_IOMAP); > + > + BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | > VM_IO))); > + > + rmd.mfn = mfn; > + rmd.prot = prot; > + > + while (nr) { > + batch = min(REMAP_BATCH_SIZE, nr); > + range = (unsigned long)batch << PAGE_SHIFT; > + > + rmd.mmu_update = mmu_update; > + err = apply_to_page_range(vma->vm_mm, addr, range, > + remap_area_mfn_list_pte_fn, &rmd); > + if (err) > + { > + printk("xen_remap_domain_mfn_list: apply_to_range: > err=%d\n", err); Stray printk? > + goto out; > + } > + > + err = HYPERVISOR_mmu_update(mmu_update, batch, &done, domid); > + if (err < 0) > + { > + int i; > + /* TODO: We should remove this printk later */ > + printk("xen_remap_domain_mfn_list: mmu_update: err=%d, > done=%d, batch=%d\n", err, done, batch); Yes, you should... > + err_ptr[done] = err; > + > + /* now do the remaining part of this batch */ > + for(i = done+1; i < batch; i++) > + { > + int tmp_err = HYPERVISOR_mmu_update(&mmu_update[i], 1, > NULL, domid); > + if (tmp_err < 0) > + { > + err_ptr[i] = tmp_err; > + } > + } There's no need to fall back to doing it page-by-page here. You can still batch the remainder. > + > + goto out; > + } > + > + nr -= batch; > + addr += range; > + err_ptr += batch; > + } > + > + err = 0; > +out: > + > + xen_flush_tlb_all(); Probably not that important anymore since we would now do far fewer TLB flushes, but this TLB flush is only needed by the PTEs being updated were already present -- if they're all clear then TLB flush can be omitted entirely. > + > + return err; > +} > +EXPORT_SYMBOL_GPL(xen_remap_domain_mfn_list); > diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c > index 8adb9cc..b39a7b7 100644 > --- a/drivers/xen/privcmd.c > +++ b/drivers/xen/privcmd.c > @@ -151,6 +151,41 @@ static int traverse_pages(unsigned nelem, size_t size, > return ret; > } > > +/* > + * Similar to traverse_pages, but use each page as a "block" of > + * data to be processed as one unit. > + */ > +static int traverse_pages_block(unsigned nelem, size_t size, > + struct list_head *pos, > + int (*fn)(void *data, int nr, void *state), > + void *state) > +{ > + void *pagedata; > + unsigned pageidx; > + int ret = 0; > + > + BUG_ON(size > PAGE_SIZE); > + > + pageidx = PAGE_SIZE; > + pagedata = NULL; /* hush, gcc */ What was gcc upset about? I don't see anything it could get confused about. > @@ -260,17 +295,17 @@ struct mmap_batch_state { > xen_pfn_t __user *user_mfn; > }; > > -static int mmap_batch_fn(void *data, void *state) > +static int mmap_batch_fn(void *data, int nr, void *state) > { > xen_pfn_t *mfnp = data; > + > struct mmap_batch_state *st = state; > int ret; > > - ret = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK, > *mfnp, 1, > - st->vma->vm_page_prot, st->domain); > + BUG_ON(nr < 0); Is this BUG_ON() useful? David _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.