|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC/PATCH] Improve speed of mapping guest memory into Dom0
Mats,
Your patch has been white-space damaged and will not apply. You should
use git send-email which will do the right thing. I've also CC'd Konrad
who is the maintainer for the Xen parts of the kernel.
On 14/11/12 11:13, Mats Petersson wrote:
> [a long, rambling commit message?]
The text as-is isn't really suitable for a commit message (too much
irrelevant stuff) and there is no suitable subject line.
> I have also found that the munmap() call used to unmap the guest memory
> from Dom0 is about 35% slower in 3.7 kernel than in the 2.6 kernel (3.8M
> cycles vs 2.8M cycles).
This performance reduction only occurs with 32-bit guests is the Xen
then traps-and-emulates both halves of the PTE write.
> I think this could be made quicker by using a
> direct write of zero rather than the compare exchange operation that is
> currently used [which traps into Xen, performs the compare & exchange] -
This is something I noticed but never got around to producing a patch.
How about this (uncomplied!) patch?
-- a/mm/memory.c
+++ b/mm/memory.c
@@ -1146,8 +1146,16 @@ again:
page->index > details->last_index))
continue;
}
- ptent = ptep_get_and_clear_full(mm, addr, pte,
- tlb->fullmm);
+ /*
+ * No need for the expensive atomic get and
+ * clear for anonymous mappings as the dirty
+ * and young bits are not used.
+ */
+ if (PageAnon(page))
+ pte_clear(mm, addr, pte);
+ else
+ ptent = ptep_get_and_clear_full(mm, addr, pte,
+ tlb->fullmm);
tlb_remove_tlb_entry(tlb, pte, addr);
if (unlikely(!page))
continue;
Now for the patch itself. On the whole, the approach looks good and the
real-word performance improvements are nice. Specific comments inline.
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -2542,3 +2561,77 @@ out:
> return err;
> }
> EXPORT_SYMBOL_GPL(xen_remap_domain_mfn_range);
> +
> +/* like xen_remap_domain_mfn_range, but does a list of mfn's, rather
> + * than the, for xen, quite useless, consecutive pages.
> + */
/*
* Like xen_remap_domain_mfn_range(), but more efficiently handles MFNs
* that are not contiguous (which is common for a domain's memory).
*/
> +int xen_remap_domain_mfn_list(struct vm_area_struct *vma,
> + unsigned long addr,
> + unsigned long *mfn, int nr,
> + int *err_ptr,
> + pgprot_t prot, unsigned domid)
xen_remap_domain_mfn_array() ? It's not taking a list.
> +{
> + struct remap_list_data rmd;
> + struct mmu_update mmu_update[REMAP_BATCH_SIZE];
This is surprisingly large (256 bytes) but I note that the existing
xen_remap_domain_mfn_range() does the same thing so I guess it's ok.
> + int batch;
> + int done;
> + unsigned long range;
> + int err = 0;
> +
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return -EINVAL;
> +
> + prot = __pgprot(pgprot_val(prot) | _PAGE_IOMAP);
> +
> + BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP |
> VM_IO)));
> +
> + rmd.mfn = mfn;
> + rmd.prot = prot;
> +
> + while (nr) {
> + batch = min(REMAP_BATCH_SIZE, nr);
> + range = (unsigned long)batch << PAGE_SHIFT;
> +
> + rmd.mmu_update = mmu_update;
> + err = apply_to_page_range(vma->vm_mm, addr, range,
> + remap_area_mfn_list_pte_fn, &rmd);
> + if (err)
> + {
> + printk("xen_remap_domain_mfn_list: apply_to_range:
> err=%d\n", err);
Stray printk?
> + goto out;
> + }
> +
> + err = HYPERVISOR_mmu_update(mmu_update, batch, &done, domid);
> + if (err < 0)
> + {
> + int i;
> + /* TODO: We should remove this printk later */
> + printk("xen_remap_domain_mfn_list: mmu_update: err=%d,
> done=%d, batch=%d\n", err, done, batch);
Yes, you should...
> + err_ptr[done] = err;
> +
> + /* now do the remaining part of this batch */
> + for(i = done+1; i < batch; i++)
> + {
> + int tmp_err = HYPERVISOR_mmu_update(&mmu_update[i], 1,
> NULL, domid);
> + if (tmp_err < 0)
> + {
> + err_ptr[i] = tmp_err;
> + }
> + }
There's no need to fall back to doing it page-by-page here. You can
still batch the remainder.
> +
> + goto out;
> + }
> +
> + nr -= batch;
> + addr += range;
> + err_ptr += batch;
> + }
> +
> + err = 0;
> +out:
> +
> + xen_flush_tlb_all();
Probably not that important anymore since we would now do far fewer TLB
flushes, but this TLB flush is only needed by the PTEs being updated
were already present -- if they're all clear then TLB flush can be
omitted entirely.
> +
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(xen_remap_domain_mfn_list);
> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
> index 8adb9cc..b39a7b7 100644
> --- a/drivers/xen/privcmd.c
> +++ b/drivers/xen/privcmd.c
> @@ -151,6 +151,41 @@ static int traverse_pages(unsigned nelem, size_t size,
> return ret;
> }
>
> +/*
> + * Similar to traverse_pages, but use each page as a "block" of
> + * data to be processed as one unit.
> + */
> +static int traverse_pages_block(unsigned nelem, size_t size,
> + struct list_head *pos,
> + int (*fn)(void *data, int nr, void *state),
> + void *state)
> +{
> + void *pagedata;
> + unsigned pageidx;
> + int ret = 0;
> +
> + BUG_ON(size > PAGE_SIZE);
> +
> + pageidx = PAGE_SIZE;
> + pagedata = NULL; /* hush, gcc */
What was gcc upset about? I don't see anything it could get confused about.
> @@ -260,17 +295,17 @@ struct mmap_batch_state {
> xen_pfn_t __user *user_mfn;
> };
>
> -static int mmap_batch_fn(void *data, void *state)
> +static int mmap_batch_fn(void *data, int nr, void *state)
> {
> xen_pfn_t *mfnp = data;
> +
> struct mmap_batch_state *st = state;
> int ret;
>
> - ret = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK,
> *mfnp, 1,
> - st->vma->vm_page_prot, st->domain);
> + BUG_ON(nr < 0);
Is this BUG_ON() useful?
David
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |