On Fri, Sep 23, 2011 at 10:35 AM, Hongkaixing <hongkaixing@xxxxxxxxxx> wrote:
> Hi, Olaf
>
> we have tested the xenpaging feature and found some problems.
> (1) the test case like this : when we start a VM with POD enable, the
> xenpaging is started at the same time.
> this case will cause many problems ,finally, we fixed the BUG, the patch
> is attached below.
>
> (2) there is a very serious problem. we have observed many VM crash examples,
> the error code is not always the same.
> we guess there exists conflict between xenpaging and memory mapping.
> for instance, if the Dom0 map the DomU's memory to its own space , then
> the memory of DomainU is paged out, when the Domain0 access this memory area,
> a panic is caused.
> And I really do not know whether the qemu device can perceive the memory
> modified by xenpaging?
>
> thanks!
>
> here is the patch to solve the pod problem
> 1) fix the p2m_pod_decrease_reservation() function, take care of the paging
> type
> --- ./origin/xen/arch/x86/mm/p2m.c 2011-09-05 20:39:30.000000000 +0800
> +++ ./b/xen/arch/x86/mm/p2m.c 2011-09-23 23:46:19.000000000 +0800
Kaixing,
Xie xie ni dui wo men shuo zhe xie shi. (Dui bu qi wo de zhong wen bu
hao.) As Olaf says, it would help us a lot if you could submit the
patches in a form that is easier for us to apply and to comment on:
namely, either having all the patches in separate files as
attachments, or having one e-mail per patch, with the patch being the
last thing in the e-mail.
There are some suggestions on how to post patches here:
http://wiki.xensource.com/xenwiki/SubmittingXenPatches. My preferred
technique is to make a patch series using Mercurial Queues, and then
use the PatchBomb extension (mentioned in the link above) to mail them
as a series, by using "hg email -o"
-George
> @@ -675,6 +675,23 @@
> BUG_ON(p2md->pod.entry_count < 0);
> pod--;
> }
> + else if ( steal_for_cache && p2m_is_paging(t) )
> + {
> + struct page_info *page;
> + /* alloc a new page to compensate the pod list */
> + page = alloc_domheap_page(d, 0);
> + if ( unlikely(page == NULL) )
> + {
> + goto out_entry_check;
> + }
> + set_p2m_entry(d, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid);
> + p2m_mem_paging_drop_page(d, gpfn+i);
> + p2m_pod_cache_add(d, page, 0);
> + steal_for_cache = ( p2md->pod.entry_count > p2md->pod.count );
> + nonpod--;
> + ram--;
> + }
> + /* for other ram types */
> else if ( steal_for_cache && p2m_is_ram(t) )
> {
> struct page_info *page;
>
> 2) fix the race between POD and xenpaging
> situation can be described as the follow
>
> xenpaging POD
>
> mfn = gfn_to_mfn() mfn = gfn_to_mfn()
> check p2m type check p2mt
> p2m_lock
> change p2m type
> p2m_unlock
> add to pod list
> p2m_lock()
> change p2m type
> p2m_unlock
> put_page()
>
> the result is a page is added to the pod list,and then it's removed from the
> list by put_page.
> my suggestion is to extend the range of the p2m lock to contain the p2m check.
>
> in p2m_mem_paging_nominate() function
> @@ -2532,7 +2561,8 @@
> mfn_t mfn;
> int ret;
>
> - mfn = gfn_to_mfn(d, gfn, &p2mt);
> + p2m_lock(d->arch.p2m);
> + mfn = gfn_to_mfn_query(d, gfn, &p2mt);
>
> /* Check if mfn is valid */
> ret = -EINVAL;
> @@ -2580,13 +2610,12 @@
> goto out;
>
> /* Fix p2m entry */
> - p2m_lock(d->arch.p2m);
> set_p2m_entry(d, gfn, mfn, 0, p2m_ram_paging_out);
> - p2m_unlock(d->arch.p2m);
>
> ret = 0;
>
> out:
> + p2m_unlock(d->arch.p2m);
> return ret;
> }
> in
> @@ -2595,34 +2624,39 @@
> struct page_info *page;
> p2m_type_t p2mt;
> mfn_t mfn;
> -
> + int ret;
> + p2m_lock(d->arch.p2m);
> /* Get mfn */
> - mfn = gfn_to_mfn(d, gfn, &p2mt);
> + mfn = gfn_to_mfn_query(d, gfn, &p2mt);
> +
> + ret = -EINVAL;
> if ( unlikely(!mfn_valid(mfn)) )
> - return -EINVAL;
> + goto out;
>
> if (p2mt != p2m_ram_paging_out)
> {
> printk("p2m_mem_paging_evict type %d\n", p2mt);
> - return -EINVAL;
> + goto out;
> }
> /* Get the page so it doesn't get modified under Xen's feet */
> page = mfn_to_page(mfn);
> if ( unlikely(!get_page(page, d)) )
> - return -EINVAL;
> + goto out;
>
> /* Decrement guest domain's ref count of the page */
> if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
> put_page(page);
>
> /* Remove mapping from p2m table */
> - p2m_lock(d->arch.p2m);
> set_p2m_entry(d, gfn, _mfn(PAGING_MFN), 0, p2m_ram_paged);
> - p2m_unlock(d->arch.p2m);
>
> /* Put the page back so it gets freed */
> put_page(page);
> +
> + ret = 0;
>
> +out:
> + p2m_unlock(d->arch.p2m);
> return 0;
> }
>
> 3) fix the vmx_load_pdptrs() function in vmx.c
> in this situation the page directory table is paged out.
> Although using mdelay() is a bad idea, it's better than making the xen crash
>
> void vmx_load_pdptrs(struct vcpu *v)
> {
> unsigned long cr3 = v->arch.hvm_vcpu.guest_cr[3], mfn;
> uint64_t *guest_pdptrs;
> p2m_type_t p2mt;
> char *p;
> unsigned int try_count = 0;
> /* EPT needs to load PDPTRS into VMCS for PAE. */
> if ( !hvm_pae_enabled(v) || (v->arch.hvm_vcpu.guest_efer & EFER_LMA) )
> return;
> if ( cr3 & 0x1fUL )
> goto crash;
> mfn = mfn_x(gfn_to_mfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt));
> if ( !p2m_is_ram(p2mt) )
> goto crash;
> + if( p2m_is_paging(p2mt))
> + {
> + p2m_mem_paging_populate(v->domain, cr3 >> PAGE_SHIFT);
> + do
> + {
> + mdelay(1);
> + try_count ++;
> + if ( try_count > 65535 )
> + {
> + goto crash;
> + }
> + mfn = mfn_x(gfn_to_mfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt));
> + }while( !mfn_valid(mfn));
> + }
> +
> p = map_domain_page(mfn);
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|