This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] re: [PATCH] mem_sharing: fix race condition of nominate and

Hi Jui-Hao:


         Under what situation hvm_hap_nested_page_fault will be called?


         Attach is the latest log with CPU id.


         Before each call of unshare() I print out the caller.


         From the log u will find the error mfn is MFN=2df895,  in line 488

       Line 37:(XEN) ===>mem_sharing_share_pages mfn 2df895 gfn 520798 p2md d did 2

Is the log in mem_sharing_share_pages, from below line 632


618    /* gfn lists always have at least one entry => save to call list_entry */

619     mem_sharing_gfn_account(gfn_get_info(&ce->gfns), 1);

620     mem_sharing_gfn_account(gfn_get_info(&se->gfns), 1);

621     list_for_each_safe(le, te, &ce->gfns)

622     {

623         gfn = list_entry(le, struct gfn_info, list);

624         /* Get the source page and type, this should never fail

625          * because we are under shr lock, and got non-null se */

626         BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page));

627         /* Move the gfn_info from ce list to se list */

628         list_del(&gfn->list);

629         d = get_domain_by_id(gfn->domain);

630         BUG_ON(!d);

631         gfn_to_mfn(d, gfn->gfn, &p2mt);

632         printk("===>mem_sharing_share_pages mfn %lx gfn %lu p2md %x did %d\n", se->mfn, gfn->gfn, p2mt,d->domain_id);

633         BUG_ON(set_shared_p2m_entry(d, gfn->gfn, se->mfn) == 0);

634         put_domain(d);

635         list_add(&gfn->list, &se->gfns);

636         put_page_and_type(cpage);

637     }  


Sent: Jui-Hao Chiang [mailto:juihaochiang@xxxxxxxxx]
Date: 2011
110 16:10
To: tinnycloud
CC: Tim Deegan; xen-devel@xxxxxxxxxxxxxxxxxxx
Sub: Re: [PATCH] mem_sharing: fix race condition of nominate and unshare


Hi, tinnycloud:

Thanks for your testing info.
I assume you have mem_sharing_unshare_page called with successful return value, otherwise the mod_l1_entry won't be called.
Could you call mem_sharing_debug_gfn() before unshare() return success?
In addition, are there multiple CPUs touching the same page? e.g. you can print out the cpu id inside unshare() and the

After this change, unshare() has a potential problem of deadlock for shr_lock and p2m_lock with different locking order.
Assume two CPUs do the following
CPU1: hvm_hap_nested_page_fault() => unshare() => p2m_change_type() (locking order: shr_lock, p2m_lock)
CPU2: p2m_teardown() => unshare() (locking order: p2m_lock, shr_lock)
When CPU1 grabs shr_lock and CPU2 grabs p2m_lock, they deadlock later.

 1.       mem_sharing_unshare_page() has the routine  called from gfn_to_mfn_unshare, which is called by gnttab_transfer  

Since no bug report on grant_table right now, so I think this is safe for now

Also  p2m_tear_down è mem_sharing_unshare_page() , its flag is MEM_SHARING_DESTROY_GFN, and won’t has the chance to

call set_shared_p2m_entry()


Of course, the p2m_teardown won't call set_shared_p2m_entry. But this does not change my argument that p2m_teardown() hold p2m_lock to wait on shr_lock. Actaully, after looking for a while, I rebut myself that the scenario of deadlock won't exist.
When p2m_teardown is called, the domain is dying in its last few steps (device, irq are released), and there is no way for hvm_hap_nested_page_fault() to happen on the memory of the dying domain. If this case is eliminated, then my patch should not have deadlock problem. Any comments?

3.       set_shared_p2m_entry() which call set_p2m_entry() is not in p2m_lock, and I found in other code set_p2m_entry is called in p2m_lock,

so here I think it is a problem


So I think at least set_p2m_entry should be put into p2m_lock.

I’ll do more investigation base on this.


Attachment: log.txt
Description: Text document

Xen-devel mailing list
<Prev in Thread] Current Thread [Next in Thread>