WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: [PATCH] mem_sharing: fix race condition of nominate and

To: <tim.deegan@xxxxxxxxxx>
Subject: [Xen-devel] RE: [PATCH] mem_sharing: fix race condition of nominate and unshare
From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Date: Thu, 20 Jan 2011 17:37:56 +0800
Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, juihaochiang@xxxxxxxxx
Delivery-date: Thu, 20 Jan 2011 01:38:42 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
Importance: Normal
In-reply-to: <20110120091934.GG8286@xxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <BLU157-w1861EFE53CB51FC710011FDAF10@xxxxxxx>, <AANLkTimOz_uauDEnu_XaPEgwD1EZJWEgOO1oiFccFNs1@xxxxxxxxxxxxxx>, <20110113092427.GJ5651@xxxxxxxxxxxxxxxxxxxxxxx>, <AANLkTinSga8xDkuH0BsqbhbBtvgwgbn=T0qmg9y9CeGr@xxxxxxxxxxxxxx>, <20110113155344.GN5651@xxxxxxxxxxxxxxxxxxxxxxx>, <BLU157-w507CBBB94539BFDB92B339DAF30@xxxxxxx>, <AANLkTimC9OYTiHYaeeytH5OCN-EF2v6L=KDVMwBjtB0z@xxxxxxxxxxxxxx>, <BLU157-w995512A939ADE678A6401DAF40@xxxxxxx>, <AANLkTikj6gJ2we+3FcfmqdeSkrvBFyj=p6R47UcmS3Rk@xxxxxxxxxxxxxx>, <BLU157-w352F69CD38F5FBFCA60477DAF90@xxxxxxx>, <20110120091934.GG8286@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
I'll do the check. Thanks.
Well, during the test, I still have another two failures
 
1) when all domain are destroyed, the handle in hash table are not decrease to 0 sometimes.
I print the handle count, most of time it is 0 after all domain destroyed.
(XEN) ===>total handles 2 total gfns 2 next_handle: 713269
 
2)  set_shared_p2m_entry failed
 745     list_for_each_safe(le, te, &ce->gfns)
 746     {
 747         gfn = list_entry(le, struct gfn_info, list);
 748         /* Get the source page and type, this should never fail
 749          * because we are under shr lock, and got non-null se */
 750         BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page));
 751         /* Move the gfn_info from ce list to se list */
 752         list_del(&gfn->list);
 753         d = get_domain_by_id(gfn->domain);
 754 //      mem_sharing_debug_gfn(d, gfn->gfn);
 755 &n bsp;       BUG_ON(!d);
 756         BUG_ON(set_shared_p2m_entry(d, gfn->gfn, se->mfn) == 0);
 757         put_domain(d);
 758         list_add(&gfn->list, &se->gfns);
 759         put_page_and_type(cpage);
 760 //      mem_sharing_debug_gfn(d, gfn->gfn);  
 
 
(XEN) printk: 33 messages suppressed.
(XEN) p2m.c:2442:d0 set_mmio_p2m_entry: set_p2m_entry failed! mfn=0023dbb7
(XEN) Xen BUG at mem_sharing.c:756
(XEN) ----[ Xen-4.0.0  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c4801bfd90>] mem_sharing_share_pages+0x370/0x3d0
(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: ffff83040ed20000   rcx: 0000000000000092
(XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c48021eac4
(XEN) rbp: ffff8305a4bbe1b0   rsp: ffff82c48035fc58   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 00000000fffffffb   r11: ffff82c4801318d0
(XEN) r12: ffff8305a4bbe1a0   r13: ffff8305a61d42a0   r14: ffff82f6047b76e0
(XEN) r15: ffff8304e5e918c8   cr0: 0000000 080050033   cr4: 00000000000026f0
(XEN) cr3: 00000005203fc000   cr2: 00000000027b8000


 
 
 
> Date: Thu, 20 Jan 2011 09:19:34 +0000
> From: Tim.Deegan@xxxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx
> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; juihaochiang@xxxxxxxxx
> Subject: Re: [PATCH] mem_sharing: fix race condition of nominate and unshare
>
> At 07:19 +0000 on 20 Jan (1295507976), MaoXiaoyun wrote:
> > Hi:
> >
> > The latest BUG in mem_sharing_alloc_page from mem_sharing_unshare_page.
> > I printed heap info, which shows plenty memory left.
> > Could domain be NULL during in unshare, or should it be locked by rcu_lock_domain_by_id ?
> >
>
> 'd' probably isn't NULL; more likely is that the domain is not allowed
> to have any more memory. You should look at the values of d->max_pages
> and d->tot_pages when the failure happens.
>
> Cheers.
>
> Tim.
>
> > -----------code------------
> > 422 extern void pa gealloc_info(unsigned char key);
> > 423 static struct page_info* mem_sharing_alloc_page(struct domain *d,
> > 424 unsigned long gfn,
> > 425 int must_succeed)
> > 426 {
> > 427 struct page_info* page;
> > 428 struct vcpu *v = current;
> > 429 mem_event_request_t req;
> > 430
> > 431 page = alloc_domheap_page(d, 0);
> > 432 if(page != NULL) return page;
> > 433
> > 434 memset(&req, 0, sizeof(req));
> > 435 if(must_succeed)
> > 436 {
> > 437 /* We do not support 'must_succeed' any more. External operations such
> > 438 * as grant table mappings may fail with OOM condition!
> > 439 */
> > 440 pagealloc_info('m');
> > 441 BUG();
> > 442 }
> >
> > -------------serial output-------
> > (XEN) Physical memory information:
> > (XEN) Xen heap: 0kB free
> > (X EN) heap[14]: 64480kB free
> > (XEN) heap[15]: 131072kB free
> > (XEN) heap[16]: 262144kB free
> > (XEN) heap[17]: 524288kB free
> > (XEN) heap[18]: 1048576kB free
> > (XEN) heap[19]: 1037128kB free
> > (XEN) heap[20]: 3035744kB free
> > (XEN) heap[21]: 2610292kB free
> > (XEN) heap[22]: 2866212kB free
> > (XEN) Dom heap: 11579936kB free
> > (XEN) Xen BUG at mem_sharing.c:441
> > (XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]----
> > (XEN) CPU: 0
> > (XEN) RIP: e008:[<ffff82c4801c0531>] mem_sharing_unshare_page+0x681/0x790
> > (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor
> > (XEN) rax: 0000000000000000 rbx: ffff83040092d808 rcx: 0000000000000096
> > (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021eac4
> > (XEN) rbp: 0000000000000000 rsp: ffff82c48035f5e8 r8: 0000000000000001
> > (XE N) r9: 0000000000000001 r10: 00000000fffffff5 r11: 0000000000000008
> > (XEN) r12: ffff8305c61f3980 r13: ffff83040eff0000 r14: 000000000001610f
> > (XEN) r15: ffff82c48035f628 cr0: 000000008005003b cr4: 00000000000026f0
> > (XEN) cr3: 000000052bc4f000 cr2: ffff880120126e88
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
> > (XEN) Xen stack trace from rsp=ffff82c48035f5e8:
> > (XEN) ffff8305c61f3990 00018300bf2f0000 ffff82f604e6a4a0 000000002ab84078
> > (XEN) ffff83040092d7f0 00000000001b9c9c ffff8300bf2f0000 000000010eff0000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000d0000010f ffff8305447ec000 000000000001610f
> > (XEN) 0000000000273525 ffff82c48035f724 ffff830502c705a0 ffff82f602c89a00
> > (XEN) ffff83040eff0000 ffff82c48010bfa9 ffff830572c5dbf0 000000000029e07f
> > (XEN) 0000000000000 000 ffff830572c5dbf0 000000008035fbe8 ffff82c48035f6f8
> > (XEN) 0000000100000002 ffff830572c5dbf0 ffff83063fc30000 ffff830572c5dbf0
> > (XEN) 0000035900000000 ffff88010d14bbe0 ffff880159e09000 00003f7e00000002
> > (XEN) ffffffffffff0032 ffff88010d14bbb0 ffff830438dfa920 0000000d8010a650
> > (XEN) 0000000000000100 ffff83063fc30000 ffff8305f9203730 ffffffffffffffea
> > (XEN) ffff88010d14bb70 0000000000000000 ffff88010d14bc10 ffff88010d14bbc0
> > (XEN) 0000000000000002 ffff82c48010da9b 0000000000000202 ffff82c48035fec8
> > (XEN) ffff82c48035f7c8 00000000801880af ffff83063fc30010 0000000000000000
> > (XEN) ffff82c400000008 ffff82c48035ff28 0000000000000000 ffff88010d14bbc0
> > (XEN) ffff880159e08000 0000000000000000 0000000000000000 00020000000002d7
> > (XEN) 00000000003f2b38 ffff8305b1f4b6b8 ffff8305b30f0000 ffff880159e09000
> > (XEN) 0000000000000000 0000000000000000 000200000000 008a 00000000003ed1f9
> > (XEN) ffff83063fc26450 ffff8305b30f0000 ffff880159e0a000 0000000000000000
> > (XEN) 0000000000000000 00020000000001fa 000000000029e2ba ffff83063fc26fd0
> > (XEN) Xen call trace:
> > (XEN) [<ffff82c4801c0531>] mem_sharing_unshare_page+0x681/0x790
> > (XEN) [<ffff82c48010bfa9>] gnttab_map_grant_ref+0xbf9/0xe30
> > (XEN) [<ffff82c48010da9b>] do_grant_table_op+0x14b/0x1080
> > (XEN) [<ffff82c48010fb44>] do_xen_version+0xb4/0x480
> > (XEN) [<ffff82c4801b8215>] set_p2m_entry+0x85/0xc0
> > (XEN) [<ffff82c4801bc92e>] set_shared_p2m_entry+0x1be/0x2f0
> > (XEN) [<ffff82c480121c4c>] xmem_pool_free+0x2c/0x310
> > (XEN) [<ffff82c4801bfaf8>] mem_sharing_share_pages+0xd8/0x3d0
> > (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70
> > (XEN) [<ffff82c48011c519>] cpumask_raise_softirq+0x89/0 xa0
> > (XEN) [<ffff82c480118351>] csched_vcpu_wake+0x101/0x1b0
> > (XEN) [<ffff82c48014717d>] vcpu_kick+0x1d/0x80
> > (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70
> > (XEN) [<ffff82c48015a1d8>] get_page+0x28/0xf0
> > (XEN) [<ffff82c48015ed72>] do_update_descriptor+0x1d2/0x210
> > (XEN) [<ffff82c480113d7e>] do_multicall+0x14e/0x340
> > (XEN) [<ffff82c4801e3169>] syscall_enter+0xa9/0xae
> > (XEN)
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) Xen BUG at mem_sharing.c:441
> > (XEN) ****************************************
> > (XEN)
> > (XEN) Manual reset required ('noreboot' specified)
> >
> > > Date: Mon, 17 Jan 2011 17:02:02 +0800
> > > Subject: Re: [PATCH] mem_sharing: fix race condition of nominate and unshare
> > > From: juihaochiang@xxxxxxxxx
> > > To: tinnycloud@xxxxxxxxxxx
> > > CC: xen-devel@xxxxxxxxxxxxxxxxxxx; tim.deegan@xxxxxxxxxx
> > >
> > > Hi, tinnycloud:
> > >
> > > Do you have xenpaging tools running properly?
> > > I haven't gone through that one, but it seems you have run out of memory.
> > > When this case happens, mem_sharing will request memory to the
> > > xenpaging daemon, which tends to page out and free some memory.
> > > Otherwise, the allocation would fail.
> > > Is this your scenario?
> > >
> > > Bests,
> > > Jui-Hao
> > >
> > > 2011/1/17 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> > > > Another failure on BUG() in mem_sharing_alloc_page()
> > > >
> > > > memset(&req, 0, sizeof(req));
> > > > if(must_succeed)
> > > > {
> > > > /* We do not support 'must_succeed' any more. External operations
> > > > such
> > > > * as grant table mappings may fail with OOM condition!
> > > > */
> > > > BUG();===================>bug here
> > > > }
> > > > else
> > > > {
> > > > /* All foreign attempts to unshare pages should be handled through
> > > > * 'must_succeed' case. */
> > > > ASSERT(v->domain->domain_id == d->domain_id);
> > > > vcpu_pause_nosync(v);
> > > > req.flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
> > > > }
> > > >
>
> --
> Tim Deegan <Tim.Deegan@xxxxxxxxxx>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>