Very likely. If you look in xen/arch/x86/mm/p2m.c, the two functions
which check a page to see if it can be reclaimed are
"p2m_pod_zero_check*()". A little ways into each function there's a
giant "if()" which has all of the conditions for reclaiming a page,
starting with p2m_is_ram(). The easiest way to fix it is to add
p2m_is_shared() to that "if" statement.
-George
2011/1/19 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> Hi George:
>
> I am working on the xen mem_sharing, I think the bug below is
> related to POD.
> (Test shows when POD is enable, it is easily hit the bug, when disabled, no
> bug occurs).
>
> As I know when domU starts will POD, it gets memory from POD cache, and in
> some
> situation, POD cached will scan from Zero pages for reusing(link the page
> into POD
> cache page list), and from the page_info define, list and handle share same
> posistion,
> I think when reusing the page, POD doest't check page type, and if it is a
> shared page
> , it still can be put into POD cache, and thus handle is been overwritten.
>
> So maybe we need to check the page type before putting into cache,
> What's your opinion?
> thanks.
>
>>--------------------------------------------------------------------------------
>>From: tinnycloud@xxxxxxxxxxx
>>To: juihaochiang@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>>CC: tim.deegan@xxxxxxxxxx
>>Subject: RE: [PATCH] mem_sharing: fix race condition of nominate and
>> unshare
>>Date: Tue, 18 Jan 2011 20:05:16 +0800
>>
>>Hi:
>>
>> It is later found that caused by below patch code and I am using the
>> blktap2.
>>The handle retruned from here will later become ch in
>> mem_sharing_share_pages, and then
>>in mem_sharing_share_pages will have ch = sh, thus caused the problem.
>>
>>+ /* Return the handle if the page is already shared */
>>+ page = mfn_to_page(mfn);
>>+ if (p2m_is_shared(p2mt)) {
>>+ *phandle = page->shr_handle;
>>+ ret = 0;
>>+ goto out;
>>+ }
>>+
>>
>>But. after I removed the code, tests still failed, and this handle's value
>> is not make sence.
>>
>>
>>(XEN) ===>total handles 17834 total gfns 255853
>>(XEN) handle 13856642536914634
>>(XEN) Debug for domain=1, gfn=19fed, Debug page: MFN=349c0a is
>> ci=8000000000000008, ti=8400000000000007, owner_id=32755
>>(XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]----
>>(XEN) CPU: 15
>>(XEN) RIP: e008:[<ffff82c4801bff4b>]
>> mem_sharing_unshare_page+0x19b/0x720
>>(XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
>>(XEN) rax: 0000000000000000 rbx: ffff83063fc67f28  ; rcx:
>> 0000000000000092
>>(XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021e9c4
>>(XEN) rbp: ffff830440000000 rsp: ffff83063fc67c48 r8: 0000000000000001
>>(XEN) r9: 0000000000000000 r10: 00000000fffffff8 r11: 0000000000000005
>>(XEN) r12: 0000000000019fed r13: 0000000000000000 r14: 0000000000000000
>>(XEN) r15: ffff82f606938140 cr0: 000000008005003b cr4: 00000000000026f0
>>(XEN) cr3: 000000055513c000 cr2: 0000000000000018
>>(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
>>(XEN) Xen stack trace from rsp=ffff83063fc67c48:
>>(XEN) 02c5f6c8b70fed66 39ef64058b487674 ffff82c4801a6082
>> 0000000000000000
>>(XEN) 00313a8b00313eca 0000000000000001 0000000000000009 ff
>> ff830440000000
>>(XEN) ffff83063fc67cb8 ffff82c4801df6f9 0000000000000040
>> ffff83063fc67d04
>>(XEN) 0000000000019fed 0000000d000001ed ffff83055458d000
>> ffff83063fc67f28
>>(XEN) 0000000000019fed 0000000000349c0a 0000000000000030
>> ffff83063fc67f28
>>(XEN) 0000000000000030 ffff82c48019baa6 ffff82c4802519c0
>> 0000000d8016838e
>>(XEN) 0000000000000000 00000000000001aa ffff8300bf554000
>> ffff82c4801b3864
>>(XEN) ffff830440000348 ffff8300bf554000 ffff8300bf5557f0
>> ffff8300bf5557e8
>>(XEN) 00000032027b81f2 ffff82c48026f080 ffff82c4801a9337
>> ffff8300bf448000
>>(XEN) ffff8300bf554000 ffff830000000000 0000000019fed000
>> ffff8300bf2f2000
>>(XEN) ffff82c48019985d 0000000000000080 ffff8300bf554000
>> 0000000000019fed
>>(XEN) ffff82c4801b08ba 000000000001e000 ffff82c48014931f ff
>> ff8305570c6d50
>>(XEN) ffff82c480251080 00000032027b81f2 ffff8305570c6d50
>> ffff83052f3e2200
>>(XEN) 0000000f027b7de0 ffff82c48011e07a 000000000000000f
>> ffff82c48026f0a0
>>(XEN) 0000000000000082 0000000000000000 0000000000000000
>> 0000000000009e44
>>(XEN) ffff8300bf554000 ffff8300bf2f2000 ffff82c48011e07a
>> 000000000000000f
>>(XEN) ffff8300bf555760 0000000000000292 ffff82c48011afca
>> 00000032028a8fc0
>>(XEN) 0000000000000292 ffff82c4801a93c3 00000000000000ef
>> ffff8300bf554000
>>(XEN) ffff8300bf554000 ffff8300bf5557e8 ffff82c4801a6082
>> ffff8300bf554000
>>(XEN) 0000000000000000 ffff82c4801a0cc8 ffff8300bf554000
>> ffff8300bf554000
>>(XEN) Xen call trace:
>>(XEN) [<ffff82c4801bff4b>] mem_sharing_unshare_page+0x19b/0x720
>>(XEN) [<ffff82c4801a6082>] v lapic_has_pending_irq+0x42/0x70
>>(XEN) [<ffff82c4801df6f9>] ept_get_entry+0xa9/0x1c0
>>(XEN) [<ffff82c48019baa6>] hvm_hap_nested_page_fault+0xd6/0x190
>>(XEN) [<ffff82c4801b3864>] vmx_vmexit_handler+0x304/0x1a90
>>(XEN) [<ffff82c4801a9337>] pt_restore_timer+0x57/0xb0
>>(XEN) [<ffff82c48019985d>] hvm_do_resume+0x1d/0x130
>>(XEN) [<ffff82c4801b08ba>] vmx_do_resume+0x11a/0x1c0
>>(XEN) [<ffff82c48014931f>] context_switch+0x76f/0xf00
>>(XEN) [<ffff82c48011e07a>] add_entry+0x3a/0xb0
>>(XEN) [<ffff82c48011e07a>] add_entry+0x3a/0xb0
>>(XEN) [<ffff82c48011afca>] schedule+0x1ea/0x500
>>(XEN) [<ffff82c4801a93c3>] pt_update_irq+0x33/0x1e0
>>(XEN) [< ;ffff82c4801a6082>] vlapic_has_pending_irq+0x42/0x70
>>(XEN) [<ffff82c4801a0cc8>] hvm_vcpu_has_pending_irq+0x88/0xa0
>>(XEN) [<ffff82c4801b267b>] vmx_vmenter_helper+0x5b/0x150
>>(XEN) [<ffff82c4801adaa3>] vmx_asm_do_vmentry+0x0/0xdd
>>(XEN)
>>(XEN) Pagetable walk from 0000000000000018:
>>(XEN) L4[0x000] = 0000000000000000 ffffffffffffffff
>>(XEN)
>>(XEN) ****************************************
>>(XEN) Panic on CPU 15:
>>(XEN) FATAL PAGE FAULT
>>(XEN) [error_code=0000]
>>(XEN) Faulting linear address: 0000000000000018
>>(XEN) ****************************************
>>(XEN)
>>(XEN) Manual reset required ('noreboot' specified)
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------------------------------------
>>>From: tinnycloud@xxxxxxxxxxx
>>>To: juihaochiang@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>>>CC: tim.deegan@xxxxxxxxxx
>>>Subject: RE: [PATCH] mem_sharing: fix race condition of nominate and
>>> unshare
>>>Date: Tue, 18 Jan 2011 17:42:32 +0800
>>
>>>Hi Tim & Jui-Hao:
>>
>> > When I use Linux HVM instead of Windows HVM, more bug shows up.
>>
>>> I only start on VM, and when I destroy it , xen crashed on
>>> mem_sharing_unshare_page()
>>>which in line709, hash_entry is NULL. Later I found the handle has been
>>> removed in
>>>mem_sharing_share_pages(), please refer logs below.
>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|