WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [memory sharing] bug on get_page_and_type

To: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date: Mon, 31 Jan 2011 12:47:11 +0000
Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, tim.deegan@xxxxxxxxxx, juihaochiang@xxxxxxxxx
Delivery-date: Mon, 31 Jan 2011 04:48:01 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=+aM7I37u5Q597hI7hn9v8mJJviOtkaJOXkwuafn6CAA=; b=Z7Y6DgnSN1A278wtyFlWJBFSifibm0Z2Vyu1PPykXZHNTsC8A3rg85vq538L1/up1F 5y5G5sEXYA58gDeegu4HkCwiYBYH75N0+0qHvhF2Khlr7X9dyp3BpDzBnLDKaAsWZOOB s08e/iWarB9Q4qaLiLqxGvV/bwhDtQ4Sz4O3g=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=r1FPLXJiG/xT7ijiHtneIV1dgYVJg3NSjRdRoEdF5g5IfentT2fQY/WE2mV7aSODYN UTqQlkhPuth6ky2KTm6xqizSaeV0EJgcnlc7TrnvTK0bcD8k9hF04X3li9ltVC3JKabB jTT3A1LsCCn89rYAtdLCgqRu43DKkeLVtrYKI=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <BLU157-w29AAD53D1BF4AF2892205EDAE20@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <BLU157-ds63E3E55F8D444780B613ADAE20@xxxxxxx> <AANLkTika2afgs8hhdMD024PHGzDFsoWcOK7nJ7ePzG=z@xxxxxxxxxxxxxx> <BLU157-w29AAD53D1BF4AF2892205EDAE20@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
At first glance, it does look like a race, of the type that would
normally be done by grabbing the p2m lock.  I'd have to take a closer
look to be sure, but I won't be able to for at least another day or
two.  Maybe Tim can comment.

 -George

2011/1/31 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> Hi Georage:
>
>       Thanks for your help, I think I really need to learn hg a bit.
>       Well, when looking into memory sharing code, a confusion
> really brother me a lot.
>       I think maybe you can  enlighten me.
>
>       There has been *check action code* do like below steps
>       1) Use gfn_to_mfn to get p2m_type_t p2mt
>       2) check the type of p2mt,
>       3) do some stuffes base if check on step 2 is failed
>
>       While when *nominate  page*  to be shared, it does these follow steps
>       a) check page whether can be shared
>       b) call page_make_sharable to make sharable
>       c) if 2) is success, call p2m_change_type to update page type
> p2m_ram_shared
>
>       My confusion is if *nominate page* code reach step b, and before it go
> to step c
>        *check action code*  reach its step2, it will find page type is
> not  p2m_ram_shared
>        and go to step3. And later "nominate page"  code go to step (c)
>
> Take xen/common/memory.c  for example,
> line 179 is much like step 2, it will goto 196 to clear the _PGC_allocated
> flag
> if check failed, so it might indicates, *nominate page* code change the page
> type
> into p2m_ram_shared, but the page actually has alreay been freed.
>
> Am I right?
>
>
> 155 int guest_remove_page(struct domain *d, unsigned long gmfn)
> 156 {
> 157     struct page_info *page;
> 158 #ifdef CONFIG_X86
> 159     p2m_type_t p2mt;
> 160 #endif
> 161     unsigned long mfn;
> 162
> 163 #ifdef CONFIG_X86
> 164     mfn = mfn_x(gfn_to_mfn(d, gmfn, &p2mt));
> 165 #else
> 166     mfn = gmfn_to_mfn(d, gmfn);
> 167 #endif
> 168     if ( unlikely(!mfn_valid(mfn)) )
> 169     {
> 170         gdprintk(XENLOG_INFO, "Domain %u page number %lx invalid\n",
> 171                 d->domain_id, gmfn);
> 172         return 0;
> 173     }
> 174
> 175&nbsp ;    page = mfn_to_page(mfn);
> 176 #ifdef CONFIG_X86
> 177     /* If gmfn is shared, just drop the guest reference (which may or
> may not
> 178      * free the page) */
> 179     if(p2m_is_shared(p2mt))
> 180     {
> 181         put_page_and_type(page);
> 182         guest_physmap_remove_page(d, gmfn, mfn, 0);
> 183         return 1;
> 184     }
> 185
> 186 #endif /* CONFIG_X86 */
> 187     if ( unlikely(!get_page(page, d)) )
> 188     {
> 189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> d->domain_id);
> 190         return 0;
> 189         gdprintk(XENLOG_INFO, "Bad page free for domain %u\n",
> d->domain_id);
> 190         return 0;
> 191     }
> 192
> 193     if ( test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
> 194         put_page_and_type(page);
> 195
> 196     if ( test_and_clear_bit(_PGC_allocated, &page->count_info) )
> 197          put_page(page);
> 198
> 199     guest_physmap_remove_page(d, gmfn, mfn, 0);
> 200
> 201     put_page(page);
> 202
> 203     return 1;
> 204 }
>
>> Date: Mon, 31 Jan 2011 10:49:37 +0000
>> Subject: Re: [Xen-devel] [memory sharing] bug on get_page_and_type
>> From: George.Dunlap@xxxxxxxxxxxxx
>> To: tinnycloud@xxxxxxxxxxx
>> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; tim.deegan@xxxxxxxxxx;
>> juihaochiang@xxxxxxxxx
>>
>> Xiaoyun,
>>
>> Thanks for all of your work getting page sharing working. When
>> submitting patches, please break them down into individual chunks,
>> each of which does one thing. Each patch should also include a
>> comment saying what the patch does and why, and a Signed-off-by line
>> indicating that you certify that the copyright holder (possibly you)
>> is placing the code under the GPL.
>>
>> Using mercurial queues:
>> http://mercurial.selenic.com/wiki/MqExtension
>> and the mercurial patchbomb extension:
>> http://mercurial.selenic.com/wiki/PatchbombExtension
>> are particularly handy for this process.
> &g t;
>> Quick comment: The ept-locking part of the patch is going to be
>> NACK-ed, as it will cause circular locking dependencies with the
>> hap_lock. c/s 22526:7a5ee380 has an explanation of the circular
>> dependency, and a fix which doesn't introduce a circular dependency.
>> (It may need to be adapted a bit to apply to 4.0-testing)
>>
>> -George
>>
>> On Mon, Jan 31, 2011 at 10:13 AM, tinnycloud <tinnycloud@xxxxxxxxxxx>
>> wrote:
>> > Hi:
>> >
>> >
>> > Attached is the whole patch suit for latest xen-4.0-testing,changeset
>> > 21443,
>> >
>> > It comes from George, Tim and JuiHao, also, has some extra debug info.
>> >
>> >
>> >
>> > On most occasion, memory sharing works fine, but still exists a bug.
>> >
>> >        I’ve been tracing this problem for a while.
>> >
>> >   &nbs p;    There is a bug on get_page_and_type() in
>> > mem_sharing_share_pages()
>> >
>> >
>> >
>> >
>> > --------------------------------------------mem_sharing_share_pages()-----------------------
>> >
>> > 789         if(!get_page_and_type(spage, dom_cow, PGT_shared_page)){
>> >
>> > 790             mem_sharing_debug_gfn(cd, gfn->gfn);
>> >
>> > 791             mem_sharing_debug_gfn(sd, sgfn->gfn);
>> >
>> > 792             printk("c->dying %d s->dying %d spage %p se->mfn %lx\n",
>> > cd->is_dying, sd->is_dying, spage, se->mfn);
>> >
>> > 793             printk("Debug page: MFN=%lx is ci=%lx, ti=%lx,
>> > owner_id=%d\n",
>> >
>> > 794                     mfn_x(page_to_mfn(spage)),
>> >
>> > 795                     spage->count_info,
>> >
>> >  796                     spage->u.inuse.type_info,
>> >
>> > 797                     page_get_owner(spage)->domain_id);
>> >
>> > 798             BUG();
>> >
>> > 799         }
>> >
>> >
>> >
>> >
>> >
>> > Below painc log contains the debug info from line 790-798.
>> >
>> > We saw that 180000000000000 which is PGC_state_free,
>> >
>> > So it looks like a shared page has been freed unexpectly.
>> >
>> >
>> >
>> >
>> >
>> > (XEN) teardown 64
>> >
>> > (XEN) teardown 66
>> >
>> > blktap_sysfs_destroy
>> >
>> > blktap_sysfs_create: adding attributes for dev ffff880109d0c200
>> >
>> > blktap_sysfs_destroy
>> >
>> > __ratelimit: 1 callbacks suppressed
>> >
>> > blktap_sysfs_destroy
>> >
>> > (XEN) Debug for domain=83, gfn=1e8dc, Debug page: MFN=1306dc is
>> > ci=8000000000000005, ti=8400000000000001, owner_id=32755
>> >
>> > (XEN) Debug for domain=79, gfn=1dc95, Invalid MFN=ffffffffffffffff
>> >
>> > (XEN) c->dying 0 s->dying 0 spage ffff82f60a0f12a0 se->mfn 507895
>> >
>> > (XEN) Debug page: MFN=507895 is ci= 180000000000000,
>> > ti=8400000000000001,
>> > owner_id=32755
>> >
>> > (XEN) Xen BUG at mem_sharing.c:798
>> >
>> > (XEN) ----[ Xen-4.0.2-rc2-pre  x86_64  debug=n  Not tainted ]----
>> >
>> > (XEN) CPU:    0
>> >
>> > (XEN) RIP:    e008:[<ffff82c4801c3760>]
>> > mem_sharing_share_pages+0x5b0/0x5d0
>> >
>> > (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>> >
>> > (XEN) rax: 0000000000000000   rbx: ffff83044ef76000   rcx:
>> > 0000000000000092
>> >
>> > (XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi:
>> > ffff82c4802237c4
>> >
>> > (XEN) rbp: ffff83030d99e310   rsp: ffff82c48035fc48    r8:
>> > 0000000000000001
>> >
>> > (XEN) r9:  0000000000000001   r10: 00000000fffffff8   r11:
>> > 0000000000000005
>> >
>> > (XEN) r12: ffff83050b558000   r13: ffff830626ec5740   r14:
>> > 0000000000347967
>> >
>> > (XEN) r15: ffff83030d99e068   cr0: 0000000080050033   cr4:
>> > 00000000000026f0
>> >
>> > (XEN) cr3: 000000010e642000   cr2: 00002abdc4809000
>> >
>> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> >
>> > (XEN) Xen stack trace from rsp=ffff82c48035fc48:
>> >
>> > (XEN)    000000000001e8dc ffff83030d99e068 0000000000507895
>> > ffff83030d99e050
>> >
>> > (XEN)    ffff82f60a0f12a0 ffff82f60260db80 ffff83030d99e300
>> > ffff830626afbbd0
>> >
>> > (XEN)&nb sp;   0000000080372980 ffff82c48035fe38 ffff83023febe000
>> > 00000000008f7000
>> >
>> > (XEN)    0000000000305000 0000000000000006 0000000000000006
>> > ffff82c4801c44a4
>> >
>> > (XEN)    ffff82c480258188 ffff82c48011cb89 000000000001e8dc
>> > fffffffffffffff3
>> >
>> > (XEN)    ffff82c48035fe28 ffff82c480148503 000003d932971281
>> > 0000000000000080
>> >
>> > (XEN)    000003d9329611d6 ffff82c48018aedf ffff82c4803903a8
>> > ffff82c48035fd38
>> >
>> > (XEN)    ffff82c480390380 ffff82c48035fe28 0000000000000002
>> > 0000000000000000
>> >
>> > (XEN)    0000000000000002 0000000000000000 ffff83023febe000
>> > ffff83023ff80080
>> >
>> > (XEN)    ffff82c48035fe58 0000000000000000 ffff82c48022ca80
>> > 0000000000000000
>> >
>> > (XEN)    0000000000000002 ffff82c48018a 452 0000000000000000
>> > ffff82c48016f6bb
>> >
>> > (XEN)    ffff82c48035fe58 ffff82c4801538ea 000000023ab79067
>> > fffffffffffffff3
>> >
>> > (XEN)    ffff82c48035fe28 00000000008f7000 0000000000305000
>> > 0000000000000006
>> >
>> > (XEN)    0000000000000006 ffff82c4801042b3 0000000000000000
>> > ffff82c48010d2a5
>> >
>> > (XEN)    ffff830477fb51e8 0000000000000000 00007ff000000200
>> > ffff8300bf76e000
>> >
>> > (XEN)    0000000600000039 0000000000000000 00007fc09d744003
>> > 0000000000325258
>> >
>> > (XEN)    0000000000347967 ffffffffff600429 000000004d463b17
>> > 0000000000062fe2
>> >
>> > (XEN)    0000000000000000 00007fc09d744070 00007fc09d744000
>> > 00007fffcfec0d20
>> >
>> > (XEN)    00007fc09d744078 0000000000430fd8 00007fffcfec0d88
>> > 00000000019f1ca8
>> >
>> > (XEN)    0000f05c00000000 00007fc09f4c2718 0000000000000000
>> > 0000000000000246
>> >
>> > (XEN) Xen call trace:
>> >
>> > (XEN)    [<ffff82c4801c3760>] mem_sharing_share_pages+0x5b0/0x5d0
>> >
>> > (XEN)    [<ffff82c4801c44a4>] mem_sharing_domctl+0xe4/0x130
>> >
>> > (XEN)    [<ffff82c48011cb89>] cpumask_raise_softirq+0x89/0xa0
>> >
>> > (XEN)    [<ffff82c480148503>] arch_do_domctl+0x14f3/0x22d0
>> >
>> > (XEN)    [<ffff82c48018aedf>] handle_hpet_broadcast+0x16f/0x1d0
>> >
>> > (XEN)    [<ffff82c48018a452>] hpet_legacy_irq_tick+0x42/0x50
>> >
>> > (XEN)    [<ffff82c48016f6bb>] timer_interrupt+0xb/0x130
>> >
>> > (XEN)    [<ffff82c4801538ea> ] ack_edge_ioapic_irq+0x2a/0x70
>> >
>> > (XEN)    [<ffff82c4801042b3>] do_domctl+0x163/0xfe0
>> >
>> > (XEN)    [<ffff82c48010d2a5>] do_grant_table_op+0x75/0x1ad0
>> >
>> > (XEN)    [<ffff82c4801e7169>] syscall_enter+0xa9/0xae
>> >
>> > (XEN)
>> >
>> > (XEN)
>> >
>> > (XEN) ****************************************
>> >
>> > (XEN) Panic on CPU 0:
>> >
>> > (XEN) Xen BUG at mem_sharing.c:798
>> >
>> > (XEN) ****************************************
>> >
>> > (XEN)
>> >
>> > (XEN) Manual reset required ('noreboot' specified)
>> >
>> >
>> >
>> > _______________________________________________
>> > Xen-devel mailing list
>> > Xen-devel@xxxxxxxxxxxxxxxxxxx
>> > http://lists.xensource.com/xen- devel
>> >
>> >
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>