WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] RE: mem_sharing: summarized problems when domain is dyin

To: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Subject: Re: [Xen-devel] RE: mem_sharing: summarized problems when domain is dying
From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date: Mon, 24 Jan 2011 14:08:01 +0000
Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, tim.deegan@xxxxxxxxxx, juihaochiang@xxxxxxxxx
Delivery-date: Mon, 24 Jan 2011 06:08:41 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=dwNGZi0fHEvG30KxwrU5hXSHnYxiQuKFRF8TJo2Zpps=; b=SoAUQN0GVEaW9qqbyJU7CVmSkTSspuKytvCJBctY/YZW23akMtHwNbDb3IjG73gMKG IV4owCGsV5XUi7JEwEKlZBjMXetK+Q6EnpjxReGh7HLDdXjrNumswaIvF7cTgpxRspdp VLaKogBAeq1sNDNBgs4SX5gE0J5XNsXcrRgpo=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=dKh4F8DN1/AQTfqJfSkJmjoRhcjmEQJGo9GLLKBX9Th72ugJmQe10L/F6/FZzzLRiY 8tZJL9+D0JRiKKMcOrhcIqK76VCEQtcnQkzXFGck4u4lzUL/NHeOc9b2BnEoPG5QrJLi 6E4O3OoV4cl1BRAKsJzg6TMQdLkxQ/u+ocRf0=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <BLU157-w27B4B745F95C5525F45568DAFD0@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTi=wimfV7Wc6aEd2cYS-=dOb2V5Xy97eSCgW-gKh@xxxxxxxxxxxxxx> <AANLkTi=nhs9edNB2-A700RKcGrj52EgcvVUPaK_qiM0N@xxxxxxxxxxxxxx> <BLU157-w27B4B745F95C5525F45568DAFD0@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
I think it would be best if every separate issue you're facing is a
separate thread.  This looks like a Linux crash -- please include the
kernel version you're using, and whatever other information might be
appropriate.

 -George

2011/1/24 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> Hi:
>
>        Another BUG found when testing memory sharing.
>        In this test, I start 24 linux HVMS, each of them reboot through "xm
> reboot" every 30minutes.
>        After several hours, some of the HVM will crash. All of the crash HVM
> are stopped during booting.
>        The bug still exists even I forbid page sharing by cheating tapdisk
> that xc_memshr_nominate_gref()
>        return failure.
>
>        And no special log found.
>
>        I was able to dump the crash stack.
>        what could happen?
>        thanks.
>
> PID: 2307   TASK: ffff810014166100  CPU: 0   COMMAND: "setfont"
>  #0 [ffff8100123cd900] xen_panic_event at ffffffff88001d28
>  #1 [ffff8100123cd920] notifier_call_chain at ffffffff80066eaa
>  #2 [ffff8100123cd940] panic at ffffffff8009094a
>  #3 [ffff8100123cda30] oops_end at ffffffff80064fca
>  #4 [ffff8100123cda40] do_page_fault at ffffffff80066dc0
>  #5 [ffff8100123cdb30] error_exit at ffffffff8005dde9
>     [exception RIP: vgacon_do_font_op+363]
>     RIP: ffffffff800515e5  RSP: ffff8100123cdbe 8  RFLAGS: 00010203
>     RAX: 0000000000000000  RBX: ffffffff804b3740  RCX: ffff8100000a03fc
>     RDX: 00000000000003fd  RSI: ffff810011cec000  RDI: ffffffff803244c4
>     RBP: ffff810011cec000   R8: d0d6999996000000   R9: 0000009090b0b0ff
>     R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000004
>     R13: 0000000000000001  R14: 0000000000000001  R15: 000000000000000e
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #6 [ffff8100123cdc20] vgacon_font_set at ffffffff8016bec5
>  #7 [ffff8100123cdc60] con_font_op at ffffffff801aa86b
>  #8&nbsp ;[ffff8100123cdcd0] vt_ioctl at ffffffff801a5af4
>  #9 [ffff8100123cdd70] tty_ioctl at ffffffff80038a2c
> #10 [ffff8100123cdeb0] do_ioctl at ffffffff800420d9
> #11 [ffff8100123cded0] vfs_ioctl at ffffffff800302ce
> #12 [ffff8100123cdf40] sys_ioctl at ffffffff8004c766
> #13 [ffff8100123cdf80] tracesys at ffffffff8005d28d (via system_call)
>     RIP: 00000039294cc557  RSP: 00007fff54c4aec8  RFLAGS: 00000246
>     RAX: ffffffffffffffda  RBX: ffffffff8005d28d  RCX: ffffffffffffffff
>     RDX: 00007fff54c4aee0  RSI: 0000000000004b72  RDI: 0000000000000003
>     RBP: 000000001d747ab0   R8: 0000000000000010   R9: 0000000 000800000
>     R10: 0000000000000000  R11: 0000000000000246  R12: 0000000000000010
>     R13: 0000000000000200  R14: 0000000000000008  R15: 0000000000000008
>     ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b
>
>> Date: Fri, 21 Jan 2011 14:45:14 -0500
>> Subject: Re: mem_sharing: summarized problems when domain is dying
>> From: juihaochiang@xxxxxxxxx
>> To: Tim.Deegan@xxxxxxxxxx
>> CC: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>>
>> Hi
>>
>> On Fri, Jan 21, 2011 at 11:19 AM, Jui-Hao Chiang <juihaochiang@xxxxxxxxx>
>> wrote:
>> > Hi, Tim:
>> >
>> > From tinnycloud's result, here I summarize the current problem and
>> > findings of mem_sharing due to domain dying.
>> > (1) When domain is dying, alloc_domheap_page() and
>> > set_shared_p2m_entry() would just fail. So the shr_lock is not enough
>> > to ensure that the domain won't die in the middle of mem_sharing code.
>> > As tinnycloud's code shows, is that better to use
>> > rcu_lock_domain_by_id before calling the above two functions?
>> >
>>
>> There seems no good locking to protect a domain from changing the
>> is_dying state. So the unshare function could fail in the middle in
>> several points, e.g., alloc_domheap_page and set_shared_p2m_entry.
>> If that's the case, we need to add some checking, and probably revert
>> the things we have done when is_dying is changed in the middle.
>>
>> Any comments?
>>
>> Jui-Hao
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel