WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

[PATCH] Fix libxc and pm_timer (Was: [Xen-ia64-devel] Maybe doman_destro

To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
Subject: [PATCH] Fix libxc and pm_timer (Was: [Xen-ia64-devel] Maybe doman_destroy() was not called?)
From: Masaki Kanno <kanno.masaki@xxxxxxxxxxxxxx>
Date: Fri, 24 Aug 2007 01:18:37 +0900
Delivery-date: Thu, 23 Aug 2007 09:19:20 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <47C7E38A180B10kanno.masaki@xxxxxxxxxxxxxx>
List-help: <mailto:xen-ia64-devel-request@lists.xensource.com?subject=help>
List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
List-post: <mailto:xen-ia64-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=unsubscribe>
References: <47C7E38A180B10kanno.masaki@xxxxxxxxxxxxxx>
Sender: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
Tue, 21 Aug 2007 09:27:45 +0900, Masaki Kanno wrote:

>Hi all,
>
>I tested xm create command with latest xen-ia64-unstable and the 
>attached patch.  The attached patch intentionally causes contiguous 
>memory shortage in VHPT allocation for HVM domain.  On the test, 
>I wanted to confirm that the release proceeding of domain resources 
>is working correctly when HVM domain creation failed.  But I could 
>not confirm that it is working correctly.  It seemed to be not 
>calling domain_destroy(). 
>The following messages are the result of the test.  Different RID 
>was allocated whenever I created a HVM domain. 
>Do you think where a bug hides? 
>
> (XEN) domain.c:546: arch_domain_create:546 domain 1 pervcpu_vhpt 1
> (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
> (XEN) tlb_track.c:115: hash 0xf0000002fd350000 hash_size 512
> (XEN) regionreg.c:193: ### domain f0000000040fc080: rid=80000-c0000 mp_rid
>=2000
> (XEN) domain.c:583: arch_domain_create: domain=f0000000040fc080
> (XEN) vpd base: 0xf000000007be0000, vpd size:65536
> (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt
> (XEN) domain.c:546: arch_domain_create:546 domain 2 pervcpu_vhpt 1
> (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
> (XEN) tlb_track.c:115: hash 0xf0000002f6f8c000 hash_size 512
> (XEN) regionreg.c:193: ### domain f000000004109380: rid=c0000-100000 
>mp_rid=3000
> (XEN) domain.c:583: arch_domain_create: domain=f000000004109380
> (XEN) vpd base: 0xf000000007b90000, vpd size:65536
> (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt
> (XEN) domain.c:546: arch_domain_create:546 domain 3 pervcpu_vhpt 1
> (XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
> (XEN) tlb_track.c:115: hash 0xf0000002f676c000 hash_size 512
> (XEN) regionreg.c:193: ### domain f000000007bf1380: rid=100000-140000 
>mp_rid=4000
> (XEN) domain.c:583: arch_domain_create: domain=f000000007bf1380
> (XEN) vpd base: 0xf000000007b50000, vpd size:65536
> (XEN) No enough contiguous memory(16384KB) for init_domain_vhpt
>

Hi,

I found two bugs in this problem. 

Bug.1:
 copy_from_GFW_to_nvram() in libxc forgot munmap() if NVRAM data 
 invalid.  Also it forgot free() and close() too. 
 The Bug.1 is solved by munmap_nvram_page.patch. 

I tried the test again after Bug.1 was solved.  But hypervisor did 
a panic on the test.  The following messages are the result of the 
test. 

(XEN) domain.c:546: arch_domain_create:546 domain 2 pervcpu_vhpt 1
(XEN) tlb_track.c:69: allocated 256 num_entries 256 num_free 256
(XEN) tlb_track.c:115: hash 0xf0000002fad00000 hash_size 512
(XEN) regionreg.c:193: ### domain f0000000040fc080: rid=80000-c0000 mp_rid=2000
(XEN) domain.c:583: arch_domain_create: domain=f0000000040fc080
(XEN) *** xen_handle_domain_access: exception table lookup failed, 
iip=0xf00000000403f530, addr=0x0, spinning...
ip=0xf00000000403f530, addr=0x0, spinning...
(XEN) d 0xf000000007c5c080 domid 0
(XEN) vcpu 0xf000000007c40000 vcpu 0
(XEN) 
(XEN) CPU 0
(XEN) psr : 0000101008226018 ifs : 800000000000058d ip  : [<f00000000403f530>]
(XEN) ip is at timer_softirq_action+0x170/0x2e0
(XEN) unat: 0000000000000000 pfs : 000000000000058d rsc : 0000000000000003
(XEN) rnat: 0000000000004000 bsps: f000000007c47e20 pr  : 00000000006a9969
(XEN) ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
(XEN) csd : 0000000000000000 ssd : 0000000000000000
(XEN) b0  : f00000000403f4f0 b6  : f000000004038b80 b7  : a000000100018570
(XEN) f6  : 1003e000001b932157960 f7  : 1003e0000000281bd3682
(XEN) f8  : 000000000000000000000 f9  : 000000000000000000000
(XEN) f10 : 000000000000000000000 f11 : 000000000000000000000
(XEN) r1  : f00000000438ca40 r2  : 0000007da3766757 r3  : f000000007c47fe8
(XEN) r8  : 0000000000000001 r9  : 0000000000000000 r10 : 0000000000000000
(XEN) r11 : 0009804c0270033f r12 : f000000007c47e00 r13 : f000000007c40000
(XEN) r14 : 0000000000000000 r15 : f0000000040fc9b0 r16 : 0000000000000001
(XEN) r17 : f000000007ceaf18 r18 : 0000000000000002 r19 : 0000000000000001
(XEN) r20 : f000000007ceb508 r21 : f0000000040fc9b8 r22 : 0000000000000001
(XEN) r23 : 0000000000000001 r24 : f000000007ceaf18 r25 : f000000007c47e28
(XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000
(XEN) r29 : 0000000000000000 r30 : 0000000000000000 r31 : f000000004400100
(XEN) 
(XEN) Call Trace:
(XEN)  [<f0000000040af150>] show_stack+0x80/0xa0
(XEN)                                 sp=f000000007c478b0 bsp=f000000007c41668
(XEN)  [<f000000004087640>] panic_domain+0x120/0x170
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41600
(XEN)  [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0
(XEN)                                 sp=f000000007c47bc0 bsp=f000000007c41568
(XEN)  [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300
(XEN)                                 sp=f000000007c47c00 bsp=f000000007c41568
(XEN)  [<f00000000403f530>] timer_softirq_action+0x170/0x2e0
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41500
(XEN)  [<f00000000403ca30>] do_softirq+0x170/0x220
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN)  [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN) domain_crash_sync called from xenmisc.c:152
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) d 0xf000000007c5c080 domid 0
(XEN) vcpu 0xf000000007c40000 vcpu 0
(XEN) 
(XEN) CPU 0
(XEN) psr : 00001011085a6010 ifs : 8000000000000307 ip  : [<a0000001000a6540>]
(XEN) ip is at ???
(XEN) unat: 0000000000000000 pfs : 400000000000038a rsc : 0000000000000007
(XEN) rnat: 0000000000000000 bsps: e000000162a90f70 pr  : 00000000006a9a59
(XEN) ldrs: 0000000002300000 ccv : 0000000000000000 fpsr: 0009804c0270033f
(XEN) csd : 0000000000000000 ssd : 0000000000000000
(XEN) b0  : a0000001000a6960 b6  : a000000100018610 b7  : a000000100018570
(XEN) f6  : 000000000000000000000 f7  : 000000000000000000000
(XEN) f8  : 000000000000000000000 f9  : 000000000000000000000
(XEN) f10 : 000000000000000000000 f11 : 000000000000000000000
(XEN) r1  : a0000001011225d0 r2  : e0000001781f3154 r3  : e000000164d58198
(XEN) r8  : e000000164cc8198 r9  : e000000164cc8018 r10 : 0000000000000000
(XEN) r11 : 0000000000000000 r12 : e000000162a97df0 r13 : e000000162a90000
(XEN) r14 : 0000000000000000 r15 : e000000164cc8dd0 r16 : 0000000000001000
(XEN) r17 : e000000164d58dd0 r18 : e000000164cc8da8 r19 : 0000000000000000
(XEN) r20 : e0000001781f3138 r21 : 0000000000000018 r22 : e000000162a90f70
(XEN) r23 : 0000000000000001 r24 : 0000000000000000 r25 : 0000000000000000
(XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000
(XEN) r29 : 0000000000000000 r30 : 0000000000000018 r31 : 400000000000038a
(XEN) 
(XEN) Call Trace:
(XEN)  [<f0000000040af150>] show_stack+0x80/0xa0
(XEN)                                 sp=f000000007c478b0 bsp=f000000007c416b8
(XEN)  [<f000000004017300>] __domain_crash+0x100/0x140
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41690
(XEN)  [<f000000004017380>] __domain_crash_synchronous+0x40/0xf0
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41668
(XEN)  [<f000000004087680>] panic_domain+0x160/0x170
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41600
(XEN)  [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0
(XEN)                                 sp=f000000007c47bc0 bsp=f000000007c41568
(XEN)  [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300
(XEN)                                 sp=f000000007c47c00 bsp=f000000007c41568
(XEN)  [<f00000000403f530>] timer_softirq_action+0x170/0x2e0
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41500
(XEN)  [<f00000000403ca30>] do_softirq+0x170/0x220
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN)  [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN) 
(XEN) Call Trace:
(XEN)  [<f0000000040af150>] show_stack+0x80/0xa0
(XEN)                                 sp=f000000007c478b0 bsp=f000000007c416b8
(XEN)  [<f000000004017310>] __domain_crash+0x110/0x140
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41690
(XEN)  [<f000000004017380>] __domain_crash_synchronous+0x40/0xf0
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41668
(XEN)  [<f000000004087680>] panic_domain+0x160/0x170
(XEN)                                 sp=f000000007c47a80 bsp=f000000007c41600
(XEN)  [<f00000000407ada0>] ia64_do_page_fault+0x6b0/0x6c0
(XEN)                                 sp=f000000007c47bc0 bsp=f000000007c41568
(XEN)  [<f0000000040a7f40>] ia64_leave_kernel+0x0/0x300
(XEN)                                 sp=f000000007c47c00 bsp=f000000007c41568
(XEN)  [<f00000000403f530>] timer_softirq_action+0x170/0x2e0
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41500
(XEN)  [<f00000000403ca30>] do_softirq+0x170/0x220
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN)  [<f0000000040a7f60>] ia64_leave_kernel+0x20/0x300
(XEN)                                 sp=f000000007c47e00 bsp=f000000007c41480
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.


Bug.2:
 The release proceeding of domain resources forgot to stop (or kill) 
 PM timer, and freed the domain structure. 
 VMX flag of VCPU#0 was not set when VHPT allocation for VCPU#0 
 failed.  For this reason, domain_relinquish_resources() did not 
 call vmx_relinqush_guest_resources().  But the domain structure 
 was freed.  As a result, timer_softirq_action() lose sight of 
 the callback function for PM timer. 
 The Bug.2 is solved by kill_pm_timer.patch. 


Signed-off-by: Masaki Kanno <kanno.masaki@xxxxxxxxxxxxxx>

Best regards,
 Kan

Attachment: kill_pm_timer.patch
Description: Binary data

Attachment: munmap_nvram_page.patch
Description: Binary data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel