WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Unplugging a dom0 vcpu and domain destruction

To: George Dunlap <dunlapg@xxxxxxxxx>
Subject: Re: [Xen-devel] Unplugging a dom0 vcpu and domain destruction
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Fri, 20 Feb 2009 10:15:35 -0800
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Fri, 20 Feb 2009 10:16:05 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <de76405a0902200913v7e7f6b99tf5ef71d46c1f6724@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <de76405a0902170930p7afad4b9ye68358df0f6ff3fe@xxxxxxxxxxxxxx> <C5C0A6EF.2D2D%keir.fraser@xxxxxxxxxxxxx> <de76405a0902200913v7e7f6b99tf5ef71d46c1f6724@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.19 (X11/20090105)
George Dunlap wrote:
OK, I finally popped off all the interrupts on my stack and got back to this.

The put_domain() that finally destroys the domains (after plugging
back in the cpu) is in page_alloc.c:931, in free_domheap_pages().

Here's the callstack from xen:

(XEN)    [<ffff828c80112cd6>] free_domheap_pages+0x3a9/0x427
(XEN)    [<ffff828c8014f0e3>] put_page+0x4b/0x52
(XEN)    [<ffff828c80150236>] put_page_from_l1e+0x137/0x1ae
(XEN)    [<ffff828c80155ed0>] ptwr_emulated_update+0x555/0x57c
(XEN)    [<ffff828c80155fa3>] ptwr_emulated_cmpxchg+0xac/0xb5
(XEN)    [<ffff828c80176511>] x86_emulate+0xf876/0xfb5d
(XEN)    [<ffff828c8014f523>] ptwr_do_page_fault+0x15c/0x190
(XEN)    [<ffff828c80164d8c>] do_page_fault+0x3b8/0x571

So the thing that finally destroys the domain is unmapping its last
outstanding domheap page from dom0's pagetables.  It was unmapped from
vcpu 1 (which had just come back online), from
linux/mm/memory.c:unmap_vmas().

I confirmed that there were two outstanding unmapped pages of the
"zombie domain" using the 'q' debug key:
(XEN) General information for domain 2:
(XEN)     refcnt=1 dying=2 nr_pages=2 xenheap_pages=0 dirty_cpus={}
max_pages=8192
(XEN)     handle=a7c2bcb8-e647-992f-9e15-7313072a36bf vm_assist=00000008
(XEN) Rangesets belonging to domain 2:
(XEN)     Interrupts { }
(XEN)     I/O Memory { }
(XEN)     I/O Ports  { }
(XEN) Memory pages belonging to domain 2:
(XEN)     DomPage 000000000003d64f: caf=00000001, taf=e800000000000001
(XEN)     DomPage 000000000003d64e: caf=00000001, taf=e800000000000001
(XEN) VCPU information and callbacks for domain 2:
(XEN)     VCPU0: CPU0 [has=F] flags=1 poll=0 upcall_pend = 00,
upcall_mask = 00 dirty_cpus={} cpu_affinity={0-31}
(XEN)     100 Hz periodic timer (period 10 ms)
(XEN)     Notifying guest (virq 1, port 0, stat 0/-1/0)

I'm not sure if this is relevant, but looks that while dom0's vcpu 1
was offline, it had a pending interrupt:

(XEN)     VCPU1: CPU0 [has=F] flags=2 poll=0 upcall_pend = 01,
upcall_mask = 01 dirty_cpus={} cpu_affinity={0-31}
(XEN)     100 Hz periodic timer (period 10 ms)
(XEN)     Notifying guest (virq 1, port 0, stat 0/-1/-1)

So it appears that when vcpu 1 is offline, it never successfully
removes mappings for the domU until vcpu 1 comes back online.

I don't know enough about the unmapping process... Jeremy, do you know
anything about the process for unmapping domU memory from dom0 when
the domU is being destroyed in the linux-2.6.18-xen.hg tree?  More
specifically, why if I take dom0's vcpu 1 offline (via the /sys
interface), why the unmapping doesn't happen until I bring vcpu 1
online?

Is it that the offline cpu still has a cr3 reference to a pagetable, and that's not being given up? Or gdt?

In the pvops kernels we also keep a reference to the vcpu info structure, since we place it the kernel's memory rather than keeping it in the shared info structure. For a while that had bugs that left zombie domains lying around, but I don't think anyone backported that stuff to 2.6.18.

   J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel