WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Zombie VMs cannot be destroyed

To: Michael Froh <mike@xxxxxxxxxxx>
Subject: Re: [Xen-users] Zombie VMs cannot be destroyed
From: Tim Post <tim.post@xxxxxxxxxxxxxxx>
Date: Sat, 02 Dec 2006 02:25:45 +0800
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 01 Dec 2006 10:26:08 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <6A79DF76-E17F-4AE3-9891-DAA2DC743D96@xxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Organization: Net Kinetics
References: <6A79DF76-E17F-4AE3-9891-DAA2DC743D96@xxxxxxxxxxx>
Reply-to: tim.post@xxxxxxxxxxxxxxx
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
On Fri, 2006-12-01 at 12:57 -0500, Michael Froh wrote:
> This sounds like a B rate horror movie.  Has anyone else seen Zombie  
> VMs?

Yes. I get them quite often due to the fact that I use mostly Tyan
boards with on board SODIMMs for disk caching. Guests like to hang on
shutdown due to that. Only seems to be on Tyan boards.

> 
> I had a number of VMs running and used the following script to  
> destroy them:
> 
> for vm in `xm list | awk '{print $1}' | grep -v Name | grep -v  
> Domain-0`; do xm destroy $vm; done
> 

I hope those aren't ext3 file systems. 'shutdown' would be preferable.

> This destroyed all of the para-virtualized domains running (4 of  
> them) but turned all the HVM VMs into Zombies as shown here:
> 

Destroyed is the word. You may want to fsck prior to booting them again,
it would be faster.

> [root@vm0 ~]# xm list
> Name                                      ID Mem(MiB) VCPUs State    
> Time(s)
> Domain-0                                   0     5074     4 r-----    
> 4912.8
> Zombie-dsl0                               25      256     1 -b---d     
> 552.1
> Zombie-dsl1                               26      256     1 -b---d     
> 552.2
> Zombie-dsl2                               27      256     1 -b---d     
> 550.0
> Zombie-dsl3                               28      256     1 -b---d     
> 554.5
> Zombie-knoppix0                           17      256     1 -b---d    
> 4459.9
> Zombie-knoppix1                           18      256     1 -----d    
> 4425.9
> Zombie-knoppix2                           19      256     1 -b---d    
> 4530.9
> Zombie-knoppix3                           20      256     1 -b---d    
> 4493.7
> 
> 
> Subsequent attempts to destroy the VMs using "xm destroy 25" or "xm  
> destroy Zombie-dsl0" don't do anything.
> 

Zombie VM's are just like zombie processes.. they're waiting for
something to happen before they exit. In this case they're waiting for
disks to sync on a VBD that's no longer connected. In effect, you pulled
out the hard drives before the VM's could sync what they had in the
inode cache to write, then yanked the power cord and plugged it back in
really quickly.

Bad idea.

> It's curious that the VMs are shown as booting and being destroyed (- 
> b----d).
> 

Whats being destroyed are your file systems.

> The para-virtualized VMs were named centos[0-3] so it might be a  
> timing issue where only 4 destroys were properly handled and the para- 
> virtualized VMs happened to be the first 4 domains in xm list.
> 
> I will play around a bit to see if I can recreate consistently and if  
> there options to really destroy the domains.  This is not an issue  
> for me since my VM environment is a lab, but in production this might  
> be very problematic.
> 

Amen. Try "xm shutdown" .. if your script has to ensure a dom-u exited
try something like :

counter=0

while [ `xm list | grep [domname]` = 0 ] && [ "$counter" -le 20 ]; do
        xm shutdown [domname]
        sleep 5
        let "counter += 1"
done

if [ "$counter" -ge 20 ]; then
        xm pause [domname]
        xm sysrq [domname] S
        sleep 5
        xm destroy [domname]
fi

Depending on the I/O usage of the guests, you may want to toss in a xm
sysrq 0 S too.

Note, "xm shutdown [domname]" is almost always going to exit 0. The only
reason it will not is if [domname] doesn't exist. It is a little tricky
to use in a script.

The above is completely off the top of my head and meant for
illustration only. 

ext3 (or any other journaling file systems) get *very* grumpy if they
can't flush their inodes prior to shutting down. Save yourself a few
hassles :)

xm destroy = pull out the power cord.

You may try using "xendomains" instead. 

> Mike.
> 

Hope this helps
-Tim

> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users