Re: [Xen-users] Xen system hang or freeze
2009/4/3 Nick Anderson <nick@xxxxxxxxxxxx>
On Fri, Apr 03, 2009 at 03:56:28PM +0100, Paraic Gallagher wrote:I have seen similar issues with one of my servers. I have yet to nail
> I am running xen 3.0.3, with CentOS 5.2 based Dom0
> Recently I have noticed some complete system lockups on a few different
> servers. Neither Dom0 or any of the guests respond to pings, connecting a
> keyboard and monitor to the system only shows a blank screen. Nothing is
> written to logs at time of lockup.
down the issue.
Distro: Debian Etch
CPU: 2x Quad-Core AMD Opteron(tm) Processor 2350
Disk: 3ware 9650LE with 8 drive Raid6
Xen: 3.2 (from debian repo)
All vms are LVM backed. Not running any HVM guests.
Thanks for the response. After searching net for few weeks with no luck
in finding similar issues was beginning to think I was going crazy!
Just with some further details.
I have seen the issue on two types of servers Dell PE 1950, and 2950
2x Quad core Intel Xeon E5410@xxxxxxx
Memory 4G and 16G
Disk, PERC 6/i 1.11, 2x250 Raid1, ST3250620NS Rev: 3BKT
All vms are LVM backed on this system except for Dom0.
For a while I was seeing softlockup on cpu scrolling on the console
and thought that may have caused it. Unfortunatly after updating the
kernel the errors went away and I have had another lockup since then.
Ive found a fairly set pattern though no time periods to predict.
A VM typically goes unresponsive first. If left unchecked for long
enough the host will lock. If caught in time I have had limited
success running xm destroy on the domU. Most of the time running xm
destroy on the domU causes the host to lock immediately requiring a
The most recent lockup was a bit different that what I had in the
The domU locked up (no output on domU console). xm destroy locked
dom0. I rebooted with a remote power strip. dom0 and all domUs came
back up. Nothing in logs as usual. 10 minutes later dom0 was locked
again. I drove to the datacenter and about 30-45 minutes after the
lock the machine became responsive again (according to monitoring
server) I was able to display a website running on a vm. Then the
machine went unresponsive again. Not responding to physical console
access either. Another hard reboot and things are ok.
That was the first time I had ever had so many lockups so close
together. Typically the lockups seem to be 1-2 weeks apart.
I have even tried setting up netconsole on dom0 to try to catch kernel
errors with no success.
This seems to be quite a similar problem from the description, however I haven't
noticed the guest vms locking up prior to Dom0. Something to keep an eye on.
Are you running a particular load on the system at the time or is it somewhat
idle? Seems to be idle in my case before lockup.
Nick Anderson <nick@xxxxxxxxxxxx>
Xen-users mailing list