This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-users] Xen system hang or freeze

Over the last year, I've experienced a couple of sources of lockups.

The first was resolved by going to the stock xen kernel compiled from source (had been using the Debian etch kernel; found commentary online describing the same symptoms on Ubuntu, Redhat, and CentOS though, each with their distro-specific kernel).

This one tended to result in kernel oops messages--soft IRQ lockups as I recall. Lockup would start with a domU and within a few minutes would kill the dom0 too. The fastest way to trigger this one was to create and shutdown domU's, although I don't recall that being the only way.

The second, with the stock kernel, was an errant USB hub attached to a xen host. Removing the hub resolved the issue. These were complete, sudden lockups of the dom0 and all domUs -- basically everything. Higher traffic over the USB port would trigger this lockup.

So, for those who haven't tried the stock xen kernel, and are able to try it (based on driver support, etc.), it might help.


On Apr 5, 2009, at 1:20 PM, Martin Fernau wrote:

I tried this before. I had your kernel a few months but this changed nothing.
I had freezed with this kernel too in the same way.

Am Sonntag, 5. April 2009 14:29:20 schrieb Andrew Lyon:
On Sat, Apr 4, 2009 at 4:00 PM, Martin Fernau <m.fernau@xxxxxxxxxx> wrote:

I just want to tell you that I've the same issue for one server!
Fujitsu Siemens PRIMERGY TX200 S4
CPU: Intel Xeon Dual Quad E5405 2GhZ
Hardware Raid: LSI Logic / Symbios Logic MegaRAID SAS 1078 with SAS HDDs
I'm running 4 guests on it:
- Win2003
- Win2003
- Gentoo Linux
- Windows XP Prof

The xen 3.3.0 is running on a gentoo with a 2.6.18-xen-r12 kernel.

I had the same problem with the 2.6.18-xen-rX Gentoo kernels, so I
made my own ebuild and patches from the openSUSE Xen patches, you can
get it from http://code.google.com/p/gentoo-xen-kernel/downloads/list


The systems hangs round about all 3-4 weeks as far as I can tell. This server is quite new (from nov 2008) and the ServerView doesn't tell me
anything about hardware problems. It seems from this point that the
hardware is ok. If the server hangs then it's not responsive for any kind
of input. Neither the network is working (ping to dom0 or one of the
guests) nor
keyboard/monitor of the server itself is responding to anything. Black screen.. nothing more. A hard reset is the only thing to get the system
back to life.

/var/log/messages just show nothing. It's like disconnecting the power
cable. I have no idea and no hints about this problem.
At the moment I've a cronjob running which collects some system
informations of the dom0 every minute - I hope that the very last run
(just before the next crash happens) will show me some kind of
informations which maybe point me to the problem!? However - I currently have no clue which kind of informations will be helpful for this purpose.
I currently log the following things every minute:
- dmesg
- free
- netstat -lnp
- ps aux
- w
- vgdisplay
- lvdisplay

hints about other informations which could be helpful? any xen related

Interesting that you use lvm too. I also use lvm for my guests and use the snapshot functionality on a daily basis to backup the server to a
tape. dom0 is running on a normal partition. I use lvm 2.02.36


Am Freitag, 3. April 2009 16:56:28 schrieb Paraic Gallagher:
Hi all,

This is my first post to the list, I hope someone out there can help!

I am running xen 3.0.3, with CentOS 5.2 based Dom0

Recently I have noticed some complete system lockups on a few different servers. Neither Dom0 or any of the guests respond to pings, connecting a keyboard and monitor to the system only shows a blank screen. Nothing
is written to logs at time of lockup.

The problem is very difficult to reproduce and seems very random by
nature. Sometimes if a system is left running for a few weeks it will happen, other times it can happen after a reboot. I have tried taxing
the system running various scripts, rebooting numerous times, and
creating/destroying a few guests, etc but no luck. It seems like a
hardware issue but has been reproduced on a few different machines.

For a while (clutching at straws) I thought it was due to changes in the
clock (from daylight savings) so tried changing time backwards and
forwards but this had no effect.

Has anyone else out there seen a problem like this? Is there any way to
diagnose it when it does happen. (It is very frustrating to have a
hanged system where you cannot access for any information).

If anyone wants any further info or ideas on what I could try please let
me know.


Xen-users mailing list

Xen-users mailing list

Xen-users mailing list