xen-users
Re: [Xen-users] Xen system hang or freeze
Over the last year, I've experienced a couple of sources of lockups.
The first was resolved by going to the stock xen 2.6.18.8 kernel
compiled from source (had been using the Debian etch kernel; found
commentary online describing the same symptoms on Ubuntu, Redhat, and
CentOS though, each with their distro-specific kernel).
This one tended to result in kernel oops messages--soft IRQ lockups as
I recall. Lockup would start with a domU and within a few minutes
would kill the dom0 too. The fastest way to trigger this one was to
create and shutdown domU's, although I don't recall that being the
only way.
The second, with the stock kernel, was an errant USB hub attached to a
xen host. Removing the hub resolved the issue. These were complete,
sudden lockups of the dom0 and all domUs -- basically everything.
Higher traffic over the USB port would trigger this lockup.
So, for those who haven't tried the stock xen kernel, and are able to
try it (based on driver support, etc.), it might help.
--t
On Apr 5, 2009, at 1:20 PM, Martin Fernau wrote:
I tried this before. I had your kernel a few months but this changed
nothing.
I had freezed with this kernel too in the same way.
Am Sonntag, 5. April 2009 14:29:20 schrieb Andrew Lyon:
On Sat, Apr 4, 2009 at 4:00 PM, Martin Fernau <m.fernau@xxxxxxxxxx>
wrote:
Hi,
I just want to tell you that I've the same issue for one server!
Hardware:
Fujitsu Siemens PRIMERGY TX200 S4
CPU: Intel Xeon Dual Quad E5405 2GhZ
Hardware Raid: LSI Logic / Symbios Logic MegaRAID SAS 1078 with
SAS HDDs
I'm running 4 guests on it:
- Win2003
- Win2003
- Gentoo Linux
- Windows XP Prof
The xen 3.3.0 is running on a gentoo with a 2.6.18-xen-r12 kernel.
I had the same problem with the 2.6.18-xen-rX Gentoo kernels, so I
made my own ebuild and patches from the openSUSE Xen patches, you can
get it from http://code.google.com/p/gentoo-xen-kernel/downloads/list
Andy
The systems hangs round about all 3-4 weeks as far as I can tell.
This
server is quite new (from nov 2008) and the ServerView doesn't
tell me
anything about hardware problems. It seems from this point that the
hardware is ok. If the server hangs then it's not responsive for
any kind
of input. Neither the network is working (ping to dom0 or one of the
guests) nor
keyboard/monitor of the server itself is responding to anything.
Black
screen.. nothing more. A hard reset is the only thing to get the
system
back to life.
/var/log/messages just show nothing. It's like disconnecting the
power
cable. I have no idea and no hints about this problem.
At the moment I've a cronjob running which collects some system
informations of the dom0 every minute - I hope that the very last
run
(just before the next crash happens) will show me some kind of
informations which maybe point me to the problem!? However - I
currently
have no clue which kind of informations will be helpful for this
purpose.
I currently log the following things every minute:
- dmesg
- free
- netstat -lnp
- ps aux
- w
- vgdisplay
- lvdisplay
hints about other informations which could be helpful? any xen
related
commands?
Interesting that you use lvm too. I also use lvm for my guests and
use
the snapshot functionality on a daily basis to backup the server
to a
tape. dom0 is running on a normal partition. I use lvm 2.02.36
Regards,
Martin
Am Freitag, 3. April 2009 16:56:28 schrieb Paraic Gallagher:
Hi all,
This is my first post to the list, I hope someone out there can
help!
I am running xen 3.0.3, with CentOS 5.2 based Dom0
(kernel-xen-2.6.18-92.1.22.el5)
Recently I have noticed some complete system lockups on a few
different
servers. Neither Dom0 or any of the guests respond to pings,
connecting
a keyboard and monitor to the system only shows a blank screen.
Nothing
is written to logs at time of lockup.
The problem is very difficult to reproduce and seems very random by
nature. Sometimes if a system is left running for a few weeks it
will
happen, other times it can happen after a reboot. I have tried
taxing
the system running various scripts, rebooting numerous times, and
creating/destroying a few guests, etc but no luck. It seems like a
hardware issue but has been reproduced on a few different machines.
For a while (clutching at straws) I thought it was due to changes
in the
clock (from daylight savings) so tried changing time backwards and
forwards but this had no effect.
Has anyone else out there seen a problem like this? Is there any
way to
diagnose it when it does happen. (It is very frustrating to have a
hanged system where you cannot access for any information).
If anyone wants any further info or ideas on what I could try
please let
me know.
Regards,
Paraic.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|