WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Xen system hang or freeze

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Xen system hang or freeze
From: Martin Fernau <m.fernau@xxxxxxxxxx>
Date: Sat, 4 Apr 2009 17:00:57 +0200
Delivery-date: Sat, 04 Apr 2009 08:01:27 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <33b90e520904030756l3d2e2eb5s1b7e50535a9a44c7@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Organization: CPS Entwicklungsgesellschaft mbH
References: <33b90e520904030756l3d2e2eb5s1b7e50535a9a44c7@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.11.1 (Linux/2.6.27-gentoo-r8; KDE/4.2.1; i686; ; )
Hi,

I just want to tell you that I've the same issue for one server!
Hardware:
Fujitsu Siemens PRIMERGY TX200 S4
CPU: Intel Xeon Dual Quad E5405 2GhZ
Hardware Raid: LSI Logic / Symbios Logic MegaRAID SAS 1078 with SAS HDDs
I'm running 4 guests on it:
 - Win2003
 - Win2003
 - Gentoo Linux
 - Windows XP Prof

The xen 3.3.0 is running on a gentoo with a 2.6.18-xen-r12 kernel.

The systems hangs round about all 3-4 weeks as far as I can tell. This server 
is quite new (from nov 2008) and the ServerView doesn't tell me anything about 
hardware problems. It seems from this point that the hardware is ok.
If the server hangs then it's not responsive for any kind of input. Neither 
the network is working (ping to dom0 or one of the guests) nor 
keyboard/monitor of the server itself is responding to anything. Black 
screen.. nothing more. A hard reset is the only thing to get the system back 
to life.

/var/log/messages just show nothing. It's like disconnecting the power cable. 
I have no idea and no hints about this problem.
At the moment I've a cronjob running which collects some system informations 
of the dom0 every minute - I hope that the very last run (just before the next 
crash happens) will show me some kind of informations which maybe point me to 
the problem!? However - I currently have no clue which kind of informations 
will be helpful for this purpose. I currently log the following things every 
minute: 
 - dmesg
 - free
 - netstat -lnp
 - ps aux
 - w
 - vgdisplay
 - lvdisplay

hints about other informations which could be helpful? any xen related 
commands?

Interesting that you use lvm too. I also use lvm for my guests and use the 
snapshot functionality on a daily basis to backup the server to a tape. dom0 
is running on a normal partition. I use lvm 2.02.36

Regards,
Martin

Am Freitag, 3. April 2009 16:56:28 schrieb Paraic Gallagher:
> Hi all,
>
> This is my first post to the list, I hope someone out there can help!
>
> I am running xen 3.0.3, with CentOS 5.2 based Dom0
> (kernel-xen-2.6.18-92.1.22.el5)
>
> Recently I have noticed some complete system lockups on a few different
> servers. Neither Dom0 or any of the guests respond to pings, connecting a
> keyboard and monitor to the system only shows a blank screen. Nothing is
> written to logs at time of lockup.
>
> The problem is very difficult to reproduce and seems very random by nature.
> Sometimes if a system is left running for a few weeks it will happen, other
> times it can happen after a reboot. I have tried taxing the system running
> various scripts, rebooting numerous times, and creating/destroying a few
> guests, etc but no luck. It seems like a hardware issue but has been
> reproduced on a few different machines.
>
> For a while (clutching at straws) I thought it was due to changes in the
> clock (from daylight savings) so tried changing time backwards and forwards
> but this had no effect.
>
> Has anyone else out there seen a problem like this? Is there any way to
> diagnose it when it does happen. (It is very frustrating to have a hanged
> system where you cannot access for any information).
>
> If anyone wants any further info or ideas on what I could try please let me
> know.
>
> Regards,
> Paraic.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users