WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] Mysterious Server Lockups

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Mysterious Server Lockups
From: Nick Anderson <nick@xxxxxxxxxxxx>
Date: Fri, 13 Feb 2009 08:42:06 -0600
Delivery-date: Fri, 13 Feb 2009 06:43:54 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
So over the past month or two one of my Xen boxes has been mysteriously
locking up. I have found nothing in any of my logs. I setup netconsole
on dom0 and I don't see any problem msgs on my remote logger (I do see
kernel msgs like modprobe etc ...).

I cannot correlate anything with the lockups (no increased traffic or
anything out of the norm. The lockups do seem to be getting more
frequent. The first one was several months ago. Then it was a month or
so later, then three weeks, then 2 weeks (last sat), then yesterday.
Yesterday I did notice one thing in my sysstat logs on a domU. My
steal went from >1 (well normally 0) to 5% just before the lockup. But 
that was just in a single vm. 

Here is the snippet. 

11:35:02 PM       CPU     %user     %nice   %system   %iowait   %steal     %idle
11:35:01 AM       all      8.70      0.00      3.88      0.14     0.00     87.27
11:45:01 AM       all      4.81      0.00      2.35      0.06     5.74     87.03

For some reason the reboot cleared my sysstat logs for that day prior
to the reboot time in dom0 so I cannot reference against that.

Server Specs:
Supermicro H8DM8-2
16GB memory
2 x Quad-Core AMD Opteron(tm) Processor 2350
3ware 9650LE 8 drive raid 6

I have 3 domus running.
The domUs are lvm backed.
1 domU has 2 vcpus, the others have a single vcpu.

Dom0 and domUs are both running debian etch. I am running Xen 3.0.3
from the repository.

Any ideas?
I am considering upgrading to xen 3.2 from backports, but I dont want
to introduce another variable unless there is a high probability
upgrading will take care of the issue.

Thanks
-- 
Nick Anderson <nick@xxxxxxxxxxxx>
http://www.cmdln.org

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
<Prev in Thread] Current Thread [Next in Thread>