WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Mysterious Server Lockups

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Mysterious Server Lockups
From: James Pifer <jep@xxxxxxxxxxxxxxxx>
Date: Fri, 13 Feb 2009 09:50:03 -0500
Delivery-date: Fri, 13 Feb 2009 06:55:47 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20090213144204.GA10025@tp>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Mailscanner-null-check: 1235146272.39893@46zz2RlmU+Tvg5CQ3NpiNg
References: <20090213144204.GA10025@tp>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
On Fri, 2009-02-13 at 08:42 -0600, Nick Anderson wrote:
> So over the past month or two one of my Xen boxes has been mysteriously
> locking up. I have found nothing in any of my logs. I setup netconsole
> on dom0 and I don't see any problem msgs on my remote logger (I do see
> kernel msgs like modprobe etc ...).
> 
> I cannot correlate anything with the lockups (no increased traffic or
> anything out of the norm. The lockups do seem to be getting more
> frequent. The first one was several months ago. Then it was a month or
> so later, then three weeks, then 2 weeks (last sat), then yesterday.
> Yesterday I did notice one thing in my sysstat logs on a domU. My
> steal went from >1 (well normally 0) to 5% just before the lockup. But 
> that was just in a single vm. 
> 
> Here is the snippet. 
> 
> 11:35:02 PM       CPU     %user     %nice   %system   %iowait   %steal     
> %idle
> 11:35:01 AM       all      8.70      0.00      3.88      0.14     0.00     
> 87.27
> 11:45:01 AM       all      4.81      0.00      2.35      0.06     5.74     
> 87.03
> 
> For some reason the reboot cleared my sysstat logs for that day prior
> to the reboot time in dom0 so I cannot reference against that.
> 
> Server Specs:
> Supermicro H8DM8-2
> 16GB memory
> 2 x Quad-Core AMD Opteron(tm) Processor 2350
> 3ware 9650LE 8 drive raid 6
> 
> I have 3 domus running.
> The domUs are lvm backed.
> 1 domU has 2 vcpus, the others have a single vcpu.
> 
> Dom0 and domUs are both running debian etch. I am running Xen 3.0.3
> from the repository.
> 
> Any ideas?
> I am considering upgrading to xen 3.2 from backports, but I dont want
> to introduce another variable unless there is a high probability
> upgrading will take care of the issue.

Nick,

My setup VERY different than yours, but I had similar unexplained
outages. Not sure I can even call it a lockup, because I could still
ping dom0, but it's screen was blank, and the domU's were hosed. 

This was when I was running xen 3.0.x on sles10sp1. I've updated all my
servers to sles10sp2 running xen 3.2.x and for a couple months now I
have not had the problem. 

Way too many difference to make a direct comparison, but...

James


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>