WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] XCP - FYI - An easy way to wedge (and fix) a Cloud

To: Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Subject: Re: [Xen-devel] XCP - FYI - An easy way to wedge (and fix) a Cloud
From: "dwight at supercomputer.org" <dwight@xxxxxxxxxxxxxxxxx>
Date: Wed, 9 Jun 2010 09:58:35 -0700
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 09 Jun 2010 10:04:46 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1276029413.2939.186.camel@xxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <201006080904.31362.dwight@xxxxxxxxxxxxxxxxx> <1276029413.2939.186.camel@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Tuesday 08 June 2010 01:36:53 pm Daniel Stodden wrote:
> On Tue, 2010-06-08 at 12:04 -0400, dwight at supercomputer.org 
wrote:
> > It turns out that /var/log had filled up the root filesystem on
> > the master.  500M+ worth of messages in there. After I tracked
> > down the problem, and freed this space up, everything started
> > working again.
>
> Which ones were the files growing too big? I recently caused
> potential trouble with blktap. But there may be more. Both xapi
> and storage management can get quite chatty, although I think this
> improved with xs5.x.
>
> Daniel

I'm going from memory here, as the main impetus was on triage, and 
not proper debug/fix/testing. But if memory serves, it was 
xensource.log.

It's unlikely that any recent change was the culprit, as this was 
stock XCP 0.1.1.

I have to say that it's something else to reboot and debug an entire 
Cloud. I've dealt with wedged/crashed systems before on 
microcontrollers, small embedded devices, PC's, Servers, Mainfraimes 
and Supercomputers, including Virtualized Systems. This is the first 
time I've had to debug and reboot an entire Cloud before. 

The main lesson for me is that the debugging interface could be 
improved. This is one of the most critical aspects of any 
Development environment.

Being able to get to a single user shell prompt easily from 
the "boot:" prompt would go a long way here.

    -dwight-




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel