WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Scary!!! Lost domU!!!

Subject: Re: [Xen-users] Scary!!! Lost domU!!!
From: Jamon Camisso <jamonation@xxxxxxxxx>
Date: Sun, 03 Jan 2010 21:15:28 -0500
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Sun, 03 Jan 2010 18:16:11 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=ptGMHMG8DJY+TfhJPMHgM0pZaffrfiP2obLgWNfvf28=; b=G9apjxDabF/tEev9KZiHHCyXJEui81FfhBZgH0IDvPncBouRUksnq57VzKzXuegghv OMXT8onPxj2jNmQDQwMjsA2Wx9uJCx8LOQBqf4/Fm2jv3gcUeu4gJOi+3XVfNw68UYT5 1VXMQc5GuFAGe1tObxMA2tTr/sOb7/LIgGq0U=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:cc:subject:references :in-reply-to:content-type:content-transfer-encoding; b=iwo4GoK22DNJ74jmxYygLgA9/pl8LOXZoKYYQNZI55kmLPJTgDvkLZOx3ahMO7E4tL YOqCkYydzAePBnfga9KfOpbN6cLGH4xIGSZgvKnNKkdevV3eViHHSTAnlU0N8dwhfg0k hGLPZMbhkz8fvWrNpCKoarBkM9qTRLDoAiy2U=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1262573379.3139.57.camel@xxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <1262233922.15120.9.camel@xxxxxxxxxxxxxxxxxxxxxxxx> <4B3E47E5.4000009@xxxxxxxxx> <1262573379.3139.57.camel@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla-Thunderbird 2.0.0.22 (X11/20090707)
James Pifer wrote:
On Fri, 2010-01-01 at 14:07 -0500, Jamon Camisso wrote:
Is there more than just the sles server using both volumes? If not, have you considered using another filesystem? Personally I've had nothing but trouble with ocfs2 in Debian and Centos -- clusters would just randomly fall apart. I've also found that unless filesystem throughput is very good, ocfs2 would end up loosing writes by getting ahead of itself somehow. All depends on the storage backend I suppose.

I think I know what happened in this case. After a lot of thought, I
believe the blunder was mine. I remember working with this specific domU
in early December. I was moving it from my dev machine with local
storage to the cluster. I did not realize how much space it was actually
using, so after copying I decided it would best to leave it on local
storage since it was not a super critical system.
Here's when I'm speculating. Somewhere along the way I think I screwed
up and did bring the domU up on the ocfs2 cluster or I had already
modified the config. I then started it back up before deleting the one I
just copied. I then tried to delete the copy on ocfs2 while it was
running. Not sure why I may have stopped here when it did not delete,
maybe side tracked, don't know. In any case I'm thinking they were
marked for deletion.
Then after Christmas I had to reboot the server for a different problem.
When I stopped the domU, or during reboot, the file deletion actually
took place. Thankfully I still had a copy of it. Wouldn't have been the
end of the world except for work rebuilding it.
I'm not sure if that is even possible but that's what I'm thinking.
Other than that my ocfs2 cluster has been solid on sles. Been using it
for quite some time, well over a year I think.

That sounds plausible. I could see doing the same thing pretty easily. I use xm migrate (live) to make sure that there's only ever one copy of a domU running anywhere. That way I can definitively check from the dom0 which filesystem is being used too -- it must get messy with different storage pools, lvm volumes, raw tap:aio files etc.

The one doubt I have is the timeline involved. I suppose it is possible that the domU continued merrily along with a filesystem that was loosing writes for the rest of the month (a couple weeks?), it's too bad there isn't a copy of the filesystem around where you could see the logs to confirm it!

Good to hear you've got a backup and that you haven't had problems since the reboot :)

Jamon

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>