|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] Scary!!! Lost domU!!!
James Pifer wrote:
On Fri, 2010-01-01 at 14:07 -0500, Jamon Camisso wrote:
Is there more than just the sles server using both volumes? If not, have
you considered using another filesystem? Personally I've had nothing but
trouble with ocfs2 in Debian and Centos -- clusters would just randomly
fall apart. I've also found that unless filesystem throughput is very
good, ocfs2 would end up loosing writes by getting ahead of itself
somehow. All depends on the storage backend I suppose.
I think I know what happened in this case. After a lot of thought, I
believe the blunder was mine. I remember working with this specific domU
in early December. I was moving it from my dev machine with local
storage to the cluster. I did not realize how much space it was actually
using, so after copying I decided it would best to leave it on local
storage since it was not a super critical system.
Here's when I'm speculating. Somewhere along the way I think I screwed
up and did bring the domU up on the ocfs2 cluster or I had already
modified the config. I then started it back up before deleting the one I
just copied. I then tried to delete the copy on ocfs2 while it was
running. Not sure why I may have stopped here when it did not delete,
maybe side tracked, don't know. In any case I'm thinking they were
marked for deletion.
Then after Christmas I had to reboot the server for a different problem.
When I stopped the domU, or during reboot, the file deletion actually
took place. Thankfully I still had a copy of it. Wouldn't have been the
end of the world except for work rebuilding it.
I'm not sure if that is even possible but that's what I'm thinking.
Other than that my ocfs2 cluster has been solid on sles. Been using it
for quite some time, well over a year I think.
That sounds plausible. I could see doing the same thing pretty easily. I
use xm migrate (live) to make sure that there's only ever one copy of a
domU running anywhere. That way I can definitively check from the dom0
which filesystem is being used too -- it must get messy with different
storage pools, lvm volumes, raw tap:aio files etc.
The one doubt I have is the timeline involved. I suppose it is possible
that the domU continued merrily along with a filesystem that was loosing
writes for the rest of the month (a couple weeks?), it's too bad there
isn't a copy of the filesystem around where you could see the logs to
confirm it!
Good to hear you've got a backup and that you haven't had problems since
the reboot :)
Jamon
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|
|
|