WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Re: domU corrupt after server crash, help needed trying

To: Rudi Ahlers <rudiahlers@xxxxxxxxx>
Subject: Re: [Xen-users] Re: domU corrupt after server crash, help needed trying to recover domU
From: Ciro Iriarte <cyruspy@xxxxxxxxx>
Date: Sun, 10 May 2009 15:08:27 -0400
Cc: "Fajar A. Nugraha" <fajar@xxxxxxxxx>, xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Sun, 10 May 2009 12:09:19 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=TOZLQ/bH0Xzldi46+FBQVDc7KQF3hUlsMb7fZe5qcmw=; b=WHtRGm4jnIKOOW3hmFck3q6/njmNrqPC37cHx9nTvZUojqCtg+XTo4ooacpZJXyoo1 EndM2YAZFHxFpFAN6P8HOEJIUj4+ja70ZzLax3BUivjNCxalSbVIyibeNWCxFpXjELnI pjLsMZgP+Rm8XCmWU6FK8wUT1gpTcyss4DTys=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=Cnl1O/uw+W9Fkn4g6zYXtpCWtq5c78d+HYMa+sSna0Vz+oRMh61OHLLvwbNAUso31Y PKt/5HJKh22zepQdxv/xmjDPdniXTZgjsSf8l8/rOoeZBYrzSdMH276OuzkqSSip9OGh NADkGhDgrCUsRGlIIQWvvA+XFeDw+Y23HRodk=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <695200da0905100059k35167d51tad925722a17c7536@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <695200da0905080529i1891ecd0jd402f8a42a4498e4@xxxxxxxxxxxxxx> <695200da0905080643t3982aa0dlf354df28c57e529f@xxxxxxxxxxxxxx> <695200da0905081121u2578bcd9rf7966b9071861443@xxxxxxxxxxxxxx> <7207d96f0905081355i7306f852gefb17a782d9eaa9f@xxxxxxxxxxxxxx> <695200da0905081542j73ee177fwf7d6c1617b92584b@xxxxxxxxxxxxxx> <7207d96f0905082019q2fda82d7oe3b456700bdcc0ba@xxxxxxxxxxxxxx> <695200da0905100038x2420dc6bpba483bcdf9f0ee33@xxxxxxxxxxxxxx> <695200da0905100059k35167d51tad925722a17c7536@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
2009/5/10 Rudi Ahlers <rudiahlers@xxxxxxxxx>:
>
>
> On Sun, May 10, 2009 at 9:38 AM, Rudi Ahlers <rudiahlers@xxxxxxxxx> wrote:
>>
>>
>> On Sat, May 9, 2009 at 5:19 AM, Fajar A. Nugraha <fajar@xxxxxxxxx> wrote:
>>>
>>> On Sat, May 9, 2009 at 5:42 AM, Rudi Ahlers <rudiahlers@xxxxxxxxx> wrote:
>>> > Hi Fajar,
>>> >
>>> > I got the commands via google search, so I didn't know that losetup was
>>> > only
>>> > meant for file-backed storage.
>>>
>>> If it's a block device (LVM, partition, etc.) you can skip losetup and
>>> go directly to
>>> kpartx -av  /dev/data/hfserver2
>>
>> Really? Cool, now I've learned something :)
>>
>>>
>>> > Unfortunately there's no backups :(
>>>
>>> Ouch. Sorry to hear that.
>>> So that makes it what ... your second corruption?
>>> On my environment FS corruption is USUALLY because one of these :
>>> - human error (like the admin mounting the same block device twice on
>>> different servers). This usually happens on shared-storage systems
>>> (SAN, NAS, etc).
>>> - SAN error (like when it got temporarily disconnected, and then
>>> reconnected again)
>>> - server hardware error (bad memory, bad disk controller, etc.)
>>
>> Yes, but on a different server, different client, different reason. The
>> only thing that's the same is the IDC, and the server setup. Both have
>> CentOS on the host node, and runs cPanel on the domU VPS's. I'd love to
>> setup a shared NAS and have 2 servers shared the data from there, but funds
>> are a bit limited :(
>>
>>>
>>> I suggest you check all three to make sure corruption doesn't happen
>>> again. If both corruption are on the same hardware, then most likely
>>> the server hardware is bad.
>>
>> The problem is due to the RAM. The ECC (non buffered) Kingston Memory
>> modules don't work as expected on the Dell PE860 platform. Strangely when I
>> put normal desktop RAM into the server, it worked fine. So, I'm taking the
>> RAM back to the supplier on Monday.
>>
>>>
>>> You could PROBABLY still salvage some data from the broken domU. Try
>>> shutting it down, and mount it again on dom0. Sometimes fsck will find
>>> recovered inodes in /lost+found, so perhaps some of your data is still
>>> there.
>>
>> Yes, I'm going to try this and see how far I can get.
>>
>>>
>>> BTW, VolGroup00 IS the name of domU's VG right? It's not dom0's VG?
>>> Cause if it were dom0's you might have more problems ahead.
>>
>> no, the hostnode's LVM has been renamed to /dev/data/root, /dev/home/swap
>> & /dev/data/home for this very reason
>>
>>>
>>> Regards,
>>>
>>> Fajar
>>
>
>
> Just as matter of interest, the amount of recovered files in
> /mnt/cpanel/lost+found/ (this is the mounted VolGroup00 partition) is 32536
>
> And the files all looks like this:
>
> -rw-r----- 1 root       32046   94816 Feb 12 10:28 #1865704
> -rw-r--r-- 1 root     root         91 Feb 12 10:28 #1865705
> -rw-r----- 1 root       32052   94816 Feb 12 10:28 #1865707
> -rw-r----- 1 root       32022     901 Feb 12 10:28 #1865709
> -rw-r----- 1 root       32052   94816 Feb 12 10:28 #1865710
> -rw-r----- 1 root       32013   94816 Feb 12 10:28 #1865711
> -rw-r----- 1 root       32013   94816 Feb 12 10:28 #1865713
> -rw-r--r-- 1 root     root         91 Feb 12 10:28 #1865714
> -rw-r----- 1 root       32037   94816 Feb 12 10:28 #1865715
> -rw-r----- 1 root         506   94816 Feb 12 10:44 #1865716
> -rw-r----- 1 root       32037    6202 Feb 12 10:28 #1865717
> -rw-r--r-- 1 root     root         92 Feb 12 10:44 #1865718
> -rw-r--r-- 1 root     root        105 Feb 12 10:44 #1865719
> -rw-r----- 1 root       32029   94816 Feb 12 10:44 #1865721
> -rw-r----- 1 root       32029   94816 Feb 12 10:44 #1865722
> -rw-r----- 1 root       32029   94816 Feb 12 10:44 #1865723
> -rw-r-0m#1865794
> -rw-r----- 1 root       32029   94816 Feb 12 16:47 #1865795
> [root@xen cpanel]# ll lost+found/ | wc -l
>
>
>
> What can I do with these files, apart from deleting them?
>
> --
> Kind Regards
> Rudi Ahlers
> CEO, SoftDux Hosting
> Web: http://www.SoftDux.com
> Office: 087 805 9573
> Cell: 082 554 7532
>

Those are from a fsck execution. Your only hope is running the "file"
command on each of them to try to guess what are the files, if they
are not corrupt you can copy them back to where they belong, but
there's no automatic procedure for this...

Regards,


-- 
Ciro Iriarte
http://cyruspy.wordpress.com
--

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users