[Xen-users] Disk erros on one xen domain

To:	xen-users@xxxxxxxxxxxxxxxxxxx
Subject:	[Xen-users] Disk erros on one xen domain
From:	Nicolas Michel <nicolas.michel@xxxxxxxxx>
Date:	Tue, 25 May 2010 08:43:48 +0200
Delivery-date:	Mon, 24 May 2010 23:45:31 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
List-help:	<mailto:xen-users-request@lists.xensource.com?subject=help>
List-id:	Xen user discussion <xen-users.lists.xensource.com>
List-post:	<mailto:xen-users@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4

Hello,

I have 3 physical servers with some virtual machines on each.
When I look at dmesg on one of them I get theses errors :

*************************************************************************

[34783.559174] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[34783.559248] hda: task_in_intr: error=0x04 { AbortedCommand }
[34783.559289] ide: failed opcode was: 0xec

[121232.732355] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[121232.732413] hda: task_in_intr: error=0x04 { AbortedCommand }
[121232.732455] ide: failed opcode was: 0xec

[207708.187565] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[207708.187623] hda: task_in_intr: error=0x04 { AbortedCommand }
[207708.187664] ide: failed opcode was: 0xec

[294224.164969] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[294224.165029] hda: task_in_intr: error=0x04 { AbortedCommand }
[294224.165075] ide: failed opcode was: 0xec

[380705.378232] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[380705.378232] hda: task_in_intr: error=0x04 { AbortedCommand }
[380705.378232] ide: failed opcode was: 0xec

[467193.505658] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[467193.505717] hda: task_in_intr: error=0x04 { AbortedCommand }
[467193.505758] ide: failed opcode was: 0xec

[553683.657031] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[553683.657091] hda: task_in_intr: error=0x04 { AbortedCommand }
[553683.657132] ide: failed opcode was: 0xec

[640176.673218] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[640176.673218] hda: task_in_intr: error=0x04 { AbortedCommand }
[640176.673218] ide: failed opcode was: 0xec

[726657.593721] hda: task_in_intr: status=0x51 { DriveReady SeekCompleteError }

[726657.593721] hda: task_in_intr: error=0x04 { AbortedCommand }
[726657.593721] ide: failed opcode was: 0xec:
******************************************************************

You'll see the full dmesg output in the attached file.

I found with google some comments about these errors saying that itmeans the disk is dying. But this is a relatively recent server (1 year)with 6 disks in RAID 10.

Since I started that server in prod, it crashed 3 times. It responds topings but no ssh access (on xen domain and virtal machines either). Someservices on virtual machines continue to respond, other don't. The onlysolution is a hard reboot.

dmesg-xen.txt
Description: Text document

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

WARNING - OLD ARCHIVES

xen-users

[Xen-users] Disk erros on one xen domain