Hello everyone,
I'm hoping to find an answer to my problem here. Currently I'm installing
xen 2.0.7 on 30 dual opteron machines with 4GB memory. The machines have
Tyan K8SR as motherboard with Silicon Image 3114 chip sata controler and 2 x
300 GB Western Digital SATA drives. Each xen server has 5 domUs, where some
of them have high traffic and thus require high disk usage. On certain
machines, I'm experiencing the following problem, after a while, I get this
error in dom0's dmesg:
TA: abnormal status 0xD0 on port 0xD080EC87
ATA: abnormal status 0xD0 on port 0xD080EC87
ATA: abnormal status 0xD0 on port 0xD080EC87
ata1: command 0x35 timeout, stat 0x51 host_stat 0x61
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
SCSI error : <0 0 0 0> return code = 0x8000002
sda: Current: sense key=0xb
ASC=0x0 ASCQ=0x0
end_request: I/O error, dev sda, sector 449410141
also in dom0's /var/log/messages, I have this:
Sep 19 00:15:59 x8 kernel: ata1: status=0x51 { DriveReady SeekComplete Error
}
Sep 19 00:15:59 x8 kernel: ata1: error=0x04 { DriveStatusError }
Sep 19 00:15:59 x8 kernel: SCSI error : <0 0 0 0> return code = 0x8000002
Sep 19 00:15:59 x8 kernel: end_request: I/O error, dev sda, sector 449410141
Once that happens, in some of the xenU's I get the error message in dmesg
saying:
Buffer I/O error on device sda1, logical block 6848542
lost page write due to I/O error on sda1
end_request: I/O error, dev sda1, sector 55399888
And after a while the xenU instance just freezes or goes into read-only FS
and I have to restart the xenU instance. The dom0 machine never locks up,
it's only the xenU's.
Does anyone have any ideas why these errors would occur? Maybe libata
related? Is there a kernel problem? xen-2.0.7 comes with the 2.6.11.12
kernel, I'm not sure if there is a bug in there.
Any help is appreciated!
Thanks,
Alex
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|