On Sat, Jan 29, 2011 at 07:03:16PM +0100, Jordan Pittier wrote:
> Thanks for your reply. LSI has indeed newer driver for the controler;
> but I can't "build" it, there's an error when I try to compile it [see
> attachement]. I will give another try in the next days.
>
> What is puzzling is that the IO errors only occurs with Xen HV. I am
> 100% willing to accept that the problem is the drivers, but how come
> the exact same kernel (the xenified one) could work fine without Xen
> loaded ? I am almost a noob in kernel/driver and stuff; but I thought
> the drivers were entirely in the kernel.
>
Yep, the driver is entirely in the kernel, but that's not the whole story.
Xen dom0 kernel does irq handling through Xen hypervisor,
so that might make some drivers behave in a different way baremetal vs. dom0.
Also remember dom0 is a *vm*, so some timing stuff might happen
differently on baremetal vs. dom0.
> I will try with the latest kernel in a few days.
>
> SLES11SP1 ships mptfusion 4.22
> (http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage)
> I dont know for RHEL
>
What driver version does the squeeze kernel have?
-- Pasi
> On Sat, Jan 29, 2011 at 6:02 PM, Pasi Kärkkäinen <pasik@xxxxxx> wrote:
> > On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote:
> >> Hi,
> >> I have been tracking a bug affecting all my servers running Debian
> >> Squeeze
> >> for more than a month now, and I*desperately*need your help :)*
> >> I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror). 4 of
> >> them are running Debian Squeeze with the latest Xen Debian kernel
> >> (2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian Lenny
> >> (2.6.26-2-xen-amd64 ==*2.6.26-26lenny1).
> >> On a Squeeze boxe, under very high IO (such as running a IO stress test,
> >> ie bonnie++), server starts behaving*weirdly and I see messages like
> >> these
> >> in kernel.log : [see attachement]. Then the server becomes totally
> >> unresponsive (but doesn't "freeze") and commands such as "ls" or
> >> "reboot"
> >> don't work anymore. I have to do an hard reboot. After the server has
> >> reboot, the RAID array seems degraded (I am using the mpt-status
> >> command)
> >> and starts rebuilding. After several hours, the raid array is "fine"
> >> ("clean"). The raid controler is "LSI53C1030" U320, with driver "Fusion
> >> MPT SPI Host driver 3.04.06". I have attached the result of "lsmod".
> >> None of my Lenny boxes are affected by this issue, all of my Squeeze
> >> boxes
> >> are.
> >> What does it have to do with Xen ? When I boot my Squeeze boxes without
> >> the Xen hypervisor but the same Xen kernel, bonnie++
> >> runs*absolutely*fine.
> >> The issue appears only with the Xen hypervisor loaded.*
> >> There is a debian bug report for this
> >> :*[1]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727
> >> Any suggestion ?*
> >
> > Did you check if LSI has newer driver version available?
> >
> > Also you might check which driver version for example RHEL6
> > or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels too.
> >
> > On one of my testboxes I need to upgrade the LSI driver
> > to a newer version to make it work. This is SAS based LSI though.
> >
> > Can you try using another disk controller?
> >
> > Also: Did you try using the latest kernel (-30) ?
> >
> > -- Pasi
> >
> >
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|