WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Possible bug with scsi disk and Xen

To: Pasi Kärkkäinen <pasik@xxxxxx>
Subject: Re: [Xen-users] Possible bug with scsi disk and Xen
From: Jordan Pittier <jordan.pittier@xxxxxxxxx>
Date: Sat, 29 Jan 2011 19:03:16 +0100
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Sat, 29 Jan 2011 10:04:47 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=AC9fXaQruBr1xr1IVfviC2bc0IFMLmo/ku0wtXH3Z2I=; b=xojjA7PoVNDEiZWybC5f4hAP2PfKjzZYMGWdG9Y2JmlkOSCQtC4YGA+R9lFBXxeOBO 61Vj8gu1g1gikQFK4asN5RQ86u7ilNliNp19egjtnU9envWdUBtrj5oaHp/cf39Fjwrl mu7Doysu5w0U8z0lMvcF/h6VhLVHqIOzYoJTI=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=VjUh0t7YNG0mltKz8eSBvk57ApUqyOCWXnG7MnOSa23w3FCTzXwOhMFJHepdr86eeW waJFwmtD97XfIoinPt3xidiybrUC61pW9tGGp2EIk9X6h5hNVBqxIPB2fpkU6WVce7CG GWen6rR4WftEWX9Vnfflm4A0a3+qRwV0cHpS4=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110129170217.GJ2754@xxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTimBUd-Fs+Z2KEx+8+XpLs015QLO5JCXqj066SMx@xxxxxxxxxxxxxx> <20110129170217.GJ2754@xxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Thanks for your reply. LSI has indeed newer driver for the controler;
but I can't "build" it, there's an error when I try to compile it [see
attachement]. I will give another try in the next days.

What is puzzling is that the IO errors only occurs with Xen HV. I am
100% willing to accept that the problem is the drivers, but how come
the exact same kernel (the xenified one) could work fine without Xen
loaded ? I am almost a noob in kernel/driver and stuff; but I thought
the drivers were entirely in the kernel.

I will try with the latest kernel in a few days.

SLES11SP1 ships mptfusion 4.22
(http://www.novell.com/linux/releasenotes/x86_64/SUSE-SLES/11-SP1/#driver-updates-storage)
I dont know for RHEL

On Sat, Jan 29, 2011 at 6:02 PM, Pasi Kärkkäinen <pasik@xxxxxx> wrote:
> On Sat, Jan 29, 2011 at 04:27:25PM +0100, Jordan Pittier wrote:
>>    Hi,
>>    I have been tracking a bug affecting all my servers running Debian Squeeze
>>    for more than a month now, and I*desperately*need your help :)*
>>    I have 10 Sun v20z servers (2*66GB SCSI disk in RAID 1 == mirror). 4 of
>>    them are running Debian Squeeze with the latest Xen Debian kernel
>>    (2.6.32-5-xen-amd64 ==*2.6.32-29). The rest are running Debian Lenny
>>    (2.6.26-2-xen-amd64 ==*2.6.26-26lenny1).
>>    On a Squeeze boxe, under very high IO (such as running a IO stress test,
>>    ie bonnie++), server starts behaving*weirdly and I see messages like these
>>    in kernel.log : [see attachement]. Then the server becomes totally
>>    unresponsive (but doesn't "freeze") and commands such as "ls" or "reboot"
>>    don't work anymore. I have to do an hard reboot. After the server has
>>    reboot, the RAID array seems degraded (I am using the mpt-status command)
>>    and starts rebuilding. After several hours, the raid array is "fine"
>>    ("clean"). The raid controler is "LSI53C1030" U320, with driver "Fusion
>>    MPT SPI Host driver 3.04.06". I have attached the result of "lsmod".
>>    None of my Lenny boxes are affected by this issue, all of my Squeeze boxes
>>    are.
>>    What does it have to do with Xen ? When I boot my Squeeze boxes without
>>    the Xen hypervisor but the same Xen kernel, bonnie++ runs*absolutely*fine.
>>    The issue appears only with the Xen hypervisor loaded.*
>>    There is a debian bug report for this
>>    :*[1]http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727
>>    Any suggestion ?*
>
> Did you check if LSI has newer driver version available?
>
> Also you might check which driver version for example RHEL6
> or SLES11SP1 ships with.. both of those distros have 2.6.32 kernels too.
>
> On one of my testboxes I need to upgrade the LSI driver
> to a newer version to make it work. This is SAS based LSI though.
>
> Can you try using another disk controller?
>
> Also: Did you try using the latest kernel (-30) ?
>
> -- Pasi
>
>

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users