WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: aic79xx failures with pvops dom0 2.6.32.25

To: Micah Anderson <micah@xxxxxxxxxx>
Subject: Re: [Xen-devel] Re: aic79xx failures with pvops dom0 2.6.32.25
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Mon, 22 Nov 2010 11:28:29 -0500
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 22 Nov 2010 08:30:35 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <87y68my4rv.fsf@xxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <87eiafzp49.fsf@xxxxxxxxxxxxxxxx> <615023213.20101120224150@xxxxxxxxxxxxxx> <87y68my4rv.fsf@xxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
> [   16.572048] scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
> [   16.572051]         <Adaptec AIC7902 Ultra320 SCSI adapter>
> [   16.572053]         aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 
> 101-133MHz, 512 SCBs
> [   16.572598] aic79xx 0000:03:02.1: found PCI INT B -> IRQ 5

That is a rather odd IRQ number that is shared amongst all of the devices. Is 
there
a "OS Compatibility" BIOS option where you can select Linux?

So MA Young found an interesting issue where the IRQs below 16 don't get 
programmed
correctly, which was fixed in .. some unstable version, but before we go that 
route
1) Go in the serial console and hit Ctrl-A, hit '?' and hit '*' and send the
output. I am curious to see if the IO-APIC ends up having a proper vector as 
during
bootup it looks to be set to nothing but it should have by now have a good 
value.
> [   16.572681] aic79xx 0000:03:02.1: sharing IRQ 5 with 0000:02:01.0
> [   16.572697] aic79xx 0000:03:02.1: sharing IRQ 5 with 0000:02:03.0
> [   16.572718] aic79xx 0000:03:02.1: sharing IRQ 5 with 0000:02:03.1
> [   16.572733] aic79xx 0000:03:02.1: sharing IRQ 5 with 0000:03:02.0
> [   31.692043] scsi9 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
> [   31.692046]         <Adaptec AIC7902 Ultra320 SCSI adapter>
> [   31.692048]         aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 
> 101-133MHz, 512 SCBs
> [   31.693434] initcall ahd_linux_init+0x0/0x21d [aic79xx] returned 0 after 
> 29539948 usecs
> [   34.820009] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 
> frozen

So the issue appears on ata1.00 and ata2.00. Those are not the AIC94XX driver,
but rather the ATA one.
> [   34.820009] ata1.00: failed command: READ FPDMA QUEUED
> [   34.820009] ata1.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 
> 4096 in
> [   34.820009]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
> (timeout)
> [   34.820009] ata1.00: status: { DRDY }
> [   34.820009] ata1: hard resetting link
> [   34.837701] ata2.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 
> frozen
> [   34.837701] ata2.00: failed command: READ FPDMA QUEUED
> [   34.837701] ata2.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 
> 4096 in
> [   34.837701]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
> (timeout)
> [   34.837701] ata2.00: status: { DRDY }
> [   34.837701] ata2: hard resetting link
> [   35.320058] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   35.320248] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [   35.360130] ata2.00: configured for UDMA/133
> [   35.360147] ata2.00: device reported invalid CHS sector 0
> [   35.360165] ata2: EH complete
> [   35.360562] ata1.00: configured for UDMA/133
> [   35.360574] ata1.00: device reported invalid CHS sector 0
> [   35.360590] ata1: EH complete
> [   37.816044] scsi 0:0:0:0: Attempting to queue an ABORT message:CDB: 0x12 
> 0x0 0x0 0x0 0x24 0x0
> [   37.817596] scsi 0:0:0:0: Command already completed
> [   47.816038] scsi 0:0:0:0: Attempting to queue an ABORT message:CDB: 0x0 
> 0x0 0x0 0x0 0x0 0x0
> [   47.817578] scsi 0:0:0:0: Command already completed
> [   47.817616] scsi 0:0:0:0: Attempting to queue a TARGET RESET message:CDB: 
> 0x12 0x0 0x0 0x0 0x24 0x0

However here the SCSI error handler kicks in too and starts complaining b/c of 
timeouts.

It looks like there could be two issues:
 1). IRQs not delievered. For that do the serial console and hit '*' to get the 
IOAPIC output.
 2). IRQs are delievered, but the memory location where the cards DMA-ed data 
to has 
     garbage. Try passing in 'swiotlb=force' under Dom0. That will force all of 
the DMA
     to go through the SWIOTLB buffer.

> [   47.817652] scsi0: Device reset code sleeping
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel