WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding sh

To: "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>, "Dante Cinco" <dantecinco@xxxxxxxxx>
Subject: Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M)
From: "Jan Beulich" <JBeulich@xxxxxxxxxx>
Date: Tue, 29 Jun 2010 09:42:10 +0100
Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 29 Jun 2010 01:42:45 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTikFlc5J9L2V3LhyXULstvkQMNMBBbwM6N6mm9_1@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTikFlc5J9L2V3LhyXULstvkQMNMBBbwM6N6mm9_1@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> On 28.06.10 at 20:22, Dante Cinco <dantecinco@xxxxxxxxx> wrote:
> I have an HP Proliant DL380-G6 (dual Xeon E5540 @ 2.53GHz) with Xen 4.0.0
> and dom0 Linux 2.6.32.12 x86_64 pvops and domU Linux kernel 2.6.30.1 x86_64.
> I'm using PCI passthrough (pci-stub) to pass my 4-port 8Gb PMC-Sierra Fibre
> Channel HBA to domU. After running I/Os for several hours, both dom0 and
> domU hangs and the Xen console shows the interrupt binding below where IRQ
> 66 shows in-flight=1 and mask set (---M). What's the best way to debug this
> problem?

There are potentially two problems here: One is that the guest may
fail to send the EOI notification. You would want to check whether
pirq_guest_eoi() got run after that last occurrence of the interrupt.

The more worrying part is that Xen should time out on a guest failing
to send the EOI notification, and ack the interrupt nevertheless.
Looking at the code I fail to see how the ack_APIC_irq() would get
sent in this case: non-maskable MSIs get this issued from
end_msi_irq(), but ->end doesn't get invoked from
irq_guest_eoi_timer_fn() (only ->enable does). Keir, am I missing
something?

Otoh I can't see how this can work reliably in the first place: Since
there's no other way to mask such interrupts, sending an ack to the
LAPIC could result in an interrupt storm. Disabling MSI on the
affected device isn't a good option either, as we know there are
devices that switch to legacy IRQ mode irreversibly in that case,
and hence the device becomes unusable (presumably until being
reset). But very likely this would still be better than hanging the
entire box; it probably would just need a more graceful timeout.

Jan

> (XEN)    IRQ:  66 affinity:00000000,00000000,00000000,00000001 vec:b9
> type=PCI-MSI         status=00000010 in-flight=1 domain-list=1: 79(---M),
> (XEN)    IRQ:  67 affinity:00000000,00000000,00000000,00000004 vec:d9
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 78(----),
> (XEN)    IRQ:  68 affinity:00000000,00000000,00000000,00000010 vec:22
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 77(----),
> (XEN)    IRQ:  69 affinity:00000000,00000000,00000000,00000040 vec:2a
> type=PCI-MSI         status=00000010 in-flight=0 domain-list=1: 76(----),
> 
> (XEN) 07:00.3 - dom 1   - MSIs < 69 >
> (XEN) 07:00.2 - dom 1   - MSIs < 68 >
> (XEN) 07:00.1 - dom 1   - MSIs < 67 >
> (XEN) 07:00.0 - dom 1   - MSIs < 66 >
> 
> (XEN)  MSI    66 vec=b9  fixed  edge   assert phys    cpu dest=00000000
> mask=0/0/-1
> (XEN)  MSI    67 vec=d9  fixed  edge   assert phys    cpu dest=00000004
> mask=0/0/-1
> (XEN)  MSI    68 vec=22  fixed  edge   assert phys    cpu dest=00000002
> mask=0/0/-1
> (XEN)  MSI    69 vec=2a  fixed  edge   assert phys    cpu dest=00000006
> mask=0/0/-1
> 
> Thanks.
> 
> Dante




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>