|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding sh
>>> On 17.08.10 at 20:01, Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote:
> On 17/08/2010 18:28, "Bruce Edge" <bruce.edge@xxxxxxxxx> wrote:
>
>> On Tue, Jun 29, 2010 at 1:42 AM, Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
>>>>>> On 28.06.10 at 20:22, Dante Cinco <dantecinco@xxxxxxxxx> wrote:
>>>> I have an HP Proliant DL380-G6 (dual Xeon E5540 @ 2.53GHz) with Xen 4.0.0
>>>> and dom0 Linux 2.6.32.12 x86_64 pvops and domU Linux kernel 2.6.30.1
>>>> x86_64.
>>>> I'm using PCI passthrough (pci-stub) to pass my 4-port 8Gb PMC-Sierra Fibre
>>>> Channel HBA to domU. After running I/Os for several hours, both dom0 and
>>>> domU hangs and the Xen console shows the interrupt binding below where IRQ
>>>> 66 shows in-flight=1 and mask set (---M). What's the best way to debug this
>>>> problem?
>>>
>>> There are potentially two problems here: One is that the guest may
>>> fail to send the EOI notification. You would want to check whether
>>> pirq_guest_eoi() got run after that last occurrence of the interrupt.
>>>
>>> The more worrying part is that Xen should time out on a guest failing
>>> to send the EOI notification, and ack the interrupt nevertheless.
>>> Looking at the code I fail to see how the ack_APIC_irq() would get
>>> sent in this case: non-maskable MSIs get this issued from
>>> end_msi_irq(), but ->end doesn't get invoked from
>>> irq_guest_eoi_timer_fn() (only ->enable does). Keir, am I missing
>>> something?
>
> I don't think that timer logic is designed to handle non-maskable MSIs, only
> maskable ones. It ought to be not too hard to fix it up for non-maskable
> ones too by issuing the ->end() call from the timer handler?
Yes, that was what I was trying to hint at, but I wasn't sure whether
calling ->end() here has any unintended side effects and/or requires
any extra care (like preventing a subsequent guest initiated EOI to
call ->end() again).
While looking at this I came across another thing I don't understand:
__pirq_guest_eoi(), for the ACKTYPE_EOI case, calls __set_eoi_ready()
in a cpu_test_and_clear() conditional, but __set_eoi_ready() bails
out if it finds !cpu_test_and_clear() on the same bitmap - what's the
point of calling __set_eoi_ready() here then (or what am I missing)?
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M), Bruce Edge
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M), Keir Fraser
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M),
Jan Beulich <=
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M), Keir Fraser
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M), Bruce Edge
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M), Keir Fraser
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M), Bruce Edge
- Re: [Xen-devel] domU and dom0 hung with Xen console interrupt binding showing in-flight=1, (---M), Keir Fraser
|
|
|
|
|