Keir, when I try to get the ip address today, I suddenly found I can't
reproduce it anymore, also orginally if I removed the code that trigger
the software LSC interrupt, the NIC can still work and get IP address,
but now if I remove that code, the NIC can't work anymore.
It is really strange to me, I did't change anything to the system. Also
I don't know any changes in the lab environment that may cause this
change. But I do can reproduce it before each time.
Really frustrated to get this :-( , do you think we still need move the
config space access down, now the only reasons to move this down is,
ack_edge_ioapic_irq() did the mask, and this mask can make HV more
robust.
Thanks
-- Yunhong Jiang
Jiang, Yunhong <> wrote:
> xen-devel-bounces@xxxxxxxxxxxxxxxxxxx <> wrote:
>> On 28/3/08 08:40, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>>
>>> The investigation result is,
>>> 1) if mask and ack the interrupt, the interrupt will happen 3 times,
the
>>> last 2 is masked because they happened when the first one is still
>>> pending for ISR's handler, the system is ok.
>>
>> How can you tell it happened three times? If the interrupt is pending
in
>> the ISR then only one further pending interrupt can become visible
>> to software
>> as there is only one pending bit per vector in the IRR.
>
> There are two type of msi interrupt, one for receive/transmit,
> one for other (this is the one cuase storm). I add printk if
> interrupt happen while previous is in progress. Then I added
> the print number and the output in /prot/interrupt. The output in
> /prco/interrupt is only 1.
>
>>
>>> So I suppose the problem happens only if trigger the interrupt by
>>> software. I consulted the HW engineer also but didn't get
confirmation,
>>> the only answer I got is, the PCI-E need a rising edge before send
the
>>> 2nd interrupt :(
>>
>> That answer means very little to me. One interesting question to have
>> answered would be: is this a closed-loop or open-loop
>> interrupt storm? I.e.,
>> does the device somehow detect API EOI and then trigger
>> re-send of the MSI
>> (closed loop) or is this an initialisation-time-only open-loop
>> storm where
>> the device is spitting out the MSI regularly until some device
register
>> gets written by the interrupt service routine?
>>
>> Given the circumstances, I'm inclined to think it is the
>> latter. Especially
>> since I think the former is impossible as EPIC EOI is not
>> visible outside
>> the processor unless the interrupt came from a level-triggered
IO-APIC pin,
>> and even then the EOI would not be visible across the PCI bus!
>>
>> Also it seems *very* likely that this is just an
>> initialisation-time thing,
>> and the device probably behaves very nicely after it is
>> bootstrapped. In
>
> I can't tell this becuase this interrupt didn't happen again
> after the device is up. Maybe I can change the driver to do more
> experiement.
>
>> light of this I think we should treat MSI sources as
>> ACKTYPE_NONE in Xen
>> (i.e, require no callback from guest to hypervisor on completion of
the
>> interrupt handler). We can then handle the interrupt storm
>> entirely within
>> the hypervisor by detecting the storm and masking the
>> interrupt and only
>> unmasking on some timeout.
>>
>> In your tests, how aggressive was the IRQ storm? If you looked at the
>> interrupted EIP on each interrupt, was it immediately after
>> the APIC was
>> EOIed and EFLAGS.IF set back to 1, or was it some time after?
>> This tells us
>> how aggressively the device is sending out EOIs, and may determine
how
>> cunning we need to be regarding interrupt storm detection.
>
> I will try that.
>
>>
>>> I'm not sure if there are any other BRAIN-DEAD device like this, I
only
>>> have this device to test MSI-X function, but we may need make sure
it
>>> will not break the whole system.
>>
>> Yes, we have to handle this case, unfortunately.
>>
>>> The call-back to guest because we are using the ACK-new method to
work
>>> around this issue. Yes, it is expensive, Also, this ACK-new method
may
>>> cause deadlock as Haitao suggested in the mail.
>>
>> Yes, that sucks. See my previous email -- if possible it would
>> be great to
>> teach Xen enough about the PCI config space to be able to mask MSIs.
> In fact, currently xen is already tryting to access config
> space, althought that is a bug still currently. In vt-d, xen try to
access
> FLR directly :)
>
>>
>>> But if we move the config space to HV, then we don't need this
ACK-new
>>> method, that should be ok, but admittedly, that should be the last
>>> method we we turn to, since config-space should be owned by domain0.
>>
>> A partial movement into the hypervisor may be the best of a
>> choice of evils.
>
> Sure, we will do that!
>
>> -- Keir
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|