Attached this new one which should eliminate the race ultimately.
Xiantao
-----Original Message-----
From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Zhang, Xiantao
Sent: Friday, October 16, 2009 5:50 PM
To: He, Qing
Cc: Cinco, Dante; xen-devel@xxxxxxxxxxxxxxxxxxx; Keir Fraser
Subject: RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP
ProLiant G6 with dual Xeon 5540 (Nehalem)
He, Qing wrote:
> On Fri, 2009-10-16 at 16:35 +0800, Zhang, Xiantao wrote:
>> He, Qing wrote:
>>> On Fri, 2009-10-16 at 16:22 +0800, Zhang, Xiantao wrote:
>>>> He, Qing wrote:
>>>>> On Fri, 2009-10-16 at 15:32 +0800, Zhang, Xiantao wrote:
>>>>>> According to the description, the issue should be caused by lost
>>>>>> EOI write for the MSI interrupt and leads to permanent interrupt
>>>>>> mask. There should be a race between guest setting new vector and
>>>>>> EOIs old vector for the interrupt. Once guest sets new vector
>>>>>> before it EOIs the old vector, hypervisor can't find the pirq
>>>>>> which corresponds old vector(has changed
>>>>>> to new vector) , so also can't EOI the old vector forever in
>>>>>> hardware level. Since the corresponding vector in real processor
>>>>>> can't be EOIed, so system may lose all interrupts and result the
>>>>>> reported issues ultimately.
>>>>>
>>>>>> But I remembered there should be a timer to handle this case
>>>>>> through a forcible EOI write to the real processor after timeout,
>>>>>> but seems it doesn't function in the expected way.
>>>>>
>>>>> The EOI timer is supposed to deal with the irq sharing problem,
>>>>> since MSI doesn't share, this timer will not be started in the
>>>>> case of MSI.
>>>>
>>>> That maybe a problem if so. If a malicious/buggy guest won't EOI
>>>> the MSI vector, so host may hang due to lack of timeout mechanism?
>>>
>>> Why does host hang? Only the assigned interrupt will block, and
>>> that's exactly what the guest wants :-)
>>
>> Hypervisor shouldn't EOI the real vector until guest EOI the
>> corresponding virtual vector , right ? Not sure.:-)
>
> Yes, it is the algorithm used today.
So it should be still a problem. If guest won't do eoi, host can't do eoi also,
and leads to system hang without timeout mechanism. So we may need to introduce
a timer for each MSI interrupt source to avoid hanging host, Keir?
> After reviewing the code, if the guest really does something like
> changing affinity within the window between an irq fire and eoi,
> there is indeed a problem, attached is the patch. Although I kinda
> doubt it, shouldn't desc->lock in guest protect and make these two
> operations mutual exclusive.
We shouldn't let hypervisor do real EOI before guest does the correponding
virtual EOI, so this patch maybe have a correctness issue. :-)
Attached the fix according to my privious guess, and it should fix the issue.
Xiantao
fix-irq-affinity-msi3.patch
Description: fix-irq-affinity-msi3.patch
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|