WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP P

To: "Cinco, Dante" <Dante.Cinco@xxxxxxx>, "He, Qing" <qing.he@xxxxxxxxx>
Subject: RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
From: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
Date: Sat, 17 Oct 2009 08:59:08 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Fri, 16 Oct 2009 18:07:02 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <2B044E14371DA244B71F8BF2514563F503F47911@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <706158FABBBA044BAD4FE898A02E4BC201C9BD8449@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <2B044E14371DA244B71F8BF2514563F503F47911@xxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcpOPi4Q9UlUI8ufS5qQ1CJSqDysQAAAM40gAAsf3SAABZW6UAAQxcbQ
Thread-topic: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
 Dante
 It should be another issue as you described.  Can you try the following code 
to see whether it works for you ?  Just a try.  
Xiantao

diff -r 0705efd9c69e xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c    Fri Oct 16 09:04:53 2009 +0100
+++ b/xen/arch/x86/hvm/hvm.c    Sat Oct 17 08:48:23 2009 +0800
@@ -243,7 +243,7 @@ void hvm_migrate_pirqs(struct vcpu *v)
             continue;
         irq = desc - irq_desc;
         ASSERT(MSI_IRQ(irq));
-        desc->handler->set_affinity(irq, *cpumask_of(v->processor));
+        //desc->handler->set_affinity(irq, *cpumask_of(v->processor));
         spin_unlock_irq(&desc->lock);
     }
     spin_unlock(&d->event_lock);

-----Original Message-----
From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Cinco, Dante
Sent: Saturday, October 17, 2009 2:24 AM
To: Zhang, Xiantao; He, Qing
Cc: Keir; xen-devel@xxxxxxxxxxxxxxxxxxx; Fraser
Subject: RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP 
ProLiant G6 with dual Xeon 5540 (Nehalem)

Xiantao,
I'm still losing the interrupts with your patch but I see some differences. To 
simplifiy the data, I'm only going to focus on the first function of my 
4-function PCI device.

After changing the IRQ affinity, the IRQ is not masked anymore (unlike before 
the patch). What stands out for me is the new vector (219) as reported by 
"guest interrupt information" does not match the vector (187) in dom0 lspci. 
Before the patch, the new vector in "guest interrupt information" matched the 
new vector in dom0 lspci (dest ID in dom0 lspci was unchanged). I also saw this 
message pop on the Xen console when I changed smp_affinity:

(XEN) do_IRQ: 1.187 No irq handler for vector (irq -1).

187 is the vector from dom0 lspci before and after the smp_affinity change but 
"guest interrupt information" reports the new vector is 219. To me, this looks 
like the new MSI message data (with vector=219) did not get written into the 
PCI device, right?

Here's a comparison before and after changing smp_affinity from ffff to 2 (dom0 
is pvops 2.6.31.1, domU is 2.6.30.1):

------------------------------------------------------------------------

/proc/irq/48/smp_affinity=ffff (default):

dom0 lspci: Address: 00000000fee00000  Data: 40bb (vector=187)

domU lspci: Address: 00000000fee00000  Data: 4071 (vector=113)

qemu-dm-dpm.log: pt_msi_setup: msi mapped with pirq 4f (79)
                 pt_msi_update: Update msi with pirq 4f gvec 71 gflags 0

Guest interrupt information: (XEN) IRQ: 74, IRQ affinity:0x00000001, Vec:187 
type=PCI-MSI status=00000010 in-flight=0 domain-list=1: 79(----)

Xen console: (XEN) [VT-D]iommu.c:1289:d0 domain_context_unmap:PCIe: bdf = 7:0.0
             (XEN) [VT-D]iommu.c:1175:d0 domain_context_mapping:PCIe: bdf = 
7:0.0
             (XEN) [VT-D]io.c:301:d0 VT-d irq bind: m_irq = 4f device = 5 intx 
= 0
             (XEN) io.c:326:d0 pt_irq_destroy_bind_vtd: machine_gsi=79 
guest_gsi=36, device=5, intx=0
             (XEN) io.c:381:d0 XEN_DOMCTL_irq_unmapping: m_irq = 0x4f device = 
0x5 intx = 0x0

------------------------------------------------------------------------

/proc/irq/48/smp_affinity=2:

dom0 lspci: Address: 00000000fee10000  Data: 40bb (dest ID changed from 0 (APIC 
ID of CPU0) to 16 (APIC ID of CPU1), vector unchanged)

domU lspci: Address: 00000000fee02000  Data: 40b1 (dest ID changed from 0 (APIC 
ID of CPU0) to 2 (APIC ID of CPU1), new vector=177)

Guest interrupt information: (XEN) IRQ: 74, IRQ affinity:0x00000002, Vec:219 
type=PCI-MSI status=00000010 in-flight=0 domain-list=1: 79(----)

qemu-dm-dpm.log: pt_msi_update: Update msi with pirq 4f gvec 71 gflags 2
                 pt_msi_update: Update msi with pirq 4f gvec b1 gflags 2

------------------------------------------------------------------------

-----Original Message-----
From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx] 
Sent: Friday, October 16, 2009 7:55 AM
To: Zhang, Xiantao; He, Qing
Cc: Cinco, Dante; xen-devel@xxxxxxxxxxxxxxxxxxx; Keir Fraser
Subject: RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP 
ProLiant G6 with dual Xeon 5540 (Nehalem)

Attached this new one which should eliminate the race ultimately. 
Xiantao 

-----Original Message-----
From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Zhang, Xiantao
Sent: Friday, October 16, 2009 5:50 PM
To: He, Qing
Cc: Cinco, Dante; xen-devel@xxxxxxxxxxxxxxxxxxx; Keir Fraser
Subject: RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP 
ProLiant G6 with dual Xeon 5540 (Nehalem)

He, Qing wrote:
> On Fri, 2009-10-16 at 16:35 +0800, Zhang, Xiantao wrote:
>> He, Qing wrote:
>>> On Fri, 2009-10-16 at 16:22 +0800, Zhang, Xiantao wrote:
>>>> He, Qing wrote:
>>>>> On Fri, 2009-10-16 at 15:32 +0800, Zhang, Xiantao wrote:
>>>>>> According to the description, the issue should be caused by lost 
>>>>>> EOI write for the MSI interrupt and leads to permanent interrupt 
>>>>>> mask. There should be a race between guest setting new vector and 
>>>>>> EOIs old vector for the interrupt.  Once guest sets new vector 
>>>>>> before it EOIs the old vector, hypervisor can't find the pirq 
>>>>>> which corresponds old vector(has changed to new vector) , so also 
>>>>>> can't EOI the old vector forever in hardware level. Since the 
>>>>>> corresponding vector in real processor can't be EOIed, so system 
>>>>>> may lose all interrupts and result the reported issues 
>>>>>> ultimately.
>>>>> 
>>>>>> But I remembered there should be a timer to handle this case 
>>>>>> through a forcible EOI write to the real processor after timeout, 
>>>>>> but seems it doesn't function in the expected way.
>>>>> 
>>>>> The EOI timer is supposed to deal with the irq sharing problem, 
>>>>> since MSI doesn't share, this timer will not be started in the 
>>>>> case of MSI.
>>>> 
>>>> That maybe a problem if so. If a malicious/buggy guest won't EOI 
>>>> the MSI vector, so host may hang due to lack of timeout mechanism?
>>> 
>>> Why does host hang? Only the assigned interrupt will block, and 
>>> that's exactly what the guest wants :-)
>> 
>> Hypervisor shouldn't EOI the real vector until guest EOI the 
>> corresponding virtual vector , right ?  Not sure.:-)
> 
> Yes, it is the algorithm used today.

So it should be still a problem. If guest won't do eoi, host can't do eoi also, 
and leads to system hang without timeout mechanism. So we may need to introduce 
a timer for each MSI interrupt source to avoid hanging host, Keir? 

> After reviewing the code, if the guest really does something like 
> changing affinity within the window between an irq fire and eoi, there 
> is indeed a problem, attached is the patch. Although I kinda doubt it, 
> shouldn't desc->lock in guest protect and make these two operations 
> mutual exclusive.

We shouldn't let hypervisor do real EOI before guest does the correponding 
virtual EOI, so this patch maybe have a correctness issue. :-)

Attached the fix according to my privious guess, and it should fix the issue. 

Xiantao
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>