Dante,
If the device doesn't support MSI mask bit, the second patch should have no
effect for that. And I am working on backporting more IRQ migration logic from
Linux, and it should ensure addr/vector are both written to devices before
firing new interrrupts. But as I mentioned before, if you want to solve the
guest affinity setting issue, you have to apply the first patch I sent out
(fix-irq-affinity-msi3.patch). :-)
Xiantao
Cinco, Dante wrote:
> Xiantao,
>
> I'm sorry I forgot to mention that I did apply your two patches but
> it didn't have any effect (interrupts still lost after changing
> smp_affinity and "No handler for irq vector" message). I added a
> dprintk in msi_set_mask_bit() and realized that MSI does not have a
> mask bit (MSIX does). My PCI device uses MSI not MSIX. I placed my
> dprintk inside the condition below and it never triggered.
>
> switch (entry->msi_attrib.type) {
> case PCI_CAP_ID_MSI:
> if (entry->msi_attrib.maskbit) {
>
> While debugging this problem, I thought about the potential problem
> of an interrupt firing between the writes for the MSI message address
> and MSI message data. I noticed that pci_conf_write() uses
> spin_lock_irqsave() to disable interrupts before issuing the "out"
> instruction but the writes for the address and data are two separate
> pci_conf_write() calls. To me, it would be safer to write the address
> and data in a single call and preceded by spin_lock_irqsave(). This
> way, when the interrupts are enabled, the address and data have both
> been updated.
>
> Dante
>
> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: Thursday, October 22, 2009 2:42 AM
> To: Zhang, Xiantao; Jan Beulich
> Cc: He, Qing; xen-devel@xxxxxxxxxxxxxxxxxxx; Cinco, Dante
> Subject: Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus
> > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
>
> On 22/10/2009 09:41, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx> wrote:
>
>>> Hmm, then I don't understand which case your patch was a fix for: I
>>> understood that it addresses an issue when the affinity of an
>>> interrupt gets changed (requiring a re-write of the address/data
>>> pair). If the hypervisor can deal with it without masking, then why
>>> did you add it?
>>
>> Hmm, sorry, seems I misunderstood your question. If the msi doesn't
>> support mask bit(clearing MSI enable bit doesn't help in this case),
>> the issue may still exist. Just checked Linux side, seems it doesn't
>> perform mask operation when program MSI, but don't know why Linux
>> hasn't such issues. Actaully, we do see inconsisten interrupt
>> message
>> from the device without this patch, and after applying the patch, the
>> issue is gone. May need further investigation why Linux doesn't
>> need the mask operation.
>
> Linux is quite careful about when it will reprogram vector/affinity
> info isn't it? Doesn't it mark such an update pending and only flush
> it through during next interrupt delivery, or something like that? Do
> we need some of the upstream Linux patches for this?
>
> -- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|