>>> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> 13.11.08 12:16 >>>
>On 13/11/08 11:07, "Jan Beulich" <jbeulich@xxxxxxxxxx> wrote:
>
>>> Otherwise we could relax it some (e.g., require N IRQs
>>> to get stacked up rather than just 1; or add explicit rate limiting).
>>
>> For the main problem at hand, that would just reduce the likelihood of the
>> device refusing to work. For the performance issue, that would be an
>> option, as would be reducing the timeout value. However, I would also
>> consider making EVTCHNOP_unmask clear that state, and then perhaps
>> find a way to tell the guest that it should call this even if unmask_evtchn()
>> finds the event channel to be bound to the local CPU. The obvious thing
>> would be to either extend shared_info or have the guest register an
>> address with Xen where per-event-channel overflow status would be
>> reported by Xen.
>
>We already have PHYSDEVOP_eoi. We can force guests to always use that. It'll
>be no slower than level-triggered IO-APIC IRQs, which we've been using since
>forever with no complaints. And yeah, as an extension we could have a
>shared-memory pirq bitmap to indicate dynamically whether eoi is required.
>Quite a secondary concern though, I would say.
Avoiding the EOI query is certainly a secondary issue. What I was asking
was rather a means for the guest to know whether Xen started that EOI
timer, so that it could indicate to Xen to terminate it and unmask the
respective IRQ. This shouldn't require always using PHYSDEVOP_eoi, and
from an abstract point of view also would belong there, but rather in
unmask_evtchn(). Since it would be an obvious thing that if you unmask
an event channel, you also want the underlying PIRQ unmasked, this
could be a compatible addition to the existing EVTCHNOP_unmask. The
only thing missing is a way for the guest to know when to actually use
the hypercall based unmasking - that's what I wanted to add a vector
for.
>>> We only disable MSI when the device does not support masking. Perhaps we
>>> should make disable/enable no-ops in that case?
>>
>> Yes, but don't we need an alternative way to avoid storms then?
>
>We should be delaying LAPIC EOI, just as we do for level-triggered IO-APIC
>IRQs (in that case, because masking RTEs in some cases has stupid side
>effects on some damn stupid chipsets). All that logic exists, just needs
>plumbing in for this particular class of non-maskable MSI.
Like this you mean? Lightly tested it appears to work (but not tested on a
problem system, yet).
--- 2008-10-27.orig/xen/arch/x86/irq.c
+++ 2008-10-27/xen/arch/x86/irq.c
@@ -463,14 +463,19 @@ int pirq_acktype(struct domain *d, int i
/*
* Edge-triggered IO-APIC and LAPIC interrupts need no final
* acknowledgement: we ACK early during interrupt processing.
- * MSIs are treated as edge-triggered interrupts.
*/
if ( !strcmp(desc->handler->typename, "IO-APIC-edge") ||
- !strcmp(desc->handler->typename, "local-APIC-edge") ||
- !strcmp(desc->handler->typename, "PCI-MSI") )
+ !strcmp(desc->handler->typename, "local-APIC-edge") )
return ACKTYPE_NONE;
/*
+ * MSIs are treated as edge-triggered interrupts, except
+ * when there is no proper way to mask them.
+ */
+ if ( desc->handler == &pci_msi_type )
+ return msi_maskable_irq(desc->msi_desc) ? ACKTYPE_NONE : ACKTYPE_EOI;
+
+ /*
* Level-triggered IO-APIC interrupts need to be acknowledged on the CPU
* on which they were received. This is because we tickle the LAPIC to EOI.
*/
--- 2008-10-27.orig/xen/arch/x86/msi.c
+++ 2008-10-27/xen/arch/x86/msi.c
@@ -303,6 +303,13 @@ static void msix_flush_writes(unsigned i
}
}
+int msi_maskable_irq(const struct msi_desc *entry)
+{
+ BUG_ON(!entry);
+ return entry->msi_attrib.type != PCI_CAP_ID_MSI
+ || entry->msi_attrib.maskbit;
+}
+
static void msi_set_mask_bit(unsigned int irq, int flag)
{
struct msi_desc *entry = irq_desc[irq].msi_desc;
@@ -323,8 +330,6 @@ static void msi_set_mask_bit(unsigned in
mask_bits &= ~(1);
mask_bits |= flag;
pci_conf_write32(bus, slot, func, pos, mask_bits);
- } else {
- msi_set_enable(entry->dev, !flag);
}
break;
case PCI_CAP_ID_MSIX:
@@ -654,7 +659,7 @@ static int __pci_enable_msix(struct msi_
pos = pci_find_cap_offset(msi->bus, slot, func, PCI_CAP_ID_MSIX);
control = pci_conf_read16(msi->bus, slot, func, msi_control_reg(pos));
nr_entries = multi_msix_capable(control);
- if (msi->entry_nr > nr_entries)
+ if (msi->entry_nr >= nr_entries)
{
spin_unlock(&pdev->lock);
return -EINVAL;
--- 2008-10-27.orig/xen/include/asm-x86/msi.h
+++ 2008-10-27/xen/include/asm-x86/msi.h
@@ -97,6 +97,8 @@ struct msi_desc {
int remap_index; /* index in interrupt remapping table */
};
+int msi_maskable_irq(const struct msi_desc *);
+
/*
* Assume the maximum number of hot plug slots supported by the system is about
* ten. The worstcase is that each of these slots is hot-added with a device,
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|