[Yunhong Jiang]
>> Right. The reason for bringing up this suggestion now rather than
>> later is because MSI support has not yet found its way into
>> mainline. Whoever decides on the interface used for registering
>> MSI and MSI-X interrupts might want to take multi-message MSIs into
>> account as well.
> Espen, thanks for your comments. I remember Linux has not such
> support, so Linux driver will not benifit from such
> implementation. After all, driver need provide ISR for the
> interrupts. Of course, we need this feature if any OS has support. I
> didn't support this because it may require changes to various common
> components and need more discussion, while Linux has no support to
> it. (also I rushed to 3.2 cut-off at that time :$).
You're right in that Linux does not currently support this. You can,
however, allocate multiple interrupts using MSI-X. Anyhow, I was not
envisioning this feature being used directly for passthrough device
access. Rather, I was considering the case where a device could be
configured to communicate data directly into a VM (e.g., using
multi-queue NICs) and deliver the interrupt to the appropriate VM. In
this case the frontend in the guest would not need to see a
multi-message MSI device, only the backend in dom0/the driver domain
would need to be made aware of it.
>> I do not think explicitly specifying destination APIC upon
>> allocation is the best idea. Setting the affinity upon binding the
>> interrupt like it's done today seems like a better approach. This
>> leaves us with dealing with the vectors.
> But what should happen when the vcpu is migrated to another physical
> cpu? I'm not sure the cost to program the interrupt remapping table,
> otherwise, that is a good choice to achieveh the affinity.
As you've already said, the interrupt affinity is only set when a pirq
is bound. The interrupt routing is not redirected if the vcpu it's
bound to migrates to another physical cpu. This can (should?) be
changed in the future so that the affinity is either set implicitly
when migrating the vcpu, or explictily with a rebind call by dom0. In
any case the affinity would be reset by the set_affinity method.
>> My initial thought was to make use of the new msix_entries[] field
>> in the xen_pci_op structure. This field is already used as an
>> in/out parameter for allocating MSI-X interrupts. The
>> pciback_enable_msi() function can then attempt to allocate multiple
>> interrupts instead of a single one, and return the allocated
>> vectors.
>>
>> The current MSI patchset also lacks a set_affinity() function for
>> changing the APIC destination similar to what is done for, e.g.,
>> IOAPICs. Also similar to IOAPICs, the MSI support should have
>> something like the io_apic_write_remap_rte() for rewriting the
>> interrupt remapping table when enabled.
> For the set_affinity(), what do you mean of changing the APIC
> destination? Currently if set guest's pirq's affinity, it will only
> impact event channel. The physical one will only be called once,
> when the pirq is bound.
With "changing the APIC destination" I meant changing the destination
CPU of an interrupt while keeping the vector, delivery type,
etc. intact.
> As for rewriting interrupt remapping table like
> io_apic_write_remap_rte(), I think it will be added later also.
> I'm also a bit confused for your statement in previous mail "The
> necessary changes would enable a device driver for an MSI capable
> device to allocate a range of pirqs and bind these to different
> frontends.". What do you mean of different frontends?
Different frontends here means multiple instances of frontends
residing in different VMs, all served by a single backend. As eluded
to above, the idea is to have a single backend that has direct access
to the device, and multiple frontends that somehow share some limited
direct access to the device. For example, a multi-queue capable NIC
could deliver the packets to the queue in the apropriate VM and raise
an interrupt in that VM without involving the domain of the backend
driver.
eSk
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|