What I understand are the followings.
- When guest gsi isn't shared, MSI-INTx interrupt translation works fine.
- When guest gsi is shared between passthrough device and emulated
device, MSI-INTx interrupt translation work, though guest OS
receives spurious interrupts.
- Sharing guest gsi among passthrough devices isn't supported.
- There are some unsuitable devices for MSI-INTx interrupt translation.
May I ask you any additional questions?
First, why can't we assign more than 8 devices?
In the view of guest OS, assigned device is always single function
device. This means assigned devices use only INTA. And Interrupt
routing in hypervisor is shown as follows.
>From xen/include/asm-x86/hvm/irq.c:
#define hvm_pci_intx_gsi(dev, intx) \
(((((dev)<<2) + ((dev)>>3) + (intx)) & 31) + 16)
I think sharing guest gsi among passthrough devices doesn't
occur if assigned device is <= 32.
Second, it is nice to create blacklist of unsuitable devices for MSI-INTx
interrupt, isn't it? The reason is the problem seems device-specific problem.
Thanks,
--
Shohei Fujiwara
On Fri, 9 Jan 2009 14:57:16 +0800
Qing He <qing.he@xxxxxxxxx> wrote:
> On Fri, 2009-01-09 at 12:26 +0800, Shohei Fujiwara wrote:
> > There is the assumption that Guest OS handles all causes which happen
> > before Guest OS receives the interrupt. But is the assumption right for
> > all OS?
> >
> > In the case of level-triggerd interrupt, I/O device asserts interrupt
> > line, when the cause of interrupt happens. OS handles the cause,
> > I/O device de-asserts interrupt, and OS sends EOI to APICs.
> >
> > When I/O APIC receives EOI, I/O APIC re-transmits interrupt to Local APIC
> > if some interrupt line is asserted.
> >
> > Some OS might rely on this re-transmittion by I/O APIC.
> >
>
> >
> > But other OS might have the code like the following:
> >
> > do {
> > ret = action->handler(irq, action->dev_id, regs);
> > if (ret == IRQ_HANDLED) {
> > status |= action->flags;
> > retval |= ret;
> > break;
> > ^^^^^^
> > }
> > action = action->next;
> > } while (action);
> >
> Hmm, I think now I understand what you mean. If the guest irq is shared by
> a normal IRQ and a MSI-INTx translated IRQ, two sources may assert the
> pin while they both get pending. When this irq is injected, if the guest
> only handles one irq source each time, and issues EOI right after it
> clears the normal IRQ, the MSI is lost. Is it what you mean?
>
> There is logic to avoid this from happening, see
> hvm_irq->gsi_assert_count[gsi]. Basically, it's used to count how many
> sources have asserted a shared pin. And at the time of EOI, after the
> decrement of the counter, if it's still not 0, the LAPIC is re-asserted.
> This may result in some spurious interrupts to guest, but that's better
> than losing interrupts.
>
> The sharing of guest irq is generally not a good idea, in fact, this is
> not even well supported in current Xen code. You may have seen something
> like "girq[ggsi].mirq = mirq". That way, we are already stuck.
> Currently, if the number of assigned devices is <= 8, there should be no
> sharing, otherwise, very weird things may happen...
>
> Uncommon devices of OS does have the possibility to fail MSI-INTx, for
> example, if the device doesn't behave the same way using INTx and MSI,
> or the guest OS doesn't always clear guest source before issuing EOI.
> That's why I add the per-device disable function: if a device or OS
> doesn't work properly, just turn it off. Fortunately, this is extremely
> rare.
>
> Btw, AFAIK, Windows handles all irq sources in one ISR, similar to
> Linux.
>
> Thanks,
> Qing
> >
> > This code will work on real machine, because I/O APIC re-transmits
> > interrupt, if the cause to be handled remains. If some OS has the
> > code like the above, the assumption isn't right.
> >
> > Actually, my concern is whether the assumption is right for Windows,
> > or not. Do you know about this, or does your patch works well with
> > Windows guest?
> >
> > Thanks,
> > --
> > Shohei Fujiwara
> >
> > > Generally, it's easy to "translate" an edged interrupt to a level one,
> > > but not the other way.
> > >
> > > Thanks,
> > > Qing
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > --
> > > > Shohei Fujiwara
> > > >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/xen-devel
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|