Hi,
Keir,
These patches are rebased
version of Yunhong’s original patches, which were sent
out before XEN 3.2 was released. These patches enable MSI support and limited
MSI-X support in XEN. Here is the original description of the patches from Yunhong’s mail.
The basic idea including:
1)
Keep vector global resource owned by xen, while split
pirq into per-domain information.
2) Domain0 kernel
will operate msi resource for domain0/domU, while QEMU will
operate MSI resource for HVM
domain.
3) Xen will do EOI for MSI
interrupt.
Signed-off-by: Yunhong Jiang <yunhong.jiang@xxxxxxxxx>
There are no much changes
made compared with the original patches. But
there do have some issues that we need your kind
comments.
1> ACK-NEW method is necessary to
avoid IRQ storm. But it causes the deadlock.
During my tests, I do find
there can be deadlock with patches
applied. When assigned a NIC device to HVM domain, the
scenario is: Dom0 is waiting to IDE interrupt (vector 0x21); HVM domain is
waiting for qemu’s IDE emulation and thus blocked; NIC
interrupt (MSI vector 0x31) is waiting for injection to HVM domain since it is
blocked now; IDE interrupt is waiting for NIC interrupt since NIC interrupt is
of high priority but not ACKed by XEN now. When IDE interrupt and NIC interrupt are delivered to
the same CPU, and when guest OS is Vista, the phenomenon is easy to be
observed.
2> Without ACK-NEW, some
naughty NIC devices as we observed will bring IRQ storms. For this phenomenon, I
think Yunhong can comment more. Basically, writing EOI without mask the source
of MSI will bring IRQ storm. Although the reason is under investigation, XEN
should anyhow handle such bogous device, right?
3> Using ACK-OLD and masking the
MSI when writing EOI can be solution. However, XEN does not own PCI
configuration spaces.
We also tried some work
arounds.
One work around might be
using a timer to force a EOI within some time interval.
This method is already implemented in VT-D’s code. However, with this approach, if the timer is fired and
EOI is written, this is essentially the same apporach as option
2.
Another approach is to never
deliver these two IRQs to the same CPU. But this is really ugly and can not be
applied to UP.
We have also considered using VT-D 2
interrupt remapping feature. According to the spec, there is no bit in the
remapping table to mask the interrupt. Therefore, this can not be combined with
option 2 to solve the issue. Masking the interrupt still needs accessing PCI
configuration spaces.
We think the most clean method may
be to move ownership from dom0 to VMM. However, this is a great
change. This should be well discussed in community and need your
comments.
These patch series sent out can be
served as a discussion materials. What is your comments on the patches and the
issues, Keir?
Thanks!
Haitao Shan