Re: [Xen-devel] Xen 4.1 rc5 outstanding bugs
On Tue, 8 Mar 2011, Konrad Rzeszutek Wilk wrote:
> On Fri, Feb 18, 2011 at 06:59:55PM +0000, Stefano Stabellini wrote:
> > Hi all,
> > I went through the list of bugs affecting the latest Xen 4.1 RC and
> > I made a list of what they seem to be the most serious.
> What is the status of these? I know you made some strides in fixing
> most if not all of them both in the hypervisor and Linux kernel.
we have fixed most of them, see below
> > All of them affect PCI passthrough and seem to be hypervisor/qemu-xen
> > bugs apart from the last one that is a libxenlight/xl bug.
> > * VF passthrough does not work
> > Passing through a normal NIC seem to work but passing through a VF
> > doesn't.
> > The device appears in the guest but it cannot exchange packets, the
> > guest kernel version doesn't seem to matter.
> > >From the qemu logs:
> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is already
> > function.
> > It might be the same problem of the two following bug reports:
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1709
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1742
This one is a linux kernel bug and is fixed by "set current_state to D0
> > * Xen panic on guest shutdown with PCI Passthrough
> > http://xen.1045712.n5.nabble.com/Xen-panic-on-guest-shutdown-with-PCI-Passthrough-tt3371337.html#none
> > When the guest with a passthrough pci device is shut down, Xen panic
> > on a NMI - MEMORY ERROR.
> > (XEN) Xen call trace:
> > (XEN) [<ffff82c48015f032>] msi_set_mask_bit+0xea/0x121
> > (XEN) [<ffff82c48015f087>] mask_msi_irq+0xe/0x10
> > (XEN) [<ffff82c480162fb1>] __pirq_guest_unbind+0x298/0x2aa
> > (XEN) [<ffff82c48016315c>] unmap_domain_pirq+0x199/0x307
> > (XEN) [<ffff82c480163321>] free_domain_pirqs+0x57/0x83
> > (XEN) [<ffff82c48015642a>] arch_domain_destroy+0x30/0x2e3
> > (XEN) [<ffff82c480104c59>] complete_domain_destroy+0x6e/0x12a
> > (XEN) [<ffff82c48012adfd>] rcu_process_callbacks+0x173/0x1e1
> > (XEN) [<ffff82c480123327>] __do_softirq+0x88/0x99
> > (XEN) [<ffff82c4801233a2>] do_softirq+0x6a/0x7a
This is a Xen bug (it should cope with NMIs instead of crashing) and is
fixed by "NMI: continue in case of PCI SERR erros".
> > * possible Xen pirq leak at domain shutdown
> > If a guest doesn't support pci hot-unplug (or a malicious guest), it
> > won't do anything in response to the acpi SCI interrupt we send when the
> > domain is destroyed, therefore unregister_real_device will never be
> > called in qemu and we might be leaking MSIs in the Xen (to be
> > verified).
> > http://xen.1045712.n5.nabble.com/template/NamlServlet.jtp?macro=print_post&node=3369367
I think we have verified that there is no pirq leak.
> > * Xen warns about MSIs when assigning a PCI device to a guest
> > also known as "Xen complains msi error when startup"
> > At startup Xen prints multiple:
> > (XEN) Xen WARN at msi.c:635
> > (XEN) Xen WARN at msi.c:648
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1732
the warning is still there
> > * PCI hot-unplug causes a guest crash
> > also know as "fail to detach NIC from guest"
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1736
I cannot repro this bug, I think someone at Intel is working on it.
> > * multiple PCI devices passthrough to PV guests or HVM guests with
> > stubdoms is broken with the xl toolstack
> > Cannot assign >1 PCI passthrough devices as domain creation time because
> > libxl creates a bus for the first device and increments "num_devs" node
> > in xenstore for each subsequent device but pciback cannot cope with
> > num_devs changing while the guest is not running to respond to the
> > reconfiguration request. A fix would be to create the entire bus in a
> > single cold-plug operation at start of day.
the problem still persists with stubdoms but IanJ fixed the PV
passthrough bug, see "libxl: Multi-device passthrough coldplug: do not
wait for unstarted guests".
Xen-devel mailing list