Thanks for the very detailed and careful review!
On Thu, 22 Apr 2010, Konrad Rzeszutek Wilk wrote:
>
> Pci -> PCI
>
Ok
> > + initializing xenbus and grant_table when running in a Xen HVM
> .. snip ..
> > + domain. As a consequence this driver is required to run any Xen PV
> > + frontend on Xen HVM.
> > @@ -448,6 +452,26 @@ static int gnttab_map(unsigned int start_idx, unsigned
> > int end_idx)
> > unsigned int nr_gframes = end_idx + 1;
> > int rc;
> >
> > + if (xen_hvm_domain()) {
> > + struct xen_add_to_physmap xatp;
> > + unsigned int i = end_idx;
> > + rc = 0;
> > + /*
> > + * Loop backwards, so that the first hypercall has the largest
> > + * index, ensuring that the table will grow only once.
> > + */
> > + do {
> > + xatp.domid = DOMID_SELF;
> > + xatp.idx = i;
> > + xatp.space = XENMAPSPACE_grant_table;
> > + xatp.gpfn = (hvm_pv_resume_frames >> PAGE_SHIFT) + i;
> > + if ((rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap,
> > &xatp)) != 0)
> > + break;
> > + } while (i-- > start_idx);
> > +
> So if the hypercall fails, should we print out a message ? Or is it OK
> to return the rc?
good catch, I actually added a printk there for debugging at some point
but I didn't commit it.
> > + return rc;
> > + }
> > +
> > frames = kmalloc(nr_gframes * sizeof(unsigned long), GFP_ATOMIC);
> > if (!frames)
> > return -ENOMEM;
>
> ... snip ..
> > +++ b/drivers/xen/platform-pci.c
> > @@ -0,0 +1,227 @@
> > +/******************************************************************************
> > + * platform-pci.c
> > + *
> > + * Xen platform PCI device driver
> > + * Copyright (c) 2005, Intel Corporation.
> > + * Copyright (c) 2007, XenSource Inc.
>
> No 'Copyright (c) 2010, Citrix'?
>
I'll add one
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> > for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > with
> > + * this program; if not, write to the Free Software Foundation, Inc., 59
> > Temple
> > + * Place - Suite 330, Boston, MA 02111-1307 USA.
> > + *
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +#include <linux/interrupt.h>
> > +
> > +#include <asm/io.h>
>
> I think scripts/checkpatch.pl complains about this. You did
> run script/checkpath.pl, right?
>
> You might want also to put all of those includes in
> an alphabetical order.
>
script/checkpatch.pl complains about a lot of things :)
I'll fix.
> > +
> > +#include <xen/grant_table.h>
> > +#include <xen/platform_pci.h>
> > +#include <xen/interface/platform_pci.h>
> > +#include <xen/xenbus.h>
> > +#include <xen/events.h>
> > +#include <xen/hvm.h>
> > +
> > +#define DRV_NAME "xen-platform-pci"
> > +
> > +MODULE_AUTHOR("ssmith@xxxxxxxxxxxxx and stefano.stabellini@xxxxxxxxxxxxx");
> > +MODULE_DESCRIPTION("Xen platform PCI device");
> > +MODULE_LICENSE("GPL");
> > +
> > +static unsigned long platform_mmio;
> > +static unsigned long platform_mmio_alloc;
> > +static unsigned long platform_mmiolen;
> > +
> > +unsigned long alloc_xen_mmio(unsigned long len)
> > +{
> > + unsigned long addr;
> > +
> > + addr = platform_mmio + platform_mmio_alloc;
> > + platform_mmio_alloc += len;
> > + BUG_ON(platform_mmio_alloc > platform_mmiolen);
> > +
> > + return addr;
> > +}
> > +
> > +static uint64_t get_callback_via(struct pci_dev *pdev)
> > +{
> > + u8 pin;
> > + int irq;
> > +
> > +#ifdef __ia64__
> > + for (irq = 0; irq < 16; irq++) {
> > + if (isa_irq_to_vector(irq) == pdev->irq)
> > + return irq; /* ISA IRQ */
> > + }
>
> Um, is this still neccessary? Have you tested this driver on IA64 with
> this kernel?
No I didn't and it would be difficult for me to.
However reading the code again it seems to me that the ifdef is
unnecessary here, so I'll remove it.
> > +#else /* !__ia64__ */
> > + irq = pdev->irq;
> > + if (irq < 16)
> > + return irq; /* ISA IRQ */
> > +#endif
> > + pin = pdev->pin;
> > +
> > + /* We don't know the GSI. Specify the PCI INTx line instead. */
> > + return (((uint64_t)0x01 << 56) | /* PCI INTx identifier */
> > + ((uint64_t)pci_domain_nr(pdev->bus) << 32) |
> > + ((uint64_t)pdev->bus->number << 16) |
> > + ((uint64_t)(pdev->devfn & 0xff) << 8) |
> > + ((uint64_t)(pin - 1) & 3));
> > +}
> > +
> > +static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
> > +{
> > + xen_hvm_evtchn_do_upcall(get_irq_regs());
> > + return IRQ_HANDLED;
> > +}
> > +
> > +int xen_irq_init(struct pci_dev *pdev)
>
> The name sounds awkward. How about 'get_irq' or 'allocate_irq'?
Ok, I'll go for xen_allocate_irq.
> > +{
> > + __set_irq_handler(pdev->irq, handle_edge_irq, 0, NULL);
>
> Is that kosher? I am not that familiar with how QEMU sets up the ACPI
> DSDT to figure out what the device pin (which looking at the qemu source
> is set to '1') corresponds to what IRQ.
>
> Is it possible that with AMD-Vi or Intel VT-d you pass in a PCI device
> that would have an interrupt line that would coincide with this device?
> Or is the hvmloader + QEMU smart enought to take that under account
> and stick this device on an edge interrupt line?
>
> Perhaps doing something like:
> int flags = IRQF_SAMPLE_RANDOM | IRQF_NOBALANCING;
>
> if (pdev->irq > 16)
> flags |= IRQF_SHARED;
> else {
> flags |= IRQF_TRIGGER_RISING | IRQF_DISABLED;
> /* Whack! */
> __set_irq_handler(pdev->irq, handle_edge_irq, 0, NULL);
> }
> return request_irq(pdev->irq, do_.., flags,"xen-pla..", pdev);
>
Yes I did that on purpose because it seems that it is the only thing
able to fix the multivcpu problem I was having: when receiving evtchns
from the pci platform device interrupt line (vector callback not
available), the guest boots fine with a single vcpu but it hangs at boot
with 2 or more vcpus. In particular it seems to deadlock in the xenbus
initialization code because of some missing interrupts/evtchns.
Switching to handle_edge_irq (even though the interrupt is supposed to
be level triggered, but it is emulated, so I guess is not important anyway)
fixes the issue for me. It is also interesting that using vector
callback the problem doesn't occur.
In the default scenario linux uses handle_fasteoi_irq that acks the
interrupt only at the end, after the call to handle_IRQ_event, that in
the evtchn case can loop several times on the pending evtchns so can
take long to complete.
On the other hand handle_edge_irq acks the interrupt right away before
calling handle_IRQ_event, that is the same behaviour of the ipi handler.
Also evtchn handlers can re-enable interrupts and that would cause
handle_edge_irq to be called again if we receive another
interrupt/evtchn, and handle_edge_irq would again ack the interrupt
right away even if it is marked as IRQ_INPROGRESS while handle_fasteoi_irq
would just exit.
I guess I could add a comment before __set_irq_handler.
> > + return request_irq(pdev->irq, do_hvm_evtchn_intr,
> > + IRQF_SAMPLE_RANDOM | IRQF_DISABLED | IRQF_NOBALANCING
> > |
> > + IRQF_TRIGGER_RISING, "xen-platform-pci", pdev);
> > +}
> > +
> > +static int __devinit platform_pci_init(struct pci_dev *pdev,
> > + const struct pci_device_id *ent)
> > +{
> > + int i, ret;
> > + long ioaddr, iolen;
> > + long mmio_addr, mmio_len;
> > + uint64_t callback_via;
> > +
> > + i = pci_enable_device(pdev);
> > + if (i)
> > + return i;
> > +
> > + ioaddr = pci_resource_start(pdev, 0);
> > + iolen = pci_resource_len(pdev, 0);
> > +
> > + mmio_addr = pci_resource_start(pdev, 1);
> > + mmio_len = pci_resource_len(pdev, 1);
> > +
> > + if (mmio_addr == 0 || ioaddr == 0) {
> > + printk(KERN_WARNING DRV_NAME ":no resources found\n");
> Uhh, you can also use 'dev_warn' - much nicer than thse printks.
>
Ok
> Shouldn't you do pci_disable_device(..) ?
>
Yes I should
> > + }
> > +
> > + platform_mmio = mmio_addr;
> > + platform_mmiolen = mmio_len;
> > +
> > + if (!xen_have_vector_callback) {
> > + xen_irq_init(pdev);
>
> Not checking the return value? What if it is -1?
I'll fix
>
> > + callback_via = get_callback_via(pdev);
> > + if ((ret = xen_set_callback_via(callback_via)))
>
> What happens with the xen_have_vector_callback and
> x86_platform_ipi_callback which were set in the previous patch?
> Will you receive a poke on do_hvm_pv_evtchn_intr and do_hvm_evtchn_intr?
>
Yes, exactly.
If xen_have_vector_callback == 1 then I receive evtchns from
do_hvm_pv_evtchn_intr, otherwise from do_hvm_evtchn_intr.
> > + out:
> > + if (ret) {
> > + release_mem_region(mmio_addr, mmio_len);
> > + release_region(ioaddr, iolen);
> No pci_disable_device() ?
I'll fix
> > + printk(KERN_DEBUG DRV_NAME "I/O protocol version %d\n", protocol);
>
> dev_dbg
>
> > +
> > + switch (protocol) {
> > + case 1:
> > + outw(XEN_IOPORT_LINUX_PRODNUM, XEN_IOPORT_PRODNUM);
> > + outl(XEN_IOPORT_LINUX_DRVVER, XEN_IOPORT_DRVVER);
> > + if (inw(XEN_IOPORT_MAGIC) != XEN_IOPORT_MAGIC_VAL) {
> > + printk(KERN_ERR DRV_NAME "blacklisted by host\n");
>
> dev_err
>
> > + no_dev:
> > + printk(KERN_WARNING DRV_NAME "failed backend handshake: %s\n", err);
> dev_warn.
the problem here is that we are called by module_init, we don't have a
struct pci_dev to use with dev_*
> Shouldn't the rc be -ENODEV?
Yes, I'll fix.
> > diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> > new file mode 100644
> > index 0000000..870e7a4
> > --- /dev/null
> > +++ b/include/xen/platform_pci.h
> > @@ -0,0 +1,40 @@
> > +/******************************************************************************
> > + * platform-pci.h
> > + *
> > + * Xen platform PCI device driver
> > + * Copyright (c) 2004, Intel Corporation. <xiaofeng.ling@xxxxxxxxx>
> > + * Copyright (c) 2007, XenSource Inc.
>
> Copyring 2010 Citrix?
>
I'll add one
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|