Re: [Xen-devel] [PATCH 4 of 6] xen pci platform device driver

Thanks for the very detailed and careful review!

On Thu, 22 Apr 2010, Konrad Rzeszutek Wilk wrote:
> 
> Pci -> PCI
> 

Ok

> > +       initializing xenbus and grant_table when running in a Xen HVM
> .. snip ..
> > +       domain. As a consequence this driver is required to run any Xen PV
> > +       frontend on Xen HVM.
> > @@ -448,6 +452,26 @@ static int gnttab_map(unsigned int start_idx, unsigned 
> > int end_idx)
> >       unsigned int nr_gframes = end_idx + 1;
> >       int rc;
> >
> > +     if (xen_hvm_domain()) {
> > +             struct xen_add_to_physmap xatp;
> > +             unsigned int i = end_idx;
> > +             rc = 0;
> > +             /*
> > +              * Loop backwards, so that the first hypercall has the largest
> > +              * index, ensuring that the table will grow only once.
> > +              */
> > +             do {
> > +                     xatp.domid = DOMID_SELF;
> > +                     xatp.idx = i;
> > +                     xatp.space = XENMAPSPACE_grant_table;
> > +                     xatp.gpfn = (hvm_pv_resume_frames >> PAGE_SHIFT) + i;
> > +                     if ((rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap, 
> > &xatp)) != 0)
> > +                             break;
> > +             } while (i-- > start_idx);
> > +
> So if the hypercall fails, should we print out a message ? Or is it OK
> to return the rc?

good catch, I actually added a printk there for debugging at some point
but I didn't commit it.

> > +             return rc;
> > +     }
> > +
> >       frames = kmalloc(nr_gframes * sizeof(unsigned long), GFP_ATOMIC);
> >       if (!frames)
> >               return -ENOMEM;
> 
> ... snip ..
> > +++ b/drivers/xen/platform-pci.c
> > @@ -0,0 +1,227 @@
> > +/******************************************************************************
> > + * platform-pci.c
> > + *
> > + * Xen platform PCI device driver
> > + * Copyright (c) 2005, Intel Corporation.
> > + * Copyright (c) 2007, XenSource Inc.
> 
> No 'Copyright (c) 2010, Citrix'?
> 

I'll add one

> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License 
> > for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along 
> > with
> > + * this program; if not, write to the Free Software Foundation, Inc., 59 
> > Temple
> > + * Place - Suite 330, Boston, MA 02111-1307 USA.
> > + *
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +#include <linux/interrupt.h>
> > +
> > +#include <asm/io.h>
> 
> I think scripts/checkpatch.pl complains about this. You did
> run script/checkpath.pl, right?
> 
> You might want also to put all of those includes in
> an alphabetical order.
> 

script/checkpatch.pl complains about a lot of things :)
I'll fix.

> > +
> > +#include <xen/grant_table.h>
> > +#include <xen/platform_pci.h>
> > +#include <xen/interface/platform_pci.h>
> > +#include <xen/xenbus.h>
> > +#include <xen/events.h>
> > +#include <xen/hvm.h>
> > +
> > +#define DRV_NAME    "xen-platform-pci"
> > +
> > +MODULE_AUTHOR("ssmith@xxxxxxxxxxxxx and stefano.stabellini@xxxxxxxxxxxxx");
> > +MODULE_DESCRIPTION("Xen platform PCI device");
> > +MODULE_LICENSE("GPL");
> > +
> > +static unsigned long platform_mmio;
> > +static unsigned long platform_mmio_alloc;
> > +static unsigned long platform_mmiolen;
> > +
> > +unsigned long alloc_xen_mmio(unsigned long len)
> > +{
> > +     unsigned long addr;
> > +
> > +     addr = platform_mmio + platform_mmio_alloc;
> > +     platform_mmio_alloc += len;
> > +     BUG_ON(platform_mmio_alloc > platform_mmiolen);
> > +
> > +     return addr;
> > +}
> > +
> > +static uint64_t get_callback_via(struct pci_dev *pdev)
> > +{
> > +     u8 pin;
> > +     int irq;
> > +
> > +#ifdef __ia64__
> > +     for (irq = 0; irq < 16; irq++) {
> > +             if (isa_irq_to_vector(irq) == pdev->irq)
> > +                     return irq; /* ISA IRQ */
> > +     }
> 
> Um, is this still neccessary? Have you tested this driver on IA64 with
> this kernel?

No I didn't and it would be difficult for me to.
However reading the code again it seems to me that the ifdef is
unnecessary here, so I'll remove it.


> > +#else /* !__ia64__ */
> > +     irq = pdev->irq;
> > +     if (irq < 16)
> > +             return irq; /* ISA IRQ */
> > +#endif
> > +     pin = pdev->pin;
> > +
> > +     /* We don't know the GSI. Specify the PCI INTx line instead. */
> > +     return (((uint64_t)0x01 << 56) | /* PCI INTx identifier */
> > +             ((uint64_t)pci_domain_nr(pdev->bus) << 32) |
> > +             ((uint64_t)pdev->bus->number << 16) |
> > +             ((uint64_t)(pdev->devfn & 0xff) << 8) |
> > +             ((uint64_t)(pin - 1) & 3));
> > +}
> > +
> > +static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
> > +{
> > +     xen_hvm_evtchn_do_upcall(get_irq_regs());
> > +     return IRQ_HANDLED;
> > +}
> > +
> > +int xen_irq_init(struct pci_dev *pdev)
> 
> The name sounds awkward. How about 'get_irq' or 'allocate_irq'?

Ok, I'll go for xen_allocate_irq.

> > +{
> > +     __set_irq_handler(pdev->irq, handle_edge_irq, 0, NULL);
> 
> Is that kosher?  I am not that familiar with how QEMU sets up the ACPI
> DSDT to figure out what the device pin (which looking at the qemu source
> is set to '1') corresponds to what IRQ.
> 
> Is it possible that with AMD-Vi or Intel VT-d you pass in a PCI device
> that would have an interrupt line that would coincide with this device?
> Or is the hvmloader + QEMU smart enought to take that under account
> and stick this device on an edge interrupt line?
> 
> Perhaps doing something like:
>         int flags =  IRQF_SAMPLE_RANDOM | IRQF_NOBALANCING;
> 
>         if (pdev->irq > 16)
>                 flags |= IRQF_SHARED;
>         else {
>                 flags |= IRQF_TRIGGER_RISING | IRQF_DISABLED;
>                 /* Whack! */
>                 __set_irq_handler(pdev->irq, handle_edge_irq, 0, NULL);
>         }
>         return request_irq(pdev->irq, do_.., flags,"xen-pla..", pdev);
> 

Yes I did that on purpose because it seems that it is the only thing
able to fix the multivcpu problem I was having: when receiving evtchns
from the pci platform device interrupt line (vector callback not
available), the guest boots fine with a single vcpu but it hangs at boot
with 2 or more vcpus. In particular it seems to deadlock in the xenbus
initialization code because of some missing interrupts/evtchns.

Switching to handle_edge_irq (even though the interrupt is supposed to
be level triggered, but it is emulated, so I guess is not important anyway)
fixes the issue for me. It is also interesting that using vector
callback the problem doesn't occur.

In the default scenario linux uses handle_fasteoi_irq that acks the
interrupt only at the end, after the call to handle_IRQ_event, that in
the evtchn case can loop several times on the pending evtchns so can
take long to complete.
On the other hand handle_edge_irq acks the interrupt right away before
calling handle_IRQ_event, that is the same behaviour of the ipi handler.
Also evtchn handlers can re-enable interrupts and that would cause
handle_edge_irq to be called again if we receive another
interrupt/evtchn, and handle_edge_irq would again ack the interrupt
right away even if it is marked as IRQ_INPROGRESS while handle_fasteoi_irq
would just exit.

I guess I could add a comment before __set_irq_handler.


> > +     return request_irq(pdev->irq, do_hvm_evtchn_intr,
> > +                     IRQF_SAMPLE_RANDOM | IRQF_DISABLED | IRQF_NOBALANCING 
> > |
> > +                     IRQF_TRIGGER_RISING, "xen-platform-pci", pdev);
> > +}
> > +
> > +static int __devinit platform_pci_init(struct pci_dev *pdev,
> > +                                    const struct pci_device_id *ent)
> > +{
> > +     int i, ret;
> > +     long ioaddr, iolen;
> > +     long mmio_addr, mmio_len;
> > +     uint64_t callback_via;
> > +
> > +     i = pci_enable_device(pdev);
> > +     if (i)
> > +             return i;
> > +
> > +     ioaddr = pci_resource_start(pdev, 0);
> > +     iolen = pci_resource_len(pdev, 0);
> > +
> > +     mmio_addr = pci_resource_start(pdev, 1);
> > +     mmio_len = pci_resource_len(pdev, 1);
> > +
> > +     if (mmio_addr == 0 || ioaddr == 0) {
> > +             printk(KERN_WARNING DRV_NAME ":no resources found\n");
> Uhh, you can also use 'dev_warn' - much nicer than thse printks.
> 

Ok

> Shouldn't you do pci_disable_device(..) ?
> 

Yes I should

> > +     }
> > +
> > +     platform_mmio = mmio_addr;
> > +     platform_mmiolen = mmio_len;
> > +
> > +     if (!xen_have_vector_callback) {
> > +             xen_irq_init(pdev);
> 
> Not checking the return value? What if it is -1?

I'll fix

> 
> > +             callback_via = get_callback_via(pdev);
> > +             if ((ret = xen_set_callback_via(callback_via)))
> 
> What happens with the xen_have_vector_callback and
> x86_platform_ipi_callback which were set in the previous patch?
> Will you receive a poke on do_hvm_pv_evtchn_intr and do_hvm_evtchn_intr?
> 

Yes, exactly.
If xen_have_vector_callback == 1 then I receive evtchns from
do_hvm_pv_evtchn_intr, otherwise from do_hvm_evtchn_intr.


> > + out:
> > +     if (ret) {
> > +             release_mem_region(mmio_addr, mmio_len);
> > +             release_region(ioaddr, iolen);
> No pci_disable_device() ?

I'll fix

> > +     printk(KERN_DEBUG DRV_NAME "I/O protocol version %d\n", protocol);
> 
> dev_dbg
> 
> > +
> > +     switch (protocol) {
> > +     case 1:
> > +             outw(XEN_IOPORT_LINUX_PRODNUM, XEN_IOPORT_PRODNUM);
> > +             outl(XEN_IOPORT_LINUX_DRVVER, XEN_IOPORT_DRVVER);
> > +             if (inw(XEN_IOPORT_MAGIC) != XEN_IOPORT_MAGIC_VAL) {
> > +                     printk(KERN_ERR DRV_NAME "blacklisted by host\n");
> 
> dev_err
> 
> > + no_dev:
> > +     printk(KERN_WARNING DRV_NAME  "failed backend handshake: %s\n", err);
> dev_warn.
    
the problem here is that we are called by module_init, we don't have a
struct pci_dev to use with dev_*
    

> Shouldn't the rc be -ENODEV?

Yes, I'll fix.


> > diff --git a/include/xen/platform_pci.h b/include/xen/platform_pci.h
> > new file mode 100644
> > index 0000000..870e7a4
> > --- /dev/null
> > +++ b/include/xen/platform_pci.h
> > @@ -0,0 +1,40 @@
> > +/******************************************************************************
> > + * platform-pci.h
> > + *
> > + * Xen platform PCI device driver
> > + * Copyright (c) 2004, Intel Corporation. <xiaofeng.ling@xxxxxxxxx>
> > + * Copyright (c) 2007, XenSource Inc.
> 
> Copyring 2010 Citrix?
> 

I'll add one

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [PATCH 4 of 6] xen pci platform device driver