WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Backport per-device vector map patches to xen 4.1.3

To: Wei Wang2 <wei.wang2@xxxxxxx>, keir@xxxxxxx
Subject: Re: [Xen-devel] Backport per-device vector map patches to xen 4.1.3
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Thu, 17 Nov 2011 11:02:44 -0500
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, Dunlap <George.Dunlap@xxxxxxxxxxxxx>, "Huang2, Wei" <Wei.Huang2@xxxxxxx>, "Shin, Jacob" <Jacob.Shin@xxxxxxx>, George@xxxxxxxxxxxxxxxxxxxx, Jan Beulich <JBeulich@xxxxxxxx>
Delivery-date: Thu, 17 Nov 2011 08:06:25 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201110271607.52785.wei.wang2@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <201110271607.52785.wei.wang2@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu, Oct 27, 2011 at 04:07:51PM +0200, Wei Wang2 wrote:
> Recently we found an issue in xen 4.1. Under heavy I/O stress such as running 
> bonnie++, Dom0 would lost its hard disk with lots of I/O errors. We found 
> that some PCI-E devices was using the same vector as SMBus on AMD platforms 
> and George' patch set that enables per-device vector map can fix this 
> problem.
> 
> Following patches have been back ported and tested by Jacob and Wei H.  
> Please apply them to Xen 4.1.3 

Can they be applied please? I've been running these without hiccups for well,
since they were posted. On both AMD and Intel boxes.

So, Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

> 
> Thanks,
> Wei
> 
> 23752 xen: Infrastructure to allow irqs to share vector maps
> 23753 xen: Option to allow per-device vector maps for MSI IRQs
> 23754 xen: AMD IOMMU: Automatically enable per-device vector maps
> 23786 x86: Fix up irq vector map logic
> 23812 xen: Add global irq_vector_map option
> 23899 AMD-IOMMU: remove dead variable references

> # HG changeset patch
> # User George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> # Date 1311701852 -3600
> # Node ID fa4e2ca9ecffbc432b451f495ad0a403644a6be8
> # Parent  2e0cf9428554da666616982cd0074024ff85b221
> xen: AMD IOMMU: Automatically enable per-device vector maps
> 
> Automatically enable per-device vector maps when using IOMMU,
> unless disabled specifically by an IOMMU parameter.
> 
> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> 
> diff -r e017a9a2f27c xen/arch/x86/irq.c
> --- a/xen/arch/x86/irq.c      Tue Jul 26 18:37:16 2011 +0100
> +++ b/xen/arch/x86/irq.c      Fri Oct 21 15:15:13 2011 -0400
> @@ -32,6 +32,7 @@
>  unsigned int __read_mostly nr_irqs;
>  integer_param("nr_irqs", nr_irqs);
>  
> +/* This default may be changed by the AMD IOMMU code */
>  bool_t __read_mostly opt_irq_perdev_vector_map = 0;
>  boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map);
>  
> diff -r e017a9a2f27c xen/drivers/passthrough/amd/pci_amd_iommu.c
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c     Tue Jul 26 18:37:16 
> 2011 +0100
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c     Fri Oct 21 15:15:13 
> 2011 -0400
> @@ -24,6 +24,9 @@
>  #include <asm/hvm/iommu.h>
>  #include <asm/amd-iommu.h>
>  #include <asm/hvm/svm/amd-iommu-proto.h>
> +
> +extern bool_t __read_mostly opt_irq_perdev_vector_map;
> +extern bool_t __read_mostly iommu_amd_perdev_vector_map;
>  
>  extern unsigned short ivrs_bdf_entries;
>  extern struct ivrs_mappings *ivrs_mappings;
> @@ -164,6 +167,18 @@
>      {
>          printk("AMD-Vi: Error initialization\n");
>          return -ENODEV;
> +    }
> +
> +    /* Enable use of per-device vector map unless otherwise
> +     * specified */
> +    if ( iommu_amd_perdev_vector_map )
> +    {
> +        printk("AMD-Vi: Enabling per-device vector maps\n");
> +        opt_irq_perdev_vector_map=1;
> +    }
> +    else
> +    {
> +        printk("AMD-Vi: WARNING - not enabling per-device vector maps\n");
>      }
>  
>      return scan_pci_devices();
> diff -r e017a9a2f27c xen/drivers/passthrough/iommu.c
> --- a/xen/drivers/passthrough/iommu.c Tue Jul 26 18:37:16 2011 +0100
> +++ b/xen/drivers/passthrough/iommu.c Fri Oct 21 15:15:13 2011 -0400
> @@ -49,6 +49,7 @@
>  bool_t __read_mostly iommu_intremap = 1;
>  bool_t __read_mostly iommu_hap_pt_share;
>  bool_t __read_mostly amd_iommu_debug;
> +bool_t __read_mostly iommu_amd_perdev_vector_map = 1;
>  bool_t __read_mostly amd_iommu_perdev_intremap;
>  
>  static void __init parse_iommu_param(char *s)
> @@ -84,6 +85,8 @@
>              iommu_dom0_strict = 1;
>          else if ( !strcmp(s, "sharept") )
>              iommu_hap_pt_share = 1;
> +        else if ( !strcmp(s, "no-perdev-vector-map") )
> +            iommu_amd_perdev_vector_map = 0;
>  
>          s = ss + 1;
>      } while ( ss );

> # HG changeset patch
> # User George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> # Date 1311701836 -3600
> # Node ID 2e0cf9428554da666616982cd0074024ff85b221
> # Parent  ef9ed3d2aa870a37ed5e611be9c524d526a2d604
> xen: Option to allow per-device vector maps for MSI IRQs
> 
> Add a vector-map to pci_dev, and add an option to point MSI-related
> IRQs to the vector-map of the device.
> 
> This prevents irqs from the same device from being assigned
> the same vector on different pcpus.  This is required for systems
> using an AMD IOMMU, since the intremap tables on AMD only look at
> vector, and not destination ID.
> 
> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> 
> diff -r 84b8504a6125 xen/arch/x86/irq.c
> --- a/xen/arch/x86/irq.c      Tue Jul 26 18:36:58 2011 +0100
> +++ b/xen/arch/x86/irq.c      Fri Oct 21 15:12:46 2011 -0400
> @@ -31,6 +31,9 @@
>  unsigned int __read_mostly nr_irqs_gsi = 16;
>  unsigned int __read_mostly nr_irqs;
>  integer_param("nr_irqs", nr_irqs);
> +
> +bool_t __read_mostly opt_irq_perdev_vector_map = 0;
> +boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map);
>  
>  u8 __read_mostly *irq_vector;
>  struct irq_desc __read_mostly *irq_desc = NULL;
> @@ -1560,6 +1563,9 @@
>              dprintk(XENLOG_G_ERR, "dom%d: irq %d in use\n",
>                d->domain_id, irq);
>          desc->handler = &pci_msi_type;
> +        if ( opt_irq_perdev_vector_map
> +             && !desc->chip_data->used_vectors )
> +            desc->chip_data->used_vectors = &pdev->info.used_vectors;
>          d->arch.pirq_irq[pirq] = irq;
>          d->arch.irq_pirq[irq] = pirq;
>          setup_msi_irq(pdev, msi_desc, irq);
> diff -r 84b8504a6125 xen/include/xen/pci.h
> --- a/xen/include/xen/pci.h   Tue Jul 26 18:36:58 2011 +0100
> +++ b/xen/include/xen/pci.h   Fri Oct 21 15:12:46 2011 -0400
> @@ -11,6 +11,7 @@
>  #include <xen/types.h>
>  #include <xen/list.h>
>  #include <xen/spinlock.h>
> +#include <xen/irq.h>
>  
>  /*
>   * The PCI interface treats multi-function devices as independent
> @@ -38,6 +39,7 @@
>          u8 bus;
>          u8 devfn;
>      } physfn;
> +   vmask_t used_vectors; 
>  };
>  
>  struct pci_dev {

> # HG changeset patch
> # User George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> # Date 1311701818 -3600
> # Node ID ef9ed3d2aa870a37ed5e611be9c524d526a2d604
> # Parent  590aadf7c46ae979da3552332f592f9492ce6d8b
> xen: Infrastructure to allow irqs to share vector maps
> 
> Laying the groundwork for per-device vector maps.  This generic
> code allows any irq to point to a vector map; all irqs sharing the
> same vector map will avoid sharing vectors.
> 
> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> 
> diff -r 590aadf7c46a -r ef9ed3d2aa87 xen/arch/x86/io_apic.c
> --- a/xen/arch/x86/io_apic.c  Tue Jul 26 17:00:25 2011 +0100
> +++ b/xen/arch/x86/io_apic.c  Tue Jul 26 18:36:58 2011 +0100
> @@ -449,6 +449,11 @@
>                   irq, vector, smp_processor_id());
>  
>          __get_cpu_var(vector_irq)[vector] = -1;
> +        if ( cfg->used_vectors )
> +        {
> +            ASSERT(test_bit(vector, cfg->used_vectors));
> +            clear_bit(vector, cfg->used_vectors);
> +        }
>          cfg->move_cleanup_count--;
>  unlock:
>          spin_unlock(&desc->lock);
> diff -r 590aadf7c46a -r ef9ed3d2aa87 xen/arch/x86/irq.c
> --- a/xen/arch/x86/irq.c      Tue Jul 26 17:00:25 2011 +0100
> +++ b/xen/arch/x86/irq.c      Tue Jul 26 18:36:58 2011 +0100
> @@ -108,6 +108,8 @@
>          per_cpu(vector_irq, cpu)[vector] = irq;
>      cfg->vector = vector;
>      cfg->cpu_mask = online_mask;
> +    if ( cfg->used_vectors )
> +        set_bit(vector, cfg->used_vectors);
>      irq_status[irq] = IRQ_USED;
>      if (IO_APIC_IRQ(irq))
>          irq_vector[irq] = vector;
> @@ -172,6 +174,7 @@
>      desc->depth   = 1;
>      desc->msi_desc = NULL;
>      desc->handler = &no_irq_type;
> +    desc->chip_data->used_vectors=NULL;
>      cpus_setall(desc->affinity);
>      spin_unlock_irqrestore(&desc->lock, flags);
>  
> @@ -199,6 +202,9 @@
>  
>      for_each_cpu_mask(cpu, tmp_mask)
>          per_cpu(vector_irq, cpu)[vector] = -1;
> +
> +    if ( cfg->used_vectors )
> +        clear_bit(vector, cfg->used_vectors);
>  
>      cfg->vector = IRQ_VECTOR_UNASSIGNED;
>      cpus_clear(cfg->cpu_mask);
> @@ -277,6 +283,7 @@
>      cfg->vector = IRQ_VECTOR_UNASSIGNED;
>      cpus_clear(cfg->cpu_mask);
>      cpus_clear(cfg->old_cpu_mask);
> +    cfg->used_vectors = NULL;
>  }
>  
>  int __init init_irq_data(void)
> @@ -402,6 +409,10 @@
>          if (test_bit(vector, used_vectors))
>              goto next;
>  
> +        if (cfg->used_vectors
> +            && test_bit(vector, cfg->used_vectors) )
> +            goto next;
> +
>          for_each_cpu_mask(new_cpu, tmp_mask)
>              if (per_cpu(vector_irq, new_cpu)[vector] != -1)
>                  goto next;
> @@ -417,6 +428,11 @@
>              per_cpu(vector_irq, new_cpu)[vector] = irq;
>          cfg->vector = vector;
>          cpus_copy(cfg->cpu_mask, tmp_mask);
> +        if ( cfg->used_vectors )
> +        {
> +            ASSERT(!test_bit(vector, cfg->used_vectors));
> +            set_bit(vector, cfg->used_vectors);
> +        }
>  
>          irq_status[irq] = IRQ_USED;
>              if (IO_APIC_IRQ(irq))
> diff -r 590aadf7c46a -r ef9ed3d2aa87 xen/include/asm-x86/irq.h
> --- a/xen/include/asm-x86/irq.h       Tue Jul 26 17:00:25 2011 +0100
> +++ b/xen/include/asm-x86/irq.h       Tue Jul 26 18:36:58 2011 +0100
> @@ -24,11 +24,16 @@
>  #define irq_to_desc(irq)    (&irq_desc[irq])
>  #define irq_cfg(irq)        (&irq_cfg[irq])
>  
> +typedef struct {
> +    DECLARE_BITMAP(_bits,NR_VECTORS);
> +} vmask_t;
> +
>  struct irq_cfg {
>          int  vector;
>          cpumask_t cpu_mask;
>          cpumask_t old_cpu_mask;
>          unsigned move_cleanup_count;
> +        vmask_t *used_vectors;
>          u8 move_in_progress : 1;
>  };
>  

> # HG changeset patch
> # User George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> # Date 1314026133 -3600
> # Node ID 3a05da2dc7c0a5fc0fcfc40c535d1fcb71203625
> # Parent  d1cd78a73a79e0e648937322cdb8d92a7f86327a
> x86: Fix up irq vector map logic
> 
> We need to make sure that cfg->used_vector is only cleared once;
> otherwise there may be a race condition that allows the same vector to
> be assigned twice, defeating the whole purpose of the map.
> 
> This makes two changes:
> * __clear_irq_vector() only clears the vector if the irq is not being
> moved
> * smp_iqr_move_cleanup_interrupt() only clears used_vector if this
> is the last place it's being used (move_cleanup_count==0 after
> decrement).
> 
> Also make use of asserts more consistent, to catch this kind of logic
> bug in the future.
> 
> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> 
> diff -r d1cd78a73a79 -r 3a05da2dc7c0 xen/arch/x86/io_apic.c
> --- a/xen/arch/x86/io_apic.c  Mon Aug 22 16:15:19 2011 +0100
> +++ b/xen/arch/x86/io_apic.c  Mon Aug 22 16:15:33 2011 +0100
> @@ -485,12 +485,14 @@
>                   irq, vector, smp_processor_id());
>  
>          __get_cpu_var(vector_irq)[vector] = -1;
> -        if ( cfg->used_vectors )
> +        cfg->move_cleanup_count--;
> +
> +        if ( cfg->move_cleanup_count == 0 
> +             &&  cfg->used_vectors )
>          {
>              ASSERT(test_bit(vector, cfg->used_vectors));
>              clear_bit(vector, cfg->used_vectors);
>          }
> -        cfg->move_cleanup_count--;
>  unlock:
>          spin_unlock(&desc->lock);
>      }
> diff -r d1cd78a73a79 -r 3a05da2dc7c0 xen/arch/x86/irq.c
> --- a/xen/arch/x86/irq.c      Mon Aug 22 16:15:19 2011 +0100
> +++ b/xen/arch/x86/irq.c      Mon Aug 22 16:15:33 2011 +0100
> @@ -113,7 +113,10 @@
>      cfg->vector = vector;
>      cfg->cpu_mask = online_mask;
>      if ( cfg->used_vectors )
> +    {
> +        ASSERT(!test_bit(vector, cfg->used_vectors));
>          set_bit(vector, cfg->used_vectors);
> +    }
>      irq_status[irq] = IRQ_USED;
>      if (IO_APIC_IRQ(irq))
>          irq_vector[irq] = vector;
> @@ -207,15 +210,13 @@
>      for_each_cpu_mask(cpu, tmp_mask)
>          per_cpu(vector_irq, cpu)[vector] = -1;
>  
> -    if ( cfg->used_vectors )
> -        clear_bit(vector, cfg->used_vectors);
> -
>      cfg->vector = IRQ_VECTOR_UNASSIGNED;
>      cpus_clear(cfg->cpu_mask);
>      init_one_irq_status(irq);
>  
>      if (likely(!cfg->move_in_progress))
>          return;
> +
>      cpus_and(tmp_mask, cfg->old_cpu_mask, cpu_online_map);
>      for_each_cpu_mask(cpu, tmp_mask) {
>          for (vector = FIRST_DYNAMIC_VECTOR; vector <= LAST_DYNAMIC_VECTOR;
> @@ -228,6 +229,12 @@
>               break;
>          }
>       }
> +
> +    if ( cfg->used_vectors )
> +    {
> +        ASSERT(test_bit(vector, cfg->used_vectors));
> +        clear_bit(vector, cfg->used_vectors);
> +    }
>  
>      cfg->move_in_progress = 0;
>  }

> # HG changeset patch
> # User Jan Beulich <jbeulich@xxxxxxxx>
> # Date 1317730316 -7200
> # Node ID a99d75671a911f9c0d5d11e0fe88a0a65863cb44
> # Parent  3d1664cc9e458809e399320204aca8536e401ee1
> AMD-IOMMU: remove dead variable references
> 
> These got orphaned up by recent changes.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> Acked-by: Keir Fraser <keir@xxxxxxx>
> 
> diff -r 599cf097900b xen/drivers/passthrough/amd/pci_amd_iommu.c
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c     Mon Sep 05 15:00:15 
> 2011 +0100
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c     Fri Oct 21 15:39:30 
> 2011 -0400
> @@ -24,9 +24,6 @@
>  #include <asm/hvm/iommu.h>
>  #include <asm/amd-iommu.h>
>  #include <asm/hvm/svm/amd-iommu-proto.h>
> -
> -extern bool_t __read_mostly opt_irq_perdev_vector_map;
> -extern bool_t __read_mostly iommu_amd_perdev_vector_map;
>  
>  extern unsigned short ivrs_bdf_entries;
>  extern struct ivrs_mappings *ivrs_mappings;
> diff -r 599cf097900b xen/drivers/passthrough/iommu.c
> --- a/xen/drivers/passthrough/iommu.c Mon Sep 05 15:00:15 2011 +0100
> +++ b/xen/drivers/passthrough/iommu.c Fri Oct 21 15:39:30 2011 -0400
> @@ -49,7 +49,6 @@
>  bool_t __read_mostly iommu_intremap = 1;
>  bool_t __read_mostly iommu_hap_pt_share;
>  bool_t __read_mostly amd_iommu_debug;
> -bool_t __read_mostly iommu_amd_perdev_vector_map = 1;
>  bool_t __read_mostly amd_iommu_perdev_intremap;
>  
>  static void __init parse_iommu_param(char *s)
> @@ -85,8 +84,6 @@
>              iommu_dom0_strict = 1;
>          else if ( !strcmp(s, "sharept") )
>              iommu_hap_pt_share = 1;
> -        else if ( !strcmp(s, "no-perdev-vector-map") )
> -            iommu_amd_perdev_vector_map = 0;
>  
>          s = ss + 1;
>      } while ( ss );

> # HG changeset patch
> # User George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> # Date 1315231215 -3600
> # Node ID 32814ad7458dc842a7c588eee13e5c4ee11709a3
> # Parent  f1349a968a5ac5577d67ad4a3f3490c580dbe264
> xen: Add global irq_vector_map option, set if using AMD global intremap tables
> 
> As mentioned in previous changesets, AMD IOMMU interrupt
> remapping tables only look at the vector, not the destination
> id of an interrupt.  This means that all IRQs going through
> the same interrupt remapping table need to *not* share vectors.
> 
> The irq "vector map" functionality was originally introduced
> after a patch which disabled global AMD IOMMUs entirely.  That
> patch has since been reverted, meaning that AMD intremap tables
> can either be per-device or global.
> 
> This patch therefore introduces a global irq vector map option,
> and enables it if we're using an AMD IOMMU with a global
> interrupt remapping table.
> 
> This patch removes the "irq-perdev-vector-map" boolean
> command-line optino and replaces it with "irq_vector_map",
> which can have one of three values: none, global, or per-device.
> 
> Setting the irq_vector_map to any value will override the
> default that the AMD code sets.
> 
> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> 
> diff -r 6fc00d52c179 docs/src/user.tex
> --- a/docs/src/user.tex       Mon Aug 22 16:15:33 2011 +0100
> +++ b/docs/src/user.tex       Fri Oct 21 15:21:23 2011 -0400
> @@ -4197,6 +4197,10 @@
>  \item [ vcpu\_migration\_delay=$<$minimum\_time$>$] Set minimum time of 
>    vcpu migration in microseconds (default 0). This parameter avoids agressive
>    vcpu migration. For example, the linux kernel uses 0.5ms by default.
> +\item [ irq_vector_map=xxx ] Enable irq vector non-sharing maps.  Setting 
> 'global' 
> +  will ensure that no  IRQs will share vectors.  Setting 'per-device' will 
> ensure 
> +  that no IRQs from the same device will share vectors.  Setting to 'none' 
> will
> +  disable it entirely, overriding any defaults the IOMMU code may set.
>  \end{description}
>  
>  In addition, the following options may be specified on the Xen command
> diff -r 6fc00d52c179 xen/arch/x86/irq.c
> --- a/xen/arch/x86/irq.c      Mon Aug 22 16:15:33 2011 +0100
> +++ b/xen/arch/x86/irq.c      Fri Oct 21 15:21:23 2011 -0400
> @@ -24,6 +24,8 @@
>  #include <asm/mach-generic/mach_apic.h>
>  #include <public/physdev.h>
>  
> +static void parse_irq_vector_map_param(char *s);
> +
>  /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */
>  bool_t __read_mostly opt_noirqbalance = 0;
>  boolean_param("noirqbalance", opt_noirqbalance);
> @@ -33,8 +35,10 @@
>  integer_param("nr_irqs", nr_irqs);
>  
>  /* This default may be changed by the AMD IOMMU code */
> -bool_t __read_mostly opt_irq_perdev_vector_map = 0;
> -boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map);
> +int __read_mostly opt_irq_vector_map = OPT_IRQ_VECTOR_MAP_DEFAULT;
> +custom_param("irq_vector_map", parse_irq_vector_map_param);
> +
> +vmask_t global_used_vector_map;
>  
>  u8 __read_mostly *irq_vector;
>  struct irq_desc __read_mostly *irq_desc = NULL;
> @@ -63,6 +67,26 @@
>  /* irq_ratelimit: the max irq rate allowed in every 10ms, set 0 to disable */
>  static unsigned int __read_mostly irq_ratelimit_threshold = 10000;
>  integer_param("irq_ratelimit", irq_ratelimit_threshold);
> +
> +static void __init parse_irq_vector_map_param(char *s)
> +{
> +    char *ss;
> +
> +    do {
> +        ss = strchr(s, ',');
> +        if ( ss )
> +            *ss = '\0';
> +
> +        if ( !strcmp(s, "none"))
> +            opt_irq_vector_map=OPT_IRQ_VECTOR_MAP_NONE;
> +        else if ( !strcmp(s, "global"))
> +            opt_irq_vector_map=OPT_IRQ_VECTOR_MAP_GLOBAL;
> +        else if ( !strcmp(s, "per-device"))
> +            opt_irq_vector_map=OPT_IRQ_VECTOR_MAP_PERDEV;
> +
> +        s = ss + 1;
> +    } while ( ss );
> +}
>  
>  /* Must be called when irq disabled */
>  void lock_vector_lock(void)
> @@ -348,6 +372,41 @@
>      end_none
>  };
>  
> +static vmask_t *irq_get_used_vector_mask(int irq)
> +{
> +    vmask_t *ret = NULL;
> +
> +    if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_GLOBAL )
> +    {
> +        struct irq_desc *desc = irq_to_desc(irq);
> +
> +        ret = &global_used_vector_map;
> +
> +        if ( desc->chip_data->used_vectors )
> +        {
> +            printk(XENLOG_INFO "%s: Strange, unassigned irq %d already has 
> used_vectors!\n",
> +                   __func__, irq);
> +        }
> +        else
> +        {
> +            int vector;
> +            
> +            vector = irq_to_vector(irq);
> +            if ( vector > 0 )
> +            {
> +                printk(XENLOG_INFO "%s: Strange, irq %d already assigned 
> vector %d!\n",
> +                       __func__, irq, vector);
> +                
> +                ASSERT(!test_bit(vector, ret));
> +
> +                set_bit(vector, ret);
> +            }
> +        }
> +    }
> +
> +    return ret;
> +}
> +
>  int __assign_irq_vector(int irq, struct irq_cfg *cfg, const cpumask_t *mask)
>  {
>      /*
> @@ -366,6 +425,7 @@
>      int cpu, err;
>      unsigned long flags;
>      cpumask_t tmp_mask;
> +    vmask_t *irq_used_vectors = NULL;
>  
>      old_vector = irq_to_vector(irq);
>      if (old_vector) {
> @@ -380,6 +440,17 @@
>          return -EAGAIN;
>  
>      err = -ENOSPC;
> +
> +    /* This is the only place normal IRQs are ever marked
> +     * as "in use".  If they're not in use yet, check to see
> +     * if we need to assign a global vector mask. */
> +    if ( irq_status[irq] == IRQ_USED )
> +    {
> +        irq_used_vectors = cfg->used_vectors;
> +    }
> +    else
> +        irq_used_vectors = irq_get_used_vector_mask(irq);
> +
>      for_each_cpu_mask(cpu, *mask) {
>          int new_cpu;
>          int vector, offset;
> @@ -405,8 +476,8 @@
>          if (test_bit(vector, used_vectors))
>              goto next;
>  
> -        if (cfg->used_vectors
> -            && test_bit(vector, cfg->used_vectors) )
> +        if (irq_used_vectors
> +            && test_bit(vector, irq_used_vectors) )
>              goto next;
>  
>          for_each_cpu_mask(new_cpu, tmp_mask)
> @@ -424,15 +495,22 @@
>              per_cpu(vector_irq, new_cpu)[vector] = irq;
>          cfg->vector = vector;
>          cpus_copy(cfg->cpu_mask, tmp_mask);
> +
> +        irq_status[irq] = IRQ_USED;
> +        ASSERT((cfg->used_vectors == NULL)
> +               || (cfg->used_vectors == irq_used_vectors));
> +        cfg->used_vectors = irq_used_vectors;
> +
> +        if (IO_APIC_IRQ(irq))
> +            irq_vector[irq] = vector;
> +
>          if ( cfg->used_vectors )
>          {
>              ASSERT(!test_bit(vector, cfg->used_vectors));
> +
>              set_bit(vector, cfg->used_vectors);
>          }
>  
> -        irq_status[irq] = IRQ_USED;
> -            if (IO_APIC_IRQ(irq))
> -                    irq_vector[irq] = vector;
>          err = 0;
>          local_irq_restore(flags);
>          break;
> @@ -1521,7 +1599,7 @@
>  
>      if ( !IS_PRIV(current->domain) &&
>           !(IS_PRIV_FOR(current->domain, d) &&
> -          irq_access_permitted(current->domain, pirq)))
> +           irq_access_permitted(current->domain, pirq)))
>          return -EPERM;
>  
>      if ( pirq < 0 || pirq >= d->nr_pirqs || irq < 0 || irq >= nr_irqs )
> @@ -1569,11 +1647,21 @@
>  
>          if ( desc->handler != &no_irq_type )
>              dprintk(XENLOG_G_ERR, "dom%d: irq %d in use\n",
> -              d->domain_id, irq);
> +                    d->domain_id, irq);
>          desc->handler = &pci_msi_type;
> -        if ( opt_irq_perdev_vector_map
> +        if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_PERDEV
>               && !desc->chip_data->used_vectors )
> +        {
>              desc->chip_data->used_vectors = &pdev->info.used_vectors;
> +            if ( desc->chip_data->vector != IRQ_VECTOR_UNASSIGNED )
> +            {
> +                int vector = desc->chip_data->vector;
> +                ASSERT(!test_bit(vector, desc->chip_data->used_vectors));
> +
> +                set_bit(vector, desc->chip_data->used_vectors);
> +            }
> +        }
> +
>          d->arch.pirq_irq[pirq] = irq;
>          d->arch.irq_pirq[irq] = pirq;
>          setup_msi_irq(pdev, msi_desc, irq);
> @@ -1584,9 +1672,12 @@
>          d->arch.pirq_irq[pirq] = irq;
>          d->arch.irq_pirq[irq] = pirq;
>          spin_unlock_irqrestore(&desc->lock, flags);
> +
> +        if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_PERDEV )
> +            printk(XENLOG_INFO "Per-device vector maps for GSIs not 
> implemented yet.\n");
>      }
>  
> - done:
> +done:
>      return ret;
>  }
>  
> diff -r 6fc00d52c179 xen/drivers/passthrough/amd/pci_amd_iommu.c
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c     Mon Aug 22 16:15:33 
> 2011 +0100
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c     Fri Oct 21 15:21:23 
> 2011 -0400
> @@ -169,18 +169,35 @@
>          return -ENODEV;
>      }
>  
> -    /* Enable use of per-device vector map unless otherwise
> -     * specified */
> -    if ( iommu_amd_perdev_vector_map )
> +    /*
> +     * AMD IOMMUs don't distinguish between vectors destined for
> +     * different cpus when doing interrupt remapping.  This means
> +     * that interrupts going through the same intremap table
> +     * can't share the same vector.
> +     *
> +     * If irq_vector_map isn't specified, choose a sensible default:
> +     * - If we're using per-device interemap tables, per-device
> +     *   vector non-sharing maps
> +     * - If we're using a global interemap table, global vector
> +     *   non-sharing map
> +     */
> +    if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_DEFAULT )
>      {
> -        printk("AMD-Vi: Enabling per-device vector maps\n");
> -        opt_irq_perdev_vector_map=1;
> +        if ( amd_iommu_perdev_intremap )
> +        {
> +            printk("AMD-Vi: Enabling per-device vector maps\n");
> +            opt_irq_vector_map = OPT_IRQ_VECTOR_MAP_PERDEV;
> +        }
> +        else
> +        {
> +            printk("AMD-Vi: Enabling global vector map\n");
> +            opt_irq_vector_map = OPT_IRQ_VECTOR_MAP_GLOBAL;
> +        }
>      }
>      else
>      {
> -        printk("AMD-Vi: WARNING - not enabling per-device vector maps\n");
> +        printk("AMD-Vi: Not overriding irq_vector_map setting\n");
>      }
> -
>      return scan_pci_devices();
>  }
>  
> diff -r 6fc00d52c179 xen/include/asm-x86/irq.h
> --- a/xen/include/asm-x86/irq.h       Mon Aug 22 16:15:33 2011 +0100
> +++ b/xen/include/asm-x86/irq.h       Fri Oct 21 15:21:23 2011 -0400
> @@ -44,6 +44,13 @@
>  extern u8 *irq_vector;
>  
>  extern bool_t opt_noirqbalance;
> +
> +#define OPT_IRQ_VECTOR_MAP_DEFAULT 0 /* Do the default thing  */
> +#define OPT_IRQ_VECTOR_MAP_NONE    1 /* None */ 
> +#define OPT_IRQ_VECTOR_MAP_GLOBAL  2 /* One global vector map (no vector 
> sharing) */ 
> +#define OPT_IRQ_VECTOR_MAP_PERDEV  3 /* Per-device vetor map (no vector 
> sharing w/in a device) */
> +
> +extern int opt_irq_vector_map;
>  
>  /*
>   * Per-cpu current frame pointer - the location of the last exception frame 
> on

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>