| 
         
xen-devel
RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem
 
| 
     I just checked the "xen dmesg". Look 
like DMA/iommu is the root cause of this issue. In order to tell the source of 
interrupt, Tachyon chip needs to do the DMA write to a dword memory location to 
indicate the source of interrupt. What iommu option do you recommend to use 
? 
  
(XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending 
Fault (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0] 
fault addr c00000, iommu reg = ffff82c3fff57000 (XEN) DMAR:[fault 
reason 05h] PTE Write access is not set (XEN) print_vtd_entries: iommu = 
ffff83019fffa370 bdf = 7:0.0 gmfn = c00 (XEN)     
root_entry = ffff83019ff70000 (XEN)     root_entry[7] = 
19cf52001 (XEN)     context = 
ffff83019cf52000 (XEN)     context[0] = 
102_706dc005 (XEN)     l4 = 
ffff8300706dc000 (XEN)     l4_index = 
0 (XEN)     l4[0] = 
706db003 (XEN)     l3 = 
ffff8300706db000 (XEN)     l3_index = 
0 (XEN)     l3[0] = 
706da003 (XEN)     l2 = 
ffff8300706da000 (XEN)     l2_index = 
6 (XEN)     l2[6] = 
0                                                                                      
  
-Ray 
   
On Mon, Sep 27, 2010 at 8:26 PM, Jiang, Yunhong  <yunhong.jiang@xxxxxxxxx> 
wrote:
 
  
  
  "xm dmesg" should gives xen's boot 
  log, and sometimes it contain some helpful information, I think, especially 
  loglvl and guest_loglvl is set to all. 
  
    
  
I looked at the xm dmesg output and there's nothing more than what I 
already provided, aside from a bunch of commands from me poking at it. 
  
-Bruce 
  
  
  
    
  Thanks 
  --jyh 
    
  
  
  
  
  
    
  
  On Mon, Sep 27, 2010 at 6:15 PM, Jiang, 
  Yunhong <yunhong.jiang@xxxxxxxxx> wrote: 
  Is the 07:0.0 your tachyon device? The 
  VT-d fault is suspcious. 
  
  
  Yes, there is 1 quad port card is this 
  sytem:  
  
  
  
  07:00.0 Fibre Channel: PMC-Sierra Inc. 
  Device 8032 (rev 08)  
  
  07:00.1 Fibre Channel: PMC-Sierra Inc. 
  Device 8032 (rev 08)  
  
  07:00.2 Fibre Channel: PMC-Sierra Inc. 
  Device 8032 (rev 08)  
  
  07:00.3 Fibre Channel: PMC-Sierra Inc. 
  Device 8032 (rev 08)   
  
  
  
    Also is it possible to share the xen 
    output?  
  
  
  I attached the dom0 boot output. Let me 
  know if you wanted something else.  
  
  
  Also, here's the dom0 console output upon 
  starting the VM: This lockdep error started with the release of 2.6.32.21. 
  Note that I'm running the same  kernel for the domU and 
  dom0.  
  
  
  
  [ 1817.684097] ------------[ cut here 
  ]------------  
  
  [ 1817.684113] WARNING: at 
  kernel/lockdep.c:2323 trace_hardirqs_on_caller+0x12f/0x190()  
  
  [ 1817.684119] Hardware name: ProLiant 
  DL380 G6  
  
  [ 1817.684122] Modules linked in: 
  xt_physdev ipv6 osa_mfgdom0 xenfs xen_gntdev fbcon tileblit font bitblit 
  softcursor xen_evtchn xen_pciback radeon ttm drm_kms_helper tun drm 
  i2c_algo_bit ipmi_si i2c_core ipmi_msghandler joydev serio_raw hpwdt hpilo 
  bridge stp llc usbhid hid cciss usb_storage  
  
  [ 1817.684190] Pid: 11, comm: xenwatch Not 
  tainted 2.6.32.21-xenoprof-1 #1  
  
  [ 1817.684195] Call 
Trace:  
  
  [ 1817.684197]  <IRQ> 
   [<ffffffff810aa18f>] ? 
  trace_hardirqs_on_caller+0x12f/0x190  
  
  [ 1817.684209] 
   [<ffffffff8106bed0>] 
  warn_slowpath_common+0x80/0xd0  
  
  [ 1817.684217] 
   [<ffffffff815f2b80>] ? _spin_unlock_irq+0x30/0x40  
  
  [ 1817.684223] 
   [<ffffffff8106bf34>] warn_slowpath_null+0x14/0x20  
  
  [ 1817.684229] 
   [<ffffffff810aa18f>] 
  trace_hardirqs_on_caller+0x12f/0x190  
  
  [ 1817.684234] 
   [<ffffffff810aa1fd>] trace_hardirqs_on+0xd/0x10  
  
  [ 1817.684240] 
   [<ffffffff815f2b80>] _spin_unlock_irq+0x30/0x40  
  
  [ 1817.684266] 
   [<ffffffff813c4fc5>] 
  add_to_net_schedule_list_tail+0x85/0xd0  
  
  [ 1817.684271] 
   [<ffffffff813c6216>] netif_be_int+0x36/0x160  
  
  [ 1817.684278] 
   [<ffffffff810e10d0>] handle_IRQ_event+0x70/0x180  
  
  [ 1817.684284] 
   [<ffffffff810e36e9>] handle_edge_irq+0xc9/0x170  
  
  [ 1817.684291] 
   [<ffffffff813b8d7f>] 
  __xen_evtchn_do_upcall+0x1bf/0x1f0  
  
  [ 1817.684297] 
   [<ffffffff813b92fd>] 
  xen_evtchn_do_upcall+0x3d/0x60  
  
  [ 1817.684304] 
   [<ffffffff8101647e>] 
  xen_do_hypervisor_callback+0x1e/0x30  
  
  [ 1817.684308]  <EOI> 
   [<ffffffff8100940a>] ? 
hypercall_page+0x40a/0x1010  
  
  [ 1817.684319] 
   [<ffffffff8100940a>] ? 
hypercall_page+0x40a/0x1010  
  
  [ 1817.684325] 
   [<ffffffff813bce54>] ? xb_write+0x1e4/0x290  
  
  [ 1817.684330] 
   [<ffffffff813bd8ca>] ? xs_talkv+0x6a/0x1f0  
  
  [ 1817.684336] 
   [<ffffffff813bd8d8>] ? xs_talkv+0x78/0x1f0  
  
  [ 1817.684341] 
   [<ffffffff813bdbcd>] ? xs_single+0x4d/0x60  
  
  [ 1817.684346] 
   [<ffffffff813be502>] ? xenbus_read+0x52/0x80  
  
  [ 1817.684352] 
   [<ffffffff813c87fc>] ? 
  frontend_changed+0x48c/0x770  
  
  [ 1817.684358] 
   [<ffffffff813bf76d>] ? 
  xenbus_otherend_changed+0xdd/0x1b0  
  
  [ 1817.684365] 
   [<ffffffff8101122f>] ? 
  xen_restore_fl_direct_end+0x0/0x1  
  
  [ 1817.684371] 
   [<ffffffff810ac830>] ? lock_release+0xb0/0x230  
  
  [ 1817.684376] 
   [<ffffffff813bfae0>] ? frontend_changed+0x10/0x20  
  
  [ 1817.684382] 
   [<ffffffff813bd4f5>] ? xenwatch_thread+0x55/0x160  
  
  [ 1817.684389] 
   [<ffffffff81093400>] ? 
  autoremove_wake_function+0x0/0x40  
  
  [ 1817.684394] 
   [<ffffffff813bd4a0>] ? xenwatch_thread+0x0/0x160  
  
  [ 1817.684400] 
   [<ffffffff81093086>] ? kthread+0x96/0xb0  
  
  [ 1817.684405] 
   [<ffffffff8101632a>] ? child_rip+0xa/0x20  
  
  [ 1817.684410] 
   [<ffffffff81015c90>] ? restore_args+0x0/0x30  
  
  [ 1817.684415] 
   [<ffffffff81016320>] ? child_rip+0x0/0x20   
  
  
  
  
  
     Thanks --jyh 
    
    
    >Subject: Re: [Xen-devel] pv-ops domU 
    not working with MSI interrupts on Nehalem >  
    
    
    >On Mon, Sep 27, 2010 at 12:54 PM, 
    Konrad Rzeszutek Wilk ><konrad.wilk@xxxxxxxxxx> wrote: >> On Mon, Sep 
    27, 2010 at 12:16:50PM -0700, Bruce Edge wrote: >>> On Mon, Sep 
    27, 2010 at 10:24 AM, Konrad Rzeszutek Wilk >>> <konrad.wilk@xxxxxxxxxx> wrote: >>> 
    > >>> > On Mon, Sep 27, 2010 at 08:52:39AM -0700, Bruce 
    Edge wrote: >>> > > One of our developers who is working 
    on a tachyon driver is >>> > > complaining that the pvops 
    domU kernel is not working for these MSI >>> > > 
    interrupts. >>> > > This is using the current head of 
    xen/2.6.32.x on both a single >>> > > Nahelam 920 and a 
    dual E5540. This behavior is consistent with Xen >>> > > 
    4.0.1, 4.0.2.rc1-pre and 4.1. >>> > > >>> > 
    > Here are his comments: >>> > > >>> > 
    > - the driver has no problem to enable msi interrupt and request 
    the >>> > > interrupt through kernel functions 
    pci_enable_msi & request_irq >>> > >>> > 
    What shows up in the Xen console when you send the 'q' key? Does 
    it >>> > show that the vector is assigned to the appropiate 
    guest? >>> >>> The Xen console q key shows that the 
    domU is assigned: >>> >>> (XEN)     
    Interrupts { 32, 41-42, 47 } >> >> 
    Aha! >> >>> >>> but the domU thinks it 
    has: >>> >>> 
    124/125/126/127 >>> >>> Is there some mapping that's 
    taking place, or is this plain wrong? >> >> That looks 
    wrong. The IRQ numbers (even though they are MSI vectors) are >> 
    setup as IRQ numbers in the DomU guest. You should have 
    seen >> >> 32: >> 41: >> 42: >> 
    47: >> in you /proc/interrupts on your DomU 
    guest. >> >> I wonder what broke  - can you 
    use >git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git >> 
    devel/xen-pcifront-0.5 (or pv/pcifront-2.6.32)? > >Please 
    forgive the git ignorance. > >Is this the right 
    syntax? > >git clone git://git.kernel.org/pub/scm/linux/kernel/git/konrad:pv/pcifront-2.6.32 >linux-2.6.32-pv-pcifront > >Initialized 
    empty Git repository 
    in >/import/kaan/bedge/src/xen/kernel/pv-ops/linux-2.6.32-pv-pcifront/.git/ >fatal: 
    The remote end hung up unexpectedly > >Or: > > git 
    clone  git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git > >Initialized 
    empty Git repository 
    in >/import/kaan/bedge/src/xen/kernel/pv-ops/xen/.git/ >remote: 
    error: Could not read 
    59eab2f8f04147c5aadc99f2034ca7e5b81e890f >remote: fatal: Failed to 
    traverse parents of 
    commit >979e121cb348add17ed8171bf447b27a3a9d1be3 >remote: 
    aborting due to possible repository corruption on the remote 
    side. >fatal: early EOF >fatal: index-pack 
    failed > >> >> It has the latest pcifront driver but 
    without the PVonHVM enhancments >> so we can try to eliminate the 
    PvONHVM logic out of the 
    picture. >> >>> >>> > >>> 
    > > - the interrupt does happen. But the interrupt service routine 
    of >>> > > tachyon driver doesn't detect any interrupt 
    status related to this >>> > > interrupt, which inhibits 
    the tachyon chip from coming on-line. And >>> > > there 
    are high count of tachyon interrupt in /proc/interrupts >>> 
    > >>> > Is it checking the PCI_STATUS_INTERRUPT or the 
    appropiate register >>> > in the MMIO BAR? >>> 
    > >>> >>> The driver would check the appropriate 
    register (tachyon registers) in >>> the MMIO to determine the 
    source of interrupts. >> >> OK, so that isn't it. Is there 
    anything at these vectors: >> 7c, 7d, 7e, and 7f? When you use xen 
    debug-keys 'i' or 'q' it should give you >> an inkling what device 
    this is set for. > >When I run a distro kernel in hvm mode, I 
    get the expected irq mappings: > >'i' - Note 66 - 
    69 >(XEN)    IRQ:  66 
    affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:3a >type=PCI-MSI 
            status=00000010 
    in-flight=0 >domain-list=10:127(----), >(XEN)    IRQ: 
     67 affinity:ffffffff,ffffffff,ffffffff,ffffffff 
    vec:42 >type=PCI-MSI         status=00000010 
    in-flight=0 >domain-list=10:126(----), >(XEN)    IRQ: 
     68 affinity:ffffffff,ffffffff,ffffffff,ffffffff 
    vec:4a >type=PCI-MSI         status=00000010 
    in-flight=0 >domain-list=10:125(----), >(XEN)    IRQ: 
     69 affinity:ffffffff,ffffffff,ffffffff,ffffffff 
    vec:52 >type=PCI-MSI         status=00000010 
    in-flight=0 >domain-list=10:124(----) > > >'q' >(XEN) 
        Interrupts { 32, 41-42, 47, 124-127 
    } > > >The same data with pv-ops kernel 
    shows: > >'i' >IRQ numbers stop at 65, no 66 - 69 
    present: > >(XEN)    IRQ:  63 
    affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:91 >type=PCI-MSI 
            status=00000010 
    in-flight=0 >domain-list=0:289(----), >(XEN)    IRQ: 
     64 affinity:ffffffff,ffffffff,ffffffff,ffffffff 
    vec:99 >type=PCI-MSI         status=00000002 
    mapped, unbound >(XEN)    IRQ:  65 
    affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:b1 >type=PCI-MSI 
            status=00000010 
    in-flight=0 >domain-list=0:287(----), >(XEN) IO-APIC interrupt 
    information: > >'q' >(XEN)     Interrupts { 32, 
    41-42, 47 } > >> >>> >>> > 
    > >>> > > kaan-18-dpm:~# cat /proc/interrupts | grep 
    TACH >>> > > >124:     760415   
           0          0   
           0          0 > 
       0 >>> > >          0 
             0          0 
             0         
     0 >      0 >>> > >    
     0          0  xen-pirq-pcifront-msi 
     HW_TACHYON >>> > > >125:     762234 
             0          0 
             0         
     0 >    0 >>> > >     
         0          0     
         0          0     
         0 >      0 >>> > 
    >     0          0 
     xen-pirq-pcifront-msi  HW_TACHYON >>> > 
    > >126:     764180          0 
             0          0 
             0 >    0 >>> 
    > >          0         
     0          0         
     0          0 >     
     0 >>> > >     0       
       0  xen-pirq-pcifront-msi  HW_TACHYON >>> 
    > > >127:     764164         
     0          0         
     0          0 >   
     0 >>> > >          0   
           0          0   
           0          0 > 
         0 >>> > >     0   
           0  xen-pirq-pcifront-msi 
     HW_TACHYON >>> > >>> > Can you provide 
    the full dmesg output? >>> >>> 
    Attached. >>> >>> Some possibly related messages on 
    dom0 console: >>> >>> [ 1882.269778] pciback 
    0000:07:00.0: enabling device (0000 -> 0003) >>> [ 
    1882.269800] xen: registering gsi 32 triggering 0 polarity 1 >>> 
    [ 1882.269827] xen_allocate_pirq: returning irq 32 for gsi 
    32 >>> [ 1882.269834] xen: --> irq=32 >>> [ 
    1882.269841] Already setup the GSI :32 >>> [ 1882.269847] 
    pciback 0000:07:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 
    32 >>> [ 1882.269866] pciback 0000:07:00.0: setting latency 
    timer to 64 >>> [ 1882.270463] pciback 0000:07:00.0: Driver 
    tried to write to a >>> read-only configuration space field at 
    offset 0x62, size 2. This may >>> be harmless, but if you have 
    problems with your device: >> >> Uhhh, for that I think 
    you need to do 'lspci -vvv -xxx -s 07:00.00' >> to find out what is 
    at the configuration space. You could enable >> it using the 
    permissive attribute. >> >>> [ 1882.270465] 1) see 
    permissive attribute in sysfs >>> [ 1882.270467] 2) report 
    problems to the xen-devel mailing list along >>> with details of 
    your device obtained from lspci. >>> [ 1882.270615]   alloc 
    irq_desc for 478 on node 0 >>> [ 1882.270625]   alloc 
    kstat_irqs on node 0 >> >> So for 478: what do you see? 
    xen-pciback I presume? >>> [ 1882.348411] pciback 0000:07:00.1: 
    enabling device (0000 -> 0003) >>> [ 1882.348433] xen: 
    registering gsi 42 triggering 0 polarity 1 >>> [ 1882.348440] 
    xen_allocate_pirq: returning irq 42 for gsi 42 >>> [ 
    1882.348445] xen: --> irq=42 >>> [ 1882.348472] Already setup 
    the GSI :42 >>> [ 1882.348479] pciback 0000:07:00.1: PCI INT B 
    -> GSI 42 (level, low) -> IRQ 42 >>> [ 1882.348497] 
    pciback 0000:07:00.1: setting latency timer to 64 >>> [ 
    1882.349063] pciback 0000:07:00.1: Driver tried to write to 
    a >>> read-only configuration space field at offset 0x62, size 
    2. This may >>> be harmless, but if you have problems with your 
    device: >>> [ 1882.349066] 1) see permissive attribute in 
    sysfs >>> [ 1882.349067] 2) report problems to the xen-devel 
    mailing list along >>> with details of your device obtained from 
    lspci. >>> [ 1882.349205]   alloc irq_desc for 477 on node 
    0 >>> [ 1882.349215]   alloc kstat_irqs on node 
    0 >>> [ 1882.402893] pciback 0000:07:00.2: enabling device (0000 
    -> 0003) >>> [ 1882.402908] xen: registering gsi 47 
    triggering 0 polarity 1 >>> [ 1882.402913] xen_allocate_pirq: 
    returning irq 47 for gsi 47 >>> [ 1882.402916] xen: --> 
    irq=47 >>> [ 1882.402921] Already setup the GSI 
    :47 >>> [ 1882.402925] pciback 0000:07:00.2: PCI INT C -> GSI 
    47 (level, low) -> IRQ 47 >>> [ 1882.402938] pciback 
    0000:07:00.2: setting latency timer to 64 >>> [ 1882.403280] 
    pciback 0000:07:00.2: Driver tried to write to a >>> read-only 
    configuration space field at offset 0x62, size 2. This may >>> 
    be harmless, but if you have problems with your device: >>> [ 
    1882.403282] 1) see permissive attribute in sysfs >>> [ 
    1882.403282] 2) report problems to the xen-devel mailing list 
    along >>> with details of your device obtained from 
    lspci. >>> [ 1882.403380]   alloc irq_desc for 476 on node 
    0 >>> [ 1882.403386]   alloc kstat_irqs on node 
    0 >>> (XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary 
    Pending Fault >>> (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] 
    Request device [07:00.0] >>> fault addr e6f80000, iommu reg = 
    ffff82c3fff57000 >>> (XEN) DMAR:[fault reason 05h] PTE Write 
    access is not set >>> (XEN) print_vtd_entries: iommu = 
    ffff83019fffa370 bdf = 7:0.0 gmfn = e6f80 >>> (XEN)   
      root_entry = ffff83019ff70000 >>> (XEN)     
    root_entry[7] = 19cf52001 >>> (XEN)     context = 
    ffff83019cf52000 >>> (XEN)     context[0] = 
    102_706dc005 >>> (XEN)     l4 = 
    ffff8300706dc000 >>> (XEN)     l4_index = 
    0 >>> (XEN)     l4[0] = 706db003 >>> (XEN) 
        l3 = ffff8300706db000 >>> (XEN)     
    l3_index = 3 >>> (XEN)     l3[3] = 
    702b6003 >>> (XEN)     l2 = 
    ffff8300702b6000 >>> (XEN)     l2_index = 
    137 >>> (XEN)     l2[137] = 0 >>> (XEN) 
        l2[137] not present >>> (XEN) traps.c:466:d0 
    Unhandled nmi fault/trap [#2] on VCPU 0 [ec=0000] >> >> 
    That is not good. What changed from your earlier emails that this was 
    triggered? > >Nothing >> Or was it triggered all 
    along? > >Yes, I just included it for 
    completeness > >> What happens if you run the system without 
    the iommu enabled? > >Haven't tried yet. Will check that 
    next. > >-Bruce >   
    >_______________________________________________ >Xen-devel 
    mailing list >Xen-devel@xxxxxxxxxxxxxxxxxxx >http://lists.xensource.com/xen-devel   
           
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 |   
 
| <Prev in Thread] | 
Current Thread | 
[Next in Thread>
 |  
- [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Bruce Edge
- Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Konrad Rzeszutek Wilk
- Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Bruce Edge
- Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Konrad Rzeszutek Wilk
 - Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Bruce Edge
 - RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Jiang, Yunhong
 - Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Bruce Edge
 - RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Jiang, Yunhong
 - Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Bruce Edge
 - RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem,
Lin, Ray <=
 - Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Konrad Rzeszutek Wilk
 - RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Lin, Ray
 - Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Konrad Rzeszutek Wilk
 - RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Lin, Ray
 
  
  
- RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Jiang, Yunhong
 - RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Lin, Ray
 
  
- Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Konrad Rzeszutek Wilk
 - Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem, Bruce Edge
 
 
 |  
  
 | 
    |