WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem

To: Bruce Edge <bruce.edge@xxxxxxxxx>, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Subject: RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem
From: "Lin, Ray" <Ray.Lin@xxxxxxx>
Date: Tue, 28 Sep 2010 10:08:57 -0600
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Konrad
Delivery-date: Tue, 28 Sep 2010 09:10:09 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTikrVg4xNC=+oDvxaVS+tmQwGkNDawcdbxELxt0d@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: ActeyCCTa2H/CD80RoC45Ai781jPYwAXdDgg
Thread-topic: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem
    I just checked the "xen dmesg". Look like DMA/iommu is the root cause of this issue. In order to tell the source of interrupt, Tachyon chip needs to do the DMA write to a dword memory location to indicate the source of interrupt. What iommu option do you recommend to use ?
 
(XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending Fault
(XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0] fault addr c00000, iommu reg = ffff82c3fff57000
(XEN) DMAR:[fault reason 05h] PTE Write access is not set
(XEN) print_vtd_entries: iommu = ffff83019fffa370 bdf = 7:0.0 gmfn = c00
(XEN)     root_entry = ffff83019ff70000
(XEN)     root_entry[7] = 19cf52001
(XEN)     context = ffff83019cf52000
(XEN)     context[0] = 102_706dc005
(XEN)     l4 = ffff8300706dc000
(XEN)     l4_index = 0
(XEN)     l4[0] = 706db003
(XEN)     l3 = ffff8300706db000
(XEN)     l3_index = 0
(XEN)     l3[0] = 706da003
(XEN)     l2 = ffff8300706da000
(XEN)     l2_index = 6
(XEN)     l2[6] = 0
                                                                                    
 
-Ray

 

From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Bruce Edge
Sent: Monday, September 27, 2010 9:46 PM
To: Jiang, Yunhong
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Konrad Rzeszutek Wilk
Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem

On Mon, Sep 27, 2010 at 8:26 PM, Jiang, Yunhong <yunhong.jiang@xxxxxxxxx> wrote:

"xm dmesg" should gives xen's boot log, and sometimes it contain some helpful information, I think, especially loglvl and guest_loglvl is set to all.


I looked at the xm dmesg output and there's nothing more than what I already provided, aside from a bunch of commands from me poking at it.

-Bruce
 

 

Thanks

--jyh

 

From: Bruce Edge [mailto:bruce.edge@xxxxxxxxx]
Sent: Tuesday, September 28, 2010 11:16 AM
To: Jiang, Yunhong
Cc: Konrad Rzeszutek Wilk; xen-devel@xxxxxxxxxxxxxxxxxxx


Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem

 

On Mon, Sep 27, 2010 at 6:15 PM, Jiang, Yunhong <yunhong.jiang@xxxxxxxxx> wrote:

Is the 07:0.0 your tachyon device? The VT-d fault is suspcious.

 

Yes, there is 1 quad port card is this sytem:

 

07:00.0 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)

07:00.1 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)

07:00.2 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)

07:00.3 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08)

 

 

Also is it possible to share the xen output?

 

I attached the dom0 boot output. Let me know if you wanted something else.

 

Also, here's the dom0 console output upon starting the VM: This lockdep error started with the release of 2.6.32.21. Note that I'm running the same  kernel for the domU and dom0.

 

[ 1817.684097] ------------[ cut here ]------------

[ 1817.684113] WARNING: at kernel/lockdep.c:2323 trace_hardirqs_on_caller+0x12f/0x190()

[ 1817.684119] Hardware name: ProLiant DL380 G6

[ 1817.684122] Modules linked in: xt_physdev ipv6 osa_mfgdom0 xenfs xen_gntdev fbcon tileblit font bitblit softcursor xen_evtchn xen_pciback radeon ttm drm_kms_helper tun drm i2c_algo_bit ipmi_si i2c_core ipmi_msghandler joydev serio_raw hpwdt hpilo bridge stp llc usbhid hid cciss usb_storage

[ 1817.684190] Pid: 11, comm: xenwatch Not tainted 2.6.32.21-xenoprof-1 #1

[ 1817.684195] Call Trace:

[ 1817.684197]  <IRQ>  [<ffffffff810aa18f>] ? trace_hardirqs_on_caller+0x12f/0x190

[ 1817.684209]  [<ffffffff8106bed0>] warn_slowpath_common+0x80/0xd0

[ 1817.684217]  [<ffffffff815f2b80>] ? _spin_unlock_irq+0x30/0x40

[ 1817.684223]  [<ffffffff8106bf34>] warn_slowpath_null+0x14/0x20

[ 1817.684229]  [<ffffffff810aa18f>] trace_hardirqs_on_caller+0x12f/0x190

[ 1817.684234]  [<ffffffff810aa1fd>] trace_hardirqs_on+0xd/0x10

[ 1817.684240]  [<ffffffff815f2b80>] _spin_unlock_irq+0x30/0x40

[ 1817.684266]  [<ffffffff813c4fc5>] add_to_net_schedule_list_tail+0x85/0xd0

[ 1817.684271]  [<ffffffff813c6216>] netif_be_int+0x36/0x160

[ 1817.684278]  [<ffffffff810e10d0>] handle_IRQ_event+0x70/0x180

[ 1817.684284]  [<ffffffff810e36e9>] handle_edge_irq+0xc9/0x170

[ 1817.684291]  [<ffffffff813b8d7f>] __xen_evtchn_do_upcall+0x1bf/0x1f0

[ 1817.684297]  [<ffffffff813b92fd>] xen_evtchn_do_upcall+0x3d/0x60

[ 1817.684304]  [<ffffffff8101647e>] xen_do_hypervisor_callback+0x1e/0x30

[ 1817.684308]  <EOI>  [<ffffffff8100940a>] ? hypercall_page+0x40a/0x1010

[ 1817.684319]  [<ffffffff8100940a>] ? hypercall_page+0x40a/0x1010

[ 1817.684325]  [<ffffffff813bce54>] ? xb_write+0x1e4/0x290

[ 1817.684330]  [<ffffffff813bd8ca>] ? xs_talkv+0x6a/0x1f0

[ 1817.684336]  [<ffffffff813bd8d8>] ? xs_talkv+0x78/0x1f0

[ 1817.684341]  [<ffffffff813bdbcd>] ? xs_single+0x4d/0x60

[ 1817.684346]  [<ffffffff813be502>] ? xenbus_read+0x52/0x80

[ 1817.684352]  [<ffffffff813c87fc>] ? frontend_changed+0x48c/0x770

[ 1817.684358]  [<ffffffff813bf76d>] ? xenbus_otherend_changed+0xdd/0x1b0

[ 1817.684365]  [<ffffffff8101122f>] ? xen_restore_fl_direct_end+0x0/0x1

[ 1817.684371]  [<ffffffff810ac830>] ? lock_release+0xb0/0x230

[ 1817.684376]  [<ffffffff813bfae0>] ? frontend_changed+0x10/0x20

[ 1817.684382]  [<ffffffff813bd4f5>] ? xenwatch_thread+0x55/0x160

[ 1817.684389]  [<ffffffff81093400>] ? autoremove_wake_function+0x0/0x40

[ 1817.684394]  [<ffffffff813bd4a0>] ? xenwatch_thread+0x0/0x160

[ 1817.684400]  [<ffffffff81093086>] ? kthread+0x96/0xb0

[ 1817.684405]  [<ffffffff8101632a>] ? child_rip+0xa/0x20

[ 1817.684410]  [<ffffffff81015c90>] ? restore_args+0x0/0x30

[ 1817.684415]  [<ffffffff81016320>] ? child_rip+0x0/0x20

 

-Bruce

 

 


Thanks
--jyh


>-----Original Message-----
>From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Bruce Edge
>Sent: Tuesday, September 28, 2010 7:54 AM
>To: Konrad Rzeszutek Wilk
>Cc: xen-devel@xxxxxxxxxxxxxxxxxxx

>Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem
>

>On Mon, Sep 27, 2010 at 12:54 PM, Konrad Rzeszutek Wilk
><konrad.wilk@xxxxxxxxxx> wrote:
>> On Mon, Sep 27, 2010 at 12:16:50PM -0700, Bruce Edge wrote:
>>> On Mon, Sep 27, 2010 at 10:24 AM, Konrad Rzeszutek Wilk
>>> <konrad.wilk@xxxxxxxxxx> wrote:
>>> >
>>> > On Mon, Sep 27, 2010 at 08:52:39AM -0700, Bruce Edge wrote:
>>> > > One of our developers who is working on a tachyon driver is
>>> > > complaining that the pvops domU kernel is not working for these MSI
>>> > > interrupts.
>>> > > This is using the current head of xen/2.6.32.x on both a single
>>> > > Nahelam 920 and a dual E5540. This behavior is consistent with Xen
>>> > > 4.0.1, 4.0.2.rc1-pre and 4.1.
>>> > >
>>> > > Here are his comments:
>>> > >
>>> > > - the driver has no problem to enable msi interrupt and request the
>>> > > interrupt through kernel functions pci_enable_msi & request_irq
>>> >
>>> > What shows up in the Xen console when you send the 'q' key? Does it
>>> > show that the vector is assigned to the appropiate guest?
>>>
>>> The Xen console q key shows that the domU is assigned:
>>>
>>> (XEN)     Interrupts { 32, 41-42, 47 }
>>
>> Aha!
>>
>>>
>>> but the domU thinks it has:
>>>
>>> 124/125/126/127
>>>
>>> Is there some mapping that's taking place, or is this plain wrong?
>>
>> That looks wrong. The IRQ numbers (even though they are MSI vectors) are
>> setup as IRQ numbers in the DomU guest. You should have seen
>>
>> 32:
>> 41:
>> 42:
>> 47:
>> in you /proc/interrupts on your DomU guest.
>>
>> I wonder what broke  - can you use
>git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git
>> devel/xen-pcifront-0.5 (or pv/pcifront-2.6.32)?
>
>Please forgive the git ignorance.
>
>Is this the right syntax?
>
>git clone git://git.kernel.org/pub/scm/linux/kernel/git/konrad:pv/pcifront-2.6.32
>linux-2.6.32-pv-pcifront
>
>Initialized empty Git repository in
>/import/kaan/bedge/src/xen/kernel/pv-ops/linux-2.6.32-pv-pcifront/.git/
>fatal: The remote end hung up unexpectedly
>
>Or:
>
> git clone  git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git
>
>Initialized empty Git repository in
>/import/kaan/bedge/src/xen/kernel/pv-ops/xen/.git/
>remote: error: Could not read 59eab2f8f04147c5aadc99f2034ca7e5b81e890f
>remote: fatal: Failed to traverse parents of commit
>979e121cb348add17ed8171bf447b27a3a9d1be3
>remote: aborting due to possible repository corruption on the remote side.
>fatal: early EOF
>fatal: index-pack failed
>
>>
>> It has the latest pcifront driver but without the PVonHVM enhancments
>> so we can try to eliminate the PvONHVM logic out of the picture.
>>
>>>
>>> >
>>> > > - the interrupt does happen. But the interrupt service routine of
>>> > > tachyon driver doesn't detect any interrupt status related to this
>>> > > interrupt, which inhibits the tachyon chip from coming on-line. And
>>> > > there are high count of tachyon interrupt in /proc/interrupts
>>> >
>>> > Is it checking the PCI_STATUS_INTERRUPT or the appropiate register
>>> > in the MMIO BAR?
>>> >
>>>
>>> The driver would check the appropriate register (tachyon registers) in
>>> the MMIO to determine the source of interrupts.
>>
>> OK, so that isn't it. Is there anything at these vectors:
>> 7c, 7d, 7e, and 7f? When you use xen debug-keys 'i' or 'q' it should give you
>> an inkling what device this is set for.
>
>When I run a distro kernel in hvm mode, I get the expected irq mappings:
>
>'i' - Note 66 - 69
>(XEN)    IRQ:  66 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:3a
>type=PCI-MSI         status=00000010 in-flight=0
>domain-list=10:127(----),
>(XEN)    IRQ:  67 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:42
>type=PCI-MSI         status=00000010 in-flight=0
>domain-list=10:126(----),
>(XEN)    IRQ:  68 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:4a
>type=PCI-MSI         status=00000010 in-flight=0
>domain-list=10:125(----),
>(XEN)    IRQ:  69 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:52
>type=PCI-MSI         status=00000010 in-flight=0
>domain-list=10:124(----)
>
>
>'q'
>(XEN)     Interrupts { 32, 41-42, 47, 124-127 }
>
>
>The same data with pv-ops kernel shows:
>
>'i'
>IRQ numbers stop at 65, no 66 - 69 present:
>
>(XEN)    IRQ:  63 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:91
>type=PCI-MSI         status=00000010 in-flight=0
>domain-list=0:289(----),
>(XEN)    IRQ:  64 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:99
>type=PCI-MSI         status=00000002 mapped, unbound
>(XEN)    IRQ:  65 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:b1
>type=PCI-MSI         status=00000010 in-flight=0
>domain-list=0:287(----),
>(XEN) IO-APIC interrupt information:
>
>'q'
>(XEN)     Interrupts { 32, 41-42, 47 }
>
>>
>>>
>>> > >
>>> > > kaan-18-dpm:~# cat /proc/interrupts | grep TACH
>>> > >
>124:     760415          0          0          0          0
>    0
>>> > >          0          0          0          0          0
>      0
>>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
>>> > >
>125:     762234          0          0          0          0
>    0
>>> > >          0          0          0          0          0
>      0
>>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
>>> > >
>126:     764180          0          0          0          0
>    0
>>> > >          0          0          0          0          0
>      0
>>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
>>> > >
>127:     764164          0          0          0          0
>    0
>>> > >          0          0          0          0          0
>      0
>>> > >     0          0  xen-pirq-pcifront-msi  HW_TACHYON
>>> >
>>> > Can you provide the full dmesg output?
>>>
>>> Attached.
>>>
>>> Some possibly related messages on dom0 console:
>>>
>>> [ 1882.269778] pciback 0000:07:00.0: enabling device (0000 -> 0003)
>>> [ 1882.269800] xen: registering gsi 32 triggering 0 polarity 1
>>> [ 1882.269827] xen_allocate_pirq: returning irq 32 for gsi 32
>>> [ 1882.269834] xen: --> irq=32
>>> [ 1882.269841] Already setup the GSI :32
>>> [ 1882.269847] pciback 0000:07:00.0: PCI INT A -> GSI 32 (level, low) -> IRQ 32
>>> [ 1882.269866] pciback 0000:07:00.0: setting latency timer to 64
>>> [ 1882.270463] pciback 0000:07:00.0: Driver tried to write to a
>>> read-only configuration space field at offset 0x62, size 2. This may
>>> be harmless, but if you have problems with your device:
>>
>> Uhhh, for that I think you need to do 'lspci -vvv -xxx -s 07:00.00'
>> to find out what is at the configuration space. You could enable
>> it using the permissive attribute.
>>
>>> [ 1882.270465] 1) see permissive attribute in sysfs
>>> [ 1882.270467] 2) report problems to the xen-devel mailing list along
>>> with details of your device obtained from lspci.
>>> [ 1882.270615]   alloc irq_desc for 478 on node 0
>>> [ 1882.270625]   alloc kstat_irqs on node 0
>>
>> So for 478: what do you see? xen-pciback I presume?
>>> [ 1882.348411] pciback 0000:07:00.1: enabling device (0000 -> 0003)
>>> [ 1882.348433] xen: registering gsi 42 triggering 0 polarity 1
>>> [ 1882.348440] xen_allocate_pirq: returning irq 42 for gsi 42
>>> [ 1882.348445] xen: --> irq=42
>>> [ 1882.348472] Already setup the GSI :42
>>> [ 1882.348479] pciback 0000:07:00.1: PCI INT B -> GSI 42 (level, low) -> IRQ 42
>>> [ 1882.348497] pciback 0000:07:00.1: setting latency timer to 64
>>> [ 1882.349063] pciback 0000:07:00.1: Driver tried to write to a
>>> read-only configuration space field at offset 0x62, size 2. This may
>>> be harmless, but if you have problems with your device:
>>> [ 1882.349066] 1) see permissive attribute in sysfs
>>> [ 1882.349067] 2) report problems to the xen-devel mailing list along
>>> with details of your device obtained from lspci.
>>> [ 1882.349205]   alloc irq_desc for 477 on node 0
>>> [ 1882.349215]   alloc kstat_irqs on node 0
>>> [ 1882.402893] pciback 0000:07:00.2: enabling device (0000 -> 0003)
>>> [ 1882.402908] xen: registering gsi 47 triggering 0 polarity 1
>>> [ 1882.402913] xen_allocate_pirq: returning irq 47 for gsi 47
>>> [ 1882.402916] xen: --> irq=47
>>> [ 1882.402921] Already setup the GSI :47
>>> [ 1882.402925] pciback 0000:07:00.2: PCI INT C -> GSI 47 (level, low) -> IRQ 47
>>> [ 1882.402938] pciback 0000:07:00.2: setting latency timer to 64
>>> [ 1882.403280] pciback 0000:07:00.2: Driver tried to write to a
>>> read-only configuration space field at offset 0x62, size 2. This may
>>> be harmless, but if you have problems with your device:
>>> [ 1882.403282] 1) see permissive attribute in sysfs
>>> [ 1882.403282] 2) report problems to the xen-devel mailing list along
>>> with details of your device obtained from lspci.
>>> [ 1882.403380]   alloc irq_desc for 476 on node 0
>>> [ 1882.403386]   alloc kstat_irqs on node 0
>>> (XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending Fault
>>> (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0]
>>> fault addr e6f80000, iommu reg = ffff82c3fff57000
>>> (XEN) DMAR:[fault reason 05h] PTE Write access is not set
>>> (XEN) print_vtd_entries: iommu = ffff83019fffa370 bdf = 7:0.0 gmfn = e6f80
>>> (XEN)     root_entry = ffff83019ff70000
>>> (XEN)     root_entry[7] = 19cf52001
>>> (XEN)     context = ffff83019cf52000
>>> (XEN)     context[0] = 102_706dc005
>>> (XEN)     l4 = ffff8300706dc000
>>> (XEN)     l4_index = 0
>>> (XEN)     l4[0] = 706db003
>>> (XEN)     l3 = ffff8300706db000
>>> (XEN)     l3_index = 3
>>> (XEN)     l3[3] = 702b6003
>>> (XEN)     l2 = ffff8300702b6000
>>> (XEN)     l2_index = 137
>>> (XEN)     l2[137] = 0
>>> (XEN)     l2[137] not present
>>> (XEN) traps.c:466:d0 Unhandled nmi fault/trap [#2] on VCPU 0 [ec=0000]
>>
>> That is not good. What changed from your earlier emails that this was triggered?
>
>Nothing
>> Or was it triggered all along?
>
>Yes, I just included it for completeness
>
>> What happens if you run the system without the iommu enabled?
>
>Haven't tried yet. Will check that next.
>
>-Bruce
>

>_______________________________________________
>Xen-devel mailing list
>Xen-devel@xxxxxxxxxxxxxxxxxxx
>http://lists.xensource.com/xen-devel

 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>