WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: xen-swiotlb

Hi Konrad,

It seems i have a working setup now, passing through a xhci controller to a 
guest, and keep grabbing video, it has ran for 2 days now.
But i have changed a lot of parameters, so I will try to find out which one 
seems to be the problem, or that everything is miraculously fixed :-)

(it's on other hardware (intel platform) now, as the only VM, newer dom0 
kernel, newer domU kernel, no swiotlb=force anymore, the boot option pasi 
suggested, no irq balance etc.)

Will post again when i know something more !

--

Sander




Thursday, August 26, 2010, 11:43:21 PM, you wrote:

>> > Looking at the output, the physical addresses that DMA-ed are:
>> 
>> > 0x1f2962dc0
>> > 0x1f24f2e68
>> 
>> > and they look to be called quite often. In fact, there looks to be a
>> > loop that does something like this:
>> 
>> > again:
>> >   p = kmalloc(..)
>> 
>> >   dma = pci_map_single(p)
>> >   pci_dma_mapping_error(dma);
>> >   /* get some data.. */
>> >   /* parse the: (pipe 0x80000280): IN:  c0 00 00 00 0c 00 01 00 */
>> >   pci_unmap_sg(dma);
>> >   goto again;
>> 
>> > As the virtual address sent to pci_map_single looks to be sequentially
>> > increasing.

> I am not sure what virtual address I was thinking off. Looking again
> at the dmesg output it looks actually correct. SWIOTLB is giving out
> chunks, increasing its own index, and then unmaping those pages right
> away.


> This stack-trace in the middle of the log:
>  
> Aug 15 23:14:02 security kernel: [  103.646497] em28xx #0 em28xx_isoc_copy 
> :Video frame 1, length=960, odd
> Aug 15 23:14:02 security kernel: [  103.650177] xhci_hcd 0000:08:00.0: Miss 
> service interval error, set skip flag
> Aug 15 23:14:02 security kernel: [  103.651371] em28xx #0 em28xx_irq_callback 
> :urb completition error -18.
> Aug 15 23:14:02 security kernel: [  103.651406] em28xx #0 print_err_status 
> :URB status -18 [Unknown].
> Aug 15 23:14:02 security kernel: [  103.651432] em28xx #0 em28xx_isoc_copy 
> :Video frame 2, length=2888, even
> Aug 15 23:14:02 security kernel: [  103.651454] em28xx #0 get_next_buf :No 
> active queue to serve
> Aug 15 23:14:02 security kernel: [  103.651477] em28xx #0 print_err_status 
> :URB packet 63, status -18 [Unknown].
> Aug 15 23:14:02 security kernel: [  103.653143] motion: page allocation 
> failure. order:1, mode:0x20
> Aug 15 23:14:02 security kernel: [  103.653177] Pid: 1418, comm: motion Not 
> tainted 2.6.352.6.36pre+pci-front-xhci-20100813-xhcidebug+ #14
> Aug 15 23:14:02 security kernel: [  103.653205] Call Trace:
> Aug 15 23:14:02 security kernel: [  103.653219]  <IRQ>  [<ffffffff810a7ebf>] 
> __alloc_pages_nodemask+0x667/0x6c7
> Aug 15 23:14:02 security kernel: [  103.653272]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:02 security kernel: [  103.653300]  [<ffffffff810ca32f>] 
> cache_alloc_refill+0x3b3/0x851
> Aug 15 23:14:02 security kernel: [  103.653326]  [<ffffffff81006b4d>] ? 
> xen_force_evtchn_callback+0xd/0xf
> Aug 15 23:14:02 security kernel: [  103.653352]  [<ffffffff81007202>] ? 
> check_events+0x12/0x20
> Aug 15 23:14:02 security kernel: [  103.653375]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:02 security kernel: [  103.653401]  [<ffffffff810ce9bd>] ? 
> create_object+0x2e/0x2a1
> Aug 15 23:14:02 security kernel: [  103.653425]  [<ffffffff810c9e3f>] 
> kmem_cache_alloc+0xf7/0x1c4
> Aug 15 23:14:02 security kernel: [  103.653451]  [<ffffffff8135e085>] ? 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:02 security kernel: [  103.653475]  [<ffffffff810ce9bd>] 
> create_object+0x2e/0x2a1
> Aug 15 23:14:02 security kernel: [  103.653499]  [<ffffffff8135e085>] ? 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:02 security kernel: [  103.653523]  [<ffffffff8135e085>] ? 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:02 security kernel: [  103.653548]  [<ffffffff810c8f68>] ? 
> cache_alloc_debugcheck_after+0x1bf/0x239
> Aug 15 23:14:02 security kernel: [  103.653575]  [<ffffffff8135e085>] ? 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:02 security kernel: [  103.653602]  [<ffffffff8143e7d7>] 
> kmemleak_alloc+0x21/0x3e
> Aug 15 23:14:02 security kernel: [  103.653625]  [<ffffffff810c9edf>] 
> kmem_cache_alloc+0x197/0x1c4
> Aug 15 23:14:02 security kernel: [  103.653649]  [<ffffffff8135e085>] 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:02 security kernel: [  103.653675]  [<ffffffff81340450>] 
> usb_hcd_submit_urb+0xb1f/0xbf5
> Aug 15 23:14:02 security kernel: [  103.653701]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:02 security kernel: [  103.653726]  [<ffffffff81006b4d>] ? 
> xen_force_evtchn_callback+0xd/0xf
> Aug 15 23:14:02 security kernel: [  103.653751]  [<ffffffff81007202>] ? 
> check_events+0x12/0x20
> Aug 15 23:14:02 security kernel: [  103.653777]  [<ffffffff812b6267>] ? 
> vt_console_print+0x43/0x346
> Aug 15 23:14:02 security kernel: [  103.653802]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:02 security kernel: [  103.653830]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:02 security kernel: [  103.653857]  [<ffffffff8106d5b0>] ? 
> lock_release+0x1e0/0x1ed
> Aug 15 23:14:03 security kernel: [  103.653882]  [<ffffffff81340d45>] 
> usb_submit_urb+0x249/0x260
> Aug 15 23:14:03 security kernel: [  103.653919]  [<ffffffffa0060615>] 
> em28xx_irq_callback+0xc9/0xfd [em28xx]
> Aug 15 23:14:03 security kernel: [  103.653946]  [<ffffffff8133f16a>] 
> usb_hcd_giveback_urb+0x84/0xb8
> Aug 15 23:14:03 security kernel: [  103.653972]  [<ffffffff8136480a>] 
> xhci_handle_event+0xced/0xdf7
> Aug 15 23:14:03 security kernel: [  103.653998]  [<ffffffff813648a8>] 
> xhci_handle_event+0xd8b/0xdf7
> Aug 15 23:14:03 security kernel: [  103.654024]  [<ffffffff813648a8>] 
> xhci_handle_event+0xd8b/0xdf7
> Aug 15 23:14:03 security kernel: [  103.654050]  [<ffffffff81364a31>] 
> xhci_irq+0x11d/0x1af
> Aug 15 23:14:03 security kernel: [  103.654074]  [<ffffffff81364ade>] 
> xhci_msi_irq+0x1b/0x1d
> Aug 15 23:14:03 security kernel: [  103.654103]  [<ffffffff8108e4aa>] 
> handle_IRQ_event+0x20/0xc7
> Aug 15 23:14:03 security kernel: [  103.654128]  [<ffffffff810902a4>] 
> handle_level_irq+0x98/0xf3
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff812997a8>] 
> __xen_evtchn_do_upcall+0xe1/0x183
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8129a423>] 
> xen_evtchn_do_upcall+0x2a/0x3c
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8100a9ee>] 
> xen_do_hypervisor_callback+0x1e/0x30
> Aug 15 23:14:03 security kernel: [  103.654151]  <EOI>  [<ffffffff810012eb>] 
> ? hypercall_page+0x2eb/0x1006
> Aug 15 23:14:03 security kernel: [  103.654151] Mem-Info:
> Aug 15 23:14:03 security kernel: [  103.654151] DMA per-cpu:
> Aug 15 23:14:03 security kernel: [  103.654151] CPU    0: hi:    0, btch:   1 
> usd:   0
> Aug 15 23:14:03 security kernel: [  103.654151] DMA32 per-cpu:
> Aug 15 23:14:03 security kernel: [  103.654151] CPU    0: hi:  186, btch:  31 
> usd: 168
> Aug 15 23:14:03 security kernel: [  103.654151] active_anon:4503 
> inactive_anon:4767 isolated_anon:0
> Aug 15 23:14:03 security kernel: [  103.654151]  active_file:4071 
> inactive_file:9087 isolated_file:0
> Aug 15 23:14:03 security kernel: [  103.654151]  unevictable:0 dirty:1081 
> writeback:0 unstable:0
> Aug 15 23:14:03 security kernel: [  103.654151]  free:1584 
> slab_reclaimable:5991 slab_unreclaimable:153956
> Aug 15 23:14:03 security kernel: [  103.654151]  mapped:3401 shmem:48 
> pagetables:1452 bounce:0
> Aug 15 23:14:03 security kernel: [  103.654151] DMA free:3992kB min:60kB 
> low:72kB high:88kB active_anon:84kB inactive_anon:436kB active_file:24kB 
> inactive_file:148kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
> present:15712kB mlocked:0kB dirty:20kB writeback:0kB mapped:16kB shmem:0kB 
> slab_reclaimable:264kB slab_unreclaimable:8932kB kernel_stack:0kB 
> pagetables:384kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 
> all_unreclaimable? no
> Aug 15 23:14:03 security kernel: [  103.654151] lowmem_reserve[]: 0 994 994 
> 994
> Aug 15 23:14:03 security kernel: [  103.654151] DMA32 free:2344kB min:4000kB 
> low:5000kB high:6000kB active_anon:17928kB inactive_anon:18632kB 
> active_file:16260kB inactive_file:36200kB unevictable:0kB isolated(anon):0kB 
> isolated(file):0kB present:1018080kB mlocked:0kB dirty:4304kB writeback:0kB 
> mapped:13588kB shmem:192kB slab_reclaimable:23700kB 
> slab_unreclaimable:606892kB kernel_stack:720kB pagetables:5424kB unstable:0kB 
> bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> Aug 15 23:14:03 security kernel: [  103.654151] lowmem_reserve[]: 0 0 0 0
> Aug 15 23:14:03 security kernel: [  103.654151] DMA: 88*4kB 1*8kB 1*16kB 
> 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3992kB
> Aug 15 23:14:03 security kernel: [  103.654151] DMA32: 406*4kB 0*8kB 1*16kB 
> 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2344kB
> Aug 15 23:14:03 security kernel: [  103.654151] 13236 total pagecache pages
> Aug 15 23:14:03 security kernel: [  103.654151] 21 pages in swap cache
> Aug 15 23:14:03 security kernel: [  103.654151] Swap cache stats: add 32, 
> delete 11, find 27/27
> Aug 15 23:14:03 security kernel: [  103.654151] Free swap  = 524200kB
> Aug 15 23:14:03 security kernel: [  103.654151] Total swap = 524284kB
> Aug 15 23:14:03 security kernel: [  103.654151] 262128 pages RAM
> Aug 15 23:14:03 security kernel: [  103.654151] 73203 pages reserved
> Aug 15 23:14:03 security kernel: [  103.654151] 21103 pages shared
> Aug 15 23:14:03 security kernel: [  103.654151] 105944 pages non-shared
> Aug 15 23:14:03 security kernel: [  103.654151] kmemleak: Cannot allocate a 
> kmemleak_object structure
> Aug 15 23:14:03 security kernel: [  103.654151] Pid: 1418, comm: motion Not 
> tainted 2.6.352.6.36pre+pci-front-xhci-20100813-xhcidebug+ #14
> Aug 15 23:14:03 security kernel: [  103.654151] Call Trace:
> Aug 15 23:14:03 security kernel: [  103.654151]  <IRQ>  [<ffffffff810ce9d8>] 
> create_object+0x49/0x2a1
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8135e085>] ? 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8135e085>] ? 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff810c8f68>] ? 
> cache_alloc_debugcheck_after+0x1bf/0x239
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8135e085>] ? 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8143e7d7>] 
> kmemleak_alloc+0x21/0x3e
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff810c9edf>] 
> kmem_cache_alloc+0x197/0x1c4
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8135e085>] 
> xhci_urb_enqueue+0x14c/0x3f5
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff81340450>] 
> usb_hcd_submit_urb+0xb1f/0xbf5
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff81006b4d>] ? 
> xen_force_evtchn_callback+0xd/0xf
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff81007202>] ? 
> check_events+0x12/0x20
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff812b6267>] ? 
> vt_console_print+0x43/0x346
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff810071ef>] ? 
> xen_restore_fl_direct_end+0x0/0x1
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8106d5b0>] ? 
> lock_release+0x1e0/0x1ed
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff81340d45>] 
> usb_submit_urb+0x249/0x260
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffffa0060615>] 
> em28xx_irq_callback+0xc9/0xfd [em28xx]
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8133f16a>] 
> usb_hcd_giveback_urb+0x84/0xb8
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8136480a>] 
> xhci_handle_event+0xced/0xdf7
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff813648a8>] 
> xhci_handle_event+0xd8b/0xdf7
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff813648a8>] 
> xhci_handle_event+0xd8b/0xdf7
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff81364a31>] 
> xhci_irq+0x11d/0x1af
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff81364ade>] 
> xhci_msi_irq+0x1b/0x1d
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8108e4aa>] 
> handle_IRQ_event+0x20/0xc7
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff810902a4>] 
> handle_level_irq+0x98/0xf3
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff812997a8>] 
> __xen_evtchn_do_upcall+0xe1/0x183
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8129a423>] 
> xen_evtchn_do_upcall+0x2a/0x3c
> Aug 15 23:14:03 security kernel: [  103.654151]  [<ffffffff8100a9ee>] 
> xen_do_hypervisor_callback+0x1e/0x30
> Aug 15 23:14:03 security kernel: [  103.654151]  <EOI>  [<ffffffff810012eb>] 
> ? hypercall_page+0x2eb/0x1006
> Aug 15 23:14:03 security kernel: [  103.654151] kmemleak: Kernel memory leak 
> detector disabled
> Aug 15 23:14:03 security kernel: [  103.654151] xhci_hcd 0000:08:00.0: Found 
> td. Clear skip flag.
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 0, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 1, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 2, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 3, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 4, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 5, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 6, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.654151] em28xx #0 print_err_status 
> :URB packet 7, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.659299] xhci_hcd 0000:08:00.0: Miss 
> service interval error, set skip flag
> Aug 15 23:14:03 security kernel: [  103.659307] xhci_hcd 0000:08:00.0: Found 
> td. Clear skip flag.
> Aug 15 23:14:03 security kernel: [  103.661748] kmemleak: Automatic memory 
> scanning thread ended
> Aug 15 23:14:03 security kernel: [  103.755620] em28xx #0 print_err_status 
> :URB packet 6, status -18 [Unknown].
> Aug 15 23:14:03 security kernel: [  103.755630] em28xx #0 print_err_status 
> :URB packet 7, status -18 [Unknown].


> The first fault is kmemleaker not being able to allocate its structure,
> and then it throws an error too.

> Then the em28xx gets horribly confused that it could not send an URB
> anymore and from there one things get worst.

> There is something causing this memory problem,and I think we can narrow
> it down if you output the kmemleaker output. You might have to
> unload/reload the em28xx driver to get an idea of where the leak is.

>> So what i have as domU kernel is:
>> - from Linus tree 2.6.36-pre with latest commit 
>> 8357422d4bf33bc2c35884d4016c3fc9efbbc1d2
>> - pulled your devel/xen-pcifront-0.5 tree, and fixed the merge conflicts due 
>> to the pv on hvm patches, in the same way Jeremy did for his 2.6.36 branch.
>>     Conflicts:
>>         drivers/xen/events.c
>>         include/xen/events.h
>> 
>> - Added a patch for xhci isoc that fixes another bug. (attached 
>> 0001-xHCI-update-ring-dequeue-pointer-when-process-missed.patch)
>> - Added a patch that shows extra debug info for xhci from the author of the 
>> xhci-isoc pathes (attached isoc_length5.patch)
>> 
>> - Changed some debug lines from dbg to warn level, (just enabling xhci-debug 
>> in kernel config floods the logs too fast, so i just enabled them in 
>> xhci-mem.c)
>>   (Attached a patch with all my changes to xhci*, including the 2 patches 
>> above applied.)
>> 
>> - Added some printk's to swiotlb-xen.c to see which functions were used, and 
>> let some of them print the address as well, in the hope i could find some 
>> debug info there.(attached)

> So the other thing I realized is that xhci can actually do 64bit
> transfers. Try booting without the 'swiotlb=force' (but still have
> iommu=soft).

>> 
>> Apart from fixing the xhci in the end, is there a way to prevent xen from 
>> freezing altogether without leaving a trace ?

> I am still puzzled as why Xen hypervisor would freeze. It _should_ work
> just fine.

>> Even an Oops is much easier to debug than a freeze. Due to the nature of DMA 
>> that could perhaps be difficult, although there is an DMA API ...
>> 
>> Had a fruitful LinuxCon ?

> Quite so!
>> 
>> --
>> Sander








-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>