This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Nouveau on dom0

> >> page-table directory so that when the GPU accesses the addresses, it
> >> gets the real bus address. I wonder if it fails at that thought -
> >> meaning that the addresses that are written to the page table are
> >> actually the guest page numbers (gpfn) instead of the machine page numbers 
> >> (mfn).
> >
> > No, I don't think thats how it works. The user-space write triggers an
> > aio-write -
> which triggers do_page_fault, handle_mm_fault, do_linear_fault, __do_fault
> and finally ttm_bo_vm_fault.
> ttm_bo_fault returns VM_FAULT_NOPAGE

VM_FAULT_NOPAGE = means retry the fault, In other words, I've fixed the
PTE to point to the right PFN.
>  - but xen-boot keeps on re-triggering the same fault.

Which probably means that something is not OK with the PTE. What is the
vma->vm_page_prot value before the vm_insert_mixed? (and maybe even

Try also reading the true value of the PTE and seeing what it shows
before and after the vm_insert_mixed.

I've attached a simple patch I wrote some time ago to get the real MFNs
and its page protection. I think you can adapt it (print_data function to be 
to peet at the PTE and its protection values.

There is an extra flag that the PTE can have when running under Xen: 
This signifies that the PFN is actually the MFN. In this case thought
it sholdn't be enabled b/c the memory is actually gathered from
alloc_page. But if it is, it might be the culprit.

> when vm_fault calls ttm_tt_get_page, the page is already there, and
> the handler does another vm_insert_page (i changed vm_insert_mixed
> vm_insert_page/pfn based on io_mem, now the only patch, and it works on
> bare machine) on and on and on.
> What can possibly cause the fault-handler to repeat endlessly?

The VM_FAULT_NOPAGE shortcircuits most of the fault-handler and makes it
return back. The application is resumed and retries the operation that
caused the fault - in this case an attempt to write to an address that
was not present. Obviously the second attempt at writing to the address
should have worked without problems.

> If a wrong page is backed at the user-address, it should create bad_access or
> some other subsequent events - but the system is running fine minus all local
> consoles! If the insertion is to a wrong place, this can happen; but
> the top-level
> trap is the only provider of the address - and the fault addres and
> vma address match,
> and the same code works fine on bare-boot.

So you see this fault handler being called endlessly while the machine
is still running and other pieces of code work just fine, right?

> ttm_tt_get_page calls alloc in a loop - so it may allocate multiple pages from
> start/end depending on Highmem memory or not - implying asynchronous 
> allocation
> and mapping.

I thought it had some logic to figure out that it already handled this
page and would return an already allocate page?

> All I want now is *ptr = (uint32_t)data to work as of now!

You are doing a great job at this head-spinning detective work. Much

Attachment: debug-print-pte.patch
Description: Text document

Xen-devel mailing list