On Tuesday 12 July 2005 12:18 pm, Scott Parish wrote:
> I've been slowly working on the dma problem i ran into; thought i was
> making progress, but i think i'm up against a wall, so more discussion
> and ideas might be helpful.
>
> The problem was that on x86_32 PAE and x86_64, our physical address size
> is greater then 32 bits, yet many (most?) io devices can only address
> the first 32 bits of memory. So if/when we try to do dma to an address
> that's has bits greater then 32 set (call these high addresses), due to
> truncation the dma ends up happening to the wrong address.
>
> I saw this problem on x86_64 with 6gigs ram, if i made dom0 too big, the
> allocator put it in high memory, the linux kernel booted fine, but the
> partition scan failed, and it couldn't mount root.
Why not have the allocator force all driver domains to be in memory < 4GB?
> My original solution was to add another type to the xen zoneinfo array
> to divide memory between high and low. Finally, only allocate low memory
> when a domain needs to do dma or when high memory is exhausted. This was
> an easy patch that worked fine. I can provide it if anyone wants it.
>
> On the linux side of things, my first approach was to try to use linux
> zones to divide up memory. Currently under xen, all memory is placed in
> the dma zone. I was hoping i could somewhere loop over memory, do check
> the machine address of each page, and place it in the proper zones. The
> first problem with this approach is that linux zones are designed more
> for dealing with the even smaller isa address space. That aside, it
> seems to make large assumptions about memory being (mostly) contiguous
> and most frequently deals with "start" and "size" rather then arrays
> of pages. I start looking at code, thinking that i might change
> that, but at some point finally realized that on an abstract level,
> what i was fundamentally doing was the exact reason that the pfn/mfn
> mapping exists---teaching linux about non-contiguous memory looks fairly
> non-trivial.
>
> The next approach i started on was to have xen reback memory with
> low pages when it went to do dma. dma_alloc_coherent() makes a call
> to xen_contig_memory(), which forces a range of memory to be backed
> by machine contiguous pages by freeing the buffer to xen, and then
> asking for it back[1]. I tried adding another hypercall to request that
> dma'able pages be returned. This worked great for the network cards, but
> disk was another story. First off, there were several code paths that
> do dma that don't end up calling xen_contig_memory (which right now is
> fine because its only ever on single pages). I started down the path of
> finding those, but in the mean time realized that for disk, we could be
> dma'ing to any memory. Additionally, Michael Hohnbaum reminded me of
> page flipping. Between these two, it seems reasonable to think that the
> pool for free dma memory could eventually become exhausted.
Running out of DMA'able memory happens. Perf sucks, but it shouldn't kill
your system. What's the problem?
> That is the wall.
>
> Footnote: this will not be a problem on all machines. AMD x86_64 has
> iommu which should make this a non-problem (if the kernel chooses to use
> it). Unfortunately, from what i understand, EMT64 is not so blessed.
AMD64 has IOMMU HW acceleration. EM64T has software IOMMU. Whenever I get
IOMMU working on x86-64, this should solve your problem.
> sRp
>
> 1| incidentally, it seems to me that optimally xen_contig_memory()
> should just return if order==0.
Thanks,
Jon
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|