This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] high memory dma update: up against a wall

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] high memory dma update: up against a wall
From: Jon Mason <jdmason@xxxxxxxxxx>
Date: Tue, 12 Jul 2005 18:21:54 -0500
Cc: Scott Parish <srparish@xxxxxxxxxx>
Delivery-date: Tue, 12 Jul 2005 23:20:36 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20050712171809.GB4224@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: IBM
References: <20050712171809.GB4224@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.7.2
On Tuesday 12 July 2005 12:18 pm, Scott Parish wrote:
> I've been slowly working on the dma problem i ran into; thought i was
> making progress, but i think i'm up against a wall, so more discussion
> and ideas might be helpful.
> The problem was that on x86_32 PAE and x86_64, our physical address size
> is greater then 32 bits, yet many (most?) io devices can only address
> the first 32 bits of memory. So if/when we try to do dma to an address
> that's has bits greater then 32 set (call these high addresses), due to
> truncation the dma ends up happening to the wrong address.
> I saw this problem on x86_64 with 6gigs ram, if i made dom0 too big, the
> allocator put it in high memory, the linux kernel booted fine, but the
> partition scan failed, and it couldn't mount root.

Why not have the allocator force all driver domains to be in memory < 4GB?

> My original solution was to add another type to the xen zoneinfo array
> to divide memory between high and low. Finally, only allocate low memory
> when a domain needs to do dma or when high memory is exhausted. This was
> an easy patch that worked fine. I can provide it if anyone wants it.
> On the linux side of things, my first approach was to try to use linux
> zones to divide up memory. Currently under xen, all memory is placed in
> the dma zone. I was hoping i could somewhere loop over memory, do check
> the machine address of each page, and place it in the proper zones. The
> first problem with this approach is that linux zones are designed more
> for dealing with the even smaller isa address space. That aside, it
> seems to make large assumptions about memory being (mostly) contiguous
> and most frequently deals with "start" and "size" rather then arrays
> of pages. I start looking at code, thinking that i might change
> that, but at some point finally realized that on an abstract level,
> what i was fundamentally doing was the exact reason that the pfn/mfn
> mapping exists---teaching linux about non-contiguous memory looks fairly
> non-trivial.
> The next approach i started on was to have xen reback memory with
> low pages when it went to do dma. dma_alloc_coherent() makes a call
> to xen_contig_memory(), which forces a range of memory to be backed
> by machine contiguous pages by freeing the buffer to xen, and then
> asking for it back[1]. I tried adding another hypercall to request that
> dma'able pages be returned. This worked great for the network cards, but
> disk was another story. First off, there were several code paths that
> do dma that don't end up calling xen_contig_memory (which right now is
> fine because its only ever on single pages). I started down the path of
> finding those, but in the mean time realized that for disk, we could be
> dma'ing to any memory. Additionally, Michael Hohnbaum reminded me of
> page flipping. Between these two, it seems reasonable to think that the
> pool for free dma memory could eventually become exhausted.

Running out of DMA'able memory happens.  Perf sucks, but it shouldn't kill 
your system.  What's the problem?

> That is the wall.
> Footnote: this will not be a problem on all machines. AMD x86_64 has
> iommu which should make this a non-problem (if the kernel chooses to use
> it). Unfortunately, from what i understand, EMT64 is not so blessed.

AMD64 has IOMMU HW acceleration.  EM64T has software IOMMU.  Whenever I get 
IOMMU working on x86-64, this should solve your problem.

> sRp
> 1| incidentally, it seems to me that optimally xen_contig_memory()
> should just return if order==0.


Xen-devel mailing list