|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] [PATCH] x86: add SSE-based copy_page()
On 12/01/2009 23:29, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:
> I finally got around to measuring this. On my two machines,
> an Intel "Weybridge" box and an Intel TBD quadcore box,
> the new sse2 code was at best nearly the same for cold cache
> and much worse for warm cache.
>
> I can't explain the sampling variation as I have interrupts off,
> a lock held, and pre-warmed TLB... I suppose maybe another
> processor could be causing rare TLB misses? But in any case
> the min number is probably best for comparison.
>
> I'm guessing the gcc optimizer for the memcpy code was tuned
> for an Intel pipeline... Jan, were you measuring on an
> AMD processor?
>
> I've included the raw data and measurement code below.
Seems like unless we dynamically choose the copy routine, we're better off
without the SSE2 alternative. Shall I revert it then?
-- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|