xen-devel

[Top] [All Lists]

RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page()

from [Jan Beulich]

[Permanent Link][Original]

To:	"Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
Subject:	RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page()
From:	"Jan Beulich" <jbeulich@xxxxxxxxxx>
Date:	Thu, 13 Nov 2008 08:37:26 +0000
Cc:	xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date:	Thu, 13 Nov 2008 00:37:16 -0800
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<6a20476a-f9f8-447b-bcdb-65009f38fcce@default>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<491AFDC2.76E4.0078.0@xxxxxxxxxx> <6a20476a-f9f8-447b-bcdb-65009f38fcce@default>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

>>> Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> 12.11.08 18:17 >>>
>Hmmm... I'm working on a project that does extensive page-copying
>so was eager to give it a spin on two test machines, one a Core 2 Duo
>("Weybridge"), the other an as-yet-unreleased Intel box.  I measured
>the routine with rdtsc, took many thousands of samples, and
>look at the smallest measurement.  The hypervisor measured is
>64-bit so "cpu_has_xmm2" appears to always be true.
>
>On the first machine, the change to use sse2 instructions
>made no difference.  On the second machine, using sse2 actually
>made copy_page() *worse* (by 30-40%).

This very much depends on whether the page(s) are in any caches - in
the general case (e.g. when dealing with large sets of data, or data
just read from disk), you'd expect both pages (source and destination)
not to be in any cache. This is where using the streaming instructions
helps.

However, when dealing with a small set of pages (or even just a single
source/destination pair), you'd easily run entirely on L1 or L2 data, which
certainly performs better using the non-streaming instructions.

>I'm poor enough with the x86 instruction set that I can't explain
>my results, but thought I would report them.  I'm not doubting that
>you saw improvements on your box, just noting that YMMV.
>
>Perhaps someone from Intel familiar with the microarchitectures
>might be able to explain (and can query me offlist to identify
>the as-yet-unreleased box).

The above is not to say that there are other reasons why this would
perform worse on as-yet-unreleased hardware (which I wasn't able
to test on and hence can't say anything about).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] [PATCH] x86: add SSE-based copy_page(), Jan Beulich RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Dan Magenheimer RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Jan Beulich RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Dan Magenheimer RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Jan Beulich <= RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Dan Magenheimer RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Cui, Dexuan RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Dan Magenheimer Re: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Keir Fraser RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Dan Magenheimer Re: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Keir Fraser Re: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Jan Beulich

Previous by Date:	Re: [Xen-devel] Control Interface Question, Keir Fraser
Next by Date:	[Xen-devel] [PATCH] CPU affinity reset during save/restore, Jiri Denemark
Previous by Thread:	RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Dan Magenheimer
Next by Thread:	RE: [Xen-devel] [PATCH] x86: add SSE-based copy_page(), Dan Magenheimer
Indexes:	[Date] [Thread] [Top] [All Lists]