WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ppc-devel

Re: [XenPPC] PATCH: Inline assembler for clear_page() and copy_page()

I would expect to see dcbtst in here, no?

Nah, dcbtst is expensive (it causes some non-cheap bus
transactions) and not needed at all; dcbz is much better
(but can only be used if you kill the whole cache line;
which is true here).

Both functions (copy and clear) could stand a little loop unrolling.

ldu ; stdu ; bdnz is not the best loop possible, esp. not on
970/P4/P5.  You guys got Mac's, use Shark (go to the code browser,
cmd-shift-M, select "show 970 dispatch groups" and "show 970
details drawer").  In most cases the time spent in the loop will
be dominated by memory (cache) speed, of course, but still.

I can understand if you're not *really* trying to optimize these, but in that case why do you want to add dcbz? Is there a noticeable performance
improvement?

Yes, dcbz is (should be) a huge improvement.


Segher


_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel