On 05/18/2010 10:34 AM, John Morrison wrote:
> Hi,
>
> Over the last year we have tried many times to get acceptable performance
> from pv_ops kernels.
>
> Tests done with 1,2,4 and 8 cores. The more cores the lower the score.
>
> Inside the domU it shows all cores, top -s shows all cores in use.
> xentop in dom0 never shows over 99% cpu.
>
> 2.6.18.8-xenU kernel show's over 700% cpu and the scores are about 8 x the
> pv_ops score.
>
> Any ideas ?
>
Well, I guess some kind of bad serialization is going on in there, and
it should be fairly obvious with a bit of examination.
Have you tried building your own pvops domu kernels? Does enabling PV
spinlocks make any difference? Also enabling some of the lock
debugging/profiling/contention monitoring stuff may give useful results.
Can you post the corresponding 2.6.18 results? Are there specific
sub-tests which show the effect more strongly than the others?
How does the 2.6.32 kernel fare when booted native?
Thanks,
J
>
> John
>
>
> 1 core
>
> BYTE UNIX Benchmarks (Version 4.1-wht.2)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC
> 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066476 132875660 1% /
>
> Start Benchmark Run: Tue May 18 13:54:54 BST 2010
> 13:54:54 up 0 min, 1 user, load average: 0.00, 0.00, 0.00
>
> End Benchmark Run: Tue May 18 14:06:12 BST 2010
> 14:06:12 up 11 min, 2 users, load average: 11.48, 5.20, 2.43
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 8950813.0 237.6
> Double-Precision Whetstone 83.1 2103.7 253.2
> Execl Throughput 188.3 1568.4 83.3
> File Copy 1024 bufsize 2000 maxblocks 2672.0 64198.0 240.3
> File Copy 256 bufsize 500 maxblocks 1077.0 17781.0 165.1
> File Read 4096 bufsize 8000 maxblocks 15382.0 643717.0 418.5
> Pipe-based Context Switching 15448.6 85379.4 55.3
> Pipe Throughput 111814.6 478490.1 42.8
> Process Creation 569.3 3329.6 58.5
> Shell Scripts (8 concurrent) 44.8 380.7 85.0
> System Call Overhead 114433.5 498712.3 43.6
> =========
> FINAL SCORE 114.1
>
> 2-cores
>
> ==============================================================
> BYTE UNIX Benchmarks (Version 4.1-wht.2)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC
> 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066548 132875588 1% /
>
> Start Benchmark Run: Tue May 18 14:07:27 BST 2010
> 14:07:27 up 0 min, 1 user, load average: 0.00, 0.00, 0.00
>
> End Benchmark Run: Tue May 18 14:18:04 BST 2010
> 14:18:04 up 10 min, 1 user, load average: 12.78, 5.53, 2.49
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 10124838.6 268.7
> Double-Precision Whetstone 83.1 1188.7 143.0
> Execl Throughput 188.3 1596.2 84.8
> File Copy 1024 bufsize 2000 maxblocks 2672.0 58323.0 218.3
> File Copy 256 bufsize 500 maxblocks 1077.0 17776.0 165.1
> File Read 4096 bufsize 8000 maxblocks 15382.0 568217.0 369.4
> Pipe-based Context Switching 15448.6 86111.3 55.7
> Pipe Throughput 111814.6 469957.8 42.0
> Process Creation 569.3 3298.1 57.9
> Shell Scripts (8 concurrent) 44.8 378.9 84.6
> System Call Overhead 114433.5 532828.4 46.6
> =========
> FINAL SCORE 107.9
>
> 4-cores
>
> ==============================================================
> BYTE UNIX Benchmarks (Version 4.1-wht.2)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC
> 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066628 132875508 1% /
>
> Start Benchmark Run: Tue May 18 14:19:17 BST 2010
> 14:19:17 up 0 min, 1 user, load average: 0.00, 0.00, 0.00
>
> End Benchmark Run: Tue May 18 14:29:53 BST 2010
> 14:29:53 up 10 min, 1 user, load average: 13.59, 6.35, 2.97
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 10185429.8 270.3
> Double-Precision Whetstone 83.1 759.8 91.4
> Execl Throughput 188.3 1386.2 73.6
> File Copy 1024 bufsize 2000 maxblocks 2672.0 62331.0 233.3
> File Copy 256 bufsize 500 maxblocks 1077.0 16492.0 153.1
> File Read 4096 bufsize 8000 maxblocks 15382.0 563402.0 366.3
> Pipe-based Context Switching 15448.6 87176.0 56.4
> Pipe Throughput 111814.6 481068.1 43.0
> Process Creation 569.3 3128.9 55.0
> Shell Scripts (8 concurrent) 44.8 394.9 88.1
> System Call Overhead 114433.5 539996.1 47.2
> =========
> FINAL SCORE 102.6
> 8-cores
>
> ==============================================================
> BYTE UNIX Benchmarks (Version 4.1-wht.2, 8 threads)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC
> 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066680 132875456 1% /
>
> Start Benchmark Run: Tue May 18 14:30:59 BST 2010
> 14:30:59 up 0 min, 1 user, load average: 0.07, 0.02, 0.00
>
> End Benchmark Run: Tue May 18 14:42:52 BST 2010
> 14:42:52 up 12 min, 1 user, load average: 25.56, 10.84, 4.96
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 9972130.3 264.7
> Double-Precision Whetstone 83.1 755.2 90.9
> Execl Throughput 188.3 1584.7 84.2
> File Copy 1024 bufsize 2000 maxblocks 2672.0 58981.0 220.7
> File Copy 256 bufsize 500 maxblocks 1077.0 16904.0 157.0
> File Read 4096 bufsize 8000 maxblocks 15382.0 557735.0 362.6
> Pipe-based Context Switching 15448.6 80738.2 52.3
> Pipe Throughput 111814.6 450891.2 40.3
> Process Creation 569.3 2948.5 51.8
> Shell Scripts (8 concurrent) 44.8 378.1 84.4
> System Call Overhead 114433.5 537443.2 47.0
> =========
> FINAL SCORE 100.9
>
>
>
> --
> Professional hosting without compromise
> www.clustered.net
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|