| 
Mark Williamson wrote:
 Sounds a bit weird.  How many CPUs in this box?  My memories of the benchmarks 
suggest this should be better than you're seeing, but maybe your workload is 
tickling some bad cases or something...
 
Hi Mark:
We have a quad-core processor and 8GB RAM. I gave the domU 
all 4 VCPUs (didn't know you could do that! -- you can't do 
it with memory). I assigned 7.5GB RAM to the domU. 
This was using CentOS-5 x86_64 and mysql 5.1.20-beta.
 I know this sounds really weird but when comparing to native performance you 
do need to test on the same area of the disk, have you done this?  Portions 
of the disk nearer to the outside edge of the platter can have significantly 
higher transfer rates due to moving at a higher linear velocity.
 
No, we didn't specifically test the same area of the disk. 
We have a hardware RAID-10 (PERC 5 on Dell PE 2950) using 
128K chunks. But I think we did enough tests to see a 
pattern emerging. 
Granted, we also used sql-bench and the results for the domU 
were actually okay. But we believe our own sql tests more 
specifically focus on the disk I/O. But I also ran unixbench 
since that's a bit more standardized, and it seems to 
indicate some of the same results. I don't know how helpful 
it is, but I went ahead and enclosed those results below. 
 XenEnterprise and friends add optimised disk drivers for Windows, but if 
you're running paravirtualised Linux then you've already got optimised disk 
drivers, so the commercial product probably won't help.
 
Ok thanks for clarifying.
We haven't yet tested Xen 3.1 -- do you know if that 
includes any I/O performance enhancements? 
johnn
Here's the unixbench results for CentOS-5 x86_64 on a 
physical machine (non-xen): 
======================================================
  BYTE UNIX Benchmarks (Version 4.1.0)
System -- Linux sql1 2.6.18-8.1.8.el5 #1 SMP Tue Jul 10 
06:39:17 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux 
  Start Benchmark Run: Wed Aug  8 18:15:57 EDT 2007
   1 interactive users.
   18:15:57 up 9 min,  1 user,  load average: 0.01, 0.04, 0.00
  lrwxrwxrwx 1 root root 4 Aug  8 16:41 /bin/sh -> bash
  /bin/sh: symbolic link to `bash'
  /dev/sda1              8123168   1189948   6513928  16% /
Dhrystone 2 using register variables     8893810.7 lps 
(10.0 secs, 10 samples)
Double-Precision Whetstone                 1781.2 MWIPS (9.9 
secs, 10 samples)
System Call Overhead                     473046.7 lps 
(10.0 secs, 10 samples)
Pipe Throughput                          492008.6 lps 
(10.0 secs, 10 samples)
Pipe-based Context Switching              96405.6 lps 
(10.0 secs, 10 samples)
Process Creation                           8338.6 lps 
(30.0 secs, 3 samples)
Execl Throughput                           2372.1 lps 
(29.8 secs, 3 samples)
File Read 1024 bufsize 2000 maxblocks    844117.0 KBps 
(30.0 secs, 3 samples)
File Write 1024 bufsize 2000 maxblocks   478062.0 KBps 
(30.0 secs, 3 samples)
File Copy 1024 bufsize 2000 maxblocks    283041.0 KBps 
(30.0 secs, 3 samples)
File Read 256 bufsize 500 maxblocks      235142.0 KBps 
(30.0 secs, 3 samples)
File Write 256 bufsize 500 maxblocks     126786.0 KBps 
(30.0 secs, 3 samples)
File Copy 256 bufsize 500 maxblocks       80142.0 KBps 
(30.0 secs, 3 samples)
File Read 4096 bufsize 8000 maxblocks    1755004.0 KBps 
(30.0 secs, 3 samples)
File Write 4096 bufsize 8000 maxblocks   931432.0 KBps 
(30.0 secs, 3 samples)
File Copy 4096 bufsize 8000 maxblocks    601062.0 KBps 
(30.0 secs, 3 samples)
Shell Scripts (1 concurrent)               5410.3 lpm 
(60.0 secs, 3 samples)
Shell Scripts (8 concurrent)               1721.7 lpm 
(60.0 secs, 3 samples)
Shell Scripts (16 concurrent)               925.7 lpm 
(60.0 secs, 3 samples)
Arithmetic Test (type = short)           1243892.9 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = int)             1265520.4 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = long)            326020.1 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = float)           925118.2 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = double)          512452.7 lps 
(10.0 secs, 3 samples)
Arithoh                                  227641190.3 lps 
(10.0 secs, 3 samples)
C Compiler Throughput                       978.7 lpm 
(60.0 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places          84417.7 lpm 
(30.0 secs, 3 samples)
Recursion Test--Tower of Hanoi            81500.4 lps 
(20.0 secs, 3 samples) 
                     INDEX VALUES
TEST                                        BASELINE 
RESULT      INDEX
Dhrystone 2 using register variables        116700.0 
8893810.7      762.1
Double-Precision Whetstone                      55.0 
1781.2      323.9
Execl Throughput                                43.0 
2372.1      551.7
File Copy 1024 bufsize 2000 maxblocks         3960.0 
283041.0      714.8
File Copy 256 bufsize 500 maxblocks           1655.0 
80142.0      484.2
File Copy 4096 bufsize 8000 maxblocks         5800.0 
601062.0     1036.3
Pipe Throughput                              12440.0 
492008.6      395.5
Process Creation                               126.0 
8338.6      661.8
Shell Scripts (8 concurrent)                     6.0 
1721.7     2869.5
System Call Overhead                         15000.0 
473046.7      315.4 
      =========
FINAL SCORE 
          640.2
======================================================
And here's the results for the exact same setup on a domU 
using same physical hardware: 
  BYTE UNIX Benchmarks (Version 4.1.0)
System -- Linux store-nyc377-vmsql02.limewire.com 
2.6.18-8.1.8.el5xen #1 SMP Tue Jul 10 07:06:45 EDT 2007 
x86_64 x86_64 x86_64 GNU/Linux 
  Start Benchmark Run: Thu Aug  9 14:34:36 EDT 2007
   1 interactive users.
14:34:36 up 1 day, 20:41,  1 user,  load average: 0.16, 
0.04, 0.01 
  lrwxrwxrwx 1 root root 4 Jul 17 17:54 /bin/sh -> bash
  /bin/sh: symbolic link to `bash'
  /dev/xvda1             6092360   3795932   1981960  66% /
Dhrystone 2 using register variables     8931957.1 lps 
(10.0 secs, 10 samples)
Double-Precision Whetstone                 1788.3 MWIPS (9.9 
secs, 10 samples)
System Call Overhead                     174628.7 lps 
(10.0 secs, 10 samples)
Pipe Throughput                          211195.6 lps 
(10.0 secs, 10 samples)
Pipe-based Context Switching              54065.7 lps 
(10.0 secs, 10 samples)
Process Creation                           2336.0 lps 
(30.0 secs, 3 samples)
Execl Throughput                            948.3 lps 
(29.7 secs, 3 samples)
File Read 1024 bufsize 2000 maxblocks    404623.0 KBps 
(30.0 secs, 3 samples)
File Write 1024 bufsize 2000 maxblocks   254555.0 KBps 
(30.0 secs, 3 samples)
File Copy 1024 bufsize 2000 maxblocks    145873.0 KBps 
(30.0 secs, 3 samples)
File Read 256 bufsize 500 maxblocks      105386.0 KBps 
(30.0 secs, 3 samples)
File Write 256 bufsize 500 maxblocks      65650.0 KBps 
(30.0 secs, 3 samples)
File Copy 256 bufsize 500 maxblocks       39343.0 KBps 
(30.0 secs, 3 samples)
File Read 4096 bufsize 8000 maxblocks    1103909.0 KBps 
(30.0 secs, 3 samples)
File Write 4096 bufsize 8000 maxblocks   645148.0 KBps 
(30.0 secs, 3 samples)
File Copy 4096 bufsize 8000 maxblocks    398281.0 KBps 
(30.0 secs, 3 samples)
Shell Scripts (1 concurrent)               2693.7 lpm 
(60.0 secs, 3 samples)
Shell Scripts (8 concurrent)                638.0 lpm 
(60.0 secs, 3 samples)
Shell Scripts (16 concurrent)               335.3 lpm 
(60.0 secs, 3 samples)
Arithmetic Test (type = short)           1244421.3 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = int)             1267033.4 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = long)            326565.1 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = float)           926358.0 lps 
(10.0 secs, 3 samples)
Arithmetic Test (type = double)          513037.0 lps 
(10.0 secs, 3 samples)
Arithoh                                  227902041.2 lps 
(10.0 secs, 3 samples)
C Compiler Throughput                       761.3 lpm 
(60.0 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places          37142.6 lpm 
(30.0 secs, 3 samples)
Recursion Test--Tower of Hanoi            81742.8 lps 
(20.0 secs, 3 samples) 
                     INDEX VALUES
TEST                                        BASELINE 
RESULT      INDEX
Dhrystone 2 using register variables        116700.0 
8931957.1      765.4
Double-Precision Whetstone                      55.0 
1788.3      325.1
Execl Throughput                                43.0 
948.3      220.5
File Copy 1024 bufsize 2000 maxblocks         3960.0 
145873.0      368.4
File Copy 256 bufsize 500 maxblocks           1655.0 
39343.0      237.7
File Copy 4096 bufsize 8000 maxblocks         5800.0 
398281.0      686.7
Pipe Throughput                              12440.0 
211195.6      169.8
Process Creation                               126.0 
2336.0      185.4
Shell Scripts (8 concurrent)                     6.0 
638.0     1063.3
System Call Overhead                         15000.0 
174628.7      116.4 
      =========
FINAL SCORE 
          324.3
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
 |