I will be re-doing the whole
test with a 4GB hard limit on the memory (for both the native case and the HVM
case) and also with 2 VCPU allocated for Dom0.
 
I don’t know what the difference
is in our environments. Maybe he was looking at overall time spent? I am
interested in the maximum jobs/minute which can be an indication of how much
horsepower we can get out of a guest VM.
 
Any answers for my other
questions?
 
Best regards,
 
---Kayvan
 
From: Xu, Anthony
[mailto:anthony.xu@xxxxxxxxx] 
Sent: Thursday, January 31, 2008 5:11 PM
To: Kayvan Sylvan; xen-ia64-devel
Subject: RE: [Xen-ia64-devel] HVM Multi-Processor Performance followup
 
 
 
You
can see the drop in performance starts to get really bad at about 9 CPUs and
beyond
 
If
you increase guest vCPU number, the bottleneck may be dom0 vCPU number( only
1vCPU for dom0).
 
You
can try configure two/four vCPU for dom0, the performance may be back.
 
Alex
said there are ~70% degradation on RE-AIM7,
 
Your
test result seems much better than his.
 
What's
the difference of your test environment?
 
 
From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Kayvan
Sylvan
Sent: 2008年2月1日 8:57
To: xen-ia64-devel
Subject: [Xen-ia64-devel] HVM Multi-Processor Performance followup
Hi everyone,
 
A follow-up on the multiprocessor performance benchmark on
HVM guests.
 
We ran the RE-AIM7 benchmarks on a 5-cell (40 CPU) machine
and a single-cell 8-cpu NEC machine.
 
Here are the jobs per minute maximums.
 
You can see the drop in performance starts to get really bad
at about 9 CPUs and beyond.
 
Questions:
 
1.      
What can I do to help improve this situation?
2.      
Are there any other experiments I can run?
3.      
What tools/profilers will help to gather more data here?
 
I am very interested in helping to solve this problem!
Thanks for your ideas and suggestions.
 
Best regards,
 
---Kayvan
 
 
 
   | 
   | 
   | 
   | 
   | 
     | 
 
 
   | 
  
   Xen performance comparison on
  5-Cell NEC machine (each cell with 4 dual-core Itaniums) 
   | 
 
 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   CPUs 
   | 
  
   Native
  Jobs/Min 
   | 
  
   HVM
  Jobs/Min 
   | 
  
   Overhead 
   | 
   | 
   | 
   | 
 
 
   | 
  
   1 
   | 
  
   2037 
   | 
  
   1791 
   | 
  
   12.08% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   2 
   | 
  
   4076 
   | 
  
   3615 
   | 
  
   11.31% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   3 
   | 
  
   6090 
   | 
  
   5221 
   | 
  
   14.27% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   4 
   | 
  
   8118 
   | 
  
   6839 
   | 
  
   15.76% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   5 
   | 
  
   10119 
   | 
  
   8404 
   | 
  
   16.95% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   6 
   | 
  
   12037 
   | 
  
   9949 
   | 
  
   17.35% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   7 
   | 
  
   14106 
   | 
  
   11095 
   | 
  
   21.35% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   8 
   | 
  
   15953 
   | 
  
   12360 
   | 
  
   22.52% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   9 
   | 
  
   18059 
   | 
  
   13201 
   | 
  
   26.90% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   10 
   | 
  
   20170 
   | 
  
   13742 
   | 
  
   31.87% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   11 
   | 
  
   21896 
   | 
  
   13694 
   | 
  
   37.46% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   12 
   | 
  
   24079 
   | 
  
   13331 
   | 
  
   44.64% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   13 
   | 
  
   25992 
   | 
  
   12374 
   | 
  
   52.39% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   14 
   | 
  
   28072 
   | 
  
   11684 
   | 
  
   58.38% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   15 
   | 
  
   29931 
   | 
  
   11032 
   | 
  
   63.14% 
   | 
   | 
   | 
   | 
 
 
   | 
  
   16 
   | 
  
   31696 
   | 
  
   10451 
   | 
  
   67.03% 
   | 
   | 
   | 
   | 
 
 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   The guest OS was CentOS-4.6
  with 2GB of memory, 
   | 
   | 
   | 
   | 
 
 
   | 
  
   running under a Dom0 that was
  limited to 1 VCPU. 
   | 
   | 
   | 
   | 
 
 
   | 
   | 
   | 
   | 
   | 
  
     
   | 
 
 
   | 
  
   Xen performance comparison on
  1-Cell NEC machine (4 dual core Itanium Montecito) 
   | 
   | 
   | 
   | 
 
 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   CPUs 
   | 
  
   Native
  Jobs/Min 
   | 
  
   HVM
  Jobs/Min 
   | 
  
   Overhead 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   1 
   | 
  
   2037 
   | 
  
   1779 
   | 
  
   12.67% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   2 
   | 
  
   4067 
   | 
  
   3619 
   | 
  
   11.02% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   3 
   | 
  
   6097 
   | 
  
   5344 
   | 
  
   12.35% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   4 
   | 
  
   8112 
   | 
  
   7004 
   | 
  
   13.66% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   5 
   | 
  
   10145 
   | 
  
   8663 
   | 
  
   14.61% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   6 
   | 
  
   12023 
   | 
  
   10213 
   | 
  
   15.05% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   7 
   | 
  
   14083 
   | 
  
   11249 
   | 
  
   20.12% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   8 
   | 
  
   16182 
   | 
  
   12969 
   | 
  
   19.86% 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   The guest OS was CentOS-4.6
  with 2GB of memory, 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
  
   running under a Dom0 that was
  limited to 1 VCPU. 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
 
 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   | 
   |