|  |  | 
  
    |  |  | 
 
  |   |  | 
  
    |  |  | 
  
    |  |  | 
  
    |   xen-devel
Re: [Xen-devel] Poor HVM performance with 8 vcpus 
| Hi Keir,
thanks for the quick answer.
Keir Fraser wrote:
> Hi Juergen,
> 
> Tim Deegan is the man for this stuff (cc'ed) - you don't want to get too
> involved in the shadow code without syncing with him first. My
:-)
> understanding, however, is that shadow code is currently designed with
> scalability up to only about 4 VCPUs in mind. The expectation is that, as
> users want to scale wider than that, they will typically be upgrading to
> modern many-core processors with hardware assistance (Intel EPT, AMD NPT).
Okay. We plan to do this as soon as Nehalem-EX is available. Right now we are
testing on 4 socket Dunnington systems (24 cores) and found the issue.
This will be a problem for us if Nehalem-EX is available much later then
planned now. So I wanted to check for possible enhancements in XEN before
this might happen.
> 
> If you don't fit into that scenario, perhaps we can find you some
> lowish-hanging fruit to improve parallelism. Big changes in shadow code
> could be scary for us due to the likely nasty bug tail!
I understand this. Let's see if some rather local changes could improve the
performance.
Juergen
> On 07/10/2009 07:55, "Juergen Gross" <juergen.gross@xxxxxxxxxxxxxx> wrote:
> 
>> Hi,
>>
>> we've got massive performance problems running a 8 vcpu HVM-guest (BS2000)
>> under XEN (xen 3.3.1).
>>
>> With a specific benchmark producing a rather high load on memory management
>> operations (lots of process creation/deletion and memory allocation) the 8
>> vcpu performance was worse than the 4 vcpu performance. On other platforms
>> (/390, MIPS, SPARC) this benchmark scaled rather well with the number of 
>> cpus.
>>
>> The result of the usage of the software performance counters of XEN seemed
>> to point to the shadow lock being the reason. I modified the Hypervisor to
>> gather some lock statistics (patch will be sent soon) and found that the
>> shadow lock is really the bottleneck. On average 4 vcpus are waiting to get
>> the lock!
>>
>> Is this a known issue?
>> Is there a chance to split the shadow lock into sub-locks or to use a
>> reader/writer lock instead?
>> I just wanted to ask before trying to understand all of the shadow code :-)
>>
>>
>> Juergen
> 
> 
> 
> 
-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 636 47950
Fujitsu Technolgy Solutions               e-mail: juergen.gross@xxxxxxxxxxxxxx
Otto-Hahn-Ring 6                        Internet: ts.fujitsu.com
D-81739 Muenchen                 Company details: ts.fujitsu.com/imprint.html
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 | 
 |  | 
  
    |  |  |