|   | 
      | 
  
  
      | 
      | 
  
 
     | 
    | 
  
  
     | 
    | 
  
  
    |   | 
      | 
  
  
    | 
         
xen-devel
[Xen-devel] Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT
 
| 
To:  | 
Jeremy Fitzhardinge <jeremy@xxxxxxxx> | 
 
| 
Subject:  | 
[Xen-devel] Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT | 
 
| 
From:  | 
Ingo Molnar <mingo@xxxxxxx> | 
 
| 
Date:  | 
Tue, 20 Jan 2009 21:56:53 +0100 | 
 
| 
Cc:  | 
Nick Piggin <npiggin@xxxxxxx>, zach@xxxxxxxxxx,	Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, jeremy@xxxxxxxxxxxxx,	rusty@xxxxxxxxxxxxxxx,	Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>,	chrisw@xxxxxxxxxxxx, hpa@xxxxxxxxx,	Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>,	Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> | 
 
| 
Delivery-date:  | 
Tue, 20 Jan 2009 12:58:01 -0800 | 
 
| 
Envelope-to:  | 
www-data@xxxxxxxxxxxxxxxxxxx | 
 
| 
In-reply-to:  | 
<49763806.5090009@xxxxxxxx> | 
 
| 
List-help:  | 
<mailto:xen-devel-request@lists.xensource.com?subject=help> | 
 
| 
List-id:  | 
Xen developer discussion <xen-devel.lists.xensource.com> | 
 
| 
List-post:  | 
<mailto:xen-devel@lists.xensource.com> | 
 
| 
List-subscribe:  | 
<http://lists.xensource.com/mailman/listinfo/xen-devel>,	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe> | 
 
| 
List-unsubscribe:  | 
<http://lists.xensource.com/mailman/listinfo/xen-devel>,	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> | 
 
| 
References:  | 
<20090120110542.GE19505@xxxxxxxxxxxxx>	<20090120112634.GA20858@xxxxxxx>	<20090120140324.GA26424@xxxxxxx> <49763806.5090009@xxxxxxxx> | 
 
| 
Sender:  | 
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx | 
 
| 
User-agent:  | 
Mutt/1.5.18 (2008-05-17) | 
 
 
 
* Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
> Ingo Molnar wrote:
>> * Ingo Molnar <mingo@xxxxxxx> wrote:
>>
>>   
>>>> Times I believe are in nanoseconds for lmbench, anyway lower is  
>>>> better.
>>>>
>>>> non pv   AVG=464.22 STD=5.56
>>>> paravirt AVG=502.87 STD=7.36
>>>>
>>>> Nearly 10% performance drop here, which is quite a bit... hopefully 
>>>> people are testing the speed of their PV implementations against  
>>>> non-PV bare metal :)
>>>>       
>>> Ouch, that looks unacceptably expensive. All the major distros turn  
>>> CONFIG_PARAVIRT on. paravirt_ops was introduced in x86 with the 
>>> express promise to have no measurable runtime overhead.
>>>     
>>
>> Here are some more precise stats done via hw counters on a perfcounters 
>> kernel using 'timec', running a modified version of the 'mmap 
>> performance stress-test' app i made years ago.
>>
>> The MM benchmark app can be downloaded from:
>>
>>    http://redhat.com/~mingo/misc/mmap-perf.c
>>
>> timec.c can be picked up from:
>>
>>    http://redhat.com/~mingo/perfcounters/timec.c
>>
>> mmap-perf conducts 1 million mmap()/munmap()/mremap() calls, and 
>> touches the mapped area as well with a certain chance. The patterns are 
>> pseudo-random and the random seed is initialized to the same value so  
>> repeated runs produce the exact same mmap sequence.
>>
>> I ran the test with a single thread and bound to a single core:
>>
>>   # taskset 2 timec -e -5,-4,-3,0,1,2,3 ./mmap-perf 1
>>
>> [ I ran it as root - so that kernel-space hardware-counter statistics 
>> are   included as well. ]
>>
>> The results are quite surprisingly candid about the true costs of  
>> paravirt_ops on the native kernel's overhead (CONFIG_PARAVIRT=y):
>>
>> -----------------------------------------------
>> | Performance counter stats for './mmap-perf' |
>> -----------------------------------------------
>> |                |
>> |  x86-defconfig |   PARAVIRT=y          
>> |------------------------------------------------------------------
>> |
>> |    1311.554526 |  1360.624932  task clock ticks (msecs)    +3.74%
>> |                |
>> |              1 |            1  CPU migrations
>> |             91 |           79  context switches
>> |          55945 |        55943  pagefaults
>> |    ............................................
>> |     3781392474 |   3918777174  CPU cycles                  +3.63%
>> |     1957153827 |   2161280486  instructions               +10.43%
>>   
>
> !!
>
>> |       50234816 |     51303520  cache references            +2.12%
>> |        5428258 |      5583728  cache misses                +2.86%
>>   
>
> Is this I or D, or combined?
That's last-level-cache references+misses (L2 cache):
 Bit Position Event Name                UMask Event Select
 CPUID.AH.EBX
 3            LLC Reference             4FH   2EH
 4            LLC Misses                41H   2EH
        Ingo
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 |   
 
 | 
    | 
  
  
    |   | 
    |