WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] HYBRID: PV in HVM container

On Wed, Jul 27, 2011 at 06:58:28PM -0700, Mukesh Rathor wrote:
> Hi folks,
> 
> Well, I did some benchmarking and found interesting results. Following
> runs are on a westmere with 2 sockets and 10GB RAM.  Xen was booted
> with maxcpus=2 and entire RAM. All guests were started with 1vcpu and 2GB 
> RAM. dom0 started with 1 vcpu and 704MB. Baremetal was booted with 2GB 
> and 1 cpu.  HVM guest has EPT enabled. HT is on.

Is this PVonHVM? Or is it real HVM without _any_ PV enablement? Ah, the
.config tells me it is PVonHVM - so IRQ callbacks, and timers are PV
actually.

> 
> So, unless the NUMA'ness interfered with results (using some memory on 
> remote socket), it appears HVM does very well. To the point that it
> seems a hybrid is not going to be worth it. I am currently running
> tests on a single socket system just to be sure.

The xm has some NUMA capability while xl does not. Did you use xm or xl to
run this?

> 
> I am attaching my diff's in case any one wants to see what I did. I used
> xen 4.0.2 and linux 2.6.39. 

Wow. That is surprisingly a compact set of changes to the Linux kernel.
Good job.
> 
> thanks,
> Mukesh
> 
>                  L M B E N C H  3 . 0   S U M M A R Y
> 
> Processor, Processes - times in microseconds - smaller is better
> ------------------------------------------------------------------------------
> Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
>                              call  I/O stat clos TCP  inst hndl proc proc proc
> --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
> PV        Linux 2.6.39f 2639 0.65 0.88 2.14 4.59 3.77 0.79 3.62 535. 1294 3308
> Hybrid    Linux 2.6.39f 2639 0.13 0.21 0.89 1.96 3.08 0.24 1.10 529. 1294 3246

Hm, so it follows baremetal until fork/exec/sh. At which it is as bad as
PV.

> HVM       Linux 2.6.39f 2639 0.12 0.21 0.64 1.76 3.04 0.24 3.37 113. 354. 1324

<blinks> So HVM is better than baremetal?

> Baremetal Linux 2.6.39+ 2649 0.13 0.23 0.74 1.93 3.46 0.28 1.58 127. 386. 1434
> 
> Basic integer operations - times in nanoseconds - smaller is better
> -------------------------------------------------------------------
> Host                 OS  intgr intgr  intgr  intgr  intgr
>                           bit   add    mul    div    mod
> --------- ------------- ------ ------ ------ ------ ------
> PV        Linux 2.6.39f 0.3800 0.0100 0.1700 9.1000 9.0400
> Hybrid    Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0300
> HVM       Linux 2.6.39f 0.3800 0.0100 0.1700 9.1100 9.0600
> Baremetal Linux 2.6.39+ 0.3800 0.0100 0.1700 9.0600 8.9800
> 
> Basic float operations - times in nanoseconds - smaller is better
> -----------------------------------------------------------------
> Host                 OS  float  float  float  float
>                          add    mul    div    bogo
> --------- ------------- ------ ------ ------ ------
> PV        Linux 2.6.39f 1.1300 1.5200 5.6200 5.2900
> Hybrid    Linux 2.6.39f 1.1300 1.5200 5.6300 5.2900
> HVM       Linux 2.6.39f 1.1400 1.5200 5.6300 5.3000
> Baremetal Linux 2.6.39+ 1.1300 1.5100 5.6000 5.2700
> 
> Basic double operations - times in nanoseconds - smaller is better
> ------------------------------------------------------------------
> Host                 OS  double double double double
>                          add    mul    div    bogo
> --------- ------------- ------  ------ ------ ------
> PV        Linux 2.6.39f 1.1300 1.9000 8.6400 8.3200
> Hybrid    Linux 2.6.39f 1.1400 1.9000 8.6600 8.3200
> HVM       Linux 2.6.39f 1.1400 1.9000 8.6600 8.3300
> Baremetal Linux 2.6.39+ 1.1300 1.8900 8.6100 8.2800
> 
> Context switching - times in microseconds - smaller is better
> -------------------------------------------------------------------------
> Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
>                          ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
> --------- ------------- ------ ------ ------ ------ ------ ------- -------
> PV        Linux 2.6.39f 5.2800 5.7600 6.3600 6.3200 7.3600 6.69000 7.46000
> Hybrid    Linux 2.6.39f 4.9200 4.9300 5.2200 5.7600 6.9600 6.12000 7.31000

So the diff between PV an Hybrid looks to be 8%..

And then ~50% difference between Hybrid and baremetal. So syscall is
only causing 8% drop in performance - what is the other 42%?

> HVM       Linux 2.6.39f 1.3100 1.2200 1.6200 1.9200 3.2600 2.23000 3.48000

This is really bizzare. HVM kicks baremetal butt?
> Baremetal Linux 2.6.39+ 1.5500 1.4100 2.0600 2.2500 3.3900 2.44000 3.38000
> 
> *Local* Communication latencies in microseconds - smaller is better
> ---------------------------------------------------------------------
> Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
>                         ctxsw       UNIX         UDP         TCP conn
> --------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
> PV        Linux 2.6.39f 5.280  16.6 21.3  25.9  33.7  34.7  41.8  87.
> Hybrid    Linux 2.6.39f 4.920  11.2 14.4  19.6  26.1  27.5  32.9  71.
> HVM       Linux 2.6.39f 1.310 4.416 6.15 9.386  14.8  15.8  20.1  45.
> Baremetal Linux 2.6.39+ 1.550 4.625 7.34  14.3  19.8  21.4  26.4  66.
> 
> File & VM system latencies in microseconds - smaller is better
> -------------------------------------------------------------------------------
> Host                 OS   0K File      10K File     Mmap    Prot   Page   
> 100fd
>                         Create Delete Create Delete Latency Fault  Fault  
> selct
> --------- ------------- ------ ------ ------ ------ ------- ----- ------- 
> -----
> PV        Linux 2.6.39f                               24.0K 0.746 3.55870 
> 2.184
> Hybrid    Linux 2.6.39f                               24.6K 0.238 4.00100 
> 1.480

Could the mmap and the pagetable creations be the fault (ha! a pun!) of sucky
performance?  Perhaps running with autotranslate pagetables would eliminate 
this?

Is the mmap doing small little 4K runs or something much bigger?


> HVM       Linux 2.6.39f                              4716.0 0.202 0.96600 
> 1.468
> Baremetal Linux 2.6.39+                              6898.0 0.325 0.93610 
> 1.620
> 
> *Local* Communication bandwidths in MB/s - bigger is better
> -----------------------------------------------------------------------------
> Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
>                              UNIX      reread reread (libc) (hand) read write
> --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
> PV        Linux 2.6.39f 1661 2081 1041 3293.3 5528.3 3106.6 2800.0 4472 5633.
> Hybrid    Linux 2.6.39f 1974 2450 1183 3481.5 5529.6 3114.9 2786.6 4470 5672.
> HVM       Linux 2.6.39f 3232 2929 1622 3541.3 5527.5 3077.1 2765.6 4453 5634.
> Baremetal Linux 2.6.39+ 3320 2800 1666 3523.6 5578.9 3147.0 2841.6 4541 5752.
> 
> Memory latencies in nanoseconds - smaller is better
>     (WARNING - may not be correct, check graphs)
> ------------------------------------------------------------------------------
> Host                 OS   Mhz   L1 $   L2 $    Main mem    Rand mem    Guesses
> --------- -------------   ---   ----   ----    --------    --------    -------
> PV        Linux 2.6.39f  2639 1.5160 5.9170   29.7        97.5
> Hybrid    Linux 2.6.39f  2639 1.5170 7.5000   29.7        97.4
> HVM       Linux 2.6.39f  2639 1.5190 4.0210   29.8       105.4
> Baremetal Linux 2.6.39+  2649 1.5090 3.8370   29.2        78.0


OK, so once you have access to the memory, using it under PV is actually OK.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>