WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] MPI benchmark performance gap between native linux anddo

To: "Santos, Jose Renato G (Jose Renato Santos)" <joserenato.santos@xxxxxx>
Subject: Re: [Xen-devel] MPI benchmark performance gap between native linux anddomU
From: xuehai zhang <hai@xxxxxxxxxxxxxxx>
Date: Tue, 05 Apr 2005 23:24:17 -0500
Cc: m+Ian.Pratt@xxxxxxxxxxxx, Xen-devel@xxxxxxxxxxxxxxxxxxx, Aravind Menon <aravind.menon@xxxxxxx>, G John Janakiraman <john@xxxxxxxxxxxxxxxxxxx>, "Turner, Yoshio" <yoshio_turner@xxxxxx>
Delivery-date: Wed, 06 Apr 2005 05:01:31 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <6C21311CEE34E049B74CC0EF339464B902FB2C@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <6C21311CEE34E049B74CC0EF339464B902FB2C@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
Jose,

Thank you for your help to diagnose the problem!

I kinda agree with you that the problem is due to the network latency. The throughput calculation of SencRecv benchmark is actually directly related to the latency and the following is its formula (where #_of_messages is 2 and the unit of message_size is bytes and the unit of latency is milliseconds):
        throughput = ((#_of_messages * message_size)/220)/(latency/106)
So, the performance gap really comes from the delayed latency in domU. It is true that PMB's SendRecv benchmark is sensitive to the round trip latency. I would like to hear Keir's comments on the behavior of event notifications in the inter-domain I/O channel for networking very much.

BTW, as I stated in my previous emails, besides the SendRecv benchmark, I also have other 11 PMB's benchmark results for both native linux and domU. The following are PingPing results (between 2 nodes) in my experiments. As you can see, the performance gap is not that big as SendRecv and the performance is very closer in several testing cases. Part of the reason might come from the fact only two nodes are used and only one-way latency is used for the calculation of the latency and throughput values.

Best,
Xuehai

P.S.
Note: each reported data point in the following table is the
average of over 10 runs of the same experiments, similarly as the SendRecv.

PingPing Throughput (MB/sec)
Msg-size(bytes) #repetitions      native-linux    domU
            0         1000         0.00          0.00
            1         1000         0.01           0.00
            2         1000         0.01           0.01
            4         1000         0.02           0.01
            8         1000         0.04           0.02
           16         1000         0.09           0.04
           32         1000         0.17           0.09
           64         1000         0.33           0.17
          128         1000         0.65           0.33
          256         1000         1.19           0.62
          512         1000         1.95           1.06
         1024         1000         2.80           1.73
         2048         1000         3.74           2.52
         4096         1000         5.38           3.77
         8192         1000         6.49           4.79
        16384         1000         7.45           4.97
        32768         1000         6.74           5.27
        65536          640         5.89           3.07
       131072          320         5.27           3.11
       262144          160         5.09           3.88
       524288           80         5.00           4.84
      1048576           40         4.95           4.91
      2097152           20         4.94           4.89
      4194304           10         4.93           4.92

PingPing Latency/Startup (usec)
Msg-size(bytes) #repetitions      native-linux     domU
            0         1000       172.78          342.89
            1         1000       176.12          346.23
            2         1000       173.48          344.20
            4         1000       177.05          346.15
            8         1000       177.54          343.56
           16         1000       178.71          346.47
           32         1000       176.71          351.25
           64         1000       183.83          359.41
          128         1000       188.09          371.94
          256         1000       204.64          393.79
          512         1000       250.63          462.45
         1024         1000       349.20          565.03
         2048         1000       521.56          773.63
         4096         1000       726.62         1036.23
         8192         1000      1204.54         1630.43
        16384         1000      2097.42         3143.95
        32768         1000      4633.77         5930.04
        65536          640     10604.54        20335.55
       131072          320     23717.61        40174.68
       262144          160     49146.14        64505.20
       524288           80     99962.09       103390.30
      1048576           40    202000.30       203478.00
      2097152           20    404857.10       408950.55
      4194304           10    812047.60       813135.50

Santos, Jose Renato G (Jose Renato Santos) wrote:
  Xuehai,

  Thanks for posting your new results. In fact it seems that your
problem is not the same as the one we encountered.
I believe your problem is due to a higher network latency in Xen. Your
formula to compute throughput uses the inverse of round trip latency (if
I understood it correctly). This probably means that your application is
sensitive to the round trip latency. Your latency mesurements show a
higher value for domainU and this is the reason for the lower
throughput.  I am not sure but it is possible that network interrupts or
event notifications in the inter-domain channel are being coalesced and
causing longer latency. Keir, do event notifications get coalesced in
the inter-domain I/O channel for networking?

  Renato

-----Original Message-----
From: xuehai zhang [mailto:hai@xxxxxxxxxxxxxxx] Sent: Tuesday, April 05, 2005 3:23 PM To: Santos, Jose Renato G (Jose Renato Santos); m+Ian.Pratt@xxxxxxxxxxxx Cc: Xen-devel@xxxxxxxxxxxxxxxxxxx; Aravind Menon; Turner, Yoshio; G John Janakiraman Subject: Re: [Xen-devel] MPI benchmark performance gap between native linux anddomU


Hi Ian and Jose,

Based on your suggestions, I did two more experiments: one (with tag "domU-B" in table below) is changing the TCP advertised window of domU to -2 (the default is 2) and the other (with tag "dom0" in table below) is to repeat the experiment in dom0 (only dom0 is running). The following table contains the results from these two new experiments plus two old ones (with tags "native-linux" and "domU-A" in table below) in my previous email.

I have the following observation from the results:

1. Decreasing the scaling of TCP window ("domU-B") doesn't buy any good to the performance but slightly slowdown the performance (comparing with "domU-A").

2. Generally, the performance of running the experiments in dom0 ("dom0" column) is very close (slightly less) to the performance on native linux ("native-linux" column). However, in certain situations, it outperforms the performance on native linux. For example, throughput values when message size is 64KB and latency values when message size is 1 , or 2, or 4, or 8 bytes.

3. The performance gap between domU and dom0 is big, similarly as domU and native linux.

BTW, each reported data point in the following table is the average of over 10 runs of the same experiments. I forget to mention that in experiment using user domains, the 8 domU forms a private network and each domU is assigned a private network IP (for example, 192.168.254.X).

Xuehai

*********************************
*SendRecv Throughput(Mbytes/sec)*
*********************************

Msg Size(bytes) native-linux dom0 domU-A domU-B
        0         0              0.00          0              0.00
        1         0              0.01          0              0.00
        2         0              0.01          0              0.00
        4         0              0.03          0              0.00
        8         0.04           0.05          0.01           0.01
       16         0.16           0.11          0.01           0.01
       32         0.34           0.21          0.02           0.02
       64         0.65           0.42          0.04           0.04
      128         1.17           0.79          0.09           0.10
      256         2.15           1.44          0.59           0.58
      512         3.4            2.39          1.23           1.22
     1024         5.29           3.79          2.57           2.50
     2048         7.68           5.30          3.5            3.44
     4096         10.7           8.51          4.96           5.23
     8192         13.35          11.06         7.07           6.00
    16384         14.9           13.60         3.77           4.62
    32768         9.85           11.13         3.68           4.34
    65536         5.06           9.06          3.02           3.14
   131072         7.91           7.61          4.94           5.04
   262144         7.85           7.65          5.25           5.29
   524288         7.93           7.77          6.11           5.40
  1048576         7.85           7.82          6.5            5.62
  2097152         8.18           7.35          5.44           5.32
  4194304         7.55           6.88          4.93           4.92

*********************************
*   SendRecv Latency(millisec)  *
*********************************

Msg Size(bytes) native-linux dom0 domU-A domU-B 0 1979.6 1920.83 3010.96 3246.71 1 1724.16 397.27 3218.88 3219.63 2 1669.65 297.58 3185.3 3298.86 4 1637.26 285.27 3055.67 3222.34 8 406.77 282.78 2966.17 3001.24 16 185.76 283.87 2777.89 2761.90 32 181.06 284.75 2791.06 2798.77 64 189.12 293.93 2940.82 3043.55 128 210.51 310.47 2716.3 2495.83
      256         227.36         338.13        843.94         853.86
      512         287.28         408.14        796.71         805.51
     1024         368.72         515.59        758.19         786.67
2048 508.65 737.12 1144.24 1150.66 4096 730.59 917.97 1612.66 1516.35 8192 1170.22 1411.94 2471.65 2650.17 16384 2096.86 2297.19 8300.18 6857.13 32768 6340.45 5619.56 17017.99 14392.36 65536 24640.78 13787.31 41264.5 39871.19 131072 31709.09 32797.52 50608.97 49533.68 262144 63680.67 65174.67 94918.13 94157.30 524288 125531.7 128116.73 162168.47 189307.05 1048576 251566.94 252257.55 321451.02 361714.44 2097152 477431.32 527432.60 707981 728504.38 4194304 997768.35 1108898.61 1503987.61 1534795.56



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel