WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel][PATCH][RFC] Using data polling mechanism in netfront tor

To: James Harper <james.harper@xxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel][PATCH][RFC] Using data polling mechanism in netfront toreplace event notification between netback and netfront
From: "Xu, Dongxiao" <dongxiao.xu@xxxxxxxxx>
Date: Thu, 10 Sep 2009 17:09:01 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "Yang, Xiaowei" <xiaowei.yang@xxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>
Delivery-date: Thu, 10 Sep 2009 02:10:47 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AEC6C66638C05B468B556EA548C1A77D0177D1E4@trantor>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <EADF0A36011179459010BDF5142A457501CD8BF336@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <AEC6C66638C05B468B556EA548C1A77D0177D1E4@trantor>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acox6jFOUMf61+LiTw60WvyWTJXRuAAAiMpAAAIiU+A=
Thread-topic: [Xen-devel][PATCH][RFC] Using data polling mechanism in netfront toreplace event notification between netback and netfront

Thanks for comments!
The solution I present in last mail has an advantage that, it could almost decrease the event channel notification frequency to zero, which will save a lot of CPU cycle especially for HVM PV driver.

For James's suggestion, actually we have another solution which works in that style, see the attachment. We only modifies the netback, and keeps netfront unchanged. The patch is based on PV-ops Dom0, so the hrtimer is accurate. We set a timer in netback. If timer elapses or there are RING_SIZE/2 data slots in ring, netback will notify netfront (Of course we could modify the 'event' parameter to replace the check of data number in ring). The patch contains auto adjustment logic for each netfront's event channel frequency according to packet rate and size in a timer period. Also user could assign specific timer frequency for a certain netfront by using standard coalesce interface.
        If set the event notification frequency to 1000HZ, it also brings a lot of CPU utilization decrease like the previous test result. Here are the detail result for the two solutions. I think the two solutions could coexist, and we can set a MACRO to indicate which solution is used as default.

Here the w/ FE patch means that applying the first solution patch attached in my last mail. w/ BE patch means applying the second solution patch attached in this mail.

VM receive results:
UDP Receive (Single Guest VM)
 
TCP Receive (Single Guest VM)
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
 
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
50
w/o patch
83.25
100.00%
26.10%
 
50
w/o patch
506.57
43.30%
70.30%
w/ FE patch
79.56
100.00%
23.80%
 
w/ FE patch
521.52
34.50%
57.70%
w/ BE patch
72.43
100.00%
21.90%
 
w/ BE patch
512.78
38.50%
54.40%
1472
w/o patch
950.30
44.80%
22.40%
 
1472
w/o patch
926.19
69.00%
32.90%
w/ FE patch
949.32
46.00%
17.90%
 
w/ FE patch
928.23
63.00%
24.40%
w/ BE patch
951.57
51.10%
18.50%
 
w/ BE patch
928.59
67.50%
24.80%
1500
w/o patch
915.84
84.70%
42.40%
 
1500
w/o patch
935.12
68.60%
33.70%
w/ FE patch
908.94
88.30%
28.70%
 
w/ FE patch
926.11
63.80%
24.80%
w/ BE patch
904.00
88.90%
27.30%
 
w/ BE patch
927.00
68.80%
24.60%
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
UDP Receive (Three Guest VMs)
 
TCP Receive (Three Guest VMs)
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
 
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
1472
w/o patch
963.43
50.70%
41.10%
 
1472
w/o patch
939.68
78.40%
64.00%
w/ FE patch
964.47
51.00%
25.00%
 
w/ FE patch
926.04
65.90%
31.80%
w/ BE patch
963.07
55.60%
27.80%
 
w/ BE patch
930.61
71.60%
34.80%
1500
w/o patch
859.96
99.50%
73.40%
 
1500
w/o patch
933.00
78.10%
63.30%
w/ FE patch
861.19
97.40%
39.90%
 
w/ FE patch
927.14
66.90%
31.90%
w/ BE patch
860.92
98.90%
40.00%
 
w/ BE patch
930.76
71.10%
34.80%
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
UDP Receive (Six Guest VMs)
 
TCP Receive (Six Guest VMs)
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
 
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
1472
w/o patch
978.85
56.90%
59.20%
 
1472
w/o patch
962.04
90.30%
104.00%
w/ FE patch
975.05
53.80%
33.50%
 
w/ FE patch
958.94
69.40%
43.70%
w/ BE patch
974.71
59.50%
40.00%
 
w/ BE patch
958.08
68.30%
48.00%
1500
w/o patch
886.92
100.00%
87.20%
 
1500
w/o patch
960.35
90.10%
103.70%
w/ FE patch
902.02
96.90%
46.00%
 
w/ FE patch
957.75
68.70%
42.80%
w/ BE patch
894.57
98.90%
49.60%
 
w/ BE patch
956.42
68.20%
48.50%
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
UDP Receive (Nine Guest VMs)
 
TCP Receive (Nine Guest VMs)
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
 
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
1472
w/o patch
987.91
60.50%
70.00%
 
1472
w/o patch
974.89
90.00%
110.60%
w/ FE patch
988.30
56.60%
42.70%
 
w/ FE patch
980.03
73.70%
55.40%
w/ BE patch
986.58
61.80%
50.00%
 
w/ BE patch
968.29
72.30%
60.20%
1500
w/o patch
953.48
100.00%
93.80%
 
1500
w/o patch
971.34
89.80%
109.60%
w/ FE patch
904.17
98.60%
53.50%
 
w/ FE patch
973.63
73.90%
54.70%
w/ BE patch
905.52
100.00%
56.80%
 
w/ BE patch
971.08
72.30%
61.00%
 
VM send results:
UDP Send (Single Guest VM)
 
TCP Send (Single Guest VM)
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
 
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
1472
w/o patch
949.84
56.50%
21.70%
 
1472
w/o patch
932.16
71.50%
35.60%
w/ FE patch
946.25
51.20%
20.10%
 
w/ FE patch
932.09
66.90%
29.50%
w/ BE patch
948.73
51.60%
19.70%
 
w/ BE patch
932.54
66.20%
25.30%
1500
w/o patch
912.46
87.00%
26.60%
 
1500
w/o patch
929.91
72.60%
35.90%
w/ FE patch
899.29
86.70%
26.20%
 
w/ FE patch
931.63
66.70%
29.50%
w/ BE patch
909.31
86.90%
25.90%
 
w/ BE patch
932.83
66.20%
26.20%
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
UDP Send (Three Guest VMs)
 
TCP Send (Three Guest VMs)
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
 
Packet Size (bytes)
Test Case
Throughput (Mbps)
Dom0 CPU Util
Guest CPU Total Util
1472
w/o patch
972.66
57.60%
24.00%
 
1472
w/o patch
955.92
70.40%
36.10%
w/ FE patch
970.07
56.30%
23.30%
 
w/ FE patch
946.39
72.90%
32.90%
w/ BE patch
971.05
59.10%
23.10%
 
w/ BE patch
949.80
70.30%
33.20%
1500
w/o patch
943.87
93.50%
32.50%
 
1500
w/o patch
966.06
73.00%
38.00%
w/ FE patch
933.61
93.90%
30.00%
 
w/ FE patch
947.23
72.50%
33.60%
w/ BE patch
937.08
95.10%
31.00%
 
w/ BE patch
948.74
72.20%
34.50%

Best Regards,
-- Dongxiao
 
 
 


-----Original Message-----
From: James Harper [mailto:james.harper@xxxxxxxxxxxxxxxx]
Sent: Thursday, September 10, 2009 4:03 PM
To: Xu, Dongxiao; xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: RE: [Xen-devel][PATCH][RFC] Using data polling mechanism in netfront toreplace event notification between netback and netfront

> Hi,
>       This is a VNIF optimization patch, need for your comments.
Thanks!
>
> [Background]:
>       One of the VNIF driver's scalability issues is the high event
channel
> frequency. It's highly related to physical NIC's interrupt frequency
in dom0,
> which could be 20K HZ in some situation. The high frequency event
channel
> notification makes the guest and dom0 CPU utilization at a high value.
> Especially for HVM PV driver, it brings high rate of interrupts, which
could
> cost a lot of CPU cycle.
>       The attached patches have two parts: one part is for netback,
and the
> other is for netfront. The netback part is based on the latest PV-Ops
Dom0,
> and the netfront part is based on the 2.6.18 HVM unmodified driver.
>       This patch uses a timer in netfront to poll the ring instead of
event
> channel notification. If guest is transferring data, the timer will
start
> working and periodicaly send/receive data from ring. If guest is idle
and no
> data is transferring, the timer will stop working automatically. It
will
> restart again once there is new data transferring.
>       We set a feature flag in xenstore to indicate whether the
> netfront/netback support this feature. If there is only one side
supporting
> it, the communication mechanism will fall back to default, and the new
feature
> will not be used. The feature is enabled only when both sides have the
flag
> set in xenstore.
>       One problem is the timer polling frequency. This netfront part
patch is
> based on 2.6.18 HVM unmodified driver, and in that kernel version,
guest
> hrtimer is not accuracy, so I use the classical timer. The polling
frequency
> is 1KHz. If rebase the netfront part patch to latest pv-ops, we could
use
> hrtimer instead.
>

I experimented with this in Windows too, but the timer resolution is too
poor. I think you should also look at setting the 'event' parameter too.
The current driver tells the backend to tell it as soon as there is a
single packet ready to be notified (np->rx.sring->rsp_event =
np->rx.rsp_cons + 1), but you could set it to a higher number and also
use the timer, eg "tell me when there are 32 ring slots filled, or when
the timer elapses". That way you should have less problems with
overflows.

Also, I don't think you need to tell the backend to stop notifying you,
just don't set the 'event' field in the frontend and then
RING_PUSH_RESPONSES_AND_CHECK_NOTIFY in the backend will not return that
a notification is required.

James

Attachment: netbk_lowdown_evtchn_freq.patch
Description: netbk_lowdown_evtchn_freq.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel