This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel][Pv-ops][PATCH] Netback multiple tasklet support

To: Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>, Ian Campbell <Ian.Campbell@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject: RE: [Xen-devel][Pv-ops][PATCH] Netback multiple tasklet support
From: "Xu, Dongxiao" <dongxiao.xu@xxxxxxxxx>
Date: Wed, 2 Dec 2009 18:17:07 +0800
Accept-language: en-US
Acceptlanguage: en-US
Delivery-date: Wed, 02 Dec 2009 02:18:07 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4FA716B1526C7C4DB0375C6DADBC4EA342A7A7E95E@xxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <EADF0A36011179459010BDF5142A457501D006B913@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4FA716B1526C7C4DB0375C6DADBC4EA342A7A7E951@xxxxxxxxxxxxxxxxxxxxxxxxx> <EADF0A36011179459010BDF5142A457501D006BBAC@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4FA716B1526C7C4DB0375C6DADBC4EA342A7A7E95E@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcpvCRTacBm7g/TlQ5GetSSm6xA1EAAc2xzgAAA5DEAAK6iaMADCwX+w
Thread-topic: [Xen-devel][Pv-ops][PATCH] Netback multiple tasklet support
        According to your feedback, I revised my patch and resend it now.
[PATCH 01]: Use multiple tasklet pairs to replace the current single pair in 
[PATCH 02]: Replace the tasklet with kernel thread. It may hurt the 
performance, but could improve the responseness from userspace.

Test senario:
We use ten 1G NIC interface to talk with 10 VMs (netfront) in server. So the 
total bandwidth is 10G. 
For host machine, bind each guest's netfront with each NIC interface.
For client machine, do netperf testing with each guest.

Test Case                       Throughput(Mbps)        Dom0 CPU Util   Guests 
CPU Util
w/o any patch                   4304.30         400.33%         112.21%
w/   01   patch                 9533.13         461.64%         243.81%
w/ 01 and 02 patches            7942.68         597.83%         250.53%

>From the result we can see that, the case "w/ 01 and 02 patches" didn't 
>reach/near the total bandwidth. It is because some vcpus in dom0 are saturated 
>due to the context switch with other tasks, thus hurt the performance. To 
>prove this idea, I did a experiment, which sets the kernel thread to 
>SCHED_FIFO type, in order to avoid preemption by normal tasks. The experiment 
>result is showed below, and it could get good performance. However like 
>tasklet, set the kernel thread to high priority could also influence the 
>userspace responseness because the usespace application (for example, sshd) 
>could not preempt that netback kernel thread. 

w/ hi-priority kthread          9535.74         543.56%         241.26%

For netchannel2, it omits the grant copy in dom0, I didn't try it yet. But I 
used xenoprofile in current netback system to get a feeling that, grant copy 
occupies  ~1/6 cpu cycle of dom0 (including Xen and dom0 vmlinux). 

BTW, 02 patch is ported from the patch given by Ian Campbell. You can add your 
signed-off-by if you want. :)

Best Regards, 
-- Dongxiao

Ian Pratt wrote:
>> The domain lock is in grant_op hypercall. If the multiple tasklets
>> are fighting with each other for this big domain lock, it would
>> become a bottleneck and 
>> hurt the performance.
>> Our test system has 16 LP in total, so we have 16 vcpus in dom0 by
>> default.
>> 10 of them are used to handle the network load. For our test case,
>> dom0's totalvcpu utilization is  ~461.64%,  so each vcpu ocupies
>> ~46%. 
> Having 10 VCPUs for dom0 doesn't seem like a good idea -- it really
> oughtn't to need that many CPUs to handle IO load. Have you got any
> results with e.g. 2 or 4 VCPUs?  
> When we switch over to using netchannel2 by default this issue should
> largely go away anyhow as the copy is not done by dom0. Have you done
> any tests with netchannel2?  
>> Actually the multiple tasklet in netback could already improve the
>> the QoS of the system, therefore I think it can also help to get
>> better responseness for 
>> that vcpu.
>> I think I can try to write another patch which replace the tasklet
>> by kthread, because I think is a different job with the
>> multi-tasklet netback support. (kthread is used to guarantee the
>> responseness of userspace, however multi-tasklet netback is used to
>> remove the dom0's cpu utilization bottleneck). However I am not sure
>> whether the improvement in QoS by this change is needed In MP
>> system?  
> Have you looked at the patch that xenserver uses to replace the
> tasklets by kthreads? 
> Thanks,
> Ian

Attachment: 0001-Netback-multiple-tasklets-support.patch
Description: 0001-Netback-multiple-tasklets-support.patch

Attachment: 0002-Use-Kernel-thread-to-replace-the-tasklet.patch
Description: 0002-Use-Kernel-thread-to-replace-the-tasklet.patch

Xen-devel mailing list