Xen project Mailing List

Re: [Xen-devel] Xen dom0 network I/O scalability

To: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>

From: Kaushik Kumar Ram <kaushik@xxxxxxxx>

Date: Thu, 12 May 2011 15:10:49 -0500

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Delivery-date: Thu, 12 May 2011 13:11:28 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On May 12, 2011, at 3:21 AM, Ian Campbell wrote: > A long time ago to backend->frontend path (guest receive) operated using > a page flipping mode. At some point a copying mode was added to this > path which became the default some time in 2006. You would have to go > out of your way to find a guest which used flipping mode these days. I > think this is the copying you are referring too, it's so long ago that > there was a distinction on this path that I'd forgotten all about it > until now. I was not referring to page flipping at all. I was only talking about the copies in the receive path. > The frontend->backend path (guest transmit) has used a mapping > (PageForeign) based scheme practically since forever. However when > netback was upstreamed into 2.6.39 this had to be removed in favour of a > copy based implementation (PageForeign has fingers in the mm subsystem > which were unacceptable for upstreaming). This is the copying mode > Konrad and I were talking about. We know the performance will suffer > versus mapping mode, and we are working to find ways of reinstating > mapping. Hmm.. I did not know that the copying mode was introduced in the transmit path. But as I said above I was only referring to the receive path. > As far as I can tell you are running with the zero-copy path. Only > mainline 2.6.39+ has anything different. Again, I was only referring to the receive path! I assumed you were talking about re-introducing zero-copy in the receive path (aka page flipping). To be clear: - xen.git#xen/stable-2.6.32.x uses copying in the RX path and mapping (zero-copy) in the RX path. - Copying is used in both RX and TX path in 2.6.39+ for upstreaming. > I think you need to go into detail about your test setup so we can all > get on the same page and stop confusing ourselves by guessing which > modes netback has available and is running in. Please can you describe > precisely which kernels you are running (tree URL and changeset as well > as the .config you are using). Please also describe your guest > configuration (kernels, cfg file, distro etc) and benchmark methodology > (e.g. netperf options). > > I'd also be interesting in seeing the actual numbers you are seeing, > alongside specifics of the test scenario which produced them. > > I'm especially interesting in the details of the experiment(s) where you > saw a 50% drop in throughput. I agree. I plan to run the experiments again next week. I will get back to you with all the details. But these are the versions I am trying to compare: 1. http://xenbits.xensource.com/linux-2.6.18-xen.hg (single-threaded legacy netback) 2. xen.git#xen/stable-2.6.32.x (multi-threaded netback using tasklets) 3. xen.git#xen/stable-2.6.32.x (multi-threaded netback using kthreads) And (1) outperforms both (2) and (3). >> I mentioned packet copies only to explain the severity of this >> problem. Let me try to clarify. Consider the following scenario: vcpu 1 >> performs a hypercall, acquires the domain_lock, and starts copying one or >> more >> packets (in gnttab_copy). Now vcpu 2 also performs a hypercall, but it >> cannot >> acquire the domain_lock until all the copies have completed and the lock is >> released by vcpu 1. So the domain_lock could be held for a long time before >> it is released. > > But this isn't a difference between the multi-threaded/tasklet and > single-threaded/tasklet version of netback, is it? > > In the single threaded case the serialisation is explicit due to the > lack of threading, and it would obviously be good to avoid for the > multithreaded case, but the contention doesn't really explain why > multi-threaded mode would be 50% slower. (I suppose the threading case > could serialise things into a different order, perhaps one which is > somehow pessimal for e.g. TCP) > > It is quite easy to force the number of tasklets/threads to 1 (by > forcing xen_netbk_group_nr to 1 in netback_init()). This might be an > interesting experiment to see if the degradation is down to contention > between threads or something else which has changed between 2.6.18 and > 2.6.32 (there is an extent to which this is comparing apples to oranges > but 50% is pretty severe...). Hmm.. You are right. I will run the above experiments next week. >> I think to properly scale netback we need more fine grained locking. > > Quite possibly. It doesn't seem at all unlikely that the domain lock on > the guest-receive grant copy is going to hurt at some point. There are > some plans to rework the guest receive path to do the copy on the guest > side, the primary motivation is to remove load from dom0 and to allow > better accounting of work to the guests to request it but a side-effect > of this could be to reduce contention on dom0's domain_lock. > > However I would like to get to the bottom of the 50% degradation between > linux-2.6.18-xen.hg and xen.git#xen/stable-2.6.32.x before we move on to > how we can further improve the situation in xen.git. OK. > An IRQ is associated with a VIF and multiple VIFs can be associated with > a netbk. > > I suppose we could bind the IRQ to the same CPU as the associated netrbk > thread but this can move around so we'd need to follow it. The tasklet > case is easier since, I think, the tasklet will be run on whichever CPU > scheduled it, which will be the one the IRQ occurred on. > > Drivers are not typically expected to behave in this way. In fact I'm > not sure it is even allowed by the IRQ subsystem and I expect upstream > would frown on a driver doing this sort of thing (I expect their answer > would be "why aren't you using irqbalanced?"). If you can make this work > and it shows real gains over running irqbalanced we can of course > consider it. OK. Thanks. --Kaushik _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.