Re: [Xen-devel] Xen dom0 network I/O scalability

On May 11, 2011, at 4:31 AM, Ian Campbell wrote:
> 
>>> - The inter-VM performance (throughput) is worse using both tasklets
>> and kthreads as compared to the old version of netback (as in
>> linux-2.6-xen.hg repo). I observed about 50% drop in throughput in my
>> experiments. Has anyone else observed this? Is the new version yet to
>> be optimized?
>> 
>> That is not surprising. The "new" version of netback copies pages. It
>> does not "swizzel" or "map" then between domains (so zero copying).
> 
> I think Kaushik is running a xen/2.6.32.x tree and the copying only
> variant is only in mainline.
> 
> A 50% drop in performance between linux-2.6-xen.hg and the xen.git
> 2.6.32 tree is slightly worrying but such a big drop sounds more like a
> misconfiguration, e.g. something like enabling debugging options in the
> kernel .config rather than a design or implementation issue in netback.
> 
> (I actually have no idea what was in the linux-2.6-xen.hg tree, I don't
> recall such a tree ever being properly maintained, the last cset appears
> to be from 2006 and I recently cleaned it out of xenbits because noone
> knew what it was -- did you mean linux-2.6.18-xen.hg?)

I was referring to the single-threaded netback version in linux-2.6.18-xen.hg 
(which btw also uses copying). I don't believe misconfiguration to be reason. 
As I mentioned previously, I profiled the code and found significant 
synchronization
overhead due to lock contention. This essentially happens when two vcpus in 
dom0 perform the grant hypercall and both try to acquire the domain_lock.

I don't think re-introducing zero-copy in the receive path is a solution to 
this problem. I mentioned packet copies only to explain the severity of this
problem. Let me try to clarify. Consider the following scenario: vcpu 1 
performs a hypercall, acquires the domain_lock, and starts copying one or more 
packets (in gnttab_copy). Now vcpu 2 also performs a hypercall, but it cannot 
acquire the domain_lock until all the copies have completed and the lock is 
released by vcpu 1. So the domain_lock could be held for a long time before 
it is released.

I think to properly scale netback we need more fine grained locking.

>>> - Two tasklets (rx and tx) are created per vcpu within netback. But
>> in my experiments I noticed that only one vcpu was being used during
>> the experiments (even with 4 VMs).  I also observed that all the event
>> channel notifications within netback are always sent to vcpu 0. So my
>> conjecture is that since the tasklets are always scheduled by vcpu 0,
>> all of them are run only on vcpu 0. Is this a BUG?
>> 
>> Yes. We need to fix 'irqbalance' to work properly. There is something
>> not working right.
> 
> The fix is to install the "irqbalanced" package. Without it no IRQ
> balancing will occur in a modern kernel. (perhaps this linux-2.6-xen.hg
> tree was from a time when the kernel would do balancing on its own?).
> You can also manually balance the VIF IRQs under /proc/irq if you are so
> inclined.

Why cannot the virq associated with each xen_netbk be bound to a different 
vcpu during initlization? There is after all one struct xen_netbk per vcpu in 
dom0.
This seems like the simplest fix for this problem.

>>> - A smaller source of overhead is when the '_lock' is acquired
>> within netback in netif_idx_release(). Shouldn't this lock be per
>> struct xen-netbk instead of being global (declared as static within
>> the function)? Is this a BUG?
>> 
>> Ian, what is your thought?
> 
> I suspect the _lock could be moved into the netbk, I expect it was just
> missed in the switch to multi-threading because it was static in the
> function instead of a normal global var located with all the others.

Yes, it has to be moved into struct xen_netbk.

Also, which git repo/branch should I be using If I would like to experiment 
with 
the latest dom0 networking?

--Kaushik
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] Xen dom0 network I/O scalability