Re: [Xen-API] Flood test with xen/openvswitch

On Wed, Sep 07, 2011 at 07:04:27PM +0400, George Shuklin wrote:
> I think this can be real issue.
> 
> We have a bunch of highloaded hosts with multiple CPUs in dom0 (used to
> reduce network latency in load peaks).
> 

With XCP or with Debian+xapi ?


> So I'm thinking issue can be not with openvswitch, but with
> hardware/compability with xen...
> 

Upstream Linux 3.0 as dom0 might/will behave differently from the XCP/Xenserver 
Xenlinux dom0 kernel..


-- Pasi


> ?? ????., 07/09/2011 ?? 10:43 -0400, Andres E. Moya ??????????:
> > wouldn't this give us the crashing issue that has been occurring in xen?
> > 
> > recntly i had to run this command
> > echo "NR_DOMAIN0_VCPUS=1" > /etc/sysconfig/unplug-vcpus to stop xen from 
> > crashing, its been running for 24 hours now.
> > 
> > Moya Solutions, Inc.
> > amoya@xxxxxxxxxxxxxxxxx
> > 0 | 646-918-5238 x 102
> > F | 646-390-1806
> > 
> > ----- Original Message -----
> > From: "George Shuklin" <george.shuklin@xxxxxxxxx>
> > To: xen-api@xxxxxxxxxxxxxxxxxxx
> > Sent: Wednesday, September 7, 2011 9:59:13 AM
> > Subject: Re: [Xen-API] Flood test with xen/openvswitch
> > 
> > temporary solution: add more active cpus to dom0.
> > echo 1 >/sys/devices/system/cpu/cpu1/online
> > echo 1 >/sys/devices/system/cpu/cpu2/online
> > echo 1 >/sys/devices/system/cpu/cpu3/online
> > 
> > ?? ????., 07/09/2011 ?? 13:29 +0200, Sébastien Riccio ??????????:
> > > Hi,
> > > 
> > > I just did a test to see how openvswitch handle a flood from a virtual 
> > > machine on a xen
> > > host using it as the networking layer.
> > > 
> > > I just issued a :
> > > 
> > > vm1# hping3  -S -L 0 -p 80 -i u100 192.168.1.1
> > > 
> > > options I used are:
> > > -S set SYN tcp flag
> > > -L set ACK tcp flag
> > > -p destination port
> > > -i u100 = interval between packets in micro seconds
> > > 
> > > This results in a cpu usage up to 97% by the ovs-vswitchd process in the 
> > > dom0.
> > > Letting it go for a few minutes turns the whole xen host unresponsive to 
> > > network access
> > > and must then be accessed from the local console.
> > > 
> > > Is that an excepted behavior ? I know the test is quite agressive but 
> > > any customer could issue such a flood
> > > and render the whole host unreachable. Are there workarounds ?
> > > 
> > > Thanks for your help.
> > > 
> > > Best regards,
> > > Sébastien
> > > 
> > > part of the openvswitch logs when issuing the flood:
> > > 
> > > Sep 07 13:13:26|01523|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (85% 
> > > CPU usage)
> > > Sep 07 13:13:26|01524|poll_loop|WARN|wakeup due to [POLLIN] on fd 21 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (85% 
> > > CPU usage)
> > > Sep 07 13:13:27|01525|poll_loop|WARN|Dropped 5136 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:27|01526|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (85% 
> > > CPU usage)
> > > Sep 07 13:13:27|01527|poll_loop|WARN|wakeup due to [POLLIN] on fd 21 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (85% 
> > > CPU usage)
> > > Sep 07 13:13:28|01528|poll_loop|WARN|Dropped 5815 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:28|01529|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (85% 
> > > CPU usage)
> > > Sep 07 13:13:28|01530|poll_loop|WARN|wakeup due to [POLLIN] on fd 21 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (85% 
> > > CPU usage)
> > > Sep 07 13:13:29|01531|poll_loop|WARN|Dropped 8214 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:29|01532|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (82% 
> > > CPU usage)
> > > Sep 07 13:13:29|01533|poll_loop|WARN|wakeup due to [POLLIN] on fd 21 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (82% 
> > > CPU usage)
> > > Sep 07 13:13:30|01534|poll_loop|WARN|Dropped 5068 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:30|01535|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (82% 
> > > CPU usage)
> > > Sep 07 13:13:30|01536|poll_loop|WARN|wakeup due to [POLLIN] on fd 21 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (82% 
> > > CPU usage)
> > > Sep 07 13:13:31|01537|poll_loop|WARN|Dropped 5008 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:31|01538|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (82% 
> > > CPU usage)
> > > Sep 07 13:13:31|01539|poll_loop|WARN|wakeup due to [POLLIN] on fd 21 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (82% 
> > > CPU usage)
> > > Sep 07 13:13:32|01540|poll_loop|WARN|Dropped 4841 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:32|01541|poll_loop|WARN|wakeup due to 40-ms timeout at 
> > > ../ofproto/ofproto-dpif.c:622 (83% CPU usage)
> > > Sep 07 13:13:32|01542|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (83% 
> > > CPU usage)
> > > Sep 07 13:13:33|01543|poll_loop|WARN|Dropped 92 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:33|01544|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (83% 
> > > CPU usage)
> > > Sep 07 13:13:33|01545|poll_loop|WARN|wakeup due to [POLLIN] on fd 21 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (83% 
> > > CPU usage)
> > > Sep 07 13:13:34|01546|poll_loop|WARN|Dropped 27 log messages in last 1 
> > > seconds (most recently, 1 seconds ago) due to excessive rate
> > > Sep 07 13:13:34|01547|poll_loop|WARN|wakeup due to 53-ms timeout at 
> > > ../lib/mac-learning.c:294 (83% CPU usage)
> > > Sep 07 13:13:34|01548|poll_loop|WARN|wakeup due to [POLLIN] on fd 18 
> > > (NETLINK_GENERIC<->NETLINK_GENERIC) at ../lib/netlink-socket.c:668 (83% 
> > > CPU usage)
> > > 
> > > 
> > > _______________________________________________
> > > xen-api mailing list
> > > xen-api@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/mailman/listinfo/xen-api
> > 
> > 
> > 
> > _______________________________________________
> > xen-api mailing list
> > xen-api@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/mailman/listinfo/xen-api
> > 
> > 
> 
> 
> 
> _______________________________________________
> xen-api mailing list
> xen-api@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/mailman/listinfo/xen-api

_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api
WARNING - OLD ARCHIVES

xen-api

Re: [Xen-API] Flood test with xen/openvswitch