In my experience, small packet performance
(header congestion) is a critical issue on all networking gear. An example is
an ethernet switch that needs to apply and strip 802.1q VLAN and COS tags, or for remarking DSCP. Small Business class
switches don’t support monitoring of the switching CPU making it nearly
impossible to gauge if your gear is suffering due to this. I am a dealer for a
remote performance management suite of testing tools geared towards monitoring
performance of hybrid physical and virtual networks and has the capability to
detect (with a high degree of certainty) if your network has gear that is
susceptible to small packet header congestion.
"Small
packet congestion detected"
Summary
Congestion caused by
densely arriving packet headers has been detected.
Recommended action
·
Identify
devices such as switches, gateways, etc. associated with the Layer 3 hop where
the loss first appears.
·
Assess
the impact of the problem, i.e. determine whether you expect to have dense
small packet bursts or streams across that segment.
·
If
possible, perform intrusive flooding tests across the segment to isolate the
device or software responsible.
·
Upgrade
hardware or software of limiting device and/or turn off the software feature
that is responsible.
Detailed explanation
This diagnostic
involves a specific form of small packet loss that is attributed to some
devices having difficulties with the handling of densely arriving packet
headers. Unlike regular congestion which is sensitive to the amount of data,
not the number of headers, this "header congestion" condition will
affect applications specific to small packets, such as real-time voice and video,
but only when there are many densely aggregated streams. A single voice stream
is unlikely to generate this condition. The NIC, or some other device in the
path, is unable to process headers at sufficiently high rates, and packet
loss/corruption is the consequence.
Small packet congestion
is distinct from regular congestion, which is attributed more to large packets
filling queues/buffers at store-forward devices (e.g. routers) or receiving
NICs.
Possible secondary messages
·
"Limiting
network processor or other small packet sensitive constriction detected"
·
"May
impact real-time traffic such as voice"
Effectively, Xen makes ‘virtual
switches’ to connect the VMs. It’s quite likely that performance
will suffer vs bare metal as the networking connections need to traverse many
layers of virtual bridging to reach the VM and to get returned.
I don’t know if this might be
fixable by increasing Dom0 CPU access or by giving higher priority to the
network processes. (I’m not sure what they are named).
From:
xen-users-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Carl Byström
Sent: Monday, May 23, 2011 12:31
PM
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Bad TCP
accept performance
I've been running some simple tests trying to find out why the TCP
accept() rate has been so low on my Xen guest.
The rate at which I can accept new TCP connections is about five times
better on a bare metal machine compared to my guest.
Been using netperf with the TCP_CRR test to simulate this behavior.
After a suggestion from a user there, I decided to try this list.
Judging from the number of views the questions did receive at Server Fault and
being top-3 voted at Hacker News, I presume this issue is something a lot of
users care about.
One user at HN also reported that this apparently is a known issue and
is due to small packet performance, affecting both Xen and KVM.
After collecting feedback from SF and HN users, my question is: what
can you do to improve small packet performance in Xen?
Is this a fundamentally difficult problem to solve with Xen or is there
a "quick fix"?