WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Slow network performance between HVM guests

To: "Petersson, Mats" <Mats.Petersson@xxxxxxx>, xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Slow network performance between HVM guests
From: Martin Goldstone <m.j.goldstone@xxxxxxxxxxxxxxx>
Date: Wed, 23 May 2007 17:17:06 +0100
Delivery-date: Wed, 23 May 2007 09:15:30 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <907625E08839C4409CE5768403633E0B02561D21@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <907625E08839C4409CE5768403633E0B02561D21@xxxxxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.0 (Windows/20070326)

Petersson, Mats wrote:
 

  
-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx 
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of 
Martin Goldstone
Sent: 23 May 2007 16:05
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Slow network performance between HVM guests

Hi all,

We're running Xen 3.1 on Centos 5 here (x86_64, kernel 2.6.18), and 
we're seeing some odd networking issues with our Windows HVM guests. 
Perhaps someone here has some ideas they can offer.

Briefly, the specs of the host are: Dual Xeon 3 GHz (Dual Core + HT), 
4GB RAM, 73 GB HDD.  Network is e1000.

The Windows HVM guests are Windows Server 2003 R2 Standard 
x86-64, with 
2 VCPUs, 1GB RAM, 16GB HDD (in a file), rtl8139 network.

Basically, file transfer speeds between guests (both on the 
same bridge) 
on the same host are approximately 10% (if not less) of the speed of 
file transfers from a HVM guest to another system 
(virtualised or not) 
away from that host. Any ideas, or is this normal behaviour? 
    

Most likely it's because the network access is going through QEMU-DM,
which means that Dom0 has to "emulate" the network device. With BOTH
devices being emulated on the same Dom0, you get latency added in both
ends, and there's less likely to be any "overlap" between them.

Do you by any chance also restrict the Dom0 to a single core? 

  
Although we had Dom0 set to take all the cpus (basically, the default), I've just checked and it was only using 2 vcpus (ie 1 core) (I guess it got knocked down to that by me starting up other guests and didn't take the CPUs back when they were shut down).  I've now set it to have 4 VCPUs (ie a whole cpu) and pinned it to the 1st CPU (0-3).
Also, are the two guest domains running on the same or different cores?
If both domains use the same core, they would obviously "stop each
other" from running. 


  
I had forgotten to pin the vcpus (2 in each guest), so its possible they started blocking each other.  I've pinned them now to separate cores, and I am seeing better performance (about double the speed that I was getting before, perhaps even faster than that).


  
 This is 
affecting us with Xen 3.03 and 3.0.4.1 as well, so it's not 
just a 3.1 
thing.  I've disabled iptables on dom0 to see if that makes a 
difference 
(it doesn't). We've tried the 32 bit version of Windows, and we've 
reduced the number of VCPUs to 1, and increased to 4, all without 
success. We originally thought it might have something to do with 
another issue we were experiencing (ping times reported in 
Windows were 
very strange, show latency of several thousand ms, and 
showing negative 
latency times as well) but that issue disappeared after setting  the 
number of VCPUs to 1 (apparently there is a bug in the Windows 
Multiprocessor ACPI HAL (incidentally, does anyone know if this bug 
affects anything other than the displayed latency times?)).
    

The negative latency is probably the one reported in the internal Intel
bug-tracker here:
http://losvmm-bridge.sh.intel.com/bugzilla/show_bug.cgi?id=991 
But it's not easy to know, since it's not accessible from outside Intel
(or at least not from the AMD network, but I doubt that it's really a
"hide this from AMD" attempt, but rather that the link is an internal
Intel site). 

My guess would be that the time is measured using timestamp counting,
and it fails because it's taking the TSC from two different processor
cores at different times, which leads to varying results. But that's
speculation, and not based on any real understanding of why this is. 

  
Hmm, that Intel site does appear to be internal only, oh well. I guess I'll just have to keep my eye on it to see whether its affecting anything else (apart from SiSoftware Sandra, which is registering about 20% packet loss in its performance index tests).

Thanks very much for your help

Mart
--
Mats
  
We haven't had any success tracking this down so far.  Any ideas?

Thanks in advance for any help,

Martin



    



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
  

Attachment: m.j.goldstone.vcf
Description: Vcard

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
<Prev in Thread] Current Thread [Next in Thread>