WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] compute performace problem

To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] compute performace problem
From: David Becker <becker@xxxxxxxxxxx>
Date: Sat, 23 Apr 2005 10:52:07 -0400
Delivery-date: Sat, 23 Apr 2005 14:51:53 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.6+20040722i
One of my users discovered large deviations in execution time for his
mpi jobs on xenUs.  I can reproduce the problem running his job on
a single VM.  On a native linux box the job completes in 64 secs +/-
a second or so.  On a xenU, it completes somewhere between 64 and 250
secs.  This is true on 2.0.5 (2.6.10-xenU) and 2.0-testing(2.6.11-xenU).
I tried xen-unstable but it seemed any task was taking 4 times as
long as on 2.0 so I guess its still too unstable.

Any suggestions I can try?


Software is debian sarge with lam4-7.1.1 on xen-2.0-testing(Apr 22).
stracing mpirun and lamd show no system calls being made during the
computation phase, and that phase is where the extra time disappears.
Starting and stopping do not cause the delay.  xen is running the default
bvt scheduler at default settings.  Raising the priority of xenU made
no difference.   The domains on the box are an idle xen0 and the xenU running
the app.  /lib/tls is moved to tls.disabled on both domains, and on
native linux.

Hardware is a Dell PowerEdge 1650 (dual cpu sockets but only one cpu
installed, 2GB mem).  The app itself uses 375MB of mem.  xenU was config for
HIGHMEM4GB but was created with 640MB. No swap space is consumed on
the system.  I saw similar compute time variation running this job
on a dual IBM x335.

Raw results for 2.0-testing 2.6.11-xenU linux:
Run Time  =    104.590
Run Time  =    247.370
Run Time  =     89.050
Run Time  =     64.090
Run Time  =     63.430
Run Time  =     80.360
Run Time  =     64.410
Run Time  =    131.070
Run Time  =    236.850
Run Time  =     75.470
Run Time  =    134.570
Run Time  =     65.350
Run Time  =     65.480
Run Time  =     64.970
Run Time  =    202.650


Raw results for native 2.6.10 linux:
Run Time  =     64.120
Run Time  =     63.170
Run Time  =     63.540
Run Time  =     64.670
Run Time  =     64.990
Run Time  =     64.070
Run Time  =     64.930
Run Time  =     64.640
Run Time  =     64.030





_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel