> I had initially tried BVT; however, at that time it would
> crash the system if I tried to change any of the settings, I
> do not know if this is currently the issue.
Any scheduling period more than 50ms really shouldn't result in any
significant context switch overhead.
I suspect what's going on in your tests is that dom0 (that is doing all
the IO) is being being CPU starved. Putting it on a separate hyperthread
will certainly help confirm this diagnosis. [hyperthreading is very
helpful to Xen, and by default it dedicates the first hyperthread of the
first CPU to dom0]
> Also, if you have time, could you elaborate on my WebBench results?
It would be useful if you could explain a bit more about your webbench
setup, e.g. are you testing the clietns rather than the web server?
> -----Original Message-----
> From: Ian Pratt [mailto:m+Ian.Pratt@xxxxxxxxxxxx]
> Sent: Wednesday, August 03, 2005 6:48 PM
> To: Wolinsky, David; xen-devel@xxxxxxxxxxxxxxxxxxx
> Cc: ian.pratt@xxxxxxxxxxxx
> Subject: RE: [Xen-devel] Benchmarking Xen (results and questions)
> Which xen version is this? I'm guessing unstable.
> Is this with sedf or bvt? I'm guessing sedf since you're
> playing around
> with periods.
> It would be interesting to retry a couple of datapoints with sched=bvt
> on the xen command line.
> Also, I'd definitely recommend enabling HyperThreading and dedicating
> one of the logical CPUs to dom0.
> Also, are you sure the drop-off in performance isn't just
> caused because
> of the reduced memory size when you have more VMs? It's
> probably better
> to do such experiments with the same memory size throughout.
> > -----Original Message-----
> > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
> > David_Wolinsky@xxxxxxxx
> > Sent: 04 August 2005 00:21
> > To: xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: [Xen-devel] Benchmarking Xen (results and questions)
> > Hi all,
> > Here are some benchmarks that I've done using Xen.
> > However, before I get started, let me explain some of configuration
> > details...
> > Xen Version SPECjbb
> > WebBench
> > Linux Distribution Debian 3.1
> > HT disabled
> > Linux Kernel 220.127.116.11
> > Host Patch CK3s
> > Here are the initial benchmarks
> > SPECJBB WebBench
> > 1 Thread 1 Client 2 Clients 4
> > Clients 8 Clients
> > BOPS TPS TPS TPS TPS
> > Host 32403.5 213.45 416.86 814.62 1523.78
> > 1 VM 32057 205.4 380.91 569.24 733.8
> > 2 VM 24909.25 NA 399.29 695.1 896.04
> > 4 VM 17815.75 NA NA 742.78 950.63
> > 8 VM 10216.25 NA NA NA 1002.81
> > (and some more notes.... BOPS - business operations per
> second, TPS -
> > transactions per second...
> > SPECjbb tests CPU and Memory
> > WebBench (the way we configured it) tests Network I/O and Disk I/O
> > Values = AVG * VM count
> > Domain configurations
> > 1 VM - 1660 MB - SPECJBB 1500MB
> > 2 VM - 1280 MB - SPECJBB - 1024MB
> > 4 VM - 640 MB - SPECJBB - 512 MB
> > 8 VM - 320 MB - SPECJBB - 256 MB
> > Seeing how the SPECjbb numbers declined so bizarrely, I did some
> > scheduling tests and found this out...
> > Test1: Examine Xen's scheduling to determine if context
> > switching is causing the overhead
> > Period Slice BOPs
> > Modified 8 VM 1 ms 125 us 6858
> > 8 VM 10 ms 1.25 ms 14287
> > 8 VM 100 ms 12.5 ms 18912
> > 8 VM 1 Sec .125 Sec 20695
> > 8 VM 2 Sec .25 Sec 21072
> > 8 VM 10 Sec 1.25 Sec 21797
> > 8 VM 100 Sec 12.5 Sec 11402
> > I later learned that there was a period limit of 4 seconds, thus
> > invalidating 10 and 100 seconds. However, this graph suggests that
> > Xen needs some load and scheduling balancing done.
> > I also did a memory test to determine if that could be the
> issue... I
> > made a custom stream to run for a 2 minute period...
> > and got these numbers
> > Copy Scale Add Triad
> > Host 3266.4 3215.47 3012.28 3021.79
> > Modified 1 VM 3262.34 3220.34 3016.13 3025.28
> > So we can see memory is not the issue...
> > Now onto WebBench - After comparing the WebBench to the SPECjbb
> > results, we get something interesting... NUMBERS increase as we
> > increase the virtual machien count... So I would really
> like some idea
> > on why this is. My understanding is this... When using the shared
> > memory network drivers, there must be a local buffer, and when the
> > buffer fills up, it puts the remaining into a global
> buffer, and when
> > that fills up it puts it into a disk buffer? (These are all
> > assumptions please correct me...) If that is the case is there an
> > easy way to increase the local buffer to attempt to get better
> > numbers? I also am looking into doing some tests that deal with
> > multiple small transactions and 1 large transactions... I
> ran these
> > all against a physical and image backed disk.
> > Please any suggestions.
> > (Note... I was running this on a 1 gigabit switch with only
> > running)...
> > If there are any questions, I would be glad to respond.
> > Thanks,
> > David
Xen-devel mailing list