On Tuesday May 06 2008 03:21:29 am Pasi Kärkkäinen wrote:
> OK. I think measuring pv domU is worth trying too :)
Ok, let's try a few things. Repeating my original 0.8.9 numbers, with the new
processor:
pattern 4k, 50% read, 0% random
dynamo on? | io/s | MB/s | Avg. i/o time(ms} | max i/o time(ms) | %CPU
domu w/gplpv| 501.7 | 1.96 | 2.90 | 0 | 31.68
domu w/qemu | 187.5 | 0.73 | 5.87 | 0 | 29.89
dom0 w/4Gb | 1102.3 | 4.31 | 0.91 | 445.5 | 0
dom0 w/4Gb | 1125.8 | 4.40 | 0.89 | 332.1 | 0
(2nd dom0 numbers from when booted w/o /gplpv)
pattern 32k, 50% read, 0% random
domu w/gplpv| 238.3 | 7.45 | 4.09 | 0 | 22.48
domu w/qemu | 157.4 | 4.92 | 6.35 | 0 | 20.51
dom0 w/4Gb | 52.5 | 1.64 | 19.05 | 1590.0 | 0
dom0 w/4Gb | 87.8 | 2.74 | 11.39 | 1286.4 | 0
Now, that was with all workers running on domu and dom0 simultaneously. Let's
try one at a time. On hvm w/gplpv, first the 4k pattern, then later the 32k
pattern, with dom0 using the 'idle' task:
4k pattern | 1026.6 | 4.01 | 39.37 | 0 | 49.70
32k pattern | 311.1 | 9.72 | 45.33 | 0 | 26.21
Now test dom0, with the hvm running the 'idle' task:
4k pattern | 1376.7 | 5.38 | 0.73 | 365.7 | 0
32k pattern | 165.9 | 5.19 | 6.02 | 226.6 | 0
As expected, all numbers are significantly faster. Compare this to 'dd'
creating the 4GB /iobw.tst file on dom0 at a 22MB/s rate.
Now, to test a fedora pv, since space is tight on my fedora xen server, I
just 'xm block-attach'-ed dom0's /iobw.tst as a new partition on the domu,
and in the domu, did mkfs, mount, and created a new /iobw.tst on that
partition. Results:
4k pattern | 1160.5 | 4.53 | 0.86 | 247.1 | 0
32k pattern | 284.1 | 8.88 | 3.52 | 326.4 | 0
The numbers are very similar to the hvm, including the 32k pattern being
faster than dom0, which you pointed out is due to caching. This compares
to 'dd' creating the 3.7GB iobw.tst on the mounted new partition at an 18MB/s
rate.
> Configure dom0 for 1 vcpu and domU for 1 vcpu and pin the domains to have a
> dedicated core. This way you're not sharing any pcpu's between the domains.
> I think this is the "recommended" setup from xen developers for getting
> maximum performance.
>
> I think the performance will be worse when you have more vcpus in use than
> your actual pcpu count..
Now I rebooted dom0, after editing xend-config.sxp to include '(dom0-cpus 1)',
and then did the following pins:
[576] > xm create winxp
Using config file "/etc/xen/winxp".
Started domain winxp
root@Insp6400 05/06/08 10:32PM:~
[577] > xm vcpu-pin 0 all 0
root@Insp6400 05/06/08 10:32PM:~
[578] > xm vcpu-pin winxp all 1
root@Insp6400 05/06/08 10:32PM:~
[579] > xm vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
Domain-0 0 0 0 r-- 228.7 0
Domain-0 0 1 - --p 16.0 0
winxp 5 0 1 r-- 36.4 1
Note I also had to set vcpus=1, because with two, I was again getting that
extremely sluggish response in my hvm.
Going back to simultaneous execution of all workers, to compare against the
numbers at the top of this post, I got:
pattern 4k, 50% read, 0% random
dynamo on? | io/s | MB/s | Avg. i/o time(ms} | max i/o time(ms) | %CPU
domu w/gplpv| 286.4 | 1.12 | 3.49 | 564.9 | 36.97
dom0 w/4Gb | 1173.9 | 4.59 | 0.85 | 507.3 | 0
pattern 32k, 50% read, 0% random
domu w/gplpv| 217.9 | 6.81 | 4.57 | 1633.5 | 22.93
dom0 w/4Gb | 63.3 | 1.97 | 15.85 | 1266.5 | 0
which is somewhat slower. Recommendations of the xen developers aside, my
experience is that allowing xen to schedule any vcpu on any pcpu is most
efficient.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|