|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] weird hvm performance issue.
On Mon, 29 Jan 2007, Keir Fraser wrote:
On 29/1/07 18:08, "Luke S. Crawford" <lsc@xxxxxxxxx> wrote:
The CPU usage of qemu-dm correlates with the speed of the scroll; exiting
from
the xm console does not appear to effect the cpu usage of qemu-dm- it
will drop to 1% after several minutes.
any ideas what is going on here? any ideas how I could debug this
further?
This kind of thing can be a pain to debug. Perhaps instrument qemu-dm and
find out interesting things like how often its select() call returns due to
an event on a file descriptor rather than due to hitting the timeout?
What tools should I use for that? I don't see how to make oprofile give
me the return values of select- in fact oprofile seems to be mostly
tailored to profiling CPU usage, and CPU usage does not appear to be the
problem here. My oprofile results (both opannotate and opreport) seem to
be largely the same in both fast and slow cases; the main difference
being that the samples accumulate much faster in 'fast' mode than in
'slow' mode.
Should I be ignoring these tools and adding logging to the code itself?
Here is some of what I got using strace:
fast:
[root@1950-2 xen-3.0.4-testing.hg]# strace -p 4013 -c
Process 4013 attached - interrupt to quit
Process 4013 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
66.58 0.018072 1 27020 write
14.73 0.003997 0 74281 clock_gettime
8.81 0.002391 0 26481 select
3.77 0.001024 0 24815 read
2.70 0.000733 0 7869 munlock
2.49 0.000677 0 32441 ioctl
0.91 0.000248 0 7869 mlock
0.00 0.000000 0 496 lseek
------ ----------- ----------- --------- --------- ----------------
100.00 0.027142 201272 total
slow:
[root@1950-2 xen-3.0.4-testing.hg]# strace -p 4013 -c
Process 4013 attached - interrupt to quit
Process 4013 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
29.89 0.011048 0 54389 read
27.61 0.010202 0 57631 write
15.52 0.005734 0 158975 clock_gettime
11.93 0.004408 0 73270 ioctl
11.17 0.004127 0 56038 select
2.31 0.000853 0 20786 mlock
1.58 0.000585 0 20786 munlock
0.00 0.000000 0 784 lseek
0.00 0.000000 0 6 gettimeofday
------ ----------- ----------- --------- --------- ----------------
100.00 0.036957 442665 total
It's still not what you are asking for, but somewhat interesting, I hope.
I do notice that the 'fast' spends a whole lot more time in write.
(strace seems to be rather high-overhead; running strace, the 'fast' is
much slower than the normal 'slow', and I'm not seeing much consistancy
here, so perhaps strace is the wrong tool for the job?)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|