WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Yet another Xen performance monitoring tool

To: xen-devel@xxxxxxxxxxxxxxxxxxx, xen-tools@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] Yet another Xen performance monitoring tool
From: Rob Gardner <rob.gardner@xxxxxx>
Date: Thu, 18 Aug 2005 18:51:57 -0600
Delivery-date: Fri, 19 Aug 2005 08:33:30 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
Folks,

Here is yet another performance monitoring tool for Xen. Instead of using
hypervisor calls to get domain information, we use the xentrace facility to
provide fine-grained monitoring of various metrics (see README below -- a copy
is present in the tarball too). We are providing a tarball of the initial
release.

************* IMPORTANT ****************
Currently, xenmon does NOT support SMP.
****************************************

It has three components:

o patches: there are some minor patches against changeset 684d81933442 --
mostly to add additional trace calls. All of the patches are fairly trivial,
and some of them are optional (for instance, we found that we needed to
increase size allocated for the trace buffers)

o xenbaked: xenbaked reads the raw data output avaiable in the trace buffers
and "cooks/bakes" (read aggregates) it into a form that is more useful for
analysis. The README has more details on where we see xenbaked going in the
future.

o xenmon: this is a curses frontend, meant for live-monitoring that reads
information from xenbaked and displays it in a convenient manner. We provide
xenmon as just an example of how the aggregated information can be used.

Note that while we provide patches against the latest Xen 3.0 unstable branch,
it is simply to make it easier for people to test it. There is nothing in
xenmon that is exclusive to 3.0, it can be easily backported to work with the
2.x series.

Further instructions can be found in the README. The tarball also includes a
screenshot of the curses interface. We welcome any suggestions and feedback on
how to make this more useful.


Diwaker Gupta   <diwaker.gupta@xxxxxx>
Rob Gardner     <rob.gardner@xxxxxx>
Lucy Cherkasova <lucy.cherkasova.hp.com>

============ README ================
(this file is in the tar-ball too)

Xen Performance Monitor
-----------------------

The xenmon tools make use of the existing xen tracing feature to provide fine grained reporting of various domain related metrics. It should be stressed that the xenmon.py script included here is just an example of the data that may be displayed. The xenbake demon keeps a large amount of history in a shared memory
area that may be accessed by tools such as xenmon.

For each domain, xenmon reports various metrics. One part of the display is a group of metrics that have been accumulated over the last second, while another
part of the display shows data measured over 10 seconds. Other measurement
intervals are possible, but we have just chosen 1s and 10s as an example.

************* IMPORTANT ****************
Currently, xenmon does NOT support SMP.
****************************************

Execution Count
---------------
o The number of times that a domain was scheduled to run (ie, dispatched) over
the measurement interval


CPU usage
---------
o Total time used over the measurement interval
o Usage expressed as a percentage of the measurement interval
o Average cpu time used during each execution of the domain


Waiting time
------------
This is how much time the domain spent waiting to run, or put another way, the amount of time the domain spent in the "runnable" state (or on the run queue)
but not actually running. Xenmon displays:

o Total time waiting over the measurement interval
o Wait time expressed as a percentage of the measurement interval
o Average waiting time for each execution of the domain

Blocked time
------------
This is how much time the domain spent blocked (or sleeping); Put another way, the amount of time the domain spent not needing/wanting the cpu because it was
waiting for some event (ie, I/O). Xenmon reports:

o Total time blocked over the measurement interval
o Blocked time expressed as a percentage of the measurement interval
o Blocked time per I/O (see I/O count below)

Allocation time
---------------
This is how much cpu time was allocated to the domain by the scheduler; This is distinct from cpu usage since the "time slice" given to a domain is frequently
cut short for one reason or another, ie, the domain requests I/O and blocks.
Xenmon reports:

o Average allocation time per execution (ie, time slice)
o Min and Max allocation times

I/O Count
---------
This is a rough measure of I/O requested by the domain. The number of page
exchanges (or page "flips") between the domain and dom0 are counted. The
number of pages exchanged may not accurately reflect the number of bytes
transferred to/from a domain due to partial pages being used by the network
protocols, etc. But it does give a good sense of the magnitude of I/O being
requested by a domain. Xenmon reports:

o Total number of page exchanges during the measurement interval
o Average number of page exchanges per execution of the domain

Installation
------------
- The patches can be found in the patches directory. They are against changeset 684d8193344209f3bbce4b07977f9d51ec48f63e, but they are fairly trivial and so
  should be easy to apply manually should a patch fails
- the patches are in unified diff format, and should apply from the root of
  your xen source tree
- The tools are in the tools directory. Do a 'make' to build and a 'make install'
  to install them. By default, the executables go in /usr/local/sbin
- the tools directory does NOT need to be in your xen source tree. You can put
  it anywhere and it should compile as long as Xen is properly installed on
  your machine.

Usage Notes and issues
----------------------
- Start xenmon by simply running xenmon.py; The xenbake demon is started and
  stopped automatically by xenmon.
- IMPORTANT: xenbaked uses the processor frequency to convert cycle count to
  timestamps. Therefore, you *MUST* specify the correct frequency for your
  machine, otherwise you will see the time messed up (and utilization not
adding up to 100%). The frequency can be specified by passing the --cpu_freq parameter to xenmon (see xenmon -h). The frequency should be specified in
  MHz.
- To see the various options for xenmon, run xenmon -h. Ditto for xenbaked
- xenmon also has an option (-n) for output log data to a file instead of the
  curses interface
- Lost trace records are indicated by a blue line at the bottom of the screen. When this appears, it means that the data being shown may not be reliable. If the number of lost records is excessive (or if the data just seems wrong
  or doesn't add up) then you may need to increase the memory allocated for
trace buffers. This is done by changing opt_tbuf_size in xen/common/trace.c.
- NDOMAINS is defined to be 8, but can be changed by recompiling xenbaked
- Xenmon.py appears to create 1-2% cpu overhead; Part of this is just the
  overhead of the python interpreter. Part of it may be the number of trace
  records being generated. The number of trace records generated can be
  limited by setting the trace mask (with a dom0 Op), which controls which
  events cause a trace record to be emitted.
- if your terminal screen is not wide enough, xenmon will generate an error
- To exit xenmon, type 'q'

Future Work
-----------
o RPC interface to allow external entities to programmatically access processed data
o I/O Count batching to reduce number of trace records generated
o Compute processor frequency automatically

Attachment: xenmon.tar
Description: Unix tar archive

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-devel] Yet another Xen performance monitoring tool, Rob Gardner <=