Hi,
Maybe some developers and interested people here have comments on this
Mail below I sent on xen-users without getting feedback yet?
I am already a bit further with investigations and thoughts:
As service quality of domains can and probably is best monitored from
the outside for most circumstances, I'm thinking mainly in the
direction of how an adminstrator can find out if a Xen Host has enough
ressources to run more VM's (e.g. as a migration target), or if VM's
need to be migrated from that machine because some or all ressources
are fully used, assuming the ideal goal is to use all ressources 100%,
not more, not less. Not sure if this is too theorethical?!
Seems like the newest XenMon/Shareguard paper has interesting info on
giving service guarantees, but also, as far as I can see, they work
into the direction of giving only guarantees about share of available
ressources, from an management perspective internal to the machine,
and not yet from the external monitoring and decision perspective.
Henning
---------- Forwarded message ----------
Subject: performance and ressource monitoring and statistics
To: Xen users mailing list <xen-users@xxxxxxxxxxxxxxxxxxx>
Hi,
Apart from normal service availability and quality monitoring and
measuring of ressources on a system as it would be done for any normal
machine, I think about additionally monitoring Xen-specific data and
creating one/some Nagios plugins for this.
So one idea is that I want to know when cpu, net and disk I/O on a Xen
host are saturated, which could, depending on specific needs and
SLA's, make it necessary to add ressources to the host or migrate VM's
to other hosts on which these ressources aren't saturatd yet, or
aother measures.
While, as far as I understand it, CPU scheduling and traffic shaping
are highly useful to set rules to allocate a given share of the
available ressources to specific vm's, and set minimal and maximal
amounts of these shares, in some cases it might be desirable to get
more information, and be warned.
As a result of this, I started to analyze (with a nagios plugin)
different sources of xen runtime data, beginning with the output of
xentop -b -i 2, and will mgo on to look deeper into libxenstats,
XenMon and xenoprof(of which I am not yet sure if it's good for
analyzing production runtime data, or if it's more the kind of
profiling one does in non-production environments).
Getting CPU share and seeing when the CPU is fully loaded is no great deal.
Getting useful information of net and disk I/O saturation requires a
lot of math and measuring (what's the maximum possible net/disk I/O on
that machine, under the given configuration? ) - they both are
depending on overall hardware, cpu scheduling and a lot of other
factors - I am really not sure if this is worth the trouble.
I am at the same time working on implementations and looking at
information and publications on that topic, like multiple papers on
XenMon available, and so on.
Did anybody else think about this, or anybody has comments if this is
the right direction to think or better/concrete data to collect and
look at?
Henning
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|