On Thu, May 25, 2006 at 09:44:15AM -0700, Santos, Jose Renato G wrote:
> >> -----Original Message-----
> >> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
> >> Ray Bryant
> >> Sent: Thursday, May 25, 2006 7:21 AM
> >> To: xen-devel@xxxxxxxxxxxxxxxxxxx
> >> Cc: Steve Dobbelstein; Eranian, Stephane
> >> Subject: Re: [Xen-devel] Xenoprof in an HVM domain
> ... <stuff deleted> ....
> >> This assumes, of course, that one can figure out
> >> how to virtualize the performance counters, and then the
> >> hypervisor would have to sort out conflicts, etc, between
> >> domains that wanted to use the performance counters (perhaps
> >> these would be a resource the hypervisor could dynamically
> >> allocate to a domain, by, for
> >> example, some kind of "xm" command).
> I don't think this is how performance counters should be virtualized.
> Virtualizing performance counter should save/restore the values of
> active perf counters on every VCPU/domain context switch. There should
> be no need for such a "xm" command.
> Performance counter virtualization is not currently supported in Xen,
> although it would be nice to have it. With counter virtualization, guest
> domains would be able to profile themselves with un-modified oprofile.
> This would be usefull to enable users to profile their applications on
> Xen guests in the same way they are used to do on vanila linux.
I think we need to clearly identify and prioritize the needs.
The first thing to do is to ensure that guests OSes using the PMU when
running native can continue to do so when running virtualized. That holds
true for both para-virtualized and fully virtualized (Pacifica/VT) guests.
This is the highest priority because some OSes do rely on
performance counters. Without such support, they cannot provide the same
kernel level API to their applications. In other words, certain applications
The second need is what XenOprofile is addressing which is how to get a "global
of what is going in the the guests and in the VMM. To me this is a lower
need because the system can function without it. Yet I recognize it is important
for tuning the VMM.
Those two distinct needs are not specific to Xen, in fact, they are exact
on what you need to provide in a native OS. The perfmon2 subsystem does this.
global view is equivalent to "system-wide" monitoring and the per-guest
PMU is equivalent to the per-thread mode.
To support per-guest monitoring, the PMU must be virtualized. The counters must
saved/restored on domain switch. A similar operation is done on thread-switch in
the Linux kernel for perfmon2. In general performance counters are quite
to read, ranging from 35 cycles on Itanium2 to thousands of cycles on some IA-32
processors. As indicated by Ray/Renato, you can be smart about that. In perfmon2
we do lazy save/restore of performance counters. This has worked fairly well. I
expect domain switches to happen less frequently that thread switches, anyway.
many measurements do use only a limited number of PMU registers.
Another important point is that I do not think that per-guest measurements
include VMM-level execution, unlike for a system-wide measurement. That is true
for both para-virtualized and fully virtualized (VT/Pacifica) guests. This is
important for sampling. I am not sure tools would know what to do with samples
cannot attribute to code they know about. Furthermore, the goal of
is to HIDE to guests applications the fact that they run virtualized. What would
we make an exception for monitoring tools? Note that this implies that the VMM
turn off/on monitoring upon entry/exit.
For system-wide monitoring, you do need visibility into the VMM. Yet monitoring
is driven from a guest domain, most likely domain0. On counter overflow, the VMM
receives the PMU interrupt and the corresponding interrupted IP (IIP). That
must somehow be conveyed to the monitoring tool. It is not possible to simply
interrupt to domain0 (controlling domain for the monitoring session). To solve
problem, XenOprofile uses a in-VMM buffer where the "samples" are first saved.
there needs to be a communication channel with controlling domain to send
when the buffer becomes full. There needs to be one such buffer per
buffers only need to be visible to domain0. The whole mechanism should NOT
any special code in the guest domains, except for domain0. That way it would
with para-virtualized and fully virtualized guests be they Linux, Windows or
else. In XenOprofile, I understand the buffer is shared via remapping. I think
the interface to setup/control the buffer needs to be more generic. For
certain measurements may need to record in the buffer more than just the IIP.
they may need to also save certain counters values. The controlling domain needs
some interface to express what needs to be recorded in each sample. Furthermore,
it also needs to know how to resume after an overflow, i.e., what sampling
to reload in the overflowed counter. All this information must be passed to the
because there is not intervention from the controlling domain until the buffer
up. Once again, this is not something new. We have the equivalent mechanism in
simply because we support an in-kernel sampling buffer.
The next step is to see how the PMU can be shared between a system-wide usage
and a per-guest usage. On some PMU models, this may not be possible due to
limitations, i.e., fully independence of the counters. This gets into a new
of complexity which has to be managed by the VMM. Basically, this requires a VMM
PMU register allocator per virtual-CPU. This also implies that consumers cannot
expect to systematically have access to the full PMU each time they ask for it.
Note that it may be acceptable for the time being to say that system-wide and
per-guest are mutually exclusive.
Hope this helps.
> The current model supported by Xenoprof is system-wide profiling, where
> counters are used to profile the collection of domains and Xen together.
> This is usefull for Xen developers to optimize Xen and para-virtualized
> kernels running on Xen.
> Ideally we would like to have support for both system-wide profiling
> (for Xen developers) and independent guest profiling with perf counter
> virtualization (for Xen users). Adding perf counter virtualization is in
> our to do list. If anybody is interested in working on this please let
> me know.
> We would appreciate any help we could get.
Xen-devel mailing list