On 3/3/07 00:58, "Mathieu Desnoyers" <compudj@xxxxxxxxxxxxxxxxxx> wrote:
> From why I see, I cannot use get_s_time from an NMI handler, because it
> can race with the tsc_scale update in local_time_calibration. Do you
> have any plan to support this ?
We're open to suggestions. We could have a protocol for NMI context to grab
snapshots of the timestamp info when it is safe to do so (maybe using the
protocol already employed by guest kernels).
> local_time_calibration is called from the timer interrupt, which seems
> to have the highest priority at least on x86 and x86_64. Therefore, why
> are you disabling interrupts explicitely in this function ? (since you
> know they are already disabled).
As in Linux, there is no concept of priorities among interrupts.
Specifically in this case the timer interrupt is usually EOIed early, after
which any other interrupt can be delivered to the CPU. So the explicit
irq-disable is required.
> Do you offert any method for the Linux kernel in dom0 and domUs to read
> this timer (similar interface to vsyscall in Linux ?). This can be very
> useful for system-wide tracing.
Which timer do you mean?
> I am a bit concerned about the performance impact of calling
> scale_delta() at each timestamp read. Have you measured how many cycles
> it takes ?
No, but it only involves a few cheap instructions, including one or two MUL
instructions (which are not much more expensive than most other integer ALU
ops). I'd say it's innocent until proven guilty.
> Your interpolation system between the timer interrupt and the TSC, with a
> tsc_scale used to make sure there is no time jump when the master
> oscillator goes slower than the local time. However, I see that there is
> a forward time jump when the local time lags behind the TSC. Is there
> any reason for now using a scale factor to smoothly accelerate the
> frequency instead ?
Not really. It just simplifies things and if the system is stable then any
inaccuracy of TSC estimate vs. timer interrupt should be minuscule and the
'jump' hardly detectable.
> Why are you interpolating between the timer interrupt and the TSC ? I
> guess this is useful to support Intel SpeedStep and AMD PowerNow, but I
> want to be sure.
We want the guest to use the TSC because it is fast to access and available
on CPUs that Xen supports. And yes, we want to deal with the cases where the
TSC is scaled for power management, or even big-iron systems where the TSCs
do not increment in lock-step.
> I guess you are aware that you change the TSC's precision by doing so :
> it will suffer, in the worse case, of a drift of the IRQ latency of the
> system, which depends on the longest critical sections with IRQ
I don't believe this is true. First, bear in mind we only sync TSCs to PIT
if there is no better time source (e.g., reliable HPET). Even when we do use
the PIT, the code to read platform time reads the PIT directly -- it only
uses the timestamp from an infrequent PIT interrupt because the PIT counter
width is only 16 bits, so we need to track higher-order bits in software.
The PIT interrupt is part of that mechanism.
> Since the TSCs, on the CPUs populating a physical machine, can differ
> from up to the IRQ latency in the worse case, you could have
> timestamps taken exactly at the same moment differing from this amount.
The significant latency here will be accessing the PIT across the legacy ISA
bus. I don't believe IRQ latencies are an issue.
> Do you have some latency measurements regarding the hypervisor ?
IRQ latencies? Given that Xen does hardly any work in interrupt context
there are no substantial interrupt-context or interrupts-disabled critical
regions. I very much doubt we have any that run longer than a few
Xen-devel mailing list