|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/2] x86/time: use native TSC scaling factors when TSC is not scaled
On 16.04.2026 14:51, Roger Pau Monné wrote:
> On Thu, Apr 16, 2026 at 01:28:11PM +0200, Jan Beulich wrote:
>> On 14.04.2026 12:33, Roger Pau Monne wrote:
>>> When running HVM guest in native TSC mode avoid using the recalculated vTSC
>>> scaling factors based on the cpu_khz value. Using the kHz based frequency
>>> leads to the TSC scaling values possibly not being the same as the ones
>>> used by the per CPU cpu_time->tsc_scale field, which introduces skew
>>> between the guest and Xen's calculations of the system time.
>>>
>>> On a 2gHz system, where the frequency is possibly detected as 1999999999Hz
>>> (note this is a worse-case scenario), the cpu_khz variable will be set to
>>> 1999999kHz, and hence 999Hz cycles will be not accounted for per second.
>>> Over a second (the time synchronization period), this leads to a skew of:
>>>
>>> cycles * 1 / (Hz freq) = 999 / 1999999999 = 499,5ns
>>>
>>> So far this has gone unnoticed because the time synchronization rendezvous
>>> forces the update of the tsc_timestamp and system_time fields in the vCPU
>>> time info area, and hence the skew only accumulates up to the rendezvous
>>> period. Attempting to remove the rendezvous causes the skew to grow
>>> unbounded.
>>>
>>> Fix by using the native TSC scaling values (as used by Xen) when the guest
>>> TSC is not scaled.
>>>
>>> Fixes: eab8a90be723 ("x86/time: scale host TSC in pvclock properly")
>>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>> ---
>>> I'm worried about the usage of cpu_khz beyond simple printing it for
>>> informational purposes. Overall I think it would be safer to store the
>>> frequency in Hz, as to avoid losing the least significant digits.
>>>
>>> In any case, that's a different change.
>>
>> I'm not quite sure - improving accuracy is of course a good thing, but will
>> we ever be able to do any such calculations error free, when already the
>> detected frequency isn't exactly precise?
>>
>>> --- a/xen/arch/x86/time.c
>>> +++ b/xen/arch/x86/time.c
>>> @@ -1710,17 +1710,25 @@ static void collect_time_info(const struct vcpu *v,
>>> else
>>> {
>>> if ( is_hvm_domain(d) && hvm_tsc_scaling_supported )
>>> - {
>>> tsc_stamp = hvm_scale_tsc(d, t->stamp.local_tsc);
>>
>> This is a potentially imprecise calculation. How likely is it that its result
>> will indeed ...
>>
>>> - u->tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
>>> - u->tsc_shift = d->arch.vtsc_to_ns.shift;
>>> - }
>>> else
>>> - {
>>> tsc_stamp = t->stamp.local_tsc;
>>> +
>>> + /*
>>> + * HVM guests using the native TSC ratio should use the same
>>> per-CPU
>>> + * scaling factors as Xen. This ensures time keeping is always in
>>> sync
>>> + * between Xen and the guest.
>>> + */
>>> + if ( tsc_stamp == t->stamp.local_tsc )
>>
>> ... exactly match t->stamp.local_tsc? Don't we possibly need a (small) error
>> margin? (In which case of course the next question would be: How to establish
>> such a margin?)
>
> hvm_scale_tsc() has:
>
> if ( ratio == hvm_default_tsc_scaling_ratio )
> return tsc;
>
> So when using no scaling the input value is the output value, and
> hence tsc_stamp will match exactly t->stamp.local_tsc.
Ouch. I did look at the function, but managed to have all my attention drawn
to the asm() there. I'm sorry for the noise. As it's strictly an improvement:
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
The other, earlier remark remains applicable, though.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |