[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 00/41] x86: Try to wrangle PV clocks vs. TSC



On Fri, 2026-05-15 at 12:19 -0700, Sean Christopherson wrote:
> Dave/Thomas/Peter/Boris, what's the going rate for bribes to take something
> like this through the tip tree?
> 
> The bulk of the changes are in kvmclock and TSC, but pretty much every
> hypervisor's guest-side code gets touched at some point.  I am reaonsably
> confident in the correctness of the KVM changes.  Michael tested Hyper-V in
> v2, and while there were conflicts when rebasing, they were largely
> superficial (and I've just jinxed myself).  For all other hypervisors, assume
> the code is compile-tested only, but those changes are all quite small and
> straightforward.
> 
> The only changes that are questionable/contentious are the last two patches,
> which have KVM-as-a-guest use CPUID 0x16 to get the CPU frequency, even on
> AMD (that's the dubious part).  I very deliberately put them last, so that
> they can be dropped at will (I don't care terribly if those patches land).
> To merge them, I would want explicit Acks from Paolo and David W.
> 
> So, except for the last two patches, to get the stuff I really care about
> landed, I think/hope it's just the TSC and guest-side CoCo changes that need
> reviews/acks?
> 
> The primary goal of this series is (or at least was, when I started) to
> fix flaws with SNP and TDX guests where a PV clock provided by the untrusted
> hypervisor is used instead of the secure/trusted TSC that is controlled by
> trusted firmware.
> 
> The secondary goal is to draft off of the SNP and TDX changes to slightly
> modernize running under KVM.  Currently, KVM guests will use TSC for
> clocksource, but not sched_clock.  And they ignore Intel's CPUID-based TSC
> and CPU frequency enumeration, even when using the TSC instead of kvmclock.
> And if the host provides the core crystal frequency in CPUID.0x15, then KVM
> guests can use that for the APIC timer period instead of manually calibrating
> the frequency.
> 
> The tertiary goal is to clean up all of the PV clock code to deduplicate logic
> across hypervisors, and to hopefully make it all easier to maintain going
> forward.

I booted this in qemu with -cpu host,+invtsc,+vmware-cpuid-freq

I was expecting to see it eschew the kvmclock and use *only* the TSC.
Is there even any need for 'tsc-early' given that it's *told* the TSC
frequency in CPUID? Shouldn't it have detected that the TSC is known
before init_tsc_clocksource() runs?

And then it even spent some time at boot actually using the kvmclock as
clocksource... when ideally I don't think it would even have *enabled*
it at all?

[    0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 
0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.000000] tsc: Detected 2400.000 MHz processor
[    0.008205] TSC deadline timer available
[    0.008270] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 
0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.159085] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, 
max_idle_ns: 19112604467 ns
[    0.164074] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 
0x22983777dd9, max_idle_ns: 440795300422 ns
[    0.229087] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, 
max_idle_ns: 1911260446275000 ns
[    0.337095] clocksource: Switched to clocksource kvm-clock
[    0.345246] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, 
max_idle_ns: 2085701024 ns
[    0.356201] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 
0x22983777dd9, max_idle_ns: 440795300422 ns
[    0.360560] clocksource: Switched to clocksource tsc

Attachment: smime.p7s
Description: S/MIME cryptographic signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.