One big clue: Looking at /proc/interrupts inside
the PV guest, the number of timer0 interrupts is
Not remembering well how timer interrupts are handled
in a PV guest... could this high frequency be happening
because the Linux-side PV code is setting a timer
or because the Xen-side interrupt delivery code is
> -----Original Message-----
> From: Dan Magenheimer
> Sent: Friday, November 20, 2009 4:45 PM
> To: Keir Fraser; Jeremy Fitzhardinge
> Cc: Xen-Devel (E-mail)
> Subject: Bizarre pv kernel ultra-high frequency rdtsc?!?
> Hi Jeremy/Keir (and any other PV time experts out there) --
> Working on tsc_mode stuff I've run into a roadblock where
> there is some time-related interaction between xen and a
> PV kernel that I don't understand. I'm hoping you
> might provide a clue. There's also a reasonable chance
> that this might be uncovering a significant bug that's
> been around awhile, but never noticed as other than
> a barely noticeable vague slowdown because rdtsc was
> unemulated (and "fast").
> The problem:
> In order to preserve TSC across save/restore/migrate, I
> have implemented a "tsc offset" (and also a "tsc scale"
> but that isn't used yet).
> The result is that the PV kernel starts doing rdtsc's at
> a VERY high frequency (1 MILLION / sec). I suspect this
> may be a variation of what Jeremy reported at one point
> when emulated rdtsc was first in-tree, but seemed to go away.
> By adding some debug code (and confirmed with xenctx)
> I can see that the millions of rdtsc's are half in
> get_nsec_offset() and half in do_gettimeofday() (presumably
> inlined from get_usec_offset()). This is a 32-bit 2.6.18-based
> PV kernel, not upstream. Poring through the 2.6.18 PV time
> code, I can find several places where an essentially infinite
> loop might happen if the version fields are wacko, but
> none where the timestamp contents make any difference
> in control flow, so don't see how modifying these
> values (by adding the offset) could cause a behavioral
> change in Linux, but obviously a big change is happening!
> I can reproduce the problem with a very simple patch
> on xen-unstable that adds a fake fixed offset in the
> three places I add the "tsc offset", see attached.
> By changing BIG_OFFSET to 0, in this patch, the
> frequency of rdtsc's becomes normal again.
> Suspecting some interaction with wallclock time, I
> tried shutting off ntpd and with/without independent
> wallclock in the PV guest. No difference.
> I also added debug code to see if the Xen-side code
> was churning through version numbers... it is not.
> Any ideas? (And, sorry, I know you're on a trans-
> hemisphere trip right now.)
> diff -r bec27eb6f72c xen/arch/x86/time.c
> --- a/xen/arch/x86/time.c Sat Nov 14 10:32:59 2009 +0000
> +++ b/xen/arch/x86/time.c Fri Nov 20 16:58:18 2009 -0500
> @@ -813,6 +813,8 @@ s_time_t get_s_time(void)
> #define version_update_begin(v) (((v)+1)|1)
> #define version_update_end(v) ((v)+1)
> +#define BIG_OFFSET 10000000000ULL
> static void __update_vcpu_system_time(struct vcpu *v, int force)
> struct cpu_time *t;
> @@ -827,7 +829,7 @@ static void __update_vcpu_system_time(st
> /* Don't bother unless timestamps have changed or we are
> forced. */
> if ( !force && (u->tsc_timestamp == (v->domain->arch.vtsc
> - ? t->stime_local_stamp
> + ?
> t->stime_local_stamp - BIG_OFFSET
> : t->local_tsc_stamp)) )
> @@ -835,8 +837,8 @@ static void __update_vcpu_system_time(st
> if ( v->domain->arch.vtsc )
> - _u.tsc_timestamp = t->stime_local_stamp;
> - _u.system_time = t->stime_local_stamp;
> + _u.tsc_timestamp = t->stime_local_stamp - BIG_OFFSET;
> + _u.system_time = t->stime_local_stamp - BIG_OFFSET;
> _u.tsc_to_system_mul = 0x80000000u;
> _u.tsc_shift = 1;
> @@ -1598,6 +1600,8 @@ void pv_soft_rdtsc(struct vcpu *v, struc
> + now -= BIG_OFFSET;
> regs->eax = (uint32_t)now;
> regs->edx = (uint32_t)(now >> 32);
Xen-devel mailing list