Is that just the accuracy of the underlying tsc
on your test system, e.g. the skew of tsc relative to an external
(ntp) source? Or is Xen (tsc-based) system time skewing that much
on an overcommitted system (and skewing much less than 0.03% on an
unloaded system)?
Running the following on dom0 both on an unloaded and overcommitted
system (with ntpd off in dom0 and all guests) might be interesting:
# ntpdate $NTPSERVER; sleep 3600; ntpdate -q $NTPSERVER
-----Original Message-----
From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
Sent: Saturday, June 07, 2008 3:21 PM
To: Keir Fraser; dan.magenheimer@xxxxxxxxxx
Cc: Ben Guthro; xen-devel; Dave Winchell
Subject: RE: [Xen-devel] [PATCH 0/2] Improve hpet accuracy
Possibly there are bugs in the hpet device model which are fixed by Dave's
patch. If this is actually the case, it would be nice to break those out as
separate patches, as I think an 11% drift must largely be due to
device-model bugs rather than relatively insignificant differences between
hvm_get_guest_time() and physical HPET main counter.
Hi Keir,
I tried an experiment on Friday where I short circuited the missed ticks policy
code in the hpet.c patch, but used the physical hpet each access. The result
for Linux
was a drift of .1%, same as the xen-unstable bits.
Conversely I get very good drift numbers, i.e., under .03%, when using the
missed ticks
policy code and running in simulated mode (layered on stime) when stime uses
hpet.
So clearly, the improvement from .1% to .03% is due to the policy code.
I haven't run the short circuit test with the windows policy but I can do that
on Monday.
Note: For Windows and Linux I get < .03% drift using the policy code and running
simulated mode whether stime is using hpet or some other device.
regards,
Dave
-----Original Message-----
From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
Sent: Fri 6/6/2008 6:34 PM
To: dan.magenheimer@xxxxxxxxxx; Dave Winchell
Cc: Ben Guthro; xen-devel
Subject: Re: [Xen-devel] [PATCH 0/2] Improve hpet accuracy
On 6/6/08 21:29, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:
Although hwhpet=1 is a fine alternative in many cases, it may
be unavailable on some systems and may cause significant performance
issues on others. So I think we will still need to track down
the poor accuracy when hwhpet=0. And if for some reason
Xen system time can't be made accurate enough (< 0.05%), then
I think we should consider building Xen system time itself on
top of hardware hpet instead of TSC... at least when Xen discovers
a capable hpet.
Yes, this would be a sensible extra timer_mode: have hvm_get_guest_time()
call to the platform time read function, and bypass TSC altogether. This
would be cleaner than having only the vHPET code punch through to the
physical HPET: instead we have the boot-time chosen platform timesource used
by all virtual timers.
Or maybe there's a computation error somewhere in the hvm hpet
scaling code? Hmmm...
Possibly there are bugs in the hpet device model which are fixed by Dave's
patch. If this is actually the case, it would be nice to break those out as
separate patches, as I think an 11% drift must largely be due to
device-model bugs rather than relatively insignificant differences between
hvm_get_guest_time() and physical HPET main counter.
-- Keir