On Sat, Aug 09, 2008 at 02:55:33PM -0600, Dan Magenheimer wrote:
> > >> Again no guarantees but I think we are now under the magic
> > >> threshold where the skew is smaller than the time required
> > >> for scheduling a VCPU onto a different CPU. If so,
> > >> consecutive gethrtime's by the same thread in a domain
> > >> should always be monotonic.
> > >
> > > Right! That sounds positive.
> > It's an improvement, but I'm pretty sure it's still not sufficient for
> > Solaris. If I understand the change correctly, it seems to solve the
> > problem for single-vcpu guests on an SMP, but not for multi-vcpu
> > guests on an SMP. It sounds like the OS could reschedule a thread
> > from VCPU 0 to VCPU 1 and consecutive calls to gethrtime() could still
> > return non-monotonic results.
> How long does it take for Solaris to reschedule a thread from
> VCPU0 to VCPU1? Its certainly not zero time (and you also need
> to add the overhead of gethrtime).
> But, yes, the same "no guarantees" applies to this situation...
> if a Solaris thread continuously calls gethrtime(), there is a
> non-zero probability that, if the thread changes physical CPUs
> and the thread rescheduling code is "very fast",
> two consecutive calls could observe time going backwards.
It's only non-zero if we can indeed reschedule fast enough. If it's now
below the threshold, then we can consider it effectively fixed. Only
testing can really tell us that.
> But that's true with much recent vintage hardware because TSCs
> sometimes skew, and so most OS's with high-res timers are able to
> deal with this.
> True of Solaris, John?
I'm not an expert on the relevant code, but I believe the solution to
TSC drift (as Solaris calls what I think you call skew) is to set
'tsc_gethrtime_enable' to zero, so we don't use the TSC for this
Xen-devel mailing list