Well, the good news is I'm now convinced that stime
is not going backwards... it was a false alarm. The
bad news is that skew is still 3x-4x worse than I had
previously thought.
The backwards stime was an artifact of the debug code.
I needed to put the spin_lock_irqsave/spin_unlock_irqrestore
pair around the ENTIRE call to get_s_time... I was
apparently measuring nested get_s_time calls (get_s_time
interrupted by an interrupt that calls get_s_time)
and the measurement code was seeing the unwinding
of the interrupt stack. Sorry for the noise.
For the bad skew, I'll look at some of the ideas proposed
in the other ("Xen system skew MUCH worse than tsc skew")
thread.
> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: Saturday, July 26, 2008 9:14 AM
> To: dan.magenheimer@xxxxxxxxxx; Tian, Kevin; Xen-Devel (E-mail)
> Cc: Dave Winchell
> Subject: Re: s_time going backwards on same processor?
>
>
>
> Which clocksource was this with? It would be worth saving the
> previous tsc
> and struct cpu_time values so you can print those out when
> you see s_time
> goes backwards. It'll make it much easier to see what actually went
> 'backwards' -- also we could run through the 'prev' and 'now' s_time
> calculation arithmetic manually to see if any of the
> arithmetic is at fault.
>
> -- Keir
>
> On 26/7/08 15:50, "Dan Magenheimer"
> <dan.magenheimer@xxxxxxxxxx> wrote:
>
> > Thanks Kevin for catching that. I fixed it using spin_lock_irqsave
> > and spin_unlock_irqrestore and have already seen stime going
> > backwards... I haven't retested all the clocksources and smp.
> > but it doesn't appear that the incorrect enablement of
> > interrupts was the problem.
> >
> > It's interesting that the problem occurs even with one processor.
> > Hopefully that will make it easier to debug.
> >
> > Updated debug patch attached.
> >
> > Thanks,
> > Dan
> >
> >> -----Original Message-----
> >> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> >> Sent: Saturday, July 26, 2008 12:51 AM
> >> To: Tian, Kevin; dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
> >> Cc: Dave Winchell
> >> Subject: Re: s_time going backwards on same processor?
> >>
> >>
> >> The code should be using spin_lock_irqsave/spin_unlock_irqrestore.
> >>
> >> As it is it's incorrect and could cause odd behaviour. I
> >> don't know whether
> >> that would extend to seeing time goes backwards as often as
> >> Dan reports, but
> >> obviously the test has to be re-run.
> >>
> >> -- Keir
> >>
> >> On 26/7/08 03:23, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> >>
> >>> In a quick glimpse, your measure_stime_skew has interrupt
> >>> enabled absolutely at exit, instead of restoring previous
> >>> setting. Would it be a trouble-maker? I have no latest code
> >>> at hand, but at least for what I can check:
> >>>
> >>> in local_time_calibration:
> >>>
> >>> local_irq_disable();
> >>> curr_master_stime = read_platform_stime();
> >>> curr_local_stime = get_s_time();
> >>> rdtscll(curr_tsc);
> >>> local_irq_enable();
> >>>
> >>> with your patch, interrupt is enabled after get_s_time, which
> >>> then may have curr_tsc read out from a different time point...
> >>>
> >>> Thanks,
> >>> Kevin
> >>>
> >>>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> >>>> Sent: 2008年7月26日 9:47
> >>>>
> >>>> OK, I am definitely recording stime going backwards
> >>>> observed with the attached patch. I have recorded dozens
> >>>> over a few hours. It appears to have no
> >>>> obvious pattern between reports, but one thing
> >>>> is consistent: In all cases, t->tsc_scale.shift
> >>>> is -1. I'll try to run some more tests over the
> >>>> weekend (e.g. with different clocksources... this
> >>>> is with clocksource=hpet), but thought I'd report
> >>>> what I have seen. I'm running on a dual-core single
> >>>> socket ("Conroe").
> >>>>
> >>>> Dan
> >>>>
> >>>> P.S. If you try the patch, ensure you set
> >>>> the boot parameter "measurestime".
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> >>>>> Sent: Wednesday, July 23, 2008 1:06 AM
> >>>>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
> >>>>> Cc: Dave Winchell
> >>>>> Subject: Re: s_time going backwards on same processor?
> >>>>>
> >>>>>
> >>>>> On 22/7/08 22:58, "Dan Magenheimer"
> >>>>> <dan.magenheimer@xxxxxxxxxx> wrote:
> >>>>>
> >>>>>> I *do* know that get_s_time() on different processors
> >>>>>> can have this behavior and I know it is possible for
> >>>>>> hvm_get_guest_time() to go backwards (timer_mode=0),
> >>>>>> but I thought s_time was monotonically non-decreasing
> >>>>>> on any given processor and that read_platform_stime()
> >>>>>> is also monotonically non-decreasing.
> >>>>>>
> >>>>>> Does dom0 maybe have direct hardware access to the hardware
> >>>>>> platform timer that xen system time is dependent on?
> >>>>>
> >>>>> No matter what happens to the underlying platform timer,
> >> it should be
> >>>>> impossible for Xen system time to go backwards on any given
> >>>>> processor. The
> >>>>> calibration function never sets the TSC and system timestamps
> >>>>> for the next
> >>>>> time record any earlier than current TSC value and current
> >>>>> computed system
> >>>>> time value. Hence it should be impossible for system time to
> >>>>> be computed as
> >>>>> earlier than that time record.
> >>>>>
> >>>>> -- Keir
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>
> >>
> >>
>
>
> _______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|