RE: [Xen-devel] RE: [PATCH] record max stime skew (was RE: [PATC

To:	"Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject:	RE: [Xen-devel] RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time)
From:	"Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
Date:	Wed, 9 Jul 2008 18:24:44 -0600
Cc:	Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>
Delivery-date:	Wed, 09 Jul 2008 17:26:09 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<C4943EFB.1A911%keir.fraser@xxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization:	Oracle Corporation
Reply-to:	"dan.magenheimer@xxxxxxxxxx" <dan.magenheimer@xxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AcjcXTkqnSPaEESHRsmD1HhwJkyawAAAIPEpAAuAJ0AAAgl0IAAT2dqMABFKNxAAAHEqUAAGNaGwAAdfXjQAIb9HAAAAjWJ4AAico9AAAPZ0lAECdtJA

> > Well one suspicion I had was that very long hpet reads were
> > getting serialized, but I tried clocksource=acpi and
> > clocksource=pit and get similar skew range results.
> > In fact pit shows a max of >17000 vs hpet and acpi closer
> > to 11000.  (OTOH, I suppose it IS possible that this is
> > roughly how long it takes to read each of these platform
> > timers.)
> 
> That ought to be easy to check. I would expect that the PIT, 
> for example,
> could take a couple of microseconds to access.
> 
>  -- Keir

(I haven't seen the patch applied... since it just collects
data, it would be nice if it was applied so others could
try it.)

To follow up on this, I tried a number of tests but wasn't
able to identify the problem and have given up (for now).
In case someone else starts looking at this (or if any of
my tests suggest a solution to someone), I thought I'd
document what I tried.

PROBLEM: Xen system time skew between processors local time
and platform time is generally "small" but "sometimes" gets
quite "large".  This is important because, the larger the
skew, the more likely an hvm guest will experience time
stopping or (in some cases) time going backwards.

On my box, "small" is under 1 usec, "large" is 9-18 usec,
and "sometimes" is about one out of 500 measurements.  Note
that my box is a recent vintage Intel single-socket dual-core
("Conroe").

I suspect periodically some lock is being waited for for
a long time, or maybe an unexpected interrupt is occurring,
but I didn't find anything through code reading or
experiments.

TEST METHOD: The patch I sent on this thread collects data
whenever local_time_calibration() is run (which is 1Hz on
each processor) and "xm debug-key t" prints this data
so it can be seen with "xm dmesg".  To see the problem,
one need only boot dom0 and run xm debug-key and xm dmesg.

1) CONJECTURE: Related to how long it takes to read the
   platform timer

The max skew (and distribution) are definitely different
depending on whether clocksource=hpet or clocksource=pit.
For hpet, I am almost always seeing a max skew of 11000+
and with pit 17000+.  ONCE (over many hours of runs) I saw
a skew with hpet of 15000.  However, I added code in the
platform timer read routine (inside all locks but NOT with
interrupts off) to artificially lengthen a platform timer
read and it made no difference in the measurements

2) CONJECTURE: Max skew only occurs on some processors (e.g.
   not on the one that does the platform calibration)

Nope, if you wait long enough max skew is fairly close
on all processors (though in some cases, it seems to
take a long time... perhaps because of unbalanced load?)

3) CONJECTURE: Max skew occurs on platform timer overflow.

Possibly, but there is certainly not a 1-1 correspondence.
Sometimes there are more large skews than overflows and
sometimes less.

4) CONJECTURE: Artifact of ntpd running

Nope, same skews whether ntpd is running on dom0 or not

5) CONJECTURE: Related to frequency changes or suspends

Nope, none of these happening on my box.

6)  CONJECTURE: "Weirdness can happen" comment in time.c

Nope, this path isn't getting executed.

7) CONJECTURE: Result of natural skews between platform
timer and tsc, plus jitter.  Unfixable.

Possible, untested, not sure how.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

RE: [Xen-devel] RE: [PATCH] record max stime skew (was RE: [PATCH] stric