WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: Xen system skew MUCH worse than tsc skew (was RE: [Xen-devel] RE: [P

To: "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: Xen system skew MUCH worse than tsc skew (was RE: [Xen-devel] RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time))
From: "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
Date: Fri, 11 Jul 2008 14:53:39 -0600
Cc: Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>
Delivery-date: Fri, 11 Jul 2008 13:55:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C49CD805.1AD83%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Oracle Corporation
Reply-to: "dan.magenheimer@xxxxxxxxxx" <dan.magenheimer@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcjcXTkqnSPaEESHRsmD1HhwJkyawAAAIPEpAAuAJ0AAAgl0IAAT2dqMABFKNxAAAHEqUAAGNaGwAAdfXjQAIb9HAAAAjWJ4AAico9AAAPZ0lAECdtJAABGXRyYAHMJkcAAXKn0wABhjCqA=
> I didn't measure skew across CPUs. I measured jitter between 
> one local TSC
> and the chosen platform timer for calibration (in my case I 
> think this was
> the HPET). I did this because getting a consistent tick rate from the
> platform timer, and from each local TSC, is the basis for the 
> calibration
> algorithm. The more jitter there is between them, the less 
> well it will
> work.
> 
> I implemented a user-space program to collect the required 
> stats. It used
> CLI/STI to prevent getting interrupted when reading the timer pair.

Hmmm... if the TSC is known to be stable*, is there any reason to
do the calibration vs the platform timer?  If TSC is stable,
could we instead just do essentially a divide by cpu_ghz in
get_s_time() and be done, no periodic local_time_calibration()
necessary?  Since TSC is stable on many newer platforms, it
would be nice to use this feature to decrease skew for guests
(both PV and HV).

* stable is the term used by Linux to mean that there's no
skew between the different TSC's in an SMP system

I gave this a try and it seems to work so far.  (Fortunately,
my CPU is 3GHz so I just had to divide by 3... I'm not sure
how to divide by a non-integer.)  Max skew for stime is holding
steady at 270nsec, >40x better than periodic calibration w/hpet.

If this sounds good, a design question:  Should this be
controlled:

1) by a boot option, or
2) by the TSC_CONSTANT cpu flag, or
3) when determined dynamically to be safe using code similar
   to arch/x86/tsc_sync.c in recent Linux kernels

(1) is by far the easiest (perhaps not too late for 3.3?)
while (3) is clearly the best for users but adds lots of
code (bloat/untested)

Thanks,
Dan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>