This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] RE: TSC scaling and softtsc reprise, and PROPOSAL

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] RE: TSC scaling and softtsc reprise, and PROPOSAL
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Mon, 20 Jul 2009 16:52:23 -0700 (PDT)
Cc: Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>, John Levon <levon@xxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 20 Jul 2009 16:52:58 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C68A9A0A.FF56%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> So how bad is the non-softtsc default mode anyway?

A fair question.  To me, "bad" means that TSC going backwards
can be detected by an application that samples TSC in
different threads that have been synchronized through some
simple "ordering" semaphore.  I admit this is a difficult
goal to achieve and few applications in reality will depend
on this exactly, but it is certainly feasible for a database or
a tracing tool to timestamp ordered events this way and
expect to be able to replay them in timestamp order.
(For the sake of any further discussion, let's call this
tsc-epsilon.... if the skew exceeds tsc-epsilon then the
app might observe time going backwards.)

> Our default timer_mode
> has guest TSCs track host TSC (plus a fixed per-vcpu offset 
> that defaults to
> having all vcpus of a domain aligned to vcpu0 boot = zero tsc).

Are you referring to c/s 19506?  It looks like this code
only runs on a physical machine on which tsc is already
well-behaved.  Is this because the X86_FEATURE_CONSTANT_TSC
bit is passed through unchanged to the guest so that
you are assuming guests "know" whether they can trust TSC
or not?  AFAIK, this bit is not particularly reliable (reflects
the socket, not the system) and not well-exposed to applications.

> Looking at the email thread you cited, all I see is someone from Intel
> saying something about how their code to improve TSC 
> consistency across
> migration avoids RDTSC exiting where possible (which I do not 
> see -- if the
> TSC rates across the hosts do not match closely then RDTSC exiting is
> enabled forever for that domain), and, most bizarrely, that 
> their 'solution'
> may have a tsc drift >10^5 cycles. Where did this huge number 
> come from?

Yes, I don't know where that number comes from either.

> What solution is being talked about, and under what 
> conditions might the
> claim hold? Who knows!
> I don't think we have really solid data on either the 
> performance or the
> accuracy side of the debate. And that means we don't have 
> much to argue
> over.

I'm concerned with correctness.  Although sufficient accuracy
provides correctness, I don't think we are anywhere near
tsc-epsilon.  So the only way to guarantee correctness is
via softtsc on all vcpus.

Xen-devel mailing list