RE: [Xen-devel] RE: Guest TSC and Xen (Intel and AMD feedback pl

To:	"Tian, Kevin" <kevin.tian@xxxxxxxxx>, "Andi Kleen" <andi@xxxxxxxxxxxxxx>
Subject:	RE: [Xen-devel] RE: Guest TSC and Xen (Intel and AMD feedback please)
From:	"Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
Date:	Wed, 2 Jul 2008 19:26:36 -0600
Cc:	"Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date:	Wed, 02 Jul 2008 18:27:18 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<D470B4E54465E3469E2ABBC5AFAC390F024D94E9@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization:	Oracle Corporation
Reply-to:	"dan.magenheimer@xxxxxxxxxx" <dan.magenheimer@xxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	Acjbw3s0gd7IlJ/RRrm74nP7SYm8zwAK2ASQAC8NYtA=

Hi Kevin --

> a) user may mark TSC unstable to migrate among boxes with
> mismatching TSC bits (bus crystal, cpu freq impact, etc.)
>
> b) user may always use TSC as clocksource and then trap RDTSC
> when migrating to a box with mismatching TSC bits
>
> c) user may always use TSC as clocksource when migrating to a
> box with same TSC bits, by adjusting TSC offset
>
> d) migration may be prevented since no reliable methods to ensure
> a)'s effect. Such prevention then falls into generic CPUID comparison
> involved in migration

I think this is an excellent summary of the choices.

> >> I don't know why you want to single out TSC here.
> >
> >I'm singleing it out because it is a per-cpu clock rather
> >than a platform timer... a platform timer can be (and indeed
> >is) offset'ed on migration and that is sufficient if it is
> >selected as the clocksource.
>
> The problem is not per-cpu vs platform, IMO. Instead, it's the
> problem that currently guest TSC is conveyed by host TSC plus
> an offset approach, without read trap. If you also virtualize a
> platform clocksource by a real one, like dedicating a HPET ch,
> same concern also raises.

Yes, you are right.

> >> That is what Linux is testing for anyways. If it decides it is
> >> ok it is fine.
> >
> >Not sure... if Linux thinks it is running on a uniprocessor,
> >but Xen reschedules this uniprocessor Linux guest on a different
> >processor on the same physical SMP system, does Xen adjust the
> >potential TSC difference?  I could be wrong, but I think not.
>
> Xen can do and should be, since SMP system is driven by same
> crystal and thus host TSC is synced. But I guess by far Xen hasn't
> do it, since the TSC drift (dozen of cycles) is smaller than
> the overhead
> to migrate a vcpu. Thus guest won't observe a backward value in
> theory.

I'm not sure this is always the case (though the patch I posted
earlier today may indicate there is something else going on that
has led to my assumption that TSC was skewing worse than dozens
of cycles).

> >> The reason why it is an advantage to try to make TSC btw
> >> is that it is *much* faster than any other timer and there
> >> are definitely workloads that are very timer intensive.
>
> Curiously, how much downgrade using a platform clock source may be,
> for a time-intensive workload?

A good question.  We have a workload that spends >10% of its time
doing gettimeoffset_tsc()... not sure if that is realistic but
it would be interesting to measure that if it used an hpet instead
or if rdtsc was fully emulated.

> >Yes, understood, but if a timer-intensive application makes
> >the assumption that TSC is synchronized and thus will never
> >go backwards, but TSC is not synchronized and it DOES (apparently)
> >go backwards due to Xen scheduler or migration, a slower timer
> >might have been preferred.
>
> Shouldn't this be a software bug instead?

If the application is smart enough to check the TSC bits when
it launches, but stability changes later due to migration/scheduling,
I'm not sure htis is an application bug.  Or did I misunderstand
your question?

Thanks,
Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

RE: [Xen-devel] RE: Guest TSC and Xen (Intel and AMD feedback please)