This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] write_tsc in a PV domain?

To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Alan Cox <alan@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] write_tsc in a PV domain?
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Thu, 27 Aug 2009 20:29:18 -0700 (PDT)
Cc: "Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Thu, 27 Aug 2009 20:29:50 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4A96DA35.2020109@xxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> On 08/27/09 01:48, Alan Cox wrote:
> >> as part of its ABI.  It is not a general tsc.  You can't 
> meaningfully
> >> execute "rdtsc" without also being (indirectly) aware of 
> what pcpu its
> >> running on and applying the appropriate corrections to turn it into
> >> system monotonic time.  Executing rdtsc willy-nilly gets 
> you useless
> >> results; fortunately no PV Xen kernel does that.
> >>     
> > Actually for user space this isn't at all true. You can use rdtsc
> > directly and sample the data for things like profiling then 
> correct for
> > things like spikes and skews from processor switches by filtering. 
> If an app is sophisticated to do this correctly then it 
> doesn't need any
> special assistance from a hypervisor to make the tsc well-behaved.  It
> should continue to work even in a Xen guest where both the process can
> skip between VCPUs and the VCPUs can skip between PCPUs.

No, I don't think this is true.  An enterprise app that binds processes
to fixed physical processors on a physical machine can make
assumptions about the results of rdtsc that aren't valid when
the vcpus can skip between pcpus.  Further, like Linux itself,
applications may test assumptions about tsc at startup that are
assumed to remain valid for the life of the app, which is
perfectly reasonable on a physical machine and a bad mistake
in a virtualized environment.

> >> No, write_tsc is meaningless, and anyone trying to execute 
> it is not
> >> even wrong.
> >>     
> > Writing to the tsc is perfectly reasonable providing the tsc is an
> > advertised feature. Being able to use the tsc becomes much 
> more relevant
> > with newer processors which have sane tsc implementations in the
> > architecture however.
> Apparently on some large servers the tsc is only synced and 
> sane within
> a NUMA node, and not globally across all processors, so any app which
> assumed sane tsc behaviour would break when the hardware gets 
> scaled up.

True, but any app that tries to run on a NUMA machine without
being aware of the idiosyncracies of a NUMA machine probably
has worse problems to deal with than tsc sync.  Further, there
are many many apps that will likely never ever run on those
machines.  Are we going to penalize all apps all the time
because some might run some of the time on a machine where
tsc is not synced?

> But in this case I'm talking specifically about a Xen PV guest, where
> the tsc is claimed for use by the Xen clocksource ABI.

I just don't understand how you can say that a valid userland
instruction is "claimed for use" by Xen (or Linux or both).

Xen-devel mailing list