Let me attempt to summarize our disagreement and then
I'd like to stop arguing.
1) You think rdtsc is never safe for an app to use. I
think it is safe in many hardware/software enviroments
and that the number of safe environments will continue
to increase.
2) You think the performance hit from rdtsc-emulation
is horrible. I think it is significant but relatively
small and acceptable and, if there are cases where it
is not, administrators or virtual appliance providers
can make an informed choice to turn it off.
3) You think app developers can and should be told to
not use rdtsc because it is inherently unsafe and
so Xen doesn't need to ever be concerned with making it
safe. I think app developers will do what they
please, ignorant to the subtleties of rdtsc, and if
their app works on their hardware and on VMware but
not on Xen, they will blame Xen or Linux or their
OS provider or their cloud provider, and probably never
know that their app doesn't work because of rdtsc.
Do you agree that those are the key points of disagreement?
Thanks,
Dan
> -----Original Message-----
> From: Jeremy Fitzhardinge [mailto:jeremy@xxxxxxxx]
> Sent: Tuesday, September 29, 2009 1:02 PM
> To: Dan Magenheimer
> Cc: Xen-Devel (E-mail); Keir Fraser
> Subject: Re: [Xen-devel] [PATCH] replace rdtsc emulation-vs-native xen
> boot option with per-domain (hypervisor part)
>
>
> On 09/29/09 10:34, Dan Magenheimer wrote:
> >> The TSC is not, and has never been reliable.
> >>
> > Your data is stale. Please discuss this with processor
> > and system vendors (I have)
>
> I'm sure they would say that, as they frequently have in the
> past. And
> then it breaks again.
>
> Even then their guarantee only applies while the processor is
> powered up
> and hasn't been reset. But resets can occur while the system is
> "running" in the form of S3 suspend events, or even completely powered
> off when suspending to disk.
>
> Besides, the SDM makes no claims about tsc synchronization
> between CPUs,
> only that on a given CPU/core is at a constant rate (at least from now
> on, promise!). At that point you're relying on motherboard/system
> design, which has a lot more scope for brokenness than just
> core CPUs.
> Large systems simply don't keep all their CPUs in the same
> clock domain,
> and certainly won't guarantee that for all future system designs.
>
> > and look at the latest upstream Linux.
> >
> The kernel does what it needs to do to make the tsc usable
> for itself.
> It does not make (and has never made) any guarantees about how the tsc
> appears in usermode (except for the purposes of implementing
> vgettimeofday). You won't find many Linux kernel developers who are
> sympathetic with the idea of making any hard guarantees for bare
> usermode tsc use.
>
> >> Except that it comes with a terrible cost...
> >> This is a massive regression...
> >>
> > It is certainly significant but "terrible" and "massive"
> > are a bit strong. Based on my measurements, the examples
> > you cite will degrade performance by a fraction of a percent.
> >
> How have you measured this? On what systems? Your patch introduces
> this regression on all systems for everyone; it isn't enough
> to measure
> it on a new Nehalem machine.
>
> >> The fact that you haven't named a single real app...
> >> Are you really arguing on the basis that "some apps
> >> might use tsc in a fragile way" or do you actually have a
> >> specific list
> >>
> > I have a (small) specific list. For various reasons,
> > I cannot go into further detail.
> >
>
> Well, that goes back to my point about spending a lot of effort on
> something that can only possibility benefit a (small) set of niche
> apps. Spending the effort on a vsyscall approach would be
> fast, correct
> and widely beneficial.
>
> You can default it on within Oracle, or even in Oracle's Xen distro.
> It's unreasonable to make this a global default when you're trying to
> solve a local problem. You haven't established this is something that
> anyone else need be concerned about.
>
> Besides, if they want a global sequence number, why not just keep a
> global counter? That's going to be much cheaper and more
> reliable than
> anything time-based.
>
> J
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|