This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2

To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Mark Adams <mark@xxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Tue, 26 Oct 2010 14:54:45 -0700 (PDT)
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 26 Oct 2010 14:58:11 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CC709CB.7090203@xxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20101006111618.GA31233@xxxxxxxxxxxxxxxxxx> <4CAC98BF.9010902@xxxxxxxx> <5e238400-51d4-4ed7-8f8b-1f3f44486d45@default> <20101026092254.GA2066@xxxxxxxxxxxxxxxxxx 4CC709CB.7090203@xxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>  On 10/26/2010 02:22 AM, Mark Adams wrote:
> > On Thu, Oct 07, 2010 at 07:04:18AM -0700, Dan Magenheimer wrote:
> >> Hi Jeremy and Mark --
> >>
> >> Oddly, I saw that "clocksource tsc unstable" message myself
> >> on a busy 2.6.36-rc5 PV domain yesterday.  While it is possible
> >> that this reflects a hardware problem, the fact that you
> >> saw it on a Nehalem+ Intel processor makes it very unlikely.
> >> The "s" and "t" debug keys (the output of which can be seen via
> >> "xm debug-key s; xm dmesg | tail" in dom0) can help diagnose
> >> the problem if it is indeed a hardware problem or BIOS
> >> problem or the result of a CPU hot-add... all unlikely.
> >>
> >> It IS possible that the code that emulates tsc is broken
> >> somewhere, but I don't think tsc should be emulated by
> >> default for dom0 on a Nehalem+ box... and even if it is,
> >> it is directly based on Xen system time which, if it went
> >> awry, would probably cause major problems.
> >>
> >> Looking through the Linux code that prints that message (in
> >> kernel/time/clocksource.c) it appears that the message
> >> appears if the tsc deviates from the "watchdog clocksource",
> >> which in PV domains is "xen" (or more precisely pvclock
> >> I think).  So most likely, this is a symptom of a problem
> >> with pvclock or the watchdog code in the pvops kernel, not
> >> an indicator that the tsc is actually unstable.
> >>
> > Is there any more information I can provide to help with debugging
> this?
> > We haven't had the problem since. It could just be a coincidence but
> it
> > happened around the time that daylight savings occurred in the US (we
> > are in the UK).
> In Linux/Xen it shouldn't have any effect since the clocks are always
> maintained in UTC, then timezone details are applied much later in
> usermode.  But Windows has a bad habit of setting the hardware RTC to
> local time, and mucking about with it for DST changes - but that would
> only be relevant if you booted Windows on your host machine (I don't
> think there's any way for a Windows guest's time to leak into the
> host/dom0's timebase).
> Unfortunately these kinds of time problems can be notoriously hard to
> pin down and diagnose.

This seems to occur when one -- or possibly all -- vcpus
are "spinning" for an unexpectedly long period of time.  If so
it may be possible to synthesize some kind of long-but-non-infinite
deadlock in a domU kernel which might reproduce the problem.

Xen-devel mailing list