xen-devel

[Top] [All Lists]

RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2

from [Dan Magenheimer]

[Permanent Link][Original]

To:	Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Mark Adams <mark@xxxxxxxxxxxxxxxxxx>
Subject:	RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time
From:	Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date:	Tue, 26 Oct 2010 14:54:45 -0700 (PDT)
Cc:	xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date:	Tue, 26 Oct 2010 14:58:11 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<4CC709CB.7090203@xxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<20101006111618.GA31233@xxxxxxxxxxxxxxxxxx> <4CAC98BF.9010902@xxxxxxxx> <5e238400-51d4-4ed7-8f8b-1f3f44486d45@default> <20101026092254.GA2066@xxxxxxxxxxxxxxxxxx 4CC709CB.7090203@xxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

>  On 10/26/2010 02:22 AM, Mark Adams wrote:
> > On Thu, Oct 07, 2010 at 07:04:18AM -0700, Dan Magenheimer wrote:
> >> Hi Jeremy and Mark --
> >>
> >> Oddly, I saw that "clocksource tsc unstable" message myself
> >> on a busy 2.6.36-rc5 PV domain yesterday.  While it is possible
> >> that this reflects a hardware problem, the fact that you
> >> saw it on a Nehalem+ Intel processor makes it very unlikely.
> >> The "s" and "t" debug keys (the output of which can be seen via
> >> "xm debug-key s; xm dmesg | tail" in dom0) can help diagnose
> >> the problem if it is indeed a hardware problem or BIOS
> >> problem or the result of a CPU hot-add... all unlikely.
> >>
> >> It IS possible that the code that emulates tsc is broken
> >> somewhere, but I don't think tsc should be emulated by
> >> default for dom0 on a Nehalem+ box... and even if it is,
> >> it is directly based on Xen system time which, if it went
> >> awry, would probably cause major problems.
> >>
> >> Looking through the Linux code that prints that message (in
> >> kernel/time/clocksource.c) it appears that the message
> >> appears if the tsc deviates from the "watchdog clocksource",
> >> which in PV domains is "xen" (or more precisely pvclock
> >> I think).  So most likely, this is a symptom of a problem
> >> with pvclock or the watchdog code in the pvops kernel, not
> >> an indicator that the tsc is actually unstable.
> >>
> > Is there any more information I can provide to help with debugging
> this?
> > We haven't had the problem since. It could just be a coincidence but
> it
> > happened around the time that daylight savings occurred in the US (we
> > are in the UK).
> 
> In Linux/Xen it shouldn't have any effect since the clocks are always
> maintained in UTC, then timezone details are applied much later in
> usermode.  But Windows has a bad habit of setting the hardware RTC to
> local time, and mucking about with it for DST changes - but that would
> only be relevant if you booted Windows on your host machine (I don't
> think there's any way for a Windows guest's time to leak into the
> host/dom0's timebase).
> 
> Unfortunately these kinds of time problems can be notoriously hard to
> pin down and diagnose.

This seems to occur when one -- or possibly all -- vcpus
are "spinning" for an unexpectedly long period of time.  If so
it may be possible to synthesize some kind of long-but-non-infinite
deadlock in a domU kernel which might reproduce the problem.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Mark Adams RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008R2 domU time, James Harper RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008R2 domU time, James Harper Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008R2 domU time, Mark Adams Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Jeremy Fitzhardinge Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Mark Adams Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Jeremy Fitzhardinge RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Dan Magenheimer Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Mark Adams Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Jeremy Fitzhardinge RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Dan Magenheimer <= RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Dan Magenheimer Message not available Message not available Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, wei song Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Mark Adams

Previous by Date:	Re: [Xen-devel] Re: [PATCH 1/5] xen: events: use irq_alloc_desc(_at) instead of open-coding an IRQ allocator., Jeremy Fitzhardinge
Next by Date:	[Xen-devel] [GIT PULL] Xen updates, Jeremy Fitzhardinge
Previous by Thread:	Re: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Jeremy Fitzhardinge
Next by Thread:	RE: [Xen-devel] Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time, Dan Magenheimer
Indexes:	[Date] [Thread] [Top] [All Lists]