On 10/06/2010 04:16 AM, Mark Adams wrote:
> Hi Xen-Devel's
> Please see my note below regarding a serious issue where my clock jumped
> in dom0. I'm sending this through to the devel list as I haven't managed
> to glean any clear help from xen-users and the debian bug team are
> unsure what could have caused this.
> Can you confirm if the kernel or xen controls the clock in dom0? I also
> understand that this could be an underlying hardware issue but I have
> another system on exactly the same hardware which hasn't had this occur.
The kernel manages its own time, but it uses the Xen system clock as its
timebase. If the Xen system clock is unstable for some reason, then it
will affect the kernel's timekeeping.
Nothing should be using the tsc clocksource, so I'm not sure why its
reporting any kinds of messages. No PV Xen domain can expect the raw
tsc to be stable.
But the tsc is the basis for the Xen clocksource, and if the tsc is
unstable in unexpected ways then it can affect Xen timekeeping. This
can be caused by certain power management modes.
> Any advice on how to investigate further or ensure better clock
> stability across dom0 and domU would be appreciated.
What type of system is it? How many CPUs? What CPU vendor?
> Also is it correct behaviour for Xen to reboot an 2008 R2 HVM domU if
> the time moves this much? My guess is that the domU crashed when the
> time changed, and was thus rebooted automatically. Strangely the Windows
> 2003 server didn't get rebooted.
I don't think there would be any direct connection between the dom0 time
jump and Windows dying, but if the CPU's tsc and/or Xen's timekeeping is
unstable, then Windows might also see a similar time jump and react badly.
> If you need any more info to help please let me know.
> On Mon, Oct 04, 2010 at 01:00:51PM +0100, Mark Adams wrote:
>> On Mon, Oct 04, 2010 at 11:01:10AM +0100, Mark Adams wrote:
>>> Hi All,
>>> Im running Xen 4.0.1-rc6 Debian squeeze with pvops 2.6.32-21 kernel.
>>> Today I noticed (when kerberos to the domain controllers stopped
>>> working..) that the clock was 50 minutes out in dom0 -- This caused the
>>> HVM windows domain controllers to have the wrong time.
>>> I'm not sure if this is a kernel issue or a xen issue, but the only
>>> thing related is I can see the following in the kernel log:
>>> Oct 2 18:50:33 havhost1 kernel: [623480.977748] Clocksource tsc unstable
>>> (delta = -2999660303788 ns)
>>> But I also see in the dmesg log that xen is using it's own clock.
>>> [ 7.676563] Switching to clocksource xen
>>> I can't identify anything else in the logs to indicate when the time
>>> might have changed. I have a few other dom0 at the same level that
>>> haven't decided to change the time.
>>> Can anyone confirm whether xen controls the time or the kernel? Also
>>> when I corrected the time in dom0 it was still wrong in HVM domU -- How
>>> long does it take for this to propogate? (I rebooted the VM's to correct
>>> it immediately).
>>> Any other pointers on how to ensure stability of clocks from dom0 to
>>> domU HVM hosts (and pv for that matter..) would be appreciated.
>> Some further info on this, It appears the HVM domU (windows server 2008)
>> unexpectedly shut down at 18:51, after the unstable clocksource error.
>> qemu-dm logs show a reset "reset requested in cpu_handle_ioreq." and
>> xend.log shows a reboot
>> [2010-10-02 18:51:03 1759] INFO (XendDomainInfo:2088) Domain has shutdown:
>> name=ha-dc1 id=2 reason=reboot.
>> This is like someone issuing "xm reboot domain" is it not? Is it
>> possible that xen could have issued this reboot itself due to a crash? I
>> can't see any crash logs.
> Xen-devel mailing list
Xen-devel mailing list