On 10/06/2010 09:15 AM, Mark Adams wrote:
> On Wed, Oct 06, 2010 at 08:41:51AM -0700, Jeremy Fitzhardinge wrote:
>> On 10/06/2010 04:16 AM, Mark Adams wrote:
>>> Hi Xen-Devel's
>>>
>>> Please see my note below regarding a serious issue where my clock jumped
>>> in dom0. I'm sending this through to the devel list as I haven't managed
>>> to glean any clear help from xen-users and the debian bug team are
>>> unsure what could have caused this.
>>>
>>> Can you confirm if the kernel or xen controls the clock in dom0? I also
>>> understand that this could be an underlying hardware issue but I have
>>> another system on exactly the same hardware which hasn't had this occur.
>> The kernel manages its own time, but it uses the Xen system clock as its
>> timebase. If the Xen system clock is unstable for some reason, then it
>> will affect the kernel's timekeeping.
>>
>> Nothing should be using the tsc clocksource, so I'm not sure why its
>> reporting any kinds of messages. No PV Xen domain can expect the raw
>> tsc to be stable.
> The message was reported in dom0, not domU.
Dom0 is a normal PV domain. It just has a few more privileges than a
regular domU.
>> But the tsc is the basis for the Xen clocksource, and if the tsc is
>> unstable in unexpected ways then it can affect Xen timekeeping. This
>> can be caused by certain power management modes.
>>
>>> Any advice on how to investigate further or ensure better clock
>>> stability across dom0 and domU would be appreciated.
>> What type of system is it? How many CPUs? What CPU vendor?
> It is a Tyan S7010AGM2NRF with 2 intel quad core Xeon E5620 CPU's.
I forget all the magic options that can affect timekeeping (cc:d Dan,
since this stuff is close to his heart).
J
> Thanks,
> Mark
>
>>> Also is it correct behaviour for Xen to reboot an 2008 R2 HVM domU if
>>> the time moves this much? My guess is that the domU crashed when the
>>> time changed, and was thus rebooted automatically. Strangely the Windows
>>> 2003 server didn't get rebooted.
>> I don't think there would be any direct connection between the dom0 time
>> jump and Windows dying, but if the CPU's tsc and/or Xen's timekeeping is
>> unstable, then Windows might also see a similar time jump and react badly.
>>
>> J
>>
>>> If you need any more info to help please let me know.
>>>
>>> Thanks,
>>> Mark
>>>
>>> On Mon, Oct 04, 2010 at 01:00:51PM +0100, Mark Adams wrote:
>>>> On Mon, Oct 04, 2010 at 11:01:10AM +0100, Mark Adams wrote:
>>>>> Hi All,
>>>>>
>>>>> Im running Xen 4.0.1-rc6 Debian squeeze with pvops 2.6.32-21 kernel.
>>>>> Today I noticed (when kerberos to the domain controllers stopped
>>>>> working..) that the clock was 50 minutes out in dom0 -- This caused the
>>>>> HVM windows domain controllers to have the wrong time.
>>>>>
>>>>> I'm not sure if this is a kernel issue or a xen issue, but the only
>>>>> thing related is I can see the following in the kernel log:
>>>>>
>>>>> Oct 2 18:50:33 havhost1 kernel: [623480.977748] Clocksource tsc unstable
>>>>> (delta = -2999660303788 ns)
>>>>>
>>>>> But I also see in the dmesg log that xen is using it's own clock.
>>>>>
>>>>> [ 7.676563] Switching to clocksource xen
>>>>>
>>>>> I can't identify anything else in the logs to indicate when the time
>>>>> might have changed. I have a few other dom0 at the same level that
>>>>> haven't decided to change the time.
>>>>>
>>>>> Can anyone confirm whether xen controls the time or the kernel? Also
>>>>> when I corrected the time in dom0 it was still wrong in HVM domU -- How
>>>>> long does it take for this to propogate? (I rebooted the VM's to correct
>>>>> it immediately).
>>>>>
>>>>> Any other pointers on how to ensure stability of clocks from dom0 to
>>>>> domU HVM hosts (and pv for that matter..) would be appreciated.
>>>> Some further info on this, It appears the HVM domU (windows server 2008)
>>>> unexpectedly shut down at 18:51, after the unstable clocksource error.
>>>> qemu-dm logs show a reset "reset requested in cpu_handle_ioreq." and
>>>> xend.log shows a reboot
>>>>
>>>> [2010-10-02 18:51:03 1759] INFO (XendDomainInfo:2088) Domain has shutdown:
>>>> name=ha-dc1 id=2 reason=reboot.
>>>>
>>>> This is like someone issuing "xm reboot domain" is it not? Is it
>>>> possible that xen could have issued this reboot itself due to a crash? I
>>>> can't see any crash logs.
>>>>
>>>> Cheers,
>>>> Mark
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
>>>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|