xen-devel
RE: [Xen-devel] Xen 4 TSC problems
To: |
<George.Dunlap@xxxxxxxxxxxxx>, <dan.magenheimer@xxxxxxxxxx> |
Subject: |
RE: [Xen-devel] Xen 4 TSC problems |
From: |
<Philippe.Simonet@xxxxxxxxxxxx> |
Date: |
Fri, 30 Sep 2011 06:33:00 +0000 |
Accept-language: |
en-US, de-CH |
Cc: |
olivier.hanesse@xxxxxxxxx, jeremy@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, keir@xxxxxxx, konrad.wilk@xxxxxxxxxx |
Delivery-date: |
Thu, 29 Sep 2011 23:34:18 -0700 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<CAFLBxZZ3nP5EbtL=Ne-W4DcSWtKGh1XDOs-EhVua7ujmi9Y2Jw@xxxxxxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
References: |
<AANLkTik3Ng6TpQANfPNJ2M=86WYLHVt7_MuBuVfJ1CG_@xxxxxxxxxxxxxx> <CAFLBxZZiGxef5CkH9r+VZeB4h7LPA0nJddumwDRovZaL4zOuvA@xxxxxxxxxxxxxx> <5224b434-5371-404d-8fed-2665e73dacce@default> <CAFLBxZZ3nP5EbtL=Ne-W4DcSWtKGh1XDOs-EhVua7ujmi9Y2Jw@xxxxxxxxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
Thread-index: |
AQHMctqprSt5ec8j8ES+2DGqMTKTYJVOH1iAgACGzICABcOIgIARH1AQ |
Thread-topic: |
[Xen-devel] Xen 4 TSC problems |
Hi Xen developpers
i need some good tips to go forward with my TSC problem :
first fast the problem :
- clock jump 50 minutes forward : (xm dmesg)
(XEN) TSC is reliable, synchronization unnecessary
(XEN) Platform timer is 14.318MHz HPET
(XEN) Platform timer appears to have unexpectedly wrapped 10 or more
times
(syslog)
Sep 28 17:45:06 dnsit11 kernel: [1970548.356130] Clocksource tsc
unstable (delta = -2999660112689 ns)
Sep 11 13:56:50 dnsit22 kernel: [571603.359863] Clocksource tsc
unstable (delta = -2999662111513 ns)
- I can't reproduce or force the problem
- on 2 different HP DL 385 G7, with debian squeeze :
xen-hypervisor-4.0-amd64 4.0.1-2
dom0 : linux-image-2.6.32-5-xen-amd64 2.6.32-35
domus : 5 -> 15 debian machines
2 * 12-cores AMD Opteron(tm) Processor 6174
- i have this problem since begin of september, before, the machine were
running since 3 month without problem
begin of September, I have done an upgrade (dom0 and domus:)
linux-image-2.6.32-5-xen-amd64:amd64 (2.6.32-31, automatic) ->
linux-image-2.6.32-5-xen-amd64:amd64 (2.6.32-31, 2.6.32-35)
- what is strange : (don't know if there is a link with the problem)
/proc/cpuinfo in dom0 gives me :
cpu MHz : 3249880.888
--or --
cpu MHz : 2300454.255
.... (different after each reboot)
in domu thi value is ok(cpu MHz : 2200.112), the bogomips is
also ok (bogomips : 4400.21)
if I start the machine with a non-xen environment, the values are also
ok
I have now exact the same machine where I can make some tests.
Could you give me some tips that I could test or implement ?
- hardware problem ? hypervisor problem ? dom0 problem ?
- try other hypervisor version ?
- try linux-image-3.0.0-1-amd64 3.0.0-3
- try reproducing problem ? (how ?, log it ? ....)
all your help is welcomed !
many thanks
Philippe
> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of George Dunlap
> Sent: Monday, September 19, 2011 12:40 PM
> To: Dan Magenheimer
> Cc: Keir Fraser; jeremy@xxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx; Philippe
> Simonet; Konrad Wilk
> Subject: Re: [Xen-devel] Xen 4 TSC problems
>
> On Thu, Sep 15, 2011 at 7:38 PM, Dan Magenheimer
> <dan.magenheimer@xxxxxxxxxx> wrote:
> >> I haven't been following this conversation, so I don't know if this
> >> is relevant, but I've just discovered this morning that the TSC warp
> >> check in Xen is done at the wrong time (before any secondary cpus are
> >> brought up), and thus always returns warp=0. I've submitted a patch
> >> to do the check after secondary CPUs are brought up; that should
> >> cause Xen to do periodic synchronization of TSCs when there is drift.
> >
> > Wow, nice catch, George! I wonder if this is the underlying bug for
> > many of the mysterious time problems that have been reported for a
> > year or two now... at least on certain AMD boxes.
> > Any idea when this was introduced? Or has it always been wrong?
>
> Well the comment in 20823:89907dab1aef seems to indicate that's where the
> "assume it's reliable on AMD until proven otherwise" started; that would be
> January 2010.
>
> I looked as far back as 20705:a74aca4b9386, and there the TSC reliability
> checks were again in init_xen_time(). Figuring out where things were before
> then is getting into archeology. :-)
>
> The comment at the top of init_xen_time() is correct now, but from the time
> it was first written through 4.1 is was just plain wrong -- it said
> init_xen_time() happened after all cpus were up, which has never been true.
>
> -George
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|