| 
         
xen-devel
RE: [Xen-devel] Xen 4 TSC problems
 
| 
To:  | 
<George.Dunlap@xxxxxxxxxxxxx>, <dan.magenheimer@xxxxxxxxxx> | 
 
| 
Subject:  | 
RE: [Xen-devel] Xen 4 TSC problems | 
 
| 
From:  | 
<Philippe.Simonet@xxxxxxxxxxxx> | 
 
| 
Date:  | 
Fri, 30 Sep 2011 06:33:00 +0000 | 
 
| 
Accept-language:  | 
en-US, de-CH | 
 
| 
Cc:  | 
olivier.hanesse@xxxxxxxxx, jeremy@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx,	keir@xxxxxxx, konrad.wilk@xxxxxxxxxx | 
 
| 
Delivery-date:  | 
Thu, 29 Sep 2011 23:34:18 -0700 | 
 
| 
Envelope-to:  | 
www-data@xxxxxxxxxxxxxxxxxxx | 
 
| 
In-reply-to:  | 
<CAFLBxZZ3nP5EbtL=Ne-W4DcSWtKGh1XDOs-EhVua7ujmi9Y2Jw@xxxxxxxxxxxxxx> | 
 
| 
List-help:  | 
<mailto:xen-devel-request@lists.xensource.com?subject=help> | 
 
| 
List-id:  | 
Xen developer discussion <xen-devel.lists.xensource.com> | 
 
| 
List-post:  | 
<mailto:xen-devel@lists.xensource.com> | 
 
| 
List-subscribe:  | 
<http://lists.xensource.com/mailman/listinfo/xen-devel>,	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe> | 
 
| 
List-unsubscribe:  | 
<http://lists.xensource.com/mailman/listinfo/xen-devel>,	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> | 
 
| 
References:  | 
<AANLkTik3Ng6TpQANfPNJ2M=86WYLHVt7_MuBuVfJ1CG_@xxxxxxxxxxxxxx>	<CAFLBxZZiGxef5CkH9r+VZeB4h7LPA0nJddumwDRovZaL4zOuvA@xxxxxxxxxxxxxx>	<5224b434-5371-404d-8fed-2665e73dacce@default>	<CAFLBxZZ3nP5EbtL=Ne-W4DcSWtKGh1XDOs-EhVua7ujmi9Y2Jw@xxxxxxxxxxxxxx> | 
 
| 
Sender:  | 
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx | 
 
| 
Thread-index:  | 
AQHMctqprSt5ec8j8ES+2DGqMTKTYJVOH1iAgACGzICABcOIgIARH1AQ | 
 
| 
Thread-topic:  | 
[Xen-devel] Xen 4 TSC problems | 
 
 
 
Hi Xen developpers
i need some good tips to go forward with my TSC problem : 
first fast the problem : 
- clock jump 50 minutes forward : (xm dmesg)
        (XEN) TSC is reliable, synchronization unnecessary
        (XEN) Platform timer is 14.318MHz HPET
        (XEN)  Platform timer appears to have unexpectedly wrapped 10 or more 
times
        (syslog)
        Sep 28 17:45:06 dnsit11 kernel: [1970548.356130] Clocksource tsc 
unstable (delta = -2999660112689 ns)
        Sep 11 13:56:50 dnsit22 kernel: [571603.359863] Clocksource tsc 
unstable (delta = -2999662111513 ns)
- I can't reproduce or force the problem
- on 2 different HP DL 385 G7,  with debian squeeze : 
        xen-hypervisor-4.0-amd64                4.0.1-2
        dom0 : linux-image-2.6.32-5-xen-amd64          2.6.32-35
        domus : 5 -> 15 debian machines
        2 * 12-cores AMD Opteron(tm) Processor 6174
- i have this problem since begin of september, before, the machine were 
running since 3 month without problem
        begin of September,  I have done an upgrade (dom0 and domus:)
        linux-image-2.6.32-5-xen-amd64:amd64 (2.6.32-31, automatic)  -> 
linux-image-2.6.32-5-xen-amd64:amd64 (2.6.32-31, 2.6.32-35)
- what is strange : (don't know if there is a link with the problem)
        /proc/cpuinfo in dom0 gives me : 
        cpu MHz         : 3249880.888
  --or --
        cpu MHz         : 2300454.255
....            (different after each reboot)
        
        in domu thi value is ok(cpu MHz         : 2200.112), the bogomips is 
also ok (bogomips        : 4400.21)
        if I start the machine with a non-xen environment, the values are also 
ok
        
I have now exact the same machine where I can make some tests.
Could you give me some tips that I could test or implement ?
        - hardware problem ? hypervisor problem ? dom0 problem ?
        - try other hypervisor version ? 
        - try linux-image-3.0.0-1-amd64 3.0.0-3
        - try reproducing problem ? (how ?, log it ? ....)
all your help is welcomed !
many thanks
Philippe
> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of George Dunlap
> Sent: Monday, September 19, 2011 12:40 PM
> To: Dan Magenheimer
> Cc: Keir Fraser; jeremy@xxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx; Philippe
> Simonet; Konrad Wilk
> Subject: Re: [Xen-devel] Xen 4 TSC problems
> 
> On Thu, Sep 15, 2011 at 7:38 PM, Dan Magenheimer
> <dan.magenheimer@xxxxxxxxxx> wrote:
> >> I haven't been following this conversation, so I don't know if this
> >> is relevant, but I've just discovered this morning that the TSC warp
> >> check in Xen is done at the wrong time (before any secondary cpus are
> >> brought up), and thus always returns warp=0.  I've submitted a patch
> >> to do the check after secondary CPUs are brought up; that should
> >> cause Xen to do periodic synchronization of TSCs when there is drift.
> >
> > Wow, nice catch, George!  I wonder if this is the underlying bug for
> > many of the mysterious time problems that have been reported for a
> > year or two now... at least on certain AMD boxes.
> > Any idea when this was introduced?  Or has it always been wrong?
> 
> Well the comment in 20823:89907dab1aef seems to indicate that's where the
> "assume it's reliable on AMD until proven otherwise" started; that would be
> January 2010.
> 
> I looked as far back as 20705:a74aca4b9386, and there the TSC reliability
> checks were again in init_xen_time().  Figuring out where things were before
> then is getting into archeology. :-)
> 
> The comment at the top of init_xen_time() is correct now, but from the time
> it was first written through 4.1 is was just plain wrong -- it said
> init_xen_time() happened after all cpus were up, which has never been true.
> 
>  -George
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 |   
 
 | 
    |