WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Runaway real/sys time in newer paravirt domUs?

To: Jed Smith <jed@xxxxxxxxxx>
Subject: Re: [Xen-devel] Runaway real/sys time in newer paravirt domUs?
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Tue, 06 Jul 2010 12:05:01 -0700
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 06 Jul 2010 12:05:53 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <B64CF51D-D1D4-494C-BEA8-F5C6F0A926B6@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <B64CF51D-D1D4-494C-BEA8-F5C6F0A926B6@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-2.fc12 Lightning/1.0b2pre Thunderbird/3.0.4
On 07/06/2010 09:32 AM, Jed Smith wrote:
> Good morning,
>
> We've had a few reports from domU customers[1] - confirmed by myself - that 
> CPU
> time accounting is very inaccurate in certain circumstances.  This issue seems
> to be limited to x86_64 domUs, starting around the 2.6.32 family (but I can't 
> be
> sure of that).
>
> The symptoms of the flaw include top reporting hours and days of CPU consumed 
> by
> a task which has been running for mere seconds of wall time, as well as the
> time(1) utility reporting hundreds of years in some cases.  
> Contra-indicatively,
> the /proc/stat timers on all four VCPUs increment at roughly the expected 
> rate.
> Needless to say, this is puzzling.
>
> A test case which highlights the failure has been brought to our attention by
> Ævar Arnfjörð Bjarmason, which is a simple Perl script[2] that forks and
> executes numerous dig(1) processes.  At the end of his script, time(1) reports
> 268659840m0.951s of user and 38524003m13.072s of system time consumed.  I am
> able to confirm this demonstration using:
>
>  - Xen 3.4.1 on dom0 2.6.18.8-931-2
>  - Debian Lenny on domU 2.6.32.12-x86_64-linode12 [3]
>
> Running Ævar's test case looks like this, in that domU:
>
>   
>> real 0m30.741s
>> user 307399002m50.773s
>> sys 46724m44.192s
>>     
> However, a quick busyloop in Python seems to report the correct time:
>
>   
>> li21-66:~# cat doit.py 
>> for i in xrange(10000000):
>>  a = i ** 5
>>
>> li21-66:~# time python doit.py
>>
>> real 0m16.600s
>> user 0m16.593s
>> sys  0m0.006s
>>     
> I rebooted the domU, and the problem no longer exists.  It seems to be 
> transient
> in nature, and difficult to isolate.  /proc/stat seems to increment normally:
>
>   
>> li21-66:/proc# cat stat | grep "cpu " && sleep 1 && cat stat | grep "cpu "
>> cpu  3742 0 1560 700180 1326 0 27 1282 0
>> cpu  3742 0 1562 700983 1326 0 27 1282 0
>>     
> I'm not sure where to begin with this one - any thoughts?
>   

It would be helpful to identify what kernel version the change of
behaviour started in (ideally a git bisect down to a particular change,
but a pair of versions would be close enough).

I think this is the same problem as
https://bugzilla.kernel.org/show_bug.cgi?id=16314

Thanks,
    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel