|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] Runaway real/sys time in newer paravirt domUs?
On 07/06/2010 09:32 AM, Jed Smith wrote:
> Good morning,
>
> We've had a few reports from domU customers[1] - confirmed by myself - that
> CPU
> time accounting is very inaccurate in certain circumstances. This issue seems
> to be limited to x86_64 domUs, starting around the 2.6.32 family (but I can't
> be
> sure of that).
>
> The symptoms of the flaw include top reporting hours and days of CPU consumed
> by
> a task which has been running for mere seconds of wall time, as well as the
> time(1) utility reporting hundreds of years in some cases.
> Contra-indicatively,
> the /proc/stat timers on all four VCPUs increment at roughly the expected
> rate.
> Needless to say, this is puzzling.
>
> A test case which highlights the failure has been brought to our attention by
> Ævar Arnfjörð Bjarmason, which is a simple Perl script[2] that forks and
> executes numerous dig(1) processes. At the end of his script, time(1) reports
> 268659840m0.951s of user and 38524003m13.072s of system time consumed. I am
> able to confirm this demonstration using:
>
> - Xen 3.4.1 on dom0 2.6.18.8-931-2
> - Debian Lenny on domU 2.6.32.12-x86_64-linode12 [3]
>
> Running Ævar's test case looks like this, in that domU:
>
>
>> real 0m30.741s
>> user 307399002m50.773s
>> sys 46724m44.192s
>>
> However, a quick busyloop in Python seems to report the correct time:
>
>
>> li21-66:~# cat doit.py
>> for i in xrange(10000000):
>> a = i ** 5
>>
>> li21-66:~# time python doit.py
>>
>> real 0m16.600s
>> user 0m16.593s
>> sys 0m0.006s
>>
> I rebooted the domU, and the problem no longer exists. It seems to be
> transient
> in nature, and difficult to isolate. /proc/stat seems to increment normally:
>
>
>> li21-66:/proc# cat stat | grep "cpu " && sleep 1 && cat stat | grep "cpu "
>> cpu 3742 0 1560 700180 1326 0 27 1282 0
>> cpu 3742 0 1562 700983 1326 0 27 1282 0
>>
> I'm not sure where to begin with this one - any thoughts?
>
It would be helpful to identify what kernel version the change of
behaviour started in (ideally a git bisect down to a particular change,
but a pair of versions would be close enough).
I think this is the same problem as
https://bugzilla.kernel.org/show_bug.cgi?id=16314
Thanks,
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|