WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] cpuidle causing Dom0 soft lockups

To: Jan Beulich <JBeulich@xxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>
Subject: RE: [Xen-devel] cpuidle causing Dom0 soft lockups
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Tue, 9 Feb 2010 15:55:35 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, KeirFraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Mon, 08 Feb 2010 23:56:35 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4B6FDD56020000780002E30E@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4B58402E020000780002B3FE@xxxxxxxxxxxxxxxxxx> <C77DE51B.6F89%keir.fraser@xxxxxxxxxxxxx> <4B67E85E020000780002D1A0@xxxxxxxxxxxxxxxxxx> <8B81FACE836F9248894A7844CC0BA814250B940CFF@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B6BE984020000780002DDF9@xxxxxxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EF35C82CF4@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B6BEF70020000780002DE0F@xxxxxxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EF35C82D27@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B6C02EF020000780002DE56@xxxxxxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EF35C82D49@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B6C4061020000780002DEFB@xxxxxxxxxxxxxxxxxx> <4B6C4CAA020000780002DF3A@xxxxxxxxxxxxxxxxxx> <73BDC2BA3DA0BD47BAAEE12383D407EFF8E008F9@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B6FDD56020000780002E30E@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcqomzYIdBnLB8srQ1iUujQIPmHIcQAvXKbA
Thread-topic: [Xen-devel] cpuidle causing Dom0 soft lockups
>From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx] 
>Sent: 2010年2月8日 16:46
>>>> "Tian, Kevin" <kevin.tian@xxxxxxxxx> 06.02.10 02:52 >>>
>>>From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx] 
>>>Sent: 2010年2月5日 23:52
>>>
>>>>>> "Jan Beulich" <JBeulich@xxxxxxxxxx> 05.02.10 15:59 >>>
>>>>Next I'll do is try to detect when the duty CPU is de-scheduled, and
>>>>move on the duty to one that is scheduled (i.e. one that is 
>currently
>>>>executing timer_interrupt()).
>>>
>>>This improves the situation (the highest spike I saw so far was 2,000
>>>interrupts per CPU per second), but doesn't get it back the way it
>>>ought to be (apart from the spikes, as with the original 
>version of the
>>>patch, interrupt activity is also generally too high, very 
>erratic, and
>>>even during the more quiet periods doesn't go down to the original
>>>level).
>>>
>>
>>could you send out your new patch? in same time, tweaking singleshot
>
>Attached. After another refinement (in stop_hz_timer()) I didn't see
>spikes above 1,000 interrupts per CPU per second anymore. But it's
>still far from being as quiescent as without the patch.

Would you mind elaborating what's refinement and how that may
reduce spikes? Understand those variants may help draw big picture
about whole issue.

>
>What's also interesting is that there's an initial period (a 
>minute or so)
>where the interrupt rate is really stable (though still not as low as
>previously), and only then it starts becoming erratic.
>

What is average interrupt rate for 'stable' and 'erratic' case? Is it
close to spike (~1000)?

>>timer stat from Xen side would be helpful as I said earlier. :-)
>
>Didn't get to do that yet.

This stat would be helpful, given that you can get actual singleshot
timer trace instead of just speculating from dom0 inside.

In same time, possibly you can pin dom0 vcpu as a simplified case.

BTW, with your current patch there could be still possibility for
several vCPUs to contend for xtime_lock at same time. Current duty
vCPU may be preempted in ISR, and then other non-duty vCPU
will note it not in RUNSTATE_running and then designate itself
to take new duty. This may not be big issue, compared to original
always-contending style. But just raise it here and please make
sure it's actually what's desired by you.

Thanks,
Kevin

Thanks,
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel