WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug

To: "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Tue, 30 Jan 2007 21:09:12 +0800
Delivery-date: Tue, 30 Jan 2007 05:08:59 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C1E4F121.80AB%Keir.Fraser@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcdESFqDCWsISfq5RGeHgxcxVzRqmQACelaDAAAZiDAAAP4q2wADxf1QAAFVV3AAAMUYXAAACCkw
Thread-topic: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
>From: Keir Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx]
>Sent: 2007年1月30日 20:57
>
>On 30/1/07 12:45 pm, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>
>> Actually I'm a bit interested in this case, where watchdog thread
>> depends on timer interrupt to be awaken, while next timer interval
>> depends on soft timer wheel. For the new online cpu, all its
>> processes previously running have been migrated to others before
>> offline. Thus when just coming back online, there may be no
>> meaningful timer wheel and few activities on that vcpu. In this case,
>> a (LONG_MAX >> 1) may be returned as a big timeout.
>
>Yeah, but the thread should get migrated back again (or recreated) in
>fairly
>short order. I think we can agree it should take rather less than 10
>seconds. :-)

So my test is on an 'idle' domain which does nothing. In this case, I'm 
not sure whether processes except those per-cpu kernel threads will 
be migrated back when one cpu is still easy to handle them. For the 
per-cpu kernel threads, yes they'll be re-created, but will they be 
awaken immediately within 10s to do anything when there's no 
meaningful workload on that cpu? Actually this bug may not show 
when domain is under heavy load...

>
>> So saying this new watchdog model, simply walking timer wheel is
>> not enough. Maybe we can force max timeout value to 1s in safe_halt
>> to special this case? I'll make a try on this. But this will make current
>> tick-less model to a bit tick-ful back. :-)
>
>I'm sure this will fix the issue. But who knows what real underlying issue
>it might be hiding?
>
> -- Keir

I'm not sure whether it hides something. But the current situation 
seems like a self-trap to me: watchdog waits for timer interrupt to be 
awaken in 1s interval, while timer interrupt deliberately schedules a 
longer interval without considering watchdog and then blames 
watchdog thread not running within 10s. :-)

Thanks,
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>