WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug

To: "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Tue, 30 Jan 2007 20:11:44 +0800
Delivery-date: Tue, 30 Jan 2007 04:11:28 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C1E4C9B1.87F8%Keir.Fraser@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcdESFqDCWsISfq5RGeHgxcxVzRqmQACelaDAAAZiDAAAP4q2wADxf1Q
Thread-topic: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
>From: Keir Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx]
>Sent: 2007年1月30日 18:09
>
>On 30/1/07 09:54, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>
>> If we don't take into account blocked time, maybe we have to disable
>> softlockup check. Say an idle process gets a timeout value larger than
>> 10s by next_timer_interrupt, and then blocked. If, unfortunately, there's
>> no other events happening before that timeout value, this vcpu will see
>> softlockup warning after that timeout immediately since this period is
>> not categorized into stolen time.
>
>Presumably softlockup threads are killed and re-created when VCPUs
>are
>offlined and onlined. Perhaps the re-creation is taking a long time? But

That should not be the case, since the softlockup warning continues 
to jump out after cpu is brought online.

>10s
>would be a *very* long time. And once it is created and bound to the
>correct
>VCPU we should never see long timeouts when blocking (since
>softlockup
>thread timeout is never longer than a few seconds).

Yeah, I noted this point just after sending out the mail.

>
>Perhaps there is a bug in our cpu onlining code -- a big timeout like that
>does need investigating. I don't think we can claim this bug is
>root-caused
>yet so it's premature to be applying patches.
>

Agree. I'll do more investigation on this point. Just quickly compared 
the watchdog thread between 2.6.18 and 2.6.16. Previously in 2.6.16, 
an explicit schedule timeout with 1s is used, while 2.6.18 wakes up 
the watchdog thread per second from timer interrupt (softlockup_tick). 
One distinct difference on this change is, watchdog thread in 2.6.16 
will have a soft timer registered while 2.6.18 not. I'm doubting that 
this may make some difference to decision of next_timer_interrupt.

By the way, do you think whether scheduler may do something to 
punish new-online vcpu? Just from code, I didn't see that since new 
awaken vcpu is always boosted... However in the actual, I found 
that virtual timer interrupt number increased slowly for that cpu by 
'cat /proc/interrupts'. Sometimes it may even freeze for dozen of 
seconds. But yes, this may the phenomenon instead of reason. :-)

Thanks,
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel