WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug

To: "Graham, Simon" <Simon.Graham@xxxxxxxxxxx>, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Wed, 31 Jan 2007 13:42:17 +0800
Delivery-date: Tue, 30 Jan 2007 21:42:16 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <342BAC0A5467384983B586A6B0B37671048F8B2B@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcdESFqDCWsISfq5RGeHgxcxVzRqmQACelaDAAAZiDAAAQnwRAATVWmwABVdeuA=
Thread-topic: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
>From: Graham, Simon [mailto:Simon.Graham@xxxxxxxxxxx]
>Sent: 2007年1月31日 3:29
>> On 30/1/07 09:54, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>>
>> > Another simple approach to trigger such warning is to let
>> > __xen_suspend() jumps to smp_resume immediately after
>> > smp_suspend, as a test case for suspend cancel. People can
>> > observe all vcpus except vcpu0 fall into that warning frequently.
>>
>> Do you know if this problem has been observed across many versions
>of
>> Xen or
>> e.g., only after the upgrade to 2.6.18?
>>
>
>I'm not sure but I think that we've been seeing something very similar
>when live migrating domains with 3.0.3/2.6.16.29) -- my understanding is
>that the live migration code takes the domain down to UP, does the
>migration and then restores SMP -- we VERY often see soft lockup
>messages following this (several times per night in our regression
>testing) with stack traces identical to those posted by Kevin.
>
>I also added some instrumentation and in every single case, the 'stolen'
>time is > 5s when we see the soft lockup.
>
>Simon

Hi, Simon,
        You case should be different as what I saw, which may be fixed 
by the original patch I posted which however doesn't apply to latest. 
In 2.6.16 version, it's do_timer to call softlock_tick instead of 
run_local_timers. So the check on "stolen > 5s" is a bit late to still 
allow warning jumped out though adjusted later. Could you try 
attached patch to see whether fixing for your live migration case?

Thanks,
Kevin

Attachment: fix_softlockup_2616.patch
Description: fix_softlockup_2616.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel