WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] soft lockups during live migrate..

To: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
Subject: Re: [Xen-devel] soft lockups during live migrate..
From: Pasi Kärkkäinen <pasik@xxxxxx>
Date: Fri, 23 Oct 2009 09:20:29 +0300
Cc: "Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 22 Oct 2009 23:20:55 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20091022212149.32d73745@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20091022212149.32d73745@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.13 (2006-08-11)
On Thu, Oct 22, 2009 at 09:21:49PM -0700, Mukesh Rathor wrote:
> 
> 
> Trying to migrate a 64bit PV guest with 64GB running medium to heavy load 
> on xen 3.4.0, it is showing lot of soft lockups. The softlockups are 
> causing dom0 reboot by the cluster FS. The hardware has 256GB and 32
> CPUs.
> 

Did you try with Xen 3.4.1 or the latest xen-3.4-testing.hg ?

There are a lot of fixes after 3.4.0 ..

-- Pasi

> Looking into the hypervisor thru kdb, I see one cpu in sh_resync_all()
> while all other 31 appear spinning on the shadow_lock. I vaguely remember
> seeing some thread on this while ago, but just can't seem to google find
> it now. I'm trying to figure what could be done in the short run.
> 
> Now that guests are getting bigger in memory, bugs of this nature are slowly
> popping up under medium/heavy load. I've been thinking of what could be
> done to adderss those in the long run. May be create a certain class of 
> pages, that once migrated, are 'w' protected, and any write faults on them 
> are resolved on the target system, is one idea.  Incidentally, IBM took 
> the reverse approach. The (VCPU) contexts are migrated and pages are 
> pulled in. 
> 
> 
> thanks,
> Mukesh
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel