This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] soft lockups during live migrate..

To: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Subject: Re: [Xen-devel] soft lockups during live migrate..
From: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
Date: Thu, 5 Nov 2009 19:06:04 -0800
Cc: "Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 05 Nov 2009 19:07:58 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20091023100936.GJ20579@xxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20091022212149.32d73745@xxxxxxxxxxxxxxxxxxxx> <20091023100936.GJ20579@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Fri, 23 Oct 2009 11:09:36 +0100
Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:

> At 05:21 +0100 on 23 Oct (1256275309), Mukesh Rathor wrote:
> > Trying to migrate a 64bit PV guest with 64GB running medium to
> > heavy load on xen 3.4.0, it is showing lot of soft lockups. The
> > softlockups are causing dom0 reboot by the cluster FS. The hardware
> > has 256GB and 32 CPUs.
> > 
> > Looking into the hypervisor thru kdb, I see one cpu in
> > sh_resync_all() while all other 31 appear spinning on the
> > shadow_lock.
> How many vcpus does the guest have?  Scalability issues in the OOS
> shadow code are more related to number of VCPUs than amount of RAM.
> > I vaguely remember
> > seeing some thread on this while ago, but just can't seem to google
> > find it now. I'm trying to figure what could be done in the short
> > run.
> The solution (for BS2000) was to plumb in a flag that disabled the OOS
> code for particular domains. 

Ok, I'm confused. It appears oos disable is relevant for hvm only... I'm
running PV.

    if ( is_hvm_domain(d) && !d->arch.paging.shadow.oos_off )   <---

>> Actually, things are fine with 32GB/32vcpus. Problem happens with
>> 64GB/32vcpus. Trying the unstable version now.  

>Interesting.  Have you tried increading the amount of shadow memeory 
>you give to the guest?  IIRC xend tries to pick a sensible default but
>if it's too low and you start thrashing things can get very slow indeed.

What do you recommend I start with for 32VCPUs and 64GB?


Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>