This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] soft lockups during live migrate..

To: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
Subject: Re: [Xen-devel] soft lockups during live migrate..
From: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Date: Fri, 6 Nov 2009 10:03:44 +0000
Cc: "Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 06 Nov 2009 02:04:09 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20091105190604.51cbe111@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20091022212149.32d73745@xxxxxxxxxxxxxxxxxxxx> <20091023100936.GJ20579@xxxxxxxxxxxxxxxxxxxxxxx> <20091105190604.51cbe111@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)

At 03:06 +0000 on 06 Nov (1257476764), Mukesh Rathor wrote:
> Ok, I'm confused. It appears oos disable is relevant for hvm only... I'm
> running PV.
> sh_update_paging_modes()
>     ....
>     if ( is_hvm_domain(d) && !d->arch.paging.shadow.oos_off )   <---
>     ...

Hmmm.  It looks like we never unsync for PV guests.  There's probably no
reason not to, but it probably wouldn't help all that much since we
already intercept all PV pagetable updates.

It's a bit surprising that sh_resync_all() is the function your CPU is
stopped in.  Is that consistently the case or was it just one example?
I suppose for 32 VCPUs it does a lot of locking and unlocking of the
shadow lock.  You could try adding

    if ( !d->arch.paging.shadow.oos_active )

at the top of that function and see if it helps.

> Also,
> >> Actually, things are fine with 32GB/32vcpus. Problem happens with
> >> 64GB/32vcpus. Trying the unstable version now.  
> >Interesting.  Have you tried increading the amount of shadow memeory 
> >you give to the guest?  IIRC xend tries to pick a sensible default but
> >if it's too low and you start thrashing things can get very slow indeed.
> What do you recommend I start with for 32VCPUs and 64GB?

It really depends on the workload.  I think the default for a 64GiB
domain will be about 128MiB, so maybe try 256MiB and 512MiB and see if
it makes a difference.  This bit of python will let you change it on the
fly: run it with a domid and a shadow allocation in MiB.

#!/usr/bin/env python
import sys
import xen.lowlevel.xc
xc = xen.lowlevel.xc.xc()
print "%i" % xc.shadow_mem_control(dom=int(sys.argv[1]), mb=int(sys.argv[2]))


Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Citrix Systems (R&D) Ltd.
[Company #02300071, SL9 0DZ, UK.]

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>