WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

To: Andreas Olsowski <andreas.olsowski@xxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Wed, 2 Jun 2010 16:55:19 +0100
Cc:
Delivery-date: Wed, 02 Jun 2010 08:56:20 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100602174645.9b37b6b1.andreas.olsowski@xxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcsCaxS45M4MXv9fTF6BZ7zwLyHifQAAOtKA
Thread-topic: [Xen-devel] slow live magration / xc_restore on xen4 pvops
User-agent: Microsoft-Entourage/12.24.0.100205
On 02/06/2010 16:46, "Andreas Olsowski" <andreas.olsowski@xxxxxxxxxxxxxxx>
wrote:

> One can see the timegap bewteen the first and the following memory batch
> reads.
> After that restoration works as expected.
> You might notice, that you have "0%" and then "100%" and no steps inbetween,
> whereas with xc_save you have, is that intentional or maybe another symptom
> for the same problem?

Does the log look similar for a restore on a fast system (except the
timestamps of course)?

> as for the read_exact stuff:
> tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H
> RDEXACT {} \;
> tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H
> rdexact {} \;
> 
> There are no RDEXACT/rdexact matches in my xen source code.

Ah, because you're using 4.0. Well, I wouldn't worry about it just now
anyway. It may be more fruitful to continue looking for a concrete
behavioural different between a fast and slow restore, apart from merely
timing, by inspecting logs.

 -- Keir

> In a few hours i will shutdown all virtual machines on one of the hosts
> experiencing slow xc_restores, maybe reboot it and check if xc_restore is any
> faster without load or utilization on the machine.
> 
> Ill check in with results later.
> 
> 
> On Wed, 2 Jun 2010 08:11:31 +0100
> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote:
> 
>> Hi Andreas,
>> 
>> This is an interesting bug, to be sure. I think you need to modify the
>> restore code to get a better idea of what's going on. The file in the Xen
>> tree is tools/libxc/xc_domain_restore.c. You will see it contains many
>> DBGPRINTF and DPRINTF calls, some of which are commented out, and some of
>> which may 'log' at too low a priority level to make it to the log file. For
>> your purposes you might change them to ERROR calls as they will definitely
>> get properly logged. One area of possible concern is that our read function
>> (RDEXACT, which is a macro mapping to rdexact) was modified for Remus to
>> have a select() call with a timeout of 1000ms. Do I entirely trust it? Not
>> when we have the inexplicable behaviour that you're seeing. So you might try
>> mapping RDEXACT() to read_exact() instead (which is what we already do when
>> building for __MINIOS__).
>> 
>> This all assumes you know your way around C code at least a little bit.
>> 
>>  -- Keir
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel