This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] elapse time computing when restarting VM?

To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] elapse time computing when restarting VM?
From: Yufang Zhang <yuzhang@xxxxxxxxxx>
Date: Sun, 15 Aug 2010 08:27:24 -0400 (EDT)
Delivery-date: Sun, 15 Aug 2010 05:28:04 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1822033966.3892641281875172829.JavaMail.root@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi all,
Currently, xend would compute elapse time since vm starts before restarting a 
vm. If the elapse time is larger than MINIMUM_RESTART_TIME (which is 60s), xend 
would refuse to restart the vm but destroy it to avoid loops. However, when a 
guest crashes at boot time and enable-dump is enabled, core dump is done before 
restarting the guest which may take quite a while (depends on memory size of 
the guest). At this situation, elapse time computed is expanded thus xend 
wouldn't destory the guest. Then the guest drops into a restart-crash-dumpcore 
loop, which is either a waist of cpu time or *disk space* of Domain0.  
Actually, I have hit this problem when I upgraded a 2048M guest to a 
problematic kernel. The guest crashed at boot time and core dump was done for 
it, after which the guest rebooted and go-through the previous steps. My 
domain0 was full of core dump files of that guest. So does it make sense to 
figure out a way to solve the problem but not just enlarging 
MINIMUM_RESTART_TIME? Is the following patch reasonable? 

diff -r 774dfc178c39 tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py   Thu Aug 12 17:06:21 2010 +0100
+++ b/tools/python/xen/xend/XendDomainInfo.py   Mon Aug 16 12:16:45 2010 +0800
@@ -2060,7 +2060,7 @@
                 log.warn('Domain has crashed: name=%s id=%d.',
                          self.info['name_label'], self.domid)
                 self._writeVm(LAST_SHUTDOWN_REASON, 'crash')
+                self.info['crash_time'] = time.time()
                 restart_reason = 'crash'

@@ -2188,7 +2188,12 @@
         old_domid = self.domid
         self._writeVm(RESTART_IN_PROGRESS, 'True')

-        elapse = time.time() - self.info['start_time']
+        if xoptions.get_enable_dump() or self.get_on_crash() \
+               in ['coredump_and_destroy', 'coredump_and_restart']:
+            elapse = self.info['crash_time'] - self.info['start_time']
+        else:
+            elapse = time.time() - self.info['start_time']
         if elapse < MINIMUM_RESTART_TIME:
             log.error('VM %s restarting too fast (Elapsed time: %f seconds). '
                       'Refusing to restart to avoid loops.',

I have test the situation with the patch, and it works well when the guest 
crashes at boot time.

Best Regards.


Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-devel] elapse time computing when restarting VM?, Yufang Zhang <=