This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-changelog] [xen-unstable] [IA64] Fix soft lock up caused by xen_tim

To: xen-changelog@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-changelog] [xen-unstable] [IA64] Fix soft lock up caused by xen_timer_interrupt()
From: Xen patchbot-unstable <patchbot-unstable@xxxxxxxxxxxxxxxxxxx>
Date: Fri, 27 Jul 2007 02:55:21 -0700
Delivery-date: Fri, 27 Jul 2007 02:53:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-changelog-request@lists.xensource.com?subject=help>
List-id: BK change log <xen-changelog.lists.xensource.com>
List-post: <mailto:xen-changelog@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=unsubscribe>
Reply-to: xen-devel@xxxxxxxxxxxxxxxxxxx
Sender: xen-changelog-bounces@xxxxxxxxxxxxxxxxxxx
# HG changeset patch
# User Alex Williamson <alex.williamson@xxxxxx>
# Date 1183662240 21600
# Node ID 34f285b57b87e94948f9af22df80888452d85dce
# Parent  40608e5e394ee4bcc5b68a4cbf49973f39327981
[IA64] Fix soft lock up caused by xen_timer_interrupt()

This patch intends to fix softlockup caused by xen_timer_interrupt().
This is caused by local_cpu_data->itm_next and stime_irq, itc_at_irq
inconsistency at CPU0 of hypervisor.  This patch sets stime_irq and
itc_at_irq every time in xen_timer_interrupt() to avoid this soft
lock up.

In other words, it is caused by competition of local_cpu_data->itm_next
and domain_itm in xen_timer_interrupt() and reprogram_timer() (more
specific vcpu_set_next_timer()).

For example:
 1) reprogram_timer() runs and set local_cpu_data->itm_next and set
    domain_itm as next itm.
 2) xen_timer_interrupt() called but following condition is not satisfied:
    while(time_after(ia64_get_itc(), local_cpu_data->itm_next)
    This skips stime_irq and itc_at_irq setting.
 3) goto 1)
 4) sometimes local_cpu_data->itm_next is rollback because
    ns_to_cycle()/IA64 is representing almost 32bit.
    (This occured at reprogram_timer())
 5) It causes soft lock up.
 6) Hypervisor returns to work(not hang).

To reproduce this issue, I do following configuration.

 1) boot Xen with pcpu=4 and Dom0 with vcpu=4
 2) boot domU1 with vcpu with vcpu-pin 0-1
 3) boot domU2 with vcpu with vcpu-pin 0-1
 4) run yes > /dev/null  2 process on domU1
 5) run nothing on domU2(to check softlock up occured or not)
 6) run kernel compile with -j4 on Dom0 continuously
 7) wait 4 or 8 hours to occur softlockup.

Signed-off-by: Atsushi SAKAI <sakaia@xxxxxxxxxxxxxx>
 xen/arch/ia64/xen/xentime.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff -r 40608e5e394e -r 34f285b57b87 xen/arch/ia64/xen/xentime.c
--- a/xen/arch/ia64/xen/xentime.c       Mon Jul 02 21:06:46 2007 -0600
+++ b/xen/arch/ia64/xen/xentime.c       Thu Jul 05 13:04:00 2007 -0600
@@ -126,9 +126,7 @@ xen_timer_interrupt (int irq, void *dev_
        new_itm = local_cpu_data->itm_next;
-       while (time_after(ia64_get_itc(), new_itm)) {
-               new_itm += local_cpu_data->itm_delta;
+       while (1) {
                if (smp_processor_id() == TIME_KEEPER_ID) {
                         * Here we are in the timer irq handler. We have irqs 
@@ -150,6 +148,10 @@ xen_timer_interrupt (int irq, void *dev_
                local_cpu_data->itm_next = new_itm;
+               if (time_after(new_itm, ia64_get_itc())) 
+                       break;
+               new_itm += local_cpu_data->itm_delta;
        if (!is_idle_domain(current->domain) && !VMX_DOMAIN(current)) {

Xen-changelog mailing list

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-changelog] [xen-unstable] [IA64] Fix soft lock up caused by xen_timer_interrupt(), Xen patchbot-unstable <=