WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

[Xen-ia64-devel] [PATCH] fix soft lock up caused by xen_timer_interrupt(

To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-ia64-devel] [PATCH] fix soft lock up caused by xen_timer_interrupt()
From: Atsushi SAKAI <sakaia@xxxxxxxxxxxxxx>
Date: Tue, 03 Jul 2007 13:44:23 +0900
Delivery-date: Mon, 02 Jul 2007 21:42:58 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-ia64-devel-request@lists.xensource.com?subject=help>
List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
List-post: <mailto:xen-ia64-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi

 This patch intends to fix softlockup caused by xen_timer_interrupt().
This is caused by local_cpu_data->itm_next and stime_irq, itc_at_irq 
inconsistency at CPU0 of hypervisor.
This patch sets stime_irq and itc_at_irq every time
in xen_timer_interrupt() to avoid this soft lock up.

In other words, it caused by 
competition of local_cpu_data->itm_next and domain_itm in 
xen_timer_interrupt() and reprogram_timer()(more specific 
vcpu_set_next_timer()).

For example, 
1)reprogram_timer() runs and set local_cpu_data->itm_next
  and set domain_itm as next itm.
2)xen_timer_interrupt() called but not satisfied following condition.   
while(time_after(ia64_get_itc(),local_cpu_data->itm_next)
  Then skip stime_irq and itc_at_irq setting.
3)goto 1)
4) sometimes local_cpu_data->itm_next is rollback 
  because ns_to_cycle()/IA64 is representing almost 32bit.
  (This is occured at reprogram_timer())
5)It causes soft lock up.
6)Hypervisor returns to work(not hang).

To reproduce this issue, I do following configuration.

1) boot Xen with pcpu=4 and Dom0 with vcpu=4
2) boot domU1 with vcpu with vcpu-pin 0-1
3) boot domU2 with vcpu with vcpu-pin 0-1
4) run yes > /dev/null  2 process on domU1
5) run nothing on domU2(to check softlock up occured or not)
6) run kernel compile with -j4 on Dom0 continuously
7) wait 4 or 8 hours to occur softlockup.

Signed-off-by: Atsushi SAKAI <sakaia@xxxxxxxxxxxxxx>

Thanks
Atsushi SAKAI

Attachment: fix_softlockup.patch
Description: Binary data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
<Prev in Thread] Current Thread [Next in Thread>