WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ppc-devel

[XenPPC] Overflow in decrementer restore

To: xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
Subject: [XenPPC] Overflow in decrementer restore
From: Amos Waterland <apw@xxxxxxxxxx>
Date: Fri, 1 Sep 2006 15:57:04 -0400
Delivery-date: Fri, 01 Sep 2006 12:57:15 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-ppc-devel-request@lists.xensource.com?subject=help>
List-id: Xen PPC development <xen-ppc-devel.lists.xensource.com>
List-post: <mailto:xen-ppc-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ppc-devel>, <mailto:xen-ppc-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ppc-devel>, <mailto:xen-ppc-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-ppc-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.12-2006-07-14
Just to provide background for this commit that went in today:

--- a/xen/arch/powerpc/powerpc64/domain.c
+++ b/xen/arch/powerpc/powerpc64/domain.c
@@ -55,7 +55,10 @@ void load_sprs(struct vcpu *v)
     /* adjust the DEC value to account for cycles while not
      * running this OS */
     timebase_delta = mftb() - v->arch.timebase;
-    v->arch.dec -= timebase_delta;
+    if (timebase_delta > v->arch.dec)
+        v->arch.dec = 0;
+    else
+        v->arch.dec -= timebase_delta;
 }

In the patch titled "Schedule idle domain on secondary processors", 
I mentioned that sometimes the entire system would freeze, so I didn't
want the patch to be considered for merging.

The problem turned out to be that we don't sync the timebases between
the processors.  So if load_sprs() is executed on a different CPU than
save_sprs() was, the call to mftb is bogus.  The timebase_delta can
overflow into a large unsigned value of up to 149 seconds on JS21.  So
the domU was not wrecking the machine, the decrementer was just being
loaded with a huge value every time that domU's vcpu was loaded on a
particular physical CPU, including cpu0.

This patch also went in, to pin dom0 to cpu0:

--- a/xen/arch/powerpc/setup.c  Fri Sep 01 12:31:56 2006 -0400
+++ b/xen/arch/powerpc/setup.c  Fri Sep 01 12:37:29 2006 -0400
@@ -343,6 +343,10 @@ static void __init __start_xen(multiboot
     if (NULL == alloc_vcpu(dom0, 0, 0))
         panic("Error creating domain 0 vcpu 0\n");

+    /* The Interrupt Controller will route everything to CPU 0 so we
+     * need to make sure Dom0's vVCPU 0 is pinned to the CPU */
+    dom0->vcpu[0]->cpu_affinity = cpumask_of_cpu(0);
+

We are currently thinking about how best to sync the timebases.  Right
now it looks like pulling in Linux's implementation is the best option.
Any comments would be appreciated.

We did have a real memory controller hang, as discussed on this list in
response to my original post.  It only occurred on Maple, where PIBS
does not clear the HIOR for secondary CPUSs, so their first exeception was
delivered to 0xX00 + Y.  Hence this patch that went in yesterday:

+        cpu0_hior = 0;

+    mthior(cpu0_hior);


_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel

<Prev in Thread] Current Thread [Next in Thread>