This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] [PATCH] When flush tlb , we need consider the cpu_online_map

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: [Xen-devel] [PATCH] When flush tlb , we need consider the cpu_online_map
From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Date: Mon, 29 Mar 2010 20:00:59 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 29 Mar 2010 05:01:54 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcrPN37AmnLUvJjDTJmbEpc9neI6/Q==
Thread-topic: [PATCH] When flush tlb , we need consider the cpu_online_map
When flush tlb mask, we need consider the cpu_online_map. The same happens to 
ept flush also.

We noticed sometime system hang on cpu online/offline stress test. The reason 
is because flush_tlb_mask from __get_page_type is deadloop.

This should be caused by a small windows in cpu offline.
The cpu_online_map is changed and the interrupt is disabled at take_cpu_down() 
for the to-be-offline CPU.

However, the __sync_lazy_execstate() called from idle_task_exit() in the 
idle_loop() for the to-be-offline CPU. At that time, the stop_machine_run is 
finished already, and __get_page_type may be called in other CPU before the 

BTW, I noticed that cpu_clear(cpu, cpu_online_map) is called twice in 
__cpu_disable, I will ask the owner which one should be removed.

Signed-off-by: Jiang, Yunhong <yunhong.jiang@xxxxxxxxx>

diff -r f3db0ae08304 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c        Sat Mar 27 16:01:35 2010 +0000
+++ b/xen/arch/x86/hvm/vmx/vmx.c        Mon Mar 29 17:49:51 2010 +0800
@@ -1235,6 +1235,9 @@ void ept_sync_domain(struct domain *d)
      * unnecessary extra flushes, to avoid allocating a cpumask_t on the stack.
     d->arch.hvm_domain.vmx.ept_synced = d->domain_dirty_cpumask;
+    cpus_and(d->arch.hvm_domain.vmx.ept_synced,
+             d->arch.hvm_domain.vmx.ept_synced,
+             cpu_online_map);
                      __ept_sync_domain, d, 1);
diff -r f3db0ae08304 xen/arch/x86/smp.c
--- a/xen/arch/x86/smp.c        Sat Mar 27 16:01:35 2010 +0000
+++ b/xen/arch/x86/smp.c        Mon Mar 29 17:47:25 2010 +0800
@@ -229,6 +229,7 @@ void flush_area_mask(const cpumask_t *ma
         cpus_andnot(flush_cpumask, *mask, *cpumask_of(smp_processor_id()));
+        cpus_and(flush_cpumask, cpu_online_map, flush_cpumask);
         flush_va      = va;
         flush_flags   = flags;
         send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR);

Attachment: flush_tlb_onlinemap.patch
Description: flush_tlb_onlinemap.patch

Xen-devel mailing list