This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] [Patch] x86: enforce strict memory order for x2apic

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] [Patch] x86: enforce strict memory order for x2apic
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Thu, 25 Sep 2008 14:55:43 +0800
Accept-language: en-US
Acceptlanguage: en-US
Delivery-date: Wed, 24 Sep 2008 23:56:21 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acke27gRRlDllztaSMSdaD3e4ujZ5Q==
Thread-topic: [Patch] x86: enforce strict memory order for x2apic
enforce strict memory order for x2apic.

x2apic allows using MSR to access APIC registers. For
efficient access, serializing semantics of WRMSR to
APIC regs are relaxed. This breaks some assumptions
made on old MMIO access style. In those places, wmb()
is used which is a nop on x86, as MMIO APIC regs are
mapped in uncacheable style which enforces program
order. However when x2apic is enabled, it's possible
for wrmsr reordered even before earlier global data
writes. We observed occasional crash in
smp_call_function_interrupt, where stale call_data
is referenced.

We kick a fix by simply using 'mb' instead of 'wmb'.
This can effectively enforce program order, as args
for WRMSR are loaded from memory. Those memory loads
are ordered by mb and so does WRMSR. Alternative is
to add a 'fence' parameter to send_IPI_mask and then
only use 'mb' in x2apic version. We think it's not
necessary to go with that complexity.

Signed-off-by Kevin Tian <kevin.tian@xxxxxxxxx>
Signed-off-by Yunhong Jiang <yunhong.jiang@xxxxxxxxx>

diff -r cfa7cabb56cb xen/arch/x86/smp.c
--- a/xen/arch/x86/smp.c        Thu Sep 25 05:33:35 2008 +0800
+++ b/xen/arch/x86/smp.c        Thu Sep 25 05:52:03 2008 +0800
@@ -186,6 +186,7 @@ void flush_area_mask(cpumask_t mask, con
         flush_cpumask = mask;
         flush_va      = va;
         flush_flags   = flags;
+        mb();
         send_IPI_mask(mask, INVALIDATE_TLB_VECTOR);
         while ( !cpus_empty(flush_cpumask) )
@@ -280,7 +281,7 @@ int on_selected_cpus(

     call_data = &data;
-    wmb();
+    mb();

     send_IPI_mask(selected, CALL_FUNCTION_VECTOR);

Attachment: x2apic-ipi.patch
Description: x2apic-ipi.patch

Xen-devel mailing list