Keir Fraser wrote:
Folks,
New release candidates are tagged:
http://xenbits.xensource.com/xen-3.3-testing.hg tagged 3.3.2-rc3
http://xenbits.xensource.com/xen-3.4-testing.hg tagged 3.4.1-rc4
Please test! I hope to release later this week.
-- Keir
We are putting our 3.3.2-rc3 based tree through some testing, and are
seeing what looks like a shadow page table issue.
This issue happens with Solaris (both Solaris 10 and more recent
OpenSolaris-based builds), running in a HVM domain. The domain has 3 or
more VCPUs (4 is the usual number). At some point, the Solaris kernel
will panic. The pattern always looks the same: the Solaris kernel
allocates some kmem, and then touches it shortly afterwards (e.g. to
zero it out, write a 0xbadcafe debug pattern to it, etc). When it
touches the memory, it gets a fatal pagefault (page not present).
However, when inspecting the state of the guest page tables, they all
look fine. The page is mapped, as far as the guest is concerned. That
means that the shadow page table code must have gotten it wrong. An
additional data point that points in this direction is that on a system
that very reliably reproduces the problem, setting hap=1 makes the
problem go away (the problem is reproduced by doing a virt-install,
which doesn't set hap to 1 by default).
I'm trying to narrow the circumstances down to get some useful data out
of this; I'll try disabling the out of sync optimizations in the shadow
code, etc. I've tried to add more instrumentation to the shadow code,
but this often changes the timing just enough to avoid the bug.
Our code has no changes in the shadow page table code. So far, we
haven't tried to reproduce it on our 3.4-based tree yet (we need a
better way to reproduce it in a more controlled environment for that).
I've filed bug #1480 for this.
Oh, and speaking of the out of sync option, Xen doesn't compile if it's
disabled, because some ifdefs use && instead of &. Patch attached.
- Frank
diff -r ed718c13f651 xen/arch/x86/mm/shadow/multi.c
--- a/xen/arch/x86/mm/shadow/multi.c Sun Jun 21 20:19:07 2009 -0700
+++ b/xen/arch/x86/mm/shadow/multi.c Tue Jun 23 09:09:11 2009 -0700
@@ -2048,7 +2048,7 @@
if ( r & SHADOW_SET_ERROR )
return NULL;
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
*resync |= 1;
#endif
@@ -2103,7 +2103,7 @@
if ( r & SHADOW_SET_ERROR )
return NULL;
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
*resync |= 1;
#endif
@@ -2200,7 +2200,7 @@
(void) shadow_l1_index(sl1mfn, guest_l1_table_offset(gw->va));
}
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
/* All pages walked are now pagetables. Safe to resync pages
in case level 4 or 3 shadows were set. */
if ( resync )
@@ -2482,7 +2482,7 @@
else
result |= SHADOW_SET_ERROR;
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
if ( mfn_valid(sl3mfn) )
shadow_resync_all(v, 0);
#endif
@@ -2539,7 +2539,7 @@
else
result |= SHADOW_SET_ERROR;
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
if ( mfn_valid(sl2mfn) )
shadow_resync_all(v, 0);
#endif
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|