>>> "Jan Beulich" <jbeulich@xxxxxxxxxx> 30.01.09 11:48 >>>
>This patch was moving the cpumask out of struct page_info into the page
>itself, but I just realized that this it not valid, since the purpose of the
>mask
>is to avoid a later TLB flush, hence the base assumption is that the page
>might still be in the TLB of some CPU, thus making it possible for a mis-
>behaved guest to still write to that page.
>
>Sorry for the mis-numbering,
>Jan
Actually, after some more consideration I think the patch is actually correct
(minus an issue that already has been in place for a while anyway): All
access a domain may continue have to a page it's freeing (through stale
TLB entries) is read/execute. This is because when the type count of a
page (i.e. in particular a writeable one) a (domain-)global TLB flush is
performed anyway. Allowing the freeing domain (as well as the one
subsequently allocating) to read the cpumask does not present a problem
in my opinion.
The issue mentioned above that I think current code has even without
that patch (though I may easily have missed some aspect here) is that
there is no enforcement of an immediate TLB flush when a writeable
page's type count drops to zero - instead the flush is deferred until the
end of the current operation or batch. With the removal of the use of
the per-domain lock in these code paths it is no longer valid to defer
the flush this much - it must be carried out before the page in question
gets unlocked.
And I would think that invalidate_shadow_ldt() has a similar issue, plus
it seems questionable that it does a local flush when acting on the current
vCPU, but a global one for all others.
Jan
*******************************************************
Move the (potentially large) cpumask field of free pages into the page
itself, thus making sizeof(struct page_info) independent of the
configured number of CPUs, which avoids penalizing systems with few
CPUs just because the hypervisor is able to support many.
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxxxx>
--- 2009-01-27.orig/xen/common/page_alloc.c 2009-01-29 15:27:38.000000000
+0100
+++ 2009-01-27/xen/common/page_alloc.c 2009-01-29 15:33:09.000000000 +0100
@@ -376,7 +376,7 @@ static struct page_info *alloc_heap_page
BUG_ON(pg[i].count_info != 0);
/* Add in any extra CPUs that need flushing because of this page. */
- cpus_andnot(extra_cpus_mask, pg[i].u.free.cpumask, mask);
+ cpus_andnot(extra_cpus_mask, PFN_CPUMASK(&pg[i]), mask);
tlbflush_filter(extra_cpus_mask, pg[i].tlbflush_timestamp);
cpus_or(mask, mask, extra_cpus_mask);
@@ -425,11 +425,11 @@ static void free_heap_pages(
if ( (d = page_get_owner(&pg[i])) != NULL )
{
pg[i].tlbflush_timestamp = tlbflush_current_time();
- pg[i].u.free.cpumask = d->domain_dirty_cpumask;
+ PFN_CPUMASK(&pg[i]) = d->domain_dirty_cpumask;
}
else
{
- cpus_clear(pg[i].u.free.cpumask);
+ cpus_clear(PFN_CPUMASK(&pg[i]));
}
}
--- 2009-01-27.orig/xen/include/asm-ia64/mm.h 2009-01-30 08:26:33.000000000
+0100
+++ 2009-01-27/xen/include/asm-ia64/mm.h 2009-01-29 14:24:42.000000000
+0100
@@ -35,8 +35,11 @@ typedef unsigned long page_flags_t;
* Every architecture must ensure the following:
* 1. 'struct page_info' contains a 'struct list_head list'.
* 2. Provide a PFN_ORDER() macro for accessing the order of a free page.
+ * 3. Provide a PFN_CPUMASK() macro for accessing the mask of possibly-tainted
+ * TLBs.
*/
-#define PFN_ORDER(_pfn) ((_pfn)->u.free.order)
+#define PFN_ORDER(_pfn) ((_pfn)->u.free.order)
+#define PFN_CPUMASK(_pfn) ((_pfn)->u.free.cpumask)
#define PRtype_info "016lx"
--- 2009-01-27.orig/xen/include/asm-x86/mm.h 2009-01-29 14:03:37.000000000
+0100
+++ 2009-01-27/xen/include/asm-x86/mm.h 2009-01-30 08:28:27.000000000 +0100
@@ -14,8 +14,18 @@
* Every architecture must ensure the following:
* 1. 'struct page_info' contains a 'struct page_list_entry list'.
* 2. Provide a PFN_ORDER() macro for accessing the order of a free page.
+ * 3. Provide a PFN_CPUMASK() macro for accessing the mask of possibly-tainted
+ * TLBs.
*/
#define PFN_ORDER(_pfn) ((_pfn)->v.free.order)
+#ifdef __i386__
+# define PFN_CPUMASK(_pfn) ((_pfn)->u.free.cpumask)
+#else
+# define PFN_CPUMASK(_pfn) (*(cpumask_t *)page_to_virt(_pfn))
+# if NR_CPUS > PAGE_SIZE * 8
+# error Cannot handle this many CPUs.
+# endif
+#endif
/*
* This definition is solely for the use in struct page_info (and
@@ -72,8 +82,10 @@ struct page_info
/* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
struct {
+#ifdef __i386__
/* Mask of possibly-tainted TLBs. */
cpumask_t cpumask;
+#endif
} free;
} u;
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|