WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [RFC][PATCH] domheap optimization for NUMA

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [RFC][PATCH] domheap optimization for NUMA
From: Andre Przywara <andre.przywara@xxxxxxx>
Date: Thu, 03 Apr 2008 12:39:14 +0200
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, "Edwin.Zhai" <edwin.zhai@xxxxxxxxx>
Delivery-date: Thu, 03 Apr 2008 03:43:33 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C419D386.15E31%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C419D386.15E31%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5.0.10 (X11/20070409)
Keir,

Yes, but it's a bad interface, particularlty when the function is called
alloc_domheap_pages_on_node(). Pass in a nodeid. Write a helper function to
work out the nodeid from the domain*.
I was just looking at this code, too, so I fixed this. Eventually
alloc_heap_pages is called, which deals with nodes only, so I replaced
cpu with node everywhere else, too. Now __alloc_domheap_pages and
alloc_domheap_pages_on_node are almost the same (except parameter
ordering), so I removed the first one, since the naming of the latter is
better. Passing node numbers instead of cpu numbers needs cpu_to_node
and asm/numa.h, if you think there is a better way, I am all ears.

That's fine. If you reference numa stuff then you need numa.h.

But vcpu_to_node and domain_to_node as well as cpu_to_node, please. There's
no need to be open-coding v->processor everywhere. Also in future we might
care to pick node based on v's affinity map rather than just current
processor value. And usage of d->vcpu[0] without checking for != NULL is
asking to introduce edge-case bugs. We can easily do that NULL check in one
place if we implement domain_to_node().
Ok, I did this. I provided NUMA_NO_NODE in the case d->vcpu[0] is NULL, this will be resolved to the current node in alloc_heap_pages (at least for now). By the way, can we solve the DMA_BITSIZE problem (your mail from 28th Feb) with this? If no node is specified, use the current behaviour of preferring non DMA zones, else stick to the given node.
If you agree, I will implement this.

And, while I'm thinking about the interfaces, let's just stick to
alloc_domheap_page() and alloc_domheap_pages(). Let's add a flags parameter
to the former (so it matches the latter in that respect) and let's add a
MEMF_node() flag subtype (similar to MEMF_bits). Semantics will be that if
MEMF_node(node) is provided then we try to allocate memory from node; else
we try to allocate memory from a node local to specified domain; else if
domain is NULL then we ignore locality.
Sounds reasonable. I changed this, too. If domain is NULL, domain_to_node will return NUMA_NO_NODE, which will eventually ignore locality (in alloc_heap_pages).

Since zero is probably a valid numa nodeid we can define MEMF_node() as
something like ((((node)+1)&0xff)<<8). Then since NUMA_NO_NODE==0xff
everything works nicely: MEMF_node(NUMA_NO_NODE) is equivalent to not
specifying MEMF_node() at all, which is what we would logically expect.
Good idea.
NUMA_NO_NODE probably needs to be pulled out of asm-x86/numa.h and made the
official arch-neutral way to specify 'don't care' for numa nodes.
Is this really needed? I provided memflags=0 is all don't care cases, this should work and is more compatible. But beware that this silently assumes in page_alloc.c#alloc_domheap_pages that NUMA_NO_NODE is 0xFF, otherwise this trick will not work.

Attached again a diff against my last version and the full patch (for
some reason a missing bracket slipped through my last one, sorry for that).

This is only quick-tested (booted and created a guest on each node).

Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>

Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 277-84917
----to satisfy European Law for business letters:
AMD Saxony Limited Liability Company & Co. KG,
Wilschdorfer Landstr. 101, 01109 Dresden, Germany
Register Court Dresden: HRA 4896, General Partner authorized
to represent: AMD Saxony LLC (Wilmington, Delaware, US)
General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy
diff -r 848a36114bb9 xen/arch/ia64/xen/mm.c
--- a/xen/arch/ia64/xen/mm.c    Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/ia64/xen/mm.c    Thu Apr 03 12:25:31 2008 +0200
@@ -822,7 +822,7 @@ __assign_new_domain_page(struct domain *
 
     BUG_ON(!pte_none(*pte));
 
-    p = alloc_domheap_page(d);
+    p = alloc_domheap_page(d, 0);
     if (unlikely(!p)) {
         printk("assign_new_domain_page: Can't alloc!!!! Aaaargh!\n");
         return(p);
@@ -2316,7 +2316,7 @@ steal_page(struct domain *d, struct page
         unsigned long new_mfn;
         int ret;
 
-        new = alloc_domheap_page(d);
+        new = alloc_domheap_page(d, 0);
         if (new == NULL) {
             gdprintk(XENLOG_INFO, "alloc_domheap_page() failed\n");
             return -1;
@@ -2603,7 +2603,7 @@ void *pgtable_quicklist_alloc(void)
 
     BUG_ON(dom_p2m == NULL);
     if (!opt_p2m_xenheap) {
-        struct page_info *page = alloc_domheap_page(dom_p2m);
+        struct page_info *page = alloc_domheap_page(dom_p2m, 0);
         if (page == NULL)
             return NULL;
         p = page_to_virt(page);
diff -r 848a36114bb9 xen/arch/ia64/xen/tlb_track.c
--- a/xen/arch/ia64/xen/tlb_track.c     Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/ia64/xen/tlb_track.c     Thu Apr 03 12:25:31 2008 +0200
@@ -48,7 +48,7 @@ tlb_track_allocate_entries(struct tlb_tr
                 __func__, tlb_track->num_entries, tlb_track->limit);
         return -ENOMEM;
     }
-    entry_page = alloc_domheap_page(NULL);
+    entry_page = alloc_domheap_page(NULL, 0);
     if (entry_page == NULL) {
         dprintk(XENLOG_WARNING,
                 "%s: domheap page failed. num_entries %d limit %d\n",
@@ -84,7 +84,7 @@ tlb_track_create(struct domain* d)
     if (tlb_track == NULL)
         goto out;
 
-    hash_page = alloc_domheap_page(NULL);
+    hash_page = alloc_domheap_page(NULL, 0);
     if (hash_page == NULL)
         goto out;
 
diff -r 848a36114bb9 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c     Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/domain.c     Thu Apr 03 12:25:31 2008 +0200
@@ -172,7 +172,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
     if ( !d->arch.mm_arg_xlat_l3 )
     {
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         d->arch.mm_arg_xlat_l3 = page_to_virt(pg);
@@ -190,7 +190,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
         if ( !l3e_get_intpte(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -199,7 +199,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         l2tab = l3e_to_l2e(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]);
         if ( !l2e_get_intpte(l2tab[l2_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -207,7 +207,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         }
         l1tab = l2e_to_l1e(l2tab[l2_table_offset(va)]);
         BUG_ON(l1e_get_intpte(l1tab[l1_table_offset(va)]));
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         l1tab[l1_table_offset(va)] = l1e_from_page(pg, PAGE_HYPERVISOR);
@@ -253,7 +253,7 @@ static void release_arg_xlat_area(struct
 
 static int setup_compat_l4(struct vcpu *v)
 {
-    struct page_info *pg = alloc_domheap_page(NULL);
+    struct page_info *pg = alloc_domheap_page(NULL, 0);
     l4_pgentry_t *l4tab;
     int rc;
 
@@ -478,8 +478,7 @@ int arch_domain_create(struct domain *d,
 
 #else /* __x86_64__ */
 
-    if ( (pg = alloc_domheap_page_on_node(NULL,
-        cpu_to_node(d->vcpu[0]->processor))) == NULL )
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL 
)
             goto fail;
     d->arch.mm_perdomain_l2 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l2);
@@ -488,9 +487,8 @@ int arch_domain_create(struct domain *d,
             l2e_from_page(virt_to_page(d->arch.mm_perdomain_pt)+i,
                           __PAGE_HYPERVISOR);
 
-    if ( (pg = alloc_domheap_page_on_node(NULL,
-        cpu_to_node(d->vcpu[0]->processor))) == NULL )
-            goto fail;
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL 
)
+        goto fail;
     d->arch.mm_perdomain_l3 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l3);
     d->arch.mm_perdomain_l3[l3_table_offset(PERDOMAIN_VIRT_START)] =
diff -r 848a36114bb9 xen/arch/x86/domain_build.c
--- a/xen/arch/x86/domain_build.c       Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/domain_build.c       Thu Apr 03 12:25:31 2008 +0200
@@ -630,7 +630,7 @@ int __init construct_dom0(
     }
     else
     {
-        page = alloc_domheap_page(NULL);
+        page = alloc_domheap_page(NULL, 0);
         if ( !page )
             panic("Not enough RAM for domain 0 PML4.\n");
         l4start = l4tab = page_to_virt(page);
diff -r 848a36114bb9 xen/arch/x86/hvm/stdvga.c
--- a/xen/arch/x86/hvm/stdvga.c Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/hvm/stdvga.c Thu Apr 03 12:25:31 2008 +0200
@@ -514,8 +514,8 @@ void stdvga_init(struct domain *d)
     
     for ( i = 0; i != ARRAY_SIZE(s->vram_page); i++ )
     {
-        if ( (pg = alloc_domheap_page_on_node(NULL,
-            cpu_to_node(d->vcpu[0]->processor))) == NULL )
+        if ( (pg = alloc_domheap_page(NULL,
+            MEMF_node(domain_to_node(d)))) == NULL )
                 break;
         s->vram_page[i] = pg;
         p = map_domain_page(page_to_mfn(pg));
diff -r 848a36114bb9 xen/arch/x86/hvm/vlapic.c
--- a/xen/arch/x86/hvm/vlapic.c Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/hvm/vlapic.c Thu Apr 03 12:25:31 2008 +0200
@@ -917,7 +917,7 @@ int vlapic_init(struct vcpu *v)
 int vlapic_init(struct vcpu *v)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
-    unsigned int memflags = 0;
+    unsigned int memflags = MEMF_node (vcpu_to_node(v));
 
     HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "%d", v->vcpu_id);
 
@@ -926,11 +926,10 @@ int vlapic_init(struct vcpu *v)
 #ifdef __i386__
     /* 32-bit VMX may be limited to 32-bit physical addresses. */
     if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        memflags = MEMF_bits(32);
+        memflags |= MEMF_bits(32);
 #endif
 
-    vlapic->regs_page = alloc_domheap_pages_on_node(NULL, 0, memflags,
-        cpu_to_node(v->processor));
+    vlapic->regs_page = alloc_domheap_page(NULL, memflags);
     if ( vlapic->regs_page == NULL )
     {
         dprintk(XENLOG_ERR, "alloc vlapic regs error: %d/%d\n",
diff -r 848a36114bb9 xen/arch/x86/mm/hap/hap.c
--- a/xen/arch/x86/mm/hap/hap.c Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/mm/hap/hap.c Thu Apr 03 12:25:31 2008 +0200
@@ -136,8 +136,8 @@ static struct page_info *hap_alloc_p2m_p
          && mfn_x(page_to_mfn(pg)) >= (1UL << (32 - PAGE_SHIFT)) )
     {
         free_domheap_page(pg);
-        pg = alloc_domheap_pages_on_node(NULL, 0, MEMF_bits(32),
-            cpu_to_node(d->vcpu[0]->processor));
+        pg = alloc_domheap_page(NULL, MEMF_bits(32) |
+            MEMF_node(domain_to_node(d)));
         if ( likely(pg != NULL) )
         {
             void *p = hap_map_domain_page(page_to_mfn(pg));
@@ -201,8 +201,7 @@ hap_set_allocation(struct domain *d, uns
         if ( d->arch.paging.hap.total_pages < pages )
         {
             /* Need to allocate more memory from domheap */
-            pg = alloc_domheap_page_on_node(NULL,
-                cpu_to_node(d->vcpu[0]->processor));
+            pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)));
             if ( pg == NULL )
             {
                 HAP_PRINTK("failed to allocate hap pages.\n");
diff -r 848a36114bb9 xen/arch/x86/mm/paging.c
--- a/xen/arch/x86/mm/paging.c  Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/mm/paging.c  Thu Apr 03 12:25:31 2008 +0200
@@ -100,8 +100,8 @@ static mfn_t paging_new_log_dirty_page(s
 static mfn_t paging_new_log_dirty_page(struct domain *d, void **mapping_p)
 {
     mfn_t mfn;
-    struct page_info *page = alloc_domheap_page_on_node(NULL,
-        cpu_to_node(d->vcpu[0]->processor));
+    struct page_info *page = alloc_domheap_page(NULL,
+        MEMF_node(domain_to_node(d)));
 
     if ( unlikely(page == NULL) )
     {
diff -r 848a36114bb9 xen/arch/x86/mm/shadow/common.c
--- a/xen/arch/x86/mm/shadow/common.c   Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/mm/shadow/common.c   Thu Apr 03 12:25:31 2008 +0200
@@ -1250,8 +1250,7 @@ static unsigned int sh_set_allocation(st
         {
             /* Need to allocate more memory from domheap */
             sp = (struct shadow_page_info *)
-                alloc_domheap_pages_on_node(NULL, order, 0,
-                    cpu_to_node(d->vcpu[0]->processor));
+                alloc_domheap_pages(NULL, order, MEMF_node(domain_to_node(d)));
             if ( sp == NULL ) 
             { 
                 SHADOW_PRINTK("failed to allocate shadow pages.\n");
diff -r 848a36114bb9 xen/arch/x86/x86_64/mm.c
--- a/xen/arch/x86/x86_64/mm.c  Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/arch/x86/x86_64/mm.c  Thu Apr 03 12:25:31 2008 +0200
@@ -59,7 +59,7 @@ void *alloc_xen_pagetable(void)
 
     if ( !early_boot )
     {
-        struct page_info *pg = alloc_domheap_page(NULL);
+        struct page_info *pg = alloc_domheap_page(NULL, 0);
         BUG_ON(pg == NULL);
         return page_to_virt(pg);
     }
@@ -108,7 +108,7 @@ void __init paging_init(void)
     struct page_info *l1_pg, *l2_pg, *l3_pg;
 
     /* Create user-accessible L2 directory to map the MPT for guests. */
-    if ( (l3_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l3_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     l3_ro_mpt = page_to_virt(l3_pg);
     clear_page(l3_ro_mpt);
@@ -134,7 +134,7 @@ void __init paging_init(void)
                1UL << L2_PAGETABLE_SHIFT);
         if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
         {
-            if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+            if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
                 goto nomem;
             va = RO_MPT_VIRT_START + (i << L2_PAGETABLE_SHIFT);
             l2_ro_mpt = page_to_virt(l2_pg);
@@ -154,7 +154,7 @@ void __init paging_init(void)
                  l4_table_offset(HIRO_COMPAT_MPT_VIRT_START));
     l3_ro_mpt = l4e_to_l3e(idle_pg_table[l4_table_offset(
         HIRO_COMPAT_MPT_VIRT_START)]);
-    if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     compat_idle_pg_table_l2 = l2_ro_mpt = page_to_virt(l2_pg);
     clear_page(l2_ro_mpt);
diff -r 848a36114bb9 xen/common/grant_table.c
--- a/xen/common/grant_table.c  Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/common/grant_table.c  Thu Apr 03 12:25:31 2008 +0200
@@ -1102,7 +1102,7 @@ gnttab_transfer(
             struct page_info *new_page;
             void *sp, *dp;
 
-            new_page = alloc_domheap_pages(NULL, 0, MEMF_bits(max_bitsize));
+            new_page = alloc_domheap_page(NULL, MEMF_bits(max_bitsize));
             if ( new_page == NULL )
             {
                 gop.status = GNTST_address_too_big;
diff -r 848a36114bb9 xen/common/memory.c
--- a/xen/common/memory.c       Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/common/memory.c       Thu Apr 03 12:25:32 2008 +0200
@@ -38,19 +38,13 @@ struct memop_args {
     int          preempted;  /* Was the hypercall preempted? */
 };
 
-static unsigned int select_local_node(struct domain *d)
-{
-    struct vcpu *v = d->vcpu[0];
-    return (v ? cpu_to_node(v->processor) : 0);
-}
-
 static void increase_reservation(struct memop_args *a)
 {
     struct page_info *page;
     unsigned long i;
     xen_pfn_t mfn;
     struct domain *d = a->domain;
-    unsigned int node = select_local_node(d);
+    unsigned int node = domain_to_node (d);
 
     if ( !guest_handle_is_null(a->extent_list) &&
          !guest_handle_okay(a->extent_list, a->nr_extents) )
@@ -68,8 +62,8 @@ static void increase_reservation(struct 
             goto out;
         }
 
-        page = alloc_domheap_pages_on_node (
-            d, a->extent_order, a->memflags, node);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -98,7 +92,7 @@ static void populate_physmap(struct memo
     unsigned long i, j;
     xen_pfn_t gpfn, mfn;
     struct domain *d = a->domain;
-    unsigned int node = select_local_node(d);
+    unsigned int node = domain_to_node(d);
 
     if ( !guest_handle_okay(a->extent_list, a->nr_extents) )
         return;
@@ -118,8 +112,8 @@ static void populate_physmap(struct memo
         if ( unlikely(__copy_from_guest_offset(&gpfn, a->extent_list, i, 1)) )
             goto out;
 
-        page = alloc_domheap_pages_on_node (
-            d, a->extent_order, a->memflags, node);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -299,7 +293,7 @@ static long memory_exchange(XEN_GUEST_HA
     unsigned long in_chunk_order, out_chunk_order;
     xen_pfn_t     gpfn, gmfn, mfn;
     unsigned long i, j, k;
-    unsigned int  memflags = 0, node;
+    unsigned int  memflags = 0;
     long          rc = 0;
     struct domain *d;
     struct page_info *page;
@@ -355,7 +349,7 @@ static long memory_exchange(XEN_GUEST_HA
     memflags |= MEMF_bits(domain_clamp_alloc_bitsize(
         d, exch.out.address_bits ? : (BITS_PER_LONG+PAGE_SHIFT)));
 
-    node = select_local_node(d);
+    memflags |= MEMF_node (domain_to_node(d));
 
     for ( i = (exch.nr_exchanged >> in_chunk_order);
           i < (exch.in.nr_extents >> in_chunk_order);
@@ -404,8 +398,8 @@ static long memory_exchange(XEN_GUEST_HA
         /* Allocate a chunk's worth of anonymous output pages. */
         for ( j = 0; j < (1UL << out_chunk_order); j++ )
         {
-            page = alloc_domheap_pages_on_node(
-                NULL, exch.out.extent_order, memflags, node);
+            page = alloc_domheap_pages(
+                NULL, exch.out.extent_order, memflags);
             if ( unlikely(page == NULL) )
             {
                 rc = -ENOMEM;
diff -r 848a36114bb9 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c   Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/common/page_alloc.c   Thu Apr 03 12:25:32 2008 +0200
@@ -337,6 +337,7 @@ static struct page_info *alloc_heap_page
     cpumask_t extra_cpus_mask, mask;
     struct page_info *pg;
 
+    if ( node == NUMA_NO_NODE ) node = cpu_to_node(smp_processor_id());
     ASSERT(node >= 0);
     ASSERT(node < num_nodes);
     ASSERT(zone_lo <= zone_hi);
@@ -780,12 +781,12 @@ int assign_pages(
 }
 
 
-struct page_info *alloc_domheap_pages_on_node(
-    struct domain *d, unsigned int order, unsigned int memflags,
-    unsigned int node)
+struct page_info *alloc_domheap_pages(
+    struct domain *d, unsigned int order, unsigned int memflags)
 {
     struct page_info *pg = NULL;
     unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1;
+    unsigned int node = (((memflags >> _MEMF_node)&0xFF) - 1 ) &0xFF;
 
     ASSERT(!in_irq());
 
@@ -823,13 +824,6 @@ struct page_info *alloc_domheap_pages_on
     }
     
     return pg;
-}
-
-struct page_info *alloc_domheap_pages(
-    struct domain *d, unsigned int order, unsigned int flags)
-{
-    return alloc_domheap_pages_on_node (d, order, flags,
-        cpu_to_node (smp_processor_id());
 }
 
 void free_domheap_pages(struct page_info *pg, unsigned int order)
diff -r 848a36114bb9 xen/drivers/passthrough/vtd/iommu.c
--- a/xen/drivers/passthrough/vtd/iommu.c       Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/drivers/passthrough/vtd/iommu.c       Thu Apr 03 12:25:32 2008 +0200
@@ -270,8 +270,8 @@ static struct page_info *addr_to_dma_pag
 
         if ( dma_pte_addr(*pte) == 0 )
         {
-            pg = alloc_domheap_page_on_node(NULL,
-                cpu_to_node(domain->vcpu[0]->processor));
+            pg = alloc_domheap_page(NULL,
+                MEMF_node(domain_to_node(domain)));
             vaddr = map_domain_page(page_to_mfn(pg));
             if ( !vaddr )
             {
diff -r 848a36114bb9 xen/include/asm-x86/numa.h
--- a/xen/include/asm-x86/numa.h        Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/include/asm-x86/numa.h        Thu Apr 03 12:25:32 2008 +0200
@@ -4,11 +4,16 @@
 #include <xen/cpumask.h>
 
 #define NODES_SHIFT 6
+#define NUMA_NO_NODE 0xff
 
 extern unsigned char cpu_to_node[];
 extern cpumask_t     node_to_cpumask[];
 
 #define cpu_to_node(cpu)               (cpu_to_node[cpu])
+#define domain_to_node(domain)  ((domain!=NULL && domain->vcpu[0]!=NULL)?\
+                                  cpu_to_node[domain->vcpu[0]->processor]:\
+                                  NUMA_NO_NODE)
+#define vcpu_to_node(vcpu)             (cpu_to_node[v->processor])
 #define parent_node(node)              (node)
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
@@ -73,6 +78,5 @@ static inline __attribute__((pure)) int 
 #define clear_node_cpumask(cpu) do {} while (0)
 #endif
 
-#define NUMA_NO_NODE 0xff
 
 #endif
diff -r 848a36114bb9 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h      Thu Apr 03 00:37:33 2008 +0200
+++ b/xen/include/xen/mm.h      Thu Apr 03 12:25:32 2008 +0200
@@ -54,15 +54,11 @@ void init_domheap_pages(paddr_t ps, padd
 void init_domheap_pages(paddr_t ps, paddr_t pe);
 struct page_info *alloc_domheap_pages(
     struct domain *d, unsigned int order, unsigned int memflags);
-struct page_info *alloc_domheap_pages_on_node(
-    struct domain *d, unsigned int order, unsigned int memflags,
-    unsigned int node_id);
 void free_domheap_pages(struct page_info *pg, unsigned int order);
 unsigned long avail_domheap_pages_region(
     unsigned int node, unsigned int min_width, unsigned int max_width);
 unsigned long avail_domheap_pages(void);
-#define alloc_domheap_page(d) (alloc_domheap_pages(d,0,0))
-#define alloc_domheap_page_on_node(d, n) (alloc_domheap_pages_on_node(d,0,0,n))
+#define alloc_domheap_page(d,f) (alloc_domheap_pages(d,0,f))
 #define free_domheap_page(p)  (free_domheap_pages(p,0))
 
 void scrub_heap_pages(void);
@@ -76,6 +72,8 @@ int assign_pages(
 /* memflags: */
 #define _MEMF_no_refcount 0
 #define  MEMF_no_refcount (1U<<_MEMF_no_refcount)
+#define _MEMF_node        8
+#define  MEMF_node(n)     ((((n)+1)&0xff)<<_MEMF_node)
 #define _MEMF_bits        24
 #define  MEMF_bits(n)     ((n)<<_MEMF_bits)
 
diff -r db943e8d1051 xen/arch/ia64/xen/mm.c
--- a/xen/arch/ia64/xen/mm.c    Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/ia64/xen/mm.c    Thu Apr 03 12:25:50 2008 +0200
@@ -822,7 +822,7 @@ __assign_new_domain_page(struct domain *
 
     BUG_ON(!pte_none(*pte));
 
-    p = alloc_domheap_page(d);
+    p = alloc_domheap_page(d, 0);
     if (unlikely(!p)) {
         printk("assign_new_domain_page: Can't alloc!!!! Aaaargh!\n");
         return(p);
@@ -2316,7 +2316,7 @@ steal_page(struct domain *d, struct page
         unsigned long new_mfn;
         int ret;
 
-        new = alloc_domheap_page(d);
+        new = alloc_domheap_page(d, 0);
         if (new == NULL) {
             gdprintk(XENLOG_INFO, "alloc_domheap_page() failed\n");
             return -1;
@@ -2603,7 +2603,7 @@ void *pgtable_quicklist_alloc(void)
 
     BUG_ON(dom_p2m == NULL);
     if (!opt_p2m_xenheap) {
-        struct page_info *page = alloc_domheap_page(dom_p2m);
+        struct page_info *page = alloc_domheap_page(dom_p2m, 0);
         if (page == NULL)
             return NULL;
         p = page_to_virt(page);
diff -r db943e8d1051 xen/arch/ia64/xen/tlb_track.c
--- a/xen/arch/ia64/xen/tlb_track.c     Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/ia64/xen/tlb_track.c     Thu Apr 03 12:25:50 2008 +0200
@@ -48,7 +48,7 @@ tlb_track_allocate_entries(struct tlb_tr
                 __func__, tlb_track->num_entries, tlb_track->limit);
         return -ENOMEM;
     }
-    entry_page = alloc_domheap_page(NULL);
+    entry_page = alloc_domheap_page(NULL, 0);
     if (entry_page == NULL) {
         dprintk(XENLOG_WARNING,
                 "%s: domheap page failed. num_entries %d limit %d\n",
@@ -84,7 +84,7 @@ tlb_track_create(struct domain* d)
     if (tlb_track == NULL)
         goto out;
 
-    hash_page = alloc_domheap_page(NULL);
+    hash_page = alloc_domheap_page(NULL, 0);
     if (hash_page == NULL)
         goto out;
 
diff -r db943e8d1051 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c     Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/domain.c     Thu Apr 03 12:25:50 2008 +0200
@@ -46,6 +46,7 @@
 #include <asm/debugreg.h>
 #include <asm/msr.h>
 #include <asm/nmi.h>
+#include <asm/numa.h>
 #include <xen/iommu.h>
 #ifdef CONFIG_COMPAT
 #include <compat/vcpu.h>
@@ -171,7 +172,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
     if ( !d->arch.mm_arg_xlat_l3 )
     {
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         d->arch.mm_arg_xlat_l3 = page_to_virt(pg);
@@ -189,7 +190,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
 
         if ( !l3e_get_intpte(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -198,7 +199,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         l2tab = l3e_to_l2e(d->arch.mm_arg_xlat_l3[l3_table_offset(va)]);
         if ( !l2e_get_intpte(l2tab[l2_table_offset(va)]) )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, 0);
             if ( !pg )
                 return -ENOMEM;
             clear_page(page_to_virt(pg));
@@ -206,7 +207,7 @@ int setup_arg_xlat_area(struct vcpu *v, 
         }
         l1tab = l2e_to_l1e(l2tab[l2_table_offset(va)]);
         BUG_ON(l1e_get_intpte(l1tab[l1_table_offset(va)]));
-        pg = alloc_domheap_page(NULL);
+        pg = alloc_domheap_page(NULL, 0);
         if ( !pg )
             return -ENOMEM;
         l1tab[l1_table_offset(va)] = l1e_from_page(pg, PAGE_HYPERVISOR);
@@ -252,7 +253,7 @@ static void release_arg_xlat_area(struct
 
 static int setup_compat_l4(struct vcpu *v)
 {
-    struct page_info *pg = alloc_domheap_page(NULL);
+    struct page_info *pg = alloc_domheap_page(NULL, 0);
     l4_pgentry_t *l4tab;
     int rc;
 
@@ -477,8 +478,8 @@ int arch_domain_create(struct domain *d,
 
 #else /* __x86_64__ */
 
-    if ( (pg = alloc_domheap_page(NULL)) == NULL )
-        goto fail;
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL 
)
+            goto fail;
     d->arch.mm_perdomain_l2 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l2);
     for ( i = 0; i < (1 << pdpt_order); i++ )
@@ -486,7 +487,7 @@ int arch_domain_create(struct domain *d,
             l2e_from_page(virt_to_page(d->arch.mm_perdomain_pt)+i,
                           __PAGE_HYPERVISOR);
 
-    if ( (pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)))) == NULL 
)
         goto fail;
     d->arch.mm_perdomain_l3 = page_to_virt(pg);
     clear_page(d->arch.mm_perdomain_l3);
diff -r db943e8d1051 xen/arch/x86/domain_build.c
--- a/xen/arch/x86/domain_build.c       Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/domain_build.c       Thu Apr 03 12:25:50 2008 +0200
@@ -630,7 +630,7 @@ int __init construct_dom0(
     }
     else
     {
-        page = alloc_domheap_page(NULL);
+        page = alloc_domheap_page(NULL, 0);
         if ( !page )
             panic("Not enough RAM for domain 0 PML4.\n");
         l4start = l4tab = page_to_virt(page);
diff -r db943e8d1051 xen/arch/x86/hvm/stdvga.c
--- a/xen/arch/x86/hvm/stdvga.c Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/hvm/stdvga.c Thu Apr 03 12:25:50 2008 +0200
@@ -32,6 +32,7 @@
 #include <xen/sched.h>
 #include <xen/domain_page.h>
 #include <asm/hvm/support.h>
+#include <asm/numa.h>
 
 #define PAT(x) (x)
 static const uint32_t mask16[16] = {
@@ -513,8 +514,9 @@ void stdvga_init(struct domain *d)
     
     for ( i = 0; i != ARRAY_SIZE(s->vram_page); i++ )
     {
-        if ( (pg = alloc_domheap_page(NULL)) == NULL )
-            break;
+        if ( (pg = alloc_domheap_page(NULL,
+            MEMF_node(domain_to_node(d)))) == NULL )
+                break;
         s->vram_page[i] = pg;
         p = map_domain_page(page_to_mfn(pg));
         clear_page(p);
diff -r db943e8d1051 xen/arch/x86/hvm/vlapic.c
--- a/xen/arch/x86/hvm/vlapic.c Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/hvm/vlapic.c Thu Apr 03 12:25:50 2008 +0200
@@ -33,6 +33,7 @@
 #include <xen/sched.h>
 #include <asm/current.h>
 #include <asm/hvm/vmx/vmx.h>
+#include <asm/numa.h>
 #include <public/hvm/ioreq.h>
 #include <public/hvm/params.h>
 
@@ -916,7 +917,7 @@ int vlapic_init(struct vcpu *v)
 int vlapic_init(struct vcpu *v)
 {
     struct vlapic *vlapic = vcpu_vlapic(v);
-    unsigned int memflags = 0;
+    unsigned int memflags = MEMF_node (vcpu_to_node(v));
 
     HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "%d", v->vcpu_id);
 
@@ -925,10 +926,10 @@ int vlapic_init(struct vcpu *v)
 #ifdef __i386__
     /* 32-bit VMX may be limited to 32-bit physical addresses. */
     if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        memflags = MEMF_bits(32);
+        memflags |= MEMF_bits(32);
 #endif
 
-    vlapic->regs_page = alloc_domheap_pages(NULL, 0, memflags);
+    vlapic->regs_page = alloc_domheap_page(NULL, memflags);
     if ( vlapic->regs_page == NULL )
     {
         dprintk(XENLOG_ERR, "alloc vlapic regs error: %d/%d\n",
diff -r db943e8d1051 xen/arch/x86/mm/hap/hap.c
--- a/xen/arch/x86/mm/hap/hap.c Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/mm/hap/hap.c Thu Apr 03 12:25:50 2008 +0200
@@ -38,6 +38,7 @@
 #include <asm/hap.h>
 #include <asm/paging.h>
 #include <asm/domain.h>
+#include <asm/numa.h>
 
 #include "private.h"
 
@@ -135,7 +136,8 @@ static struct page_info *hap_alloc_p2m_p
          && mfn_x(page_to_mfn(pg)) >= (1UL << (32 - PAGE_SHIFT)) )
     {
         free_domheap_page(pg);
-        pg = alloc_domheap_pages(NULL, 0, MEMF_bits(32));
+        pg = alloc_domheap_page(NULL, MEMF_bits(32) |
+            MEMF_node(domain_to_node(d)));
         if ( likely(pg != NULL) )
         {
             void *p = hap_map_domain_page(page_to_mfn(pg));
@@ -199,7 +201,7 @@ hap_set_allocation(struct domain *d, uns
         if ( d->arch.paging.hap.total_pages < pages )
         {
             /* Need to allocate more memory from domheap */
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL, MEMF_node(domain_to_node(d)));
             if ( pg == NULL )
             {
                 HAP_PRINTK("failed to allocate hap pages.\n");
diff -r db943e8d1051 xen/arch/x86/mm/paging.c
--- a/xen/arch/x86/mm/paging.c  Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/mm/paging.c  Thu Apr 03 12:25:50 2008 +0200
@@ -26,6 +26,7 @@
 #include <asm/p2m.h>
 #include <asm/hap.h>
 #include <asm/guest_access.h>
+#include <asm/numa.h>
 #include <xsm/xsm.h>
 
 #define hap_enabled(d) (is_hvm_domain(d) && (d)->arch.hvm_domain.hap_enabled)
@@ -99,7 +100,8 @@ static mfn_t paging_new_log_dirty_page(s
 static mfn_t paging_new_log_dirty_page(struct domain *d, void **mapping_p)
 {
     mfn_t mfn;
-    struct page_info *page = alloc_domheap_page(NULL);
+    struct page_info *page = alloc_domheap_page(NULL,
+        MEMF_node(domain_to_node(d)));
 
     if ( unlikely(page == NULL) )
     {
diff -r db943e8d1051 xen/arch/x86/mm/shadow/common.c
--- a/xen/arch/x86/mm/shadow/common.c   Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/mm/shadow/common.c   Thu Apr 03 12:25:50 2008 +0200
@@ -36,6 +36,7 @@
 #include <asm/current.h>
 #include <asm/flushtlb.h>
 #include <asm/shadow.h>
+#include <asm/numa.h>
 #include "private.h"
 
 
@@ -1249,7 +1250,7 @@ static unsigned int sh_set_allocation(st
         {
             /* Need to allocate more memory from domheap */
             sp = (struct shadow_page_info *)
-                alloc_domheap_pages(NULL, order, 0);
+                alloc_domheap_pages(NULL, order, MEMF_node(domain_to_node(d)));
             if ( sp == NULL ) 
             { 
                 SHADOW_PRINTK("failed to allocate shadow pages.\n");
diff -r db943e8d1051 xen/arch/x86/x86_64/mm.c
--- a/xen/arch/x86/x86_64/mm.c  Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/arch/x86/x86_64/mm.c  Thu Apr 03 12:25:50 2008 +0200
@@ -59,7 +59,7 @@ void *alloc_xen_pagetable(void)
 
     if ( !early_boot )
     {
-        struct page_info *pg = alloc_domheap_page(NULL);
+        struct page_info *pg = alloc_domheap_page(NULL, 0);
         BUG_ON(pg == NULL);
         return page_to_virt(pg);
     }
@@ -108,7 +108,7 @@ void __init paging_init(void)
     struct page_info *l1_pg, *l2_pg, *l3_pg;
 
     /* Create user-accessible L2 directory to map the MPT for guests. */
-    if ( (l3_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l3_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     l3_ro_mpt = page_to_virt(l3_pg);
     clear_page(l3_ro_mpt);
@@ -134,7 +134,7 @@ void __init paging_init(void)
                1UL << L2_PAGETABLE_SHIFT);
         if ( !((unsigned long)l2_ro_mpt & ~PAGE_MASK) )
         {
-            if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+            if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
                 goto nomem;
             va = RO_MPT_VIRT_START + (i << L2_PAGETABLE_SHIFT);
             l2_ro_mpt = page_to_virt(l2_pg);
@@ -154,7 +154,7 @@ void __init paging_init(void)
                  l4_table_offset(HIRO_COMPAT_MPT_VIRT_START));
     l3_ro_mpt = l4e_to_l3e(idle_pg_table[l4_table_offset(
         HIRO_COMPAT_MPT_VIRT_START)]);
-    if ( (l2_pg = alloc_domheap_page(NULL)) == NULL )
+    if ( (l2_pg = alloc_domheap_page(NULL, 0)) == NULL )
         goto nomem;
     compat_idle_pg_table_l2 = l2_ro_mpt = page_to_virt(l2_pg);
     clear_page(l2_ro_mpt);
diff -r db943e8d1051 xen/common/grant_table.c
--- a/xen/common/grant_table.c  Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/common/grant_table.c  Thu Apr 03 12:25:50 2008 +0200
@@ -1102,7 +1102,7 @@ gnttab_transfer(
             struct page_info *new_page;
             void *sp, *dp;
 
-            new_page = alloc_domheap_pages(NULL, 0, MEMF_bits(max_bitsize));
+            new_page = alloc_domheap_page(NULL, MEMF_bits(max_bitsize));
             if ( new_page == NULL )
             {
                 gop.status = GNTST_address_too_big;
diff -r db943e8d1051 xen/common/memory.c
--- a/xen/common/memory.c       Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/common/memory.c       Thu Apr 03 12:25:50 2008 +0200
@@ -21,6 +21,7 @@
 #include <xen/errno.h>
 #include <asm/current.h>
 #include <asm/hardirq.h>
+#include <asm/numa.h>
 #include <public/memory.h>
 #include <xsm/xsm.h>
 
@@ -37,19 +38,13 @@ struct memop_args {
     int          preempted;  /* Was the hypercall preempted? */
 };
 
-static unsigned int select_local_cpu(struct domain *d)
-{
-    struct vcpu *v = d->vcpu[0];
-    return (v ? v->processor : 0);
-}
-
 static void increase_reservation(struct memop_args *a)
 {
     struct page_info *page;
     unsigned long i;
     xen_pfn_t mfn;
     struct domain *d = a->domain;
-    unsigned int cpu = select_local_cpu(d);
+    unsigned int node = domain_to_node (d);
 
     if ( !guest_handle_is_null(a->extent_list) &&
          !guest_handle_okay(a->extent_list, a->nr_extents) )
@@ -67,7 +62,8 @@ static void increase_reservation(struct 
             goto out;
         }
 
-        page = __alloc_domheap_pages(d, cpu, a->extent_order, a->memflags);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -96,7 +92,7 @@ static void populate_physmap(struct memo
     unsigned long i, j;
     xen_pfn_t gpfn, mfn;
     struct domain *d = a->domain;
-    unsigned int cpu = select_local_cpu(d);
+    unsigned int node = domain_to_node(d);
 
     if ( !guest_handle_okay(a->extent_list, a->nr_extents) )
         return;
@@ -116,7 +112,8 @@ static void populate_physmap(struct memo
         if ( unlikely(__copy_from_guest_offset(&gpfn, a->extent_list, i, 1)) )
             goto out;
 
-        page = __alloc_domheap_pages(d, cpu, a->extent_order, a->memflags);
+        page = alloc_domheap_pages (
+            d, a->extent_order, a->memflags | MEMF_node(node));
         if ( unlikely(page == NULL) ) 
         {
             gdprintk(XENLOG_INFO, "Could not allocate order=%d extent: "
@@ -296,7 +293,7 @@ static long memory_exchange(XEN_GUEST_HA
     unsigned long in_chunk_order, out_chunk_order;
     xen_pfn_t     gpfn, gmfn, mfn;
     unsigned long i, j, k;
-    unsigned int  memflags = 0, cpu;
+    unsigned int  memflags = 0;
     long          rc = 0;
     struct domain *d;
     struct page_info *page;
@@ -352,7 +349,7 @@ static long memory_exchange(XEN_GUEST_HA
     memflags |= MEMF_bits(domain_clamp_alloc_bitsize(
         d, exch.out.address_bits ? : (BITS_PER_LONG+PAGE_SHIFT)));
 
-    cpu = select_local_cpu(d);
+    memflags |= MEMF_node (domain_to_node(d));
 
     for ( i = (exch.nr_exchanged >> in_chunk_order);
           i < (exch.in.nr_extents >> in_chunk_order);
@@ -401,8 +398,8 @@ static long memory_exchange(XEN_GUEST_HA
         /* Allocate a chunk's worth of anonymous output pages. */
         for ( j = 0; j < (1UL << out_chunk_order); j++ )
         {
-            page = __alloc_domheap_pages(
-                NULL, cpu, exch.out.extent_order, memflags);
+            page = alloc_domheap_pages(
+                NULL, exch.out.extent_order, memflags);
             if ( unlikely(page == NULL) )
             {
                 rc = -ENOMEM;
diff -r db943e8d1051 xen/common/page_alloc.c
--- a/xen/common/page_alloc.c   Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/common/page_alloc.c   Thu Apr 03 12:25:50 2008 +0200
@@ -36,6 +36,7 @@
 #include <xen/numa.h>
 #include <xen/nodemask.h>
 #include <asm/page.h>
+#include <asm/numa.h>
 #include <asm/flushtlb.h>
 
 /*
@@ -328,14 +329,15 @@ static void init_node_heap(int node)
 /* Allocate 2^@order contiguous pages. */
 static struct page_info *alloc_heap_pages(
     unsigned int zone_lo, unsigned int zone_hi,
-    unsigned int cpu, unsigned int order)
+    unsigned int node, unsigned int order)
 {
     unsigned int i, j, zone;
-    unsigned int node = cpu_to_node(cpu), num_nodes = num_online_nodes();
+    unsigned int num_nodes = num_online_nodes();
     unsigned long request = 1UL << order;
     cpumask_t extra_cpus_mask, mask;
     struct page_info *pg;
 
+    if ( node == NUMA_NO_NODE ) node = cpu_to_node(smp_processor_id());
     ASSERT(node >= 0);
     ASSERT(node < num_nodes);
     ASSERT(zone_lo <= zone_hi);
@@ -670,7 +672,8 @@ void *alloc_xenheap_pages(unsigned int o
 
     ASSERT(!in_irq());
 
-    pg = alloc_heap_pages(MEMZONE_XEN, MEMZONE_XEN, smp_processor_id(), order);
+    pg = alloc_heap_pages(MEMZONE_XEN, MEMZONE_XEN, 
+        cpu_to_node(smp_processor_id()), order);
     if ( unlikely(pg == NULL) )
         goto no_memory;
 
@@ -778,12 +781,12 @@ int assign_pages(
 }
 
 
-struct page_info *__alloc_domheap_pages(
-    struct domain *d, unsigned int cpu, unsigned int order, 
-    unsigned int memflags)
+struct page_info *alloc_domheap_pages(
+    struct domain *d, unsigned int order, unsigned int memflags)
 {
     struct page_info *pg = NULL;
     unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1;
+    unsigned int node = (((memflags >> _MEMF_node)&0xFF) - 1 ) &0xFF;
 
     ASSERT(!in_irq());
 
@@ -797,7 +800,7 @@ struct page_info *__alloc_domheap_pages(
 
     if ( (zone_hi + PAGE_SHIFT) >= dma_bitsize )
     {
-        pg = alloc_heap_pages(dma_bitsize - PAGE_SHIFT, zone_hi, cpu, order);
+        pg = alloc_heap_pages(dma_bitsize - PAGE_SHIFT, zone_hi, node, order);
 
         /* Failure? Then check if we can fall back to the DMA pool. */
         if ( unlikely(pg == NULL) &&
@@ -811,7 +814,7 @@ struct page_info *__alloc_domheap_pages(
 
     if ( (pg == NULL) &&
          ((pg = alloc_heap_pages(MEMZONE_XEN + 1, zone_hi,
-                                 cpu, order)) == NULL) )
+                                 node, order)) == NULL) )
          return NULL;
 
     if ( (d != NULL) && assign_pages(d, pg, order, memflags) )
@@ -821,12 +824,6 @@ struct page_info *__alloc_domheap_pages(
     }
     
     return pg;
-}
-
-struct page_info *alloc_domheap_pages(
-    struct domain *d, unsigned int order, unsigned int flags)
-{
-    return __alloc_domheap_pages(d, smp_processor_id(), order, flags);
 }
 
 void free_domheap_pages(struct page_info *pg, unsigned int order)
diff -r db943e8d1051 xen/drivers/passthrough/vtd/iommu.c
--- a/xen/drivers/passthrough/vtd/iommu.c       Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/drivers/passthrough/vtd/iommu.c       Thu Apr 03 12:25:50 2008 +0200
@@ -24,6 +24,7 @@
 #include <xen/xmalloc.h>
 #include <xen/domain_page.h>
 #include <xen/iommu.h>
+#include <asm/numa.h>
 #include "iommu.h"
 #include "dmar.h"
 #include "../pci-direct.h"
@@ -269,7 +270,8 @@ static struct page_info *addr_to_dma_pag
 
         if ( dma_pte_addr(*pte) == 0 )
         {
-            pg = alloc_domheap_page(NULL);
+            pg = alloc_domheap_page(NULL,
+                MEMF_node(domain_to_node(domain)));
             vaddr = map_domain_page(page_to_mfn(pg));
             if ( !vaddr )
             {
diff -r db943e8d1051 xen/include/asm-x86/numa.h
--- a/xen/include/asm-x86/numa.h        Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/include/asm-x86/numa.h        Thu Apr 03 12:25:50 2008 +0200
@@ -4,11 +4,16 @@
 #include <xen/cpumask.h>
 
 #define NODES_SHIFT 6
+#define NUMA_NO_NODE 0xff
 
 extern unsigned char cpu_to_node[];
 extern cpumask_t     node_to_cpumask[];
 
 #define cpu_to_node(cpu)               (cpu_to_node[cpu])
+#define domain_to_node(domain)  ((domain!=NULL && domain->vcpu[0]!=NULL)?\
+                                  cpu_to_node[domain->vcpu[0]->processor]:\
+                                  NUMA_NO_NODE)
+#define vcpu_to_node(vcpu)             (cpu_to_node[v->processor])
 #define parent_node(node)              (node)
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)    (node_to_cpumask[node])
@@ -73,6 +78,5 @@ static inline __attribute__((pure)) int 
 #define clear_node_cpumask(cpu) do {} while (0)
 #endif
 
-#define NUMA_NO_NODE 0xff
 
 #endif
diff -r db943e8d1051 xen/include/xen/mm.h
--- a/xen/include/xen/mm.h      Tue Apr 01 10:09:33 2008 +0100
+++ b/xen/include/xen/mm.h      Thu Apr 03 12:25:50 2008 +0200
@@ -54,14 +54,11 @@ void init_domheap_pages(paddr_t ps, padd
 void init_domheap_pages(paddr_t ps, paddr_t pe);
 struct page_info *alloc_domheap_pages(
     struct domain *d, unsigned int order, unsigned int memflags);
-struct page_info *__alloc_domheap_pages(
-    struct domain *d, unsigned int cpu, unsigned int order, 
-    unsigned int memflags);
 void free_domheap_pages(struct page_info *pg, unsigned int order);
 unsigned long avail_domheap_pages_region(
     unsigned int node, unsigned int min_width, unsigned int max_width);
 unsigned long avail_domheap_pages(void);
-#define alloc_domheap_page(d) (alloc_domheap_pages(d,0,0))
+#define alloc_domheap_page(d,f) (alloc_domheap_pages(d,0,f))
 #define free_domheap_page(p)  (free_domheap_pages(p,0))
 
 void scrub_heap_pages(void);
@@ -75,6 +72,8 @@ int assign_pages(
 /* memflags: */
 #define _MEMF_no_refcount 0
 #define  MEMF_no_refcount (1U<<_MEMF_no_refcount)
+#define _MEMF_node        8
+#define  MEMF_node(n)     ((((n)+1)&0xff)<<_MEMF_node)
 #define _MEMF_bits        24
 #define  MEMF_bits(n)     ((n)<<_MEMF_bits)
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel