WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-changelog

[Xen-changelog] [xen-unstable] [POWERPC][XEN] Merge with xen-unstable.hg

To: xen-changelog@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-changelog] [xen-unstable] [POWERPC][XEN] Merge with xen-unstable.hg.
From: Xen patchbot-unstable <patchbot-unstable@xxxxxxxxxxxxxxxxxxx>
Date: Thu, 22 Mar 2007 08:00:28 -0700
Delivery-date: Thu, 22 Mar 2007 08:00:12 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-changelog-request@lists.xensource.com?subject=help>
List-id: BK change log <xen-changelog.lists.xensource.com>
List-post: <mailto:xen-changelog@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=unsubscribe>
Reply-to: xen-devel@xxxxxxxxxxxxxxxxxxx
Sender: xen-changelog-bounces@xxxxxxxxxxxxxxxxxxx
# HG changeset patch
# User Hollis Blanchard <hollisb@xxxxxxxxxx>
# Date 1173386392 21600
# Node ID dcec453681bc85f9f6f3aa49431ded9a63aa1c9b
# Parent  8f0b5295bb1bd66c9e5c86368845bdb055b3d86c
# Parent  38513d22d23420a90f94e7e0f70c564100e83851
[POWERPC][XEN] Merge with xen-unstable.hg.
Signed-off-by: Hollis Blanchard <hollisb@xxxxxxxxxx>
---
 linux-2.6-xen-sparse/arch/i386/kernel/alternative-xen.c          |  389 -----
 linux-2.6-xen-sparse/arch/i386/kernel/cpu/intel_cacheinfo-xen.c  |  774 
----------
 linux-2.6-xen-sparse/arch/x86_64/kernel/x8664_ksyms-xen.c        |   59 
 linux-2.6-xen-sparse/mm/Kconfig                                  |  157 --
 tools/ptsname/Makefile                                           |   22 
 tools/ptsname/ptsname.c                                          |   44 
 tools/ptsname/setup.py                                           |   11 
 xen/arch/x86/mm/shadow/page-guest32.h                            |  100 -
 Config.mk                                                        |    7 
 config/StdGNU.mk                                                 |    6 
 config/SunOS.mk                                                  |    6 
 docs/misc/dump-core-format.txt                                   |   14 
 linux-2.6-xen-sparse/arch/i386/Kconfig                           |    6 
 linux-2.6-xen-sparse/arch/i386/Kconfig.cpu                       |    4 
 linux-2.6-xen-sparse/arch/i386/kernel/entry-xen.S                |   59 
 linux-2.6-xen-sparse/arch/i386/kernel/head-xen.S                 |    8 
 linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c                |    3 
 linux-2.6-xen-sparse/arch/i386/kernel/time-xen.c                 |   13 
 linux-2.6-xen-sparse/arch/i386/mm/ioremap-xen.c                  |    4 
 linux-2.6-xen-sparse/arch/i386/mm/pgtable-xen.c                  |   58 
 linux-2.6-xen-sparse/arch/x86_64/Kconfig                         |    2 
 linux-2.6-xen-sparse/arch/x86_64/kernel/Makefile                 |    2 
 linux-2.6-xen-sparse/arch/x86_64/kernel/entry-xen.S              |   83 -
 linux-2.6-xen-sparse/arch/x86_64/kernel/head-xen.S               |   25 
 linux-2.6-xen-sparse/arch/x86_64/kernel/init_task.c              |    2 
 linux-2.6-xen-sparse/arch/x86_64/kernel/setup-xen.c              |   27 
 linux-2.6-xen-sparse/arch/x86_64/mm/init-xen.c                   |  218 +-
 linux-2.6-xen-sparse/arch/x86_64/mm/pageattr-xen.c               |   11 
 linux-2.6-xen-sparse/drivers/char/mem.c                          |    2 
 linux-2.6-xen-sparse/drivers/char/tpm/tpm_xen.c                  |  192 +-
 linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c                 |    2 
 linux-2.6-xen-sparse/drivers/xen/char/mem.c                      |    2 
 linux-2.6-xen-sparse/drivers/xen/core/machine_reboot.c           |   55 
 linux-2.6-xen-sparse/drivers/xen/evtchn/evtchn.c                 |    2 
 linux-2.6-xen-sparse/drivers/xen/netback/common.h                |   15 
 linux-2.6-xen-sparse/drivers/xen/netback/interface.c             |   18 
 linux-2.6-xen-sparse/drivers/xen/netback/netback.c               |   21 
 linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c                |    4 
 linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c             |   64 
 linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c               |   11 
 linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c               |    2 
 linux-2.6-xen-sparse/drivers/xen/tpmback/tpmback.c               |    2 
 linux-2.6-xen-sparse/drivers/xen/xenbus/xenbus_dev.c             |   54 
 linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/page.h        |   17 
 linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/system.h      |    2 
 linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/fixmap.h    |    5 
 linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/io.h        |    6 
 linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/pgtable.h   |   13 
 linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/processor.h |    2 
 linux-2.6-xen-sparse/include/linux/page-flags.h                  |    6 
 linux-2.6-xen-sparse/include/xen/cpu_hotplug.h                   |    2 
 patches/linux-2.6.18/blktap-aio-16_03_06.patch                   |    5 
 tools/Makefile                                                   |    9 
 tools/blktap/lib/Makefile                                        |    4 
 tools/console/Makefile                                           |    4 
 tools/firmware/rombios/rombios.c                                 |  102 -
 tools/guest-headers/Makefile                                     |    2 
 tools/ioemu/Makefile                                             |    8 
 tools/ioemu/hw/ide.c                                             |  115 +
 tools/ioemu/target-i386-dm/qemu-ifup                             |    3 
 tools/ioemu/vl.c                                                 |    8 
 tools/ioemu/vnc.c                                                |    8 
 tools/libxc/xc_core.c                                            |   26 
 tools/libxc/xc_core.h                                            |    4 
 tools/libxc/xc_core_ia64.c                                       |    4 
 tools/libxc/xc_core_ia64.h                                       |    2 
 tools/libxc/xc_core_x86.h                                        |    2 
 tools/libxc/xc_dom_core.c                                        |    3 
 tools/libxc/xc_linux_restore.c                                   |   24 
 tools/libxc/xc_linux_save.c                                      |   16 
 tools/libxc/xc_ptrace_core.c                                     |   24 
 tools/libxen/Makefile                                            |    4 
 tools/pygrub/Makefile                                            |    7 
 tools/python/Makefile                                            |    3 
 tools/python/ptsname/ptsname.c                                   |   44 
 tools/python/setup.py                                            |    9 
 tools/python/xen/xend/XendBootloader.py                          |    3 
 tools/python/xen/xend/XendDomainInfo.py                          |   14 
 tools/python/xen/xend/XendNode.py                                |   41 
 tools/security/Makefile                                          |   35 
 tools/vnet/libxutil/Makefile                                     |    2 
 tools/xenfb/Makefile                                             |    9 
 tools/xenfb/xenfb.c                                              |    5 
 tools/xenstore/Makefile                                          |   12 
 xen/Rules.mk                                                     |    8 
 xen/arch/x86/domain.c                                            |   25 
 xen/arch/x86/domain_build.c                                      |    4 
 xen/arch/x86/domctl.c                                            |    7 
 xen/arch/x86/hvm/hvm.c                                           |    3 
 xen/arch/x86/hvm/svm/emulate.c                                   |   22 
 xen/arch/x86/hvm/svm/svm.c                                       |  388 +++--
 xen/arch/x86/hvm/svm/vmcb.c                                      |   10 
 xen/arch/x86/mm.c                                                |    5 
 xen/arch/x86/mm/Makefile                                         |    1 
 xen/arch/x86/mm/hap/Makefile                                     |    2 
 xen/arch/x86/mm/hap/hap.c                                        |  708 
+++++++++
 xen/arch/x86/mm/hap/private.h                                    |  112 +
 xen/arch/x86/mm/hap/support.c                                    |  334 ++++
 xen/arch/x86/mm/page-guest32.h                                   |  100 +
 xen/arch/x86/mm/paging.c                                         |   34 
 xen/arch/x86/mm/shadow/common.c                                  |   21 
 xen/arch/x86/mm/shadow/private.h                                 |    4 
 xen/arch/x86/mm/shadow/types.h                                   |    2 
 xen/common/event_channel.c                                       |    3 
 xen/common/page_alloc.c                                          |  165 --
 xen/common/xmalloc.c                                             |    6 
 xen/drivers/acpi/numa.c                                          |    9 
 xen/drivers/char/console.c                                       |    2 
 xen/include/acm/acm_hooks.h                                      |    4 
 xen/include/asm-x86/domain.h                                     |   16 
 xen/include/asm-x86/hap.h                                        |  122 +
 xen/include/asm-x86/hvm/svm/emulate.h                            |   10 
 xen/include/asm-x86/hvm/svm/svm.h                                |   35 
 xen/include/asm-x86/hvm/svm/vmcb.h                               |   16 
 xen/include/public/arch-x86/xen.h                                |    1 
 xen/include/public/xen.h                                         |   19 
 116 files changed, 2946 insertions(+), 2426 deletions(-)

diff -r 8f0b5295bb1b -r dcec453681bc Config.mk
--- a/Config.mk Mon Mar 05 12:49:12 2007 -0600
+++ b/Config.mk Thu Mar 08 14:39:52 2007 -0600
@@ -73,9 +73,10 @@ ACM_DEFAULT_SECURITY_POLICY ?= ACM_NULL_
 ACM_DEFAULT_SECURITY_POLICY ?= ACM_NULL_POLICY
 
 # Optional components
-XENSTAT_XENTOP ?= y
-VTPM_TOOLS ?= n
+XENSTAT_XENTOP     ?= y
+VTPM_TOOLS         ?= n
 LIBXENAPI_BINDINGS ?= n
-XENFB_TOOLS ?= n
+XENFB_TOOLS        ?= n
+PYTHON_TOOLS       ?= y
 
 -include $(XEN_ROOT)/.config
diff -r 8f0b5295bb1b -r dcec453681bc config/StdGNU.mk
--- a/config/StdGNU.mk  Mon Mar 05 12:49:12 2007 -0600
+++ b/config/StdGNU.mk  Thu Mar 08 14:39:52 2007 -0600
@@ -12,9 +12,9 @@ MSGFMT     = msgfmt
 MSGFMT     = msgfmt
 
 INSTALL      = install
-INSTALL_DIR  = $(INSTALL) -d -m0755
-INSTALL_DATA = $(INSTALL) -m0644
-INSTALL_PROG = $(INSTALL) -m0755
+INSTALL_DIR  = $(INSTALL) -d -m0755 -p
+INSTALL_DATA = $(INSTALL) -m0644 -p
+INSTALL_PROG = $(INSTALL) -m0755 -p
 
 LIB64DIR = lib64
 
diff -r 8f0b5295bb1b -r dcec453681bc config/SunOS.mk
--- a/config/SunOS.mk   Mon Mar 05 12:49:12 2007 -0600
+++ b/config/SunOS.mk   Thu Mar 08 14:39:52 2007 -0600
@@ -14,9 +14,9 @@ SHELL      = bash
 SHELL      = bash
 
 INSTALL      = ginstall
-INSTALL_DIR  = $(INSTALL) -d -m0755
-INSTALL_DATA = $(INSTALL) -m0644
-INSTALL_PROG = $(INSTALL) -m0755
+INSTALL_DIR  = $(INSTALL) -d -m0755 -p
+INSTALL_DATA = $(INSTALL) -m0644 -p
+INSTALL_PROG = $(INSTALL) -m0755 -p
 
 LIB64DIR = lib/amd64
 
diff -r 8f0b5295bb1b -r dcec453681bc docs/misc/dump-core-format.txt
--- a/docs/misc/dump-core-format.txt    Mon Mar 05 12:49:12 2007 -0600
+++ b/docs/misc/dump-core-format.txt    Thu Mar 08 14:39:52 2007 -0600
@@ -26,11 +26,12 @@ Elf header
 Elf header
 ----------
 The elf header members are set as follows
+        e_ident[EI_CLASS] = ELFCLASS64 = 2
         e_ident[EI_OSABI] = ELFOSABI_SYSV = 0
         e_type = ET_CORE = 4
-e_ident[EI_CLASS], e_ident[EI_DATA] and e_flags are set according
-to an architecture which a file is created. Other members are set as usual.
-
+ELFCLASS64 is always used independent of architecture.
+e_ident[EI_DATA] and e_flags are set according to the dumping system's
+architecture. Other members are set as usual.
 
 Sections
 --------
@@ -221,5 +222,10 @@ format_version descriptor
 
 Format version history
 ----------------------
-The currently only (major, minor) = (0, 1) is used.
+Currently only (major, minor) = (0, 1) is used.
 [When the format is changed, it would be described here.]
+
+(0, 1) update
+- EI_CLASS member of elf header was changed to ELFCLASS64 independent of
+  architecture. This is mainly for x86_32pae.
+  The format version isn't bumped because analysis tools can distinguish it.
diff -r 8f0b5295bb1b -r dcec453681bc linux-2.6-xen-sparse/arch/i386/Kconfig
--- a/linux-2.6-xen-sparse/arch/i386/Kconfig    Mon Mar 05 12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/Kconfig    Thu Mar 08 14:39:52 2007 -0600
@@ -255,7 +255,6 @@ config SCHED_SMT
 config SCHED_SMT
        bool "SMT (Hyperthreading) scheduler support"
        depends on X86_HT
-       depends on !X86_XEN
        help
          SMT scheduler support improves the CPU scheduler's decision making
          when dealing with Intel Pentium 4 chips with HyperThreading at a
@@ -311,11 +310,6 @@ config X86_VISWS_APIC
 config X86_VISWS_APIC
        bool
        depends on X86_VISWS
-       default y
-
-config X86_TSC
-       bool
-       depends on (MWINCHIP3D || MWINCHIP2 || MCRUSOE || MEFFICEON || 
MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII 
|| M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MGEODEGX1) && !X86_NUMAQ && 
!X86_XEN
        default y
 
 config X86_MCE
diff -r 8f0b5295bb1b -r dcec453681bc linux-2.6-xen-sparse/arch/i386/Kconfig.cpu
--- a/linux-2.6-xen-sparse/arch/i386/Kconfig.cpu        Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/Kconfig.cpu        Thu Mar 08 14:39:52 
2007 -0600
@@ -311,5 +311,5 @@ config X86_OOSTORE
 
 config X86_TSC
        bool
-       depends on (MWINCHIP3D || MWINCHIP2 || MCRUSOE || MEFFICEON || 
MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII 
|| M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MGEODEGX1 || MGEODE_LX) && 
!X86_NUMAQ
-       default y
+       depends on (MWINCHIP3D || MWINCHIP2 || MCRUSOE || MEFFICEON || 
MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII 
|| M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MGEODEGX1 || MGEODE_LX) && 
!X86_NUMAQ && !X86_XEN
+       default y
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/kernel/alternative-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/kernel/alternative-xen.c   Mon Mar 05 
12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,389 +0,0 @@
-#include <linux/module.h>
-#include <linux/spinlock.h>
-#include <linux/list.h>
-#include <asm/alternative.h>
-#include <asm/sections.h>
-
-static int no_replacement    = 0;
-static int smp_alt_once      = 0;
-static int debug_alternative = 0;
-
-static int __init noreplacement_setup(char *s)
-{
-       no_replacement = 1;
-       return 1;
-}
-static int __init bootonly(char *str)
-{
-       smp_alt_once = 1;
-       return 1;
-}
-static int __init debug_alt(char *str)
-{
-       debug_alternative = 1;
-       return 1;
-}
-
-__setup("noreplacement", noreplacement_setup);
-__setup("smp-alt-boot", bootonly);
-__setup("debug-alternative", debug_alt);
-
-#define DPRINTK(fmt, args...) if (debug_alternative) \
-       printk(KERN_DEBUG fmt, args)
-
-#ifdef GENERIC_NOP1
-/* Use inline assembly to define this because the nops are defined
-   as inline assembly strings in the include files and we cannot
-   get them easily into strings. */
-asm("\t.data\nintelnops: "
-       GENERIC_NOP1 GENERIC_NOP2 GENERIC_NOP3 GENERIC_NOP4 GENERIC_NOP5 
GENERIC_NOP6
-       GENERIC_NOP7 GENERIC_NOP8);
-extern unsigned char intelnops[];
-static unsigned char *intel_nops[ASM_NOP_MAX+1] = {
-       NULL,
-       intelnops,
-       intelnops + 1,
-       intelnops + 1 + 2,
-       intelnops + 1 + 2 + 3,
-       intelnops + 1 + 2 + 3 + 4,
-       intelnops + 1 + 2 + 3 + 4 + 5,
-       intelnops + 1 + 2 + 3 + 4 + 5 + 6,
-       intelnops + 1 + 2 + 3 + 4 + 5 + 6 + 7,
-};
-#endif
-
-#ifdef K8_NOP1
-asm("\t.data\nk8nops: "
-       K8_NOP1 K8_NOP2 K8_NOP3 K8_NOP4 K8_NOP5 K8_NOP6
-       K8_NOP7 K8_NOP8);
-extern unsigned char k8nops[];
-static unsigned char *k8_nops[ASM_NOP_MAX+1] = {
-       NULL,
-       k8nops,
-       k8nops + 1,
-       k8nops + 1 + 2,
-       k8nops + 1 + 2 + 3,
-       k8nops + 1 + 2 + 3 + 4,
-       k8nops + 1 + 2 + 3 + 4 + 5,
-       k8nops + 1 + 2 + 3 + 4 + 5 + 6,
-       k8nops + 1 + 2 + 3 + 4 + 5 + 6 + 7,
-};
-#endif
-
-#ifdef K7_NOP1
-asm("\t.data\nk7nops: "
-       K7_NOP1 K7_NOP2 K7_NOP3 K7_NOP4 K7_NOP5 K7_NOP6
-       K7_NOP7 K7_NOP8);
-extern unsigned char k7nops[];
-static unsigned char *k7_nops[ASM_NOP_MAX+1] = {
-       NULL,
-       k7nops,
-       k7nops + 1,
-       k7nops + 1 + 2,
-       k7nops + 1 + 2 + 3,
-       k7nops + 1 + 2 + 3 + 4,
-       k7nops + 1 + 2 + 3 + 4 + 5,
-       k7nops + 1 + 2 + 3 + 4 + 5 + 6,
-       k7nops + 1 + 2 + 3 + 4 + 5 + 6 + 7,
-};
-#endif
-
-#ifdef CONFIG_X86_64
-
-extern char __vsyscall_0;
-static inline unsigned char** find_nop_table(void)
-{
-       return k8_nops;
-}
-
-#else /* CONFIG_X86_64 */
-
-static struct nop {
-       int cpuid;
-       unsigned char **noptable;
-} noptypes[] = {
-       { X86_FEATURE_K8, k8_nops },
-       { X86_FEATURE_K7, k7_nops },
-       { -1, NULL }
-};
-
-static unsigned char** find_nop_table(void)
-{
-       unsigned char **noptable = intel_nops;
-       int i;
-
-       for (i = 0; noptypes[i].cpuid >= 0; i++) {
-               if (boot_cpu_has(noptypes[i].cpuid)) {
-                       noptable = noptypes[i].noptable;
-                       break;
-               }
-       }
-       return noptable;
-}
-
-#endif /* CONFIG_X86_64 */
-
-extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
-extern struct alt_instr __smp_alt_instructions[], __smp_alt_instructions_end[];
-extern u8 *__smp_locks[], *__smp_locks_end[];
-
-extern u8 __smp_alt_begin[], __smp_alt_end[];
-
-/* Replace instructions with better alternatives for this CPU type.
-   This runs before SMP is initialized to avoid SMP problems with
-   self modifying code. This implies that assymetric systems where
-   APs have less capabilities than the boot processor are not handled.
-   Tough. Make sure you disable such features by hand. */
-
-void apply_alternatives(struct alt_instr *start, struct alt_instr *end)
-{
-       unsigned char **noptable = find_nop_table();
-       struct alt_instr *a;
-       u8 *instr;
-       int diff, i, k;
-
-       DPRINTK("%s: alt table %p -> %p\n", __FUNCTION__, start, end);
-       for (a = start; a < end; a++) {
-               BUG_ON(a->replacementlen > a->instrlen);
-               if (!boot_cpu_has(a->cpuid))
-                       continue;
-               instr = a->instr;
-#ifdef CONFIG_X86_64
-               /* vsyscall code is not mapped yet. resolve it manually. */
-               if (instr >= (u8 *)VSYSCALL_START && instr < (u8*)VSYSCALL_END) 
{
-                       instr -= VSYSCALL_START - (unsigned long)&__vsyscall_0;
-                       DPRINTK("%s: vsyscall fixup: %p => %p\n",
-                               __FUNCTION__, a->instr, instr);
-               }
-#endif
-               memcpy(instr, a->replacement, a->replacementlen);
-               diff = a->instrlen - a->replacementlen;
-               /* Pad the rest with nops */
-               for (i = a->replacementlen; diff > 0; diff -= k, i += k) {
-                       k = diff;
-                       if (k > ASM_NOP_MAX)
-                               k = ASM_NOP_MAX;
-                       memcpy(a->instr + i, noptable[k], k);
-               }
-       }
-}
-
-#ifdef CONFIG_SMP
-
-static void alternatives_smp_save(struct alt_instr *start, struct alt_instr 
*end)
-{
-       struct alt_instr *a;
-
-       DPRINTK("%s: alt table %p-%p\n", __FUNCTION__, start, end);
-       for (a = start; a < end; a++) {
-               memcpy(a->replacement + a->replacementlen,
-                      a->instr,
-                      a->instrlen);
-       }
-}
-
-static void alternatives_smp_apply(struct alt_instr *start, struct alt_instr 
*end)
-{
-       struct alt_instr *a;
-
-       for (a = start; a < end; a++) {
-               memcpy(a->instr,
-                      a->replacement + a->replacementlen,
-                      a->instrlen);
-       }
-}
-
-static void alternatives_smp_lock(u8 **start, u8 **end, u8 *text, u8 *text_end)
-{
-       u8 **ptr;
-
-       for (ptr = start; ptr < end; ptr++) {
-               if (*ptr < text)
-                       continue;
-               if (*ptr > text_end)
-                       continue;
-               **ptr = 0xf0; /* lock prefix */
-       };
-}
-
-static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 
*text_end)
-{
-       unsigned char **noptable = find_nop_table();
-       u8 **ptr;
-
-       for (ptr = start; ptr < end; ptr++) {
-               if (*ptr < text)
-                       continue;
-               if (*ptr > text_end)
-                       continue;
-               **ptr = noptable[1][0];
-       };
-}
-
-struct smp_alt_module {
-       /* what is this ??? */
-       struct module   *mod;
-       char            *name;
-
-       /* ptrs to lock prefixes */
-       u8              **locks;
-       u8              **locks_end;
-
-       /* .text segment, needed to avoid patching init code ;) */
-       u8              *text;
-       u8              *text_end;
-
-       struct list_head next;
-};
-static LIST_HEAD(smp_alt_modules);
-static DEFINE_SPINLOCK(smp_alt);
-
-void alternatives_smp_module_add(struct module *mod, char *name,
-                                void *locks, void *locks_end,
-                                void *text,  void *text_end)
-{
-       struct smp_alt_module *smp;
-       unsigned long flags;
-
-       if (no_replacement)
-               return;
-
-       if (smp_alt_once) {
-               if (boot_cpu_has(X86_FEATURE_UP))
-                       alternatives_smp_unlock(locks, locks_end,
-                                               text, text_end);
-               return;
-       }
-
-       smp = kzalloc(sizeof(*smp), GFP_KERNEL);
-       if (NULL == smp)
-               return; /* we'll run the (safe but slow) SMP code then ... */
-
-       smp->mod        = mod;
-       smp->name       = name;
-       smp->locks      = locks;
-       smp->locks_end  = locks_end;
-       smp->text       = text;
-       smp->text_end   = text_end;
-       DPRINTK("%s: locks %p -> %p, text %p -> %p, name %s\n",
-               __FUNCTION__, smp->locks, smp->locks_end,
-               smp->text, smp->text_end, smp->name);
-
-       spin_lock_irqsave(&smp_alt, flags);
-       list_add_tail(&smp->next, &smp_alt_modules);
-       if (boot_cpu_has(X86_FEATURE_UP))
-               alternatives_smp_unlock(smp->locks, smp->locks_end,
-                                       smp->text, smp->text_end);
-       spin_unlock_irqrestore(&smp_alt, flags);
-}
-
-void alternatives_smp_module_del(struct module *mod)
-{
-       struct smp_alt_module *item;
-       unsigned long flags;
-
-       if (no_replacement || smp_alt_once)
-               return;
-
-       spin_lock_irqsave(&smp_alt, flags);
-       list_for_each_entry(item, &smp_alt_modules, next) {
-               if (mod != item->mod)
-                       continue;
-               list_del(&item->next);
-               spin_unlock_irqrestore(&smp_alt, flags);
-               DPRINTK("%s: %s\n", __FUNCTION__, item->name);
-               kfree(item);
-               return;
-       }
-       spin_unlock_irqrestore(&smp_alt, flags);
-}
-
-void alternatives_smp_switch(int smp)
-{
-       struct smp_alt_module *mod;
-       unsigned long flags;
-
-#ifdef CONFIG_LOCKDEP
-       /*
-        * A not yet fixed binutils section handling bug prevents
-        * alternatives-replacement from working reliably, so turn
-        * it off:
-        */
-       printk("lockdep: not fixing up alternatives.\n");
-       return;
-#endif
-
-       if (no_replacement || smp_alt_once)
-               return;
-       BUG_ON(!smp && (num_online_cpus() > 1));
-
-       spin_lock_irqsave(&smp_alt, flags);
-       if (smp) {
-               printk(KERN_INFO "SMP alternatives: switching to SMP code\n");
-               clear_bit(X86_FEATURE_UP, boot_cpu_data.x86_capability);
-               clear_bit(X86_FEATURE_UP, cpu_data[0].x86_capability);
-               alternatives_smp_apply(__smp_alt_instructions,
-                                      __smp_alt_instructions_end);
-               list_for_each_entry(mod, &smp_alt_modules, next)
-                       alternatives_smp_lock(mod->locks, mod->locks_end,
-                                             mod->text, mod->text_end);
-       } else {
-               printk(KERN_INFO "SMP alternatives: switching to UP code\n");
-               set_bit(X86_FEATURE_UP, boot_cpu_data.x86_capability);
-               set_bit(X86_FEATURE_UP, cpu_data[0].x86_capability);
-               apply_alternatives(__smp_alt_instructions,
-                                  __smp_alt_instructions_end);
-               list_for_each_entry(mod, &smp_alt_modules, next)
-                       alternatives_smp_unlock(mod->locks, mod->locks_end,
-                                               mod->text, mod->text_end);
-       }
-       spin_unlock_irqrestore(&smp_alt, flags);
-}
-
-#endif
-
-void __init alternative_instructions(void)
-{
-       if (no_replacement) {
-               printk(KERN_INFO "(SMP-)alternatives turned off\n");
-               free_init_pages("SMP alternatives",
-                               (unsigned long)__smp_alt_begin,
-                               (unsigned long)__smp_alt_end);
-               return;
-       }
-       apply_alternatives(__alt_instructions, __alt_instructions_end);
-
-       /* switch to patch-once-at-boottime-only mode and free the
-        * tables in case we know the number of CPUs will never ever
-        * change */
-#ifdef CONFIG_HOTPLUG_CPU
-       if (num_possible_cpus() < 2)
-               smp_alt_once = 1;
-#else
-       smp_alt_once = 1;
-#endif
-
-#ifdef CONFIG_SMP
-       if (smp_alt_once) {
-               if (1 == num_possible_cpus()) {
-                       printk(KERN_INFO "SMP alternatives: switching to UP 
code\n");
-                       set_bit(X86_FEATURE_UP, boot_cpu_data.x86_capability);
-                       set_bit(X86_FEATURE_UP, cpu_data[0].x86_capability);
-                       apply_alternatives(__smp_alt_instructions,
-                                          __smp_alt_instructions_end);
-                       alternatives_smp_unlock(__smp_locks, __smp_locks_end,
-                                               _text, _etext);
-               }
-               free_init_pages("SMP alternatives",
-                               (unsigned long)__smp_alt_begin,
-                               (unsigned long)__smp_alt_end);
-       } else {
-               alternatives_smp_save(__smp_alt_instructions,
-                                     __smp_alt_instructions_end);
-               alternatives_smp_module_add(NULL, "core kernel",
-                                           __smp_locks, __smp_locks_end,
-                                           _text, _etext);
-               alternatives_smp_switch(0);
-       }
-#endif
-}
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/kernel/cpu/intel_cacheinfo-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/kernel/cpu/intel_cacheinfo-xen.c   Mon Mar 
05 12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,774 +0,0 @@
-/*
- *      Routines to indentify caches on Intel CPU.
- *
- *      Changes:
- *      Venkatesh Pallipadi    : Adding cache identification through cpuid(4)
- *             Ashok Raj <ashok.raj@xxxxxxxxx>: Work with CPU hotplug 
infrastructure.
- *     Andi Kleen              : CPUID4 emulation on AMD.
- */
-
-#include <linux/init.h>
-#include <linux/slab.h>
-#include <linux/device.h>
-#include <linux/compiler.h>
-#include <linux/cpu.h>
-#include <linux/sched.h>
-
-#include <asm/processor.h>
-#include <asm/smp.h>
-
-#define LVL_1_INST     1
-#define LVL_1_DATA     2
-#define LVL_2          3
-#define LVL_3          4
-#define LVL_TRACE      5
-
-struct _cache_table
-{
-       unsigned char descriptor;
-       char cache_type;
-       short size;
-};
-
-/* all the cache descriptor types we care about (no TLB or trace cache 
entries) */
-static struct _cache_table cache_table[] __cpuinitdata =
-{
-       { 0x06, LVL_1_INST, 8 },        /* 4-way set assoc, 32 byte line size */
-       { 0x08, LVL_1_INST, 16 },       /* 4-way set assoc, 32 byte line size */
-       { 0x0a, LVL_1_DATA, 8 },        /* 2 way set assoc, 32 byte line size */
-       { 0x0c, LVL_1_DATA, 16 },       /* 4-way set assoc, 32 byte line size */
-       { 0x22, LVL_3,      512 },      /* 4-way set assoc, sectored cache, 64 
byte line size */
-       { 0x23, LVL_3,      1024 },     /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x25, LVL_3,      2048 },     /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x29, LVL_3,      4096 },     /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x2c, LVL_1_DATA, 32 },       /* 8-way set assoc, 64 byte line size */
-       { 0x30, LVL_1_INST, 32 },       /* 8-way set assoc, 64 byte line size */
-       { 0x39, LVL_2,      128 },      /* 4-way set assoc, sectored cache, 64 
byte line size */
-       { 0x3a, LVL_2,      192 },      /* 6-way set assoc, sectored cache, 64 
byte line size */
-       { 0x3b, LVL_2,      128 },      /* 2-way set assoc, sectored cache, 64 
byte line size */
-       { 0x3c, LVL_2,      256 },      /* 4-way set assoc, sectored cache, 64 
byte line size */
-       { 0x3d, LVL_2,      384 },      /* 6-way set assoc, sectored cache, 64 
byte line size */
-       { 0x3e, LVL_2,      512 },      /* 4-way set assoc, sectored cache, 64 
byte line size */
-       { 0x41, LVL_2,      128 },      /* 4-way set assoc, 32 byte line size */
-       { 0x42, LVL_2,      256 },      /* 4-way set assoc, 32 byte line size */
-       { 0x43, LVL_2,      512 },      /* 4-way set assoc, 32 byte line size */
-       { 0x44, LVL_2,      1024 },     /* 4-way set assoc, 32 byte line size */
-       { 0x45, LVL_2,      2048 },     /* 4-way set assoc, 32 byte line size */
-       { 0x46, LVL_3,      4096 },     /* 4-way set assoc, 64 byte line size */
-       { 0x47, LVL_3,      8192 },     /* 8-way set assoc, 64 byte line size */
-       { 0x49, LVL_3,      4096 },     /* 16-way set assoc, 64 byte line size 
*/
-       { 0x4a, LVL_3,      6144 },     /* 12-way set assoc, 64 byte line size 
*/
-       { 0x4b, LVL_3,      8192 },     /* 16-way set assoc, 64 byte line size 
*/
-       { 0x4c, LVL_3,     12288 },     /* 12-way set assoc, 64 byte line size 
*/
-       { 0x4d, LVL_3,     16384 },     /* 16-way set assoc, 64 byte line size 
*/
-       { 0x60, LVL_1_DATA, 16 },       /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x66, LVL_1_DATA, 8 },        /* 4-way set assoc, sectored cache, 64 
byte line size */
-       { 0x67, LVL_1_DATA, 16 },       /* 4-way set assoc, sectored cache, 64 
byte line size */
-       { 0x68, LVL_1_DATA, 32 },       /* 4-way set assoc, sectored cache, 64 
byte line size */
-       { 0x70, LVL_TRACE,  12 },       /* 8-way set assoc */
-       { 0x71, LVL_TRACE,  16 },       /* 8-way set assoc */
-       { 0x72, LVL_TRACE,  32 },       /* 8-way set assoc */
-       { 0x73, LVL_TRACE,  64 },       /* 8-way set assoc */
-       { 0x78, LVL_2,    1024 },       /* 4-way set assoc, 64 byte line size */
-       { 0x79, LVL_2,     128 },       /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x7a, LVL_2,     256 },       /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x7b, LVL_2,     512 },       /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x7c, LVL_2,    1024 },       /* 8-way set assoc, sectored cache, 64 
byte line size */
-       { 0x7d, LVL_2,    2048 },       /* 8-way set assoc, 64 byte line size */
-       { 0x7f, LVL_2,     512 },       /* 2-way set assoc, 64 byte line size */
-       { 0x82, LVL_2,     256 },       /* 8-way set assoc, 32 byte line size */
-       { 0x83, LVL_2,     512 },       /* 8-way set assoc, 32 byte line size */
-       { 0x84, LVL_2,    1024 },       /* 8-way set assoc, 32 byte line size */
-       { 0x85, LVL_2,    2048 },       /* 8-way set assoc, 32 byte line size */
-       { 0x86, LVL_2,     512 },       /* 4-way set assoc, 64 byte line size */
-       { 0x87, LVL_2,    1024 },       /* 8-way set assoc, 64 byte line size */
-       { 0x00, 0, 0}
-};
-
-
-enum _cache_type
-{
-       CACHE_TYPE_NULL = 0,
-       CACHE_TYPE_DATA = 1,
-       CACHE_TYPE_INST = 2,
-       CACHE_TYPE_UNIFIED = 3
-};
-
-union _cpuid4_leaf_eax {
-       struct {
-               enum _cache_type        type:5;
-               unsigned int            level:3;
-               unsigned int            is_self_initializing:1;
-               unsigned int            is_fully_associative:1;
-               unsigned int            reserved:4;
-               unsigned int            num_threads_sharing:12;
-               unsigned int            num_cores_on_die:6;
-       } split;
-       u32 full;
-};
-
-union _cpuid4_leaf_ebx {
-       struct {
-               unsigned int            coherency_line_size:12;
-               unsigned int            physical_line_partition:10;
-               unsigned int            ways_of_associativity:10;
-       } split;
-       u32 full;
-};
-
-union _cpuid4_leaf_ecx {
-       struct {
-               unsigned int            number_of_sets:32;
-       } split;
-       u32 full;
-};
-
-struct _cpuid4_info {
-       union _cpuid4_leaf_eax eax;
-       union _cpuid4_leaf_ebx ebx;
-       union _cpuid4_leaf_ecx ecx;
-       unsigned long size;
-       cpumask_t shared_cpu_map;
-};
-
-unsigned short                 num_cache_leaves;
-
-/* AMD doesn't have CPUID4. Emulate it here to report the same
-   information to the user.  This makes some assumptions about the machine:
-   No L3, L2 not shared, no SMT etc. that is currently true on AMD CPUs.
-
-   In theory the TLBs could be reported as fake type (they are in "dummy").
-   Maybe later */
-union l1_cache {
-       struct {
-               unsigned line_size : 8;
-               unsigned lines_per_tag : 8;
-               unsigned assoc : 8;
-               unsigned size_in_kb : 8;
-       };
-       unsigned val;
-};
-
-union l2_cache {
-       struct {
-               unsigned line_size : 8;
-               unsigned lines_per_tag : 4;
-               unsigned assoc : 4;
-               unsigned size_in_kb : 16;
-       };
-       unsigned val;
-};
-
-static const unsigned short assocs[] = {
-       [1] = 1, [2] = 2, [4] = 4, [6] = 8,
-       [8] = 16,
-       [0xf] = 0xffff // ??
-       };
-static const unsigned char levels[] = { 1, 1, 2 };
-static const unsigned char types[] = { 1, 2, 3 };
-
-static void __cpuinit amd_cpuid4(int leaf, union _cpuid4_leaf_eax *eax,
-                      union _cpuid4_leaf_ebx *ebx,
-                      union _cpuid4_leaf_ecx *ecx)
-{
-       unsigned dummy;
-       unsigned line_size, lines_per_tag, assoc, size_in_kb;
-       union l1_cache l1i, l1d;
-       union l2_cache l2;
-
-       eax->full = 0;
-       ebx->full = 0;
-       ecx->full = 0;
-
-       cpuid(0x80000005, &dummy, &dummy, &l1d.val, &l1i.val);
-       cpuid(0x80000006, &dummy, &dummy, &l2.val, &dummy);
-
-       if (leaf > 2 || !l1d.val || !l1i.val || !l2.val)
-               return;
-
-       eax->split.is_self_initializing = 1;
-       eax->split.type = types[leaf];
-       eax->split.level = levels[leaf];
-       eax->split.num_threads_sharing = 0;
-       eax->split.num_cores_on_die = current_cpu_data.x86_max_cores - 1;
-
-       if (leaf <= 1) {
-               union l1_cache *l1 = leaf == 0 ? &l1d : &l1i;
-               assoc = l1->assoc;
-               line_size = l1->line_size;
-               lines_per_tag = l1->lines_per_tag;
-               size_in_kb = l1->size_in_kb;
-       } else {
-               assoc = l2.assoc;
-               line_size = l2.line_size;
-               lines_per_tag = l2.lines_per_tag;
-               /* cpu_data has errata corrections for K7 applied */
-               size_in_kb = current_cpu_data.x86_cache_size;
-       }
-
-       if (assoc == 0xf)
-               eax->split.is_fully_associative = 1;
-       ebx->split.coherency_line_size = line_size - 1;
-       ebx->split.ways_of_associativity = assocs[assoc] - 1;
-       ebx->split.physical_line_partition = lines_per_tag - 1;
-       ecx->split.number_of_sets = (size_in_kb * 1024) / line_size /
-               (ebx->split.ways_of_associativity + 1) - 1;
-}
-
-static int __cpuinit cpuid4_cache_lookup(int index, struct _cpuid4_info 
*this_leaf)
-{
-       union _cpuid4_leaf_eax  eax;
-       union _cpuid4_leaf_ebx  ebx;
-       union _cpuid4_leaf_ecx  ecx;
-       unsigned                edx;
-
-       if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
-               amd_cpuid4(index, &eax, &ebx, &ecx);
-       else
-               cpuid_count(4, index, &eax.full, &ebx.full, &ecx.full,  &edx);
-       if (eax.split.type == CACHE_TYPE_NULL)
-               return -EIO; /* better error ? */
-
-       this_leaf->eax = eax;
-       this_leaf->ebx = ebx;
-       this_leaf->ecx = ecx;
-       this_leaf->size = (ecx.split.number_of_sets + 1) *
-               (ebx.split.coherency_line_size + 1) *
-               (ebx.split.physical_line_partition + 1) *
-               (ebx.split.ways_of_associativity + 1);
-       return 0;
-}
-
-/* will only be called once; __init is safe here */
-static int __init find_num_cache_leaves(void)
-{
-       unsigned int            eax, ebx, ecx, edx;
-       union _cpuid4_leaf_eax  cache_eax;
-       int                     i = -1;
-
-       do {
-               ++i;
-               /* Do cpuid(4) loop to find out num_cache_leaves */
-               cpuid_count(4, i, &eax, &ebx, &ecx, &edx);
-               cache_eax.full = eax;
-       } while (cache_eax.split.type != CACHE_TYPE_NULL);
-       return i;
-}
-
-unsigned int __cpuinit init_intel_cacheinfo(struct cpuinfo_x86 *c)
-{
-       unsigned int trace = 0, l1i = 0, l1d = 0, l2 = 0, l3 = 0; /* Cache 
sizes */
-       unsigned int new_l1d = 0, new_l1i = 0; /* Cache sizes from cpuid(4) */
-       unsigned int new_l2 = 0, new_l3 = 0, i; /* Cache sizes from cpuid(4) */
-       unsigned int l2_id = 0, l3_id = 0, num_threads_sharing, index_msb;
-#ifdef CONFIG_X86_HT
-       unsigned int cpu = (c == &boot_cpu_data) ? 0 : (c - cpu_data);
-#endif
-
-       if (c->cpuid_level > 3) {
-               static int is_initialized;
-
-               if (is_initialized == 0) {
-                       /* Init num_cache_leaves from boot CPU */
-                       num_cache_leaves = find_num_cache_leaves();
-                       is_initialized++;
-               }
-
-               /*
-                * Whenever possible use cpuid(4), deterministic cache
-                * parameters cpuid leaf to find the cache details
-                */
-               for (i = 0; i < num_cache_leaves; i++) {
-                       struct _cpuid4_info this_leaf;
-
-                       int retval;
-
-                       retval = cpuid4_cache_lookup(i, &this_leaf);
-                       if (retval >= 0) {
-                               switch(this_leaf.eax.split.level) {
-                                   case 1:
-                                       if (this_leaf.eax.split.type ==
-                                                       CACHE_TYPE_DATA)
-                                               new_l1d = this_leaf.size/1024;
-                                       else if (this_leaf.eax.split.type ==
-                                                       CACHE_TYPE_INST)
-                                               new_l1i = this_leaf.size/1024;
-                                       break;
-                                   case 2:
-                                       new_l2 = this_leaf.size/1024;
-                                       num_threads_sharing = 1 + 
this_leaf.eax.split.num_threads_sharing;
-                                       index_msb = 
get_count_order(num_threads_sharing);
-                                       l2_id = c->apicid >> index_msb;
-                                       break;
-                                   case 3:
-                                       new_l3 = this_leaf.size/1024;
-                                       num_threads_sharing = 1 + 
this_leaf.eax.split.num_threads_sharing;
-                                       index_msb = 
get_count_order(num_threads_sharing);
-                                       l3_id = c->apicid >> index_msb;
-                                       break;
-                                   default:
-                                       break;
-                               }
-                       }
-               }
-       }
-       /*
-        * Don't use cpuid2 if cpuid4 is supported. For P4, we use cpuid2 for
-        * trace cache
-        */
-       if ((num_cache_leaves == 0 || c->x86 == 15) && c->cpuid_level > 1) {
-               /* supports eax=2  call */
-               int i, j, n;
-               int regs[4];
-               unsigned char *dp = (unsigned char *)regs;
-               int only_trace = 0;
-
-               if (num_cache_leaves != 0 && c->x86 == 15)
-                       only_trace = 1;
-
-               /* Number of times to iterate */
-               n = cpuid_eax(2) & 0xFF;
-
-               for ( i = 0 ; i < n ; i++ ) {
-                       cpuid(2, &regs[0], &regs[1], &regs[2], &regs[3]);
-
-                       /* If bit 31 is set, this is an unknown format */
-                       for ( j = 0 ; j < 3 ; j++ ) {
-                               if ( regs[j] < 0 ) regs[j] = 0;
-                       }
-
-                       /* Byte 0 is level count, not a descriptor */
-                       for ( j = 1 ; j < 16 ; j++ ) {
-                               unsigned char des = dp[j];
-                               unsigned char k = 0;
-
-                               /* look up this descriptor in the table */
-                               while (cache_table[k].descriptor != 0)
-                               {
-                                       if (cache_table[k].descriptor == des) {
-                                               if (only_trace && 
cache_table[k].cache_type != LVL_TRACE)
-                                                       break;
-                                               switch 
(cache_table[k].cache_type) {
-                                               case LVL_1_INST:
-                                                       l1i += 
cache_table[k].size;
-                                                       break;
-                                               case LVL_1_DATA:
-                                                       l1d += 
cache_table[k].size;
-                                                       break;
-                                               case LVL_2:
-                                                       l2 += 
cache_table[k].size;
-                                                       break;
-                                               case LVL_3:
-                                                       l3 += 
cache_table[k].size;
-                                                       break;
-                                               case LVL_TRACE:
-                                                       trace += 
cache_table[k].size;
-                                                       break;
-                                               }
-
-                                               break;
-                                       }
-
-                                       k++;
-                               }
-                       }
-               }
-       }
-
-       if (new_l1d)
-               l1d = new_l1d;
-
-       if (new_l1i)
-               l1i = new_l1i;
-
-       if (new_l2) {
-               l2 = new_l2;
-#ifdef CONFIG_X86_HT
-               cpu_llc_id[cpu] = l2_id;
-#endif
-       }
-
-       if (new_l3) {
-               l3 = new_l3;
-#ifdef CONFIG_X86_HT
-               cpu_llc_id[cpu] = l3_id;
-#endif
-       }
-
-       if (trace)
-               printk (KERN_INFO "CPU: Trace cache: %dK uops", trace);
-       else if ( l1i )
-               printk (KERN_INFO "CPU: L1 I cache: %dK", l1i);
-
-       if (l1d)
-               printk(", L1 D cache: %dK\n", l1d);
-       else
-               printk("\n");
-
-       if (l2)
-               printk(KERN_INFO "CPU: L2 cache: %dK\n", l2);
-
-       if (l3)
-               printk(KERN_INFO "CPU: L3 cache: %dK\n", l3);
-
-       c->x86_cache_size = l3 ? l3 : (l2 ? l2 : (l1i+l1d));
-
-       return l2;
-}
-
-/* pointer to _cpuid4_info array (for each cache leaf) */
-static struct _cpuid4_info *cpuid4_info[NR_CPUS];
-#define CPUID4_INFO_IDX(x,y)    (&((cpuid4_info[x])[y]))
-
-#ifdef CONFIG_SMP
-static void __cpuinit cache_shared_cpu_map_setup(unsigned int cpu, int index)
-{
-       struct _cpuid4_info     *this_leaf, *sibling_leaf;
-       unsigned long num_threads_sharing;
-       int index_msb, i;
-       struct cpuinfo_x86 *c = cpu_data;
-
-       this_leaf = CPUID4_INFO_IDX(cpu, index);
-       num_threads_sharing = 1 + this_leaf->eax.split.num_threads_sharing;
-
-       if (num_threads_sharing == 1)
-               cpu_set(cpu, this_leaf->shared_cpu_map);
-       else {
-               index_msb = get_count_order(num_threads_sharing);
-
-               for_each_online_cpu(i) {
-                       if (c[i].apicid >> index_msb ==
-                           c[cpu].apicid >> index_msb) {
-                               cpu_set(i, this_leaf->shared_cpu_map);
-                               if (i != cpu && cpuid4_info[i])  {
-                                       sibling_leaf = CPUID4_INFO_IDX(i, 
index);
-                                       cpu_set(cpu, 
sibling_leaf->shared_cpu_map);
-                               }
-                       }
-               }
-       }
-}
-static void __cpuinit cache_remove_shared_cpu_map(unsigned int cpu, int index)
-{
-       struct _cpuid4_info     *this_leaf, *sibling_leaf;
-       int sibling;
-
-       this_leaf = CPUID4_INFO_IDX(cpu, index);
-       for_each_cpu_mask(sibling, this_leaf->shared_cpu_map) {
-               sibling_leaf = CPUID4_INFO_IDX(sibling, index); 
-               cpu_clear(cpu, sibling_leaf->shared_cpu_map);
-       }
-}
-#else
-static void __init cache_shared_cpu_map_setup(unsigned int cpu, int index) {}
-static void __init cache_remove_shared_cpu_map(unsigned int cpu, int index) {}
-#endif
-
-static void free_cache_attributes(unsigned int cpu)
-{
-       kfree(cpuid4_info[cpu]);
-       cpuid4_info[cpu] = NULL;
-}
-
-static int __cpuinit detect_cache_attributes(unsigned int cpu)
-{
-       struct _cpuid4_info     *this_leaf;
-       unsigned long           j;
-       int                     retval;
-       cpumask_t               oldmask;
-
-       if (num_cache_leaves == 0)
-               return -ENOENT;
-
-       cpuid4_info[cpu] = kmalloc(
-           sizeof(struct _cpuid4_info) * num_cache_leaves, GFP_KERNEL);
-       if (unlikely(cpuid4_info[cpu] == NULL))
-               return -ENOMEM;
-       memset(cpuid4_info[cpu], 0,
-           sizeof(struct _cpuid4_info) * num_cache_leaves);
-
-       oldmask = current->cpus_allowed;
-       retval = set_cpus_allowed(current, cpumask_of_cpu(cpu));
-       if (retval)
-               goto out;
-
-       /* Do cpuid and store the results */
-       retval = 0;
-       for (j = 0; j < num_cache_leaves; j++) {
-               this_leaf = CPUID4_INFO_IDX(cpu, j);
-               retval = cpuid4_cache_lookup(j, this_leaf);
-               if (unlikely(retval < 0))
-                       break;
-               cache_shared_cpu_map_setup(cpu, j);
-       }
-       set_cpus_allowed(current, oldmask);
-
-out:
-       if (retval)
-               free_cache_attributes(cpu);
-       return retval;
-}
-
-#ifdef CONFIG_SYSFS
-
-#include <linux/kobject.h>
-#include <linux/sysfs.h>
-
-extern struct sysdev_class cpu_sysdev_class; /* from drivers/base/cpu.c */
-
-/* pointer to kobject for cpuX/cache */
-static struct kobject * cache_kobject[NR_CPUS];
-
-struct _index_kobject {
-       struct kobject kobj;
-       unsigned int cpu;
-       unsigned short index;
-};
-
-/* pointer to array of kobjects for cpuX/cache/indexY */
-static struct _index_kobject *index_kobject[NR_CPUS];
-#define INDEX_KOBJECT_PTR(x,y)    (&((index_kobject[x])[y]))
-
-#define show_one_plus(file_name, object, val)                          \
-static ssize_t show_##file_name                                                
\
-                       (struct _cpuid4_info *this_leaf, char *buf)     \
-{                                                                      \
-       return sprintf (buf, "%lu\n", (unsigned long)this_leaf->object + val); \
-}
-
-show_one_plus(level, eax.split.level, 0);
-show_one_plus(coherency_line_size, ebx.split.coherency_line_size, 1);
-show_one_plus(physical_line_partition, ebx.split.physical_line_partition, 1);
-show_one_plus(ways_of_associativity, ebx.split.ways_of_associativity, 1);
-show_one_plus(number_of_sets, ecx.split.number_of_sets, 1);
-
-static ssize_t show_size(struct _cpuid4_info *this_leaf, char *buf)
-{
-       return sprintf (buf, "%luK\n", this_leaf->size / 1024);
-}
-
-static ssize_t show_shared_cpu_map(struct _cpuid4_info *this_leaf, char *buf)
-{
-       char mask_str[NR_CPUS];
-       cpumask_scnprintf(mask_str, NR_CPUS, this_leaf->shared_cpu_map);
-       return sprintf(buf, "%s\n", mask_str);
-}
-
-static ssize_t show_type(struct _cpuid4_info *this_leaf, char *buf) {
-       switch(this_leaf->eax.split.type) {
-           case CACHE_TYPE_DATA:
-               return sprintf(buf, "Data\n");
-               break;
-           case CACHE_TYPE_INST:
-               return sprintf(buf, "Instruction\n");
-               break;
-           case CACHE_TYPE_UNIFIED:
-               return sprintf(buf, "Unified\n");
-               break;
-           default:
-               return sprintf(buf, "Unknown\n");
-               break;
-       }
-}
-
-struct _cache_attr {
-       struct attribute attr;
-       ssize_t (*show)(struct _cpuid4_info *, char *);
-       ssize_t (*store)(struct _cpuid4_info *, const char *, size_t count);
-};
-
-#define define_one_ro(_name) \
-static struct _cache_attr _name = \
-       __ATTR(_name, 0444, show_##_name, NULL)
-
-define_one_ro(level);
-define_one_ro(type);
-define_one_ro(coherency_line_size);
-define_one_ro(physical_line_partition);
-define_one_ro(ways_of_associativity);
-define_one_ro(number_of_sets);
-define_one_ro(size);
-define_one_ro(shared_cpu_map);
-
-static struct attribute * default_attrs[] = {
-       &type.attr,
-       &level.attr,
-       &coherency_line_size.attr,
-       &physical_line_partition.attr,
-       &ways_of_associativity.attr,
-       &number_of_sets.attr,
-       &size.attr,
-       &shared_cpu_map.attr,
-       NULL
-};
-
-#define to_object(k) container_of(k, struct _index_kobject, kobj)
-#define to_attr(a) container_of(a, struct _cache_attr, attr)
-
-static ssize_t show(struct kobject * kobj, struct attribute * attr, char * buf)
-{
-       struct _cache_attr *fattr = to_attr(attr);
-       struct _index_kobject *this_leaf = to_object(kobj);
-       ssize_t ret;
-
-       ret = fattr->show ?
-               fattr->show(CPUID4_INFO_IDX(this_leaf->cpu, this_leaf->index),
-                       buf) :
-               0;
-       return ret;
-}
-
-static ssize_t store(struct kobject * kobj, struct attribute * attr,
-                    const char * buf, size_t count)
-{
-       return 0;
-}
-
-static struct sysfs_ops sysfs_ops = {
-       .show   = show,
-       .store  = store,
-};
-
-static struct kobj_type ktype_cache = {
-       .sysfs_ops      = &sysfs_ops,
-       .default_attrs  = default_attrs,
-};
-
-static struct kobj_type ktype_percpu_entry = {
-       .sysfs_ops      = &sysfs_ops,
-};
-
-static void cpuid4_cache_sysfs_exit(unsigned int cpu)
-{
-       kfree(cache_kobject[cpu]);
-       kfree(index_kobject[cpu]);
-       cache_kobject[cpu] = NULL;
-       index_kobject[cpu] = NULL;
-       free_cache_attributes(cpu);
-}
-
-static int __cpuinit cpuid4_cache_sysfs_init(unsigned int cpu)
-{
-
-       if (num_cache_leaves == 0)
-               return -ENOENT;
-
-       detect_cache_attributes(cpu);
-       if (cpuid4_info[cpu] == NULL)
-               return -ENOENT;
-
-       /* Allocate all required memory */
-       cache_kobject[cpu] = kmalloc(sizeof(struct kobject), GFP_KERNEL);
-       if (unlikely(cache_kobject[cpu] == NULL))
-               goto err_out;
-       memset(cache_kobject[cpu], 0, sizeof(struct kobject));
-
-       index_kobject[cpu] = kmalloc(
-           sizeof(struct _index_kobject ) * num_cache_leaves, GFP_KERNEL);
-       if (unlikely(index_kobject[cpu] == NULL))
-               goto err_out;
-       memset(index_kobject[cpu], 0,
-           sizeof(struct _index_kobject) * num_cache_leaves);
-
-       return 0;
-
-err_out:
-       cpuid4_cache_sysfs_exit(cpu);
-       return -ENOMEM;
-}
-
-/* Add/Remove cache interface for CPU device */
-static int __cpuinit cache_add_dev(struct sys_device * sys_dev)
-{
-       unsigned int cpu = sys_dev->id;
-       unsigned long i, j;
-       struct _index_kobject *this_object;
-       int retval = 0;
-
-       retval = cpuid4_cache_sysfs_init(cpu);
-       if (unlikely(retval < 0))
-               return retval;
-
-       cache_kobject[cpu]->parent = &sys_dev->kobj;
-       kobject_set_name(cache_kobject[cpu], "%s", "cache");
-       cache_kobject[cpu]->ktype = &ktype_percpu_entry;
-       retval = kobject_register(cache_kobject[cpu]);
-
-       for (i = 0; i < num_cache_leaves; i++) {
-               this_object = INDEX_KOBJECT_PTR(cpu,i);
-               this_object->cpu = cpu;
-               this_object->index = i;
-               this_object->kobj.parent = cache_kobject[cpu];
-               kobject_set_name(&(this_object->kobj), "index%1lu", i);
-               this_object->kobj.ktype = &ktype_cache;
-               retval = kobject_register(&(this_object->kobj));
-               if (unlikely(retval)) {
-                       for (j = 0; j < i; j++) {
-                               kobject_unregister(
-                                       &(INDEX_KOBJECT_PTR(cpu,j)->kobj));
-                       }
-                       kobject_unregister(cache_kobject[cpu]);
-                       cpuid4_cache_sysfs_exit(cpu);
-                       break;
-               }
-       }
-       return retval;
-}
-
-static void __cpuexit cache_remove_dev(struct sys_device * sys_dev)
-{
-       unsigned int cpu = sys_dev->id;
-       unsigned long i;
-
-       for (i = 0; i < num_cache_leaves; i++) {
-               cache_remove_shared_cpu_map(cpu, i);
-               kobject_unregister(&(INDEX_KOBJECT_PTR(cpu,i)->kobj));
-       }
-       kobject_unregister(cache_kobject[cpu]);
-       cpuid4_cache_sysfs_exit(cpu);
-       return;
-}
-
-static int __cpuinit cacheinfo_cpu_callback(struct notifier_block *nfb,
-                                       unsigned long action, void *hcpu)
-{
-       unsigned int cpu = (unsigned long)hcpu;
-       struct sys_device *sys_dev;
-
-       sys_dev = get_cpu_sysdev(cpu);
-       switch (action) {
-       case CPU_ONLINE:
-               cache_add_dev(sys_dev);
-               break;
-       case CPU_DEAD:
-               cache_remove_dev(sys_dev);
-               break;
-       }
-       return NOTIFY_OK;
-}
-
-static struct notifier_block __cpuinitdata cacheinfo_cpu_notifier =
-{
-    .notifier_call = cacheinfo_cpu_callback,
-};
-
-static int __cpuinit cache_sysfs_init(void)
-{
-       int i;
-
-       if (num_cache_leaves == 0)
-               return 0;
-
-       register_hotcpu_notifier(&cacheinfo_cpu_notifier);
-
-       for_each_online_cpu(i) {
-               cacheinfo_cpu_callback(&cacheinfo_cpu_notifier, CPU_ONLINE,
-                       (void *)(long)i);
-       }
-
-       return 0;
-}
-
-device_initcall(cache_sysfs_init);
-
-#endif
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/kernel/entry-xen.S
--- a/linux-2.6-xen-sparse/arch/i386/kernel/entry-xen.S Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/kernel/entry-xen.S Thu Mar 08 14:39:52 
2007 -0600
@@ -368,6 +368,7 @@ sysexit_scrit:      /**** START OF SYSEXIT CR
        movl ESI(%esp), %esi
        sysexit
 14:    __DISABLE_INTERRUPTS
+       TRACE_IRQS_OFF
 sysexit_ecrit: /**** END OF SYSEXIT CRITICAL REGION ****/
        push %esp
        call evtchn_do_upcall
@@ -427,11 +428,13 @@ restore_nocheck:
 restore_nocheck:
        movl EFLAGS(%esp), %eax
        testl $(VM_MASK|NMI_MASK), %eax
+       CFI_REMEMBER_STATE
        jnz hypervisor_iret
        shr $9, %eax                    # EAX[0] == IRET_EFLAGS.IF
        GET_VCPU_INFO
        andb evtchn_upcall_mask(%esi),%al
        andb $1,%al                     # EAX[0] == IRET_EFLAGS.IF & event_mask
+       CFI_REMEMBER_STATE
        jnz restore_all_enable_events   #        != 0 => enable event delivery
 #endif
        TRACE_IRQS_IRET
@@ -455,8 +458,8 @@ iret_exc:
        .long 1b,iret_exc
 .previous
 
+       CFI_RESTORE_STATE
 #ifndef CONFIG_XEN
-       CFI_RESTORE_STATE
 ldt_ss:
        larl OLDSS(%esp), %eax
        jnz restore_nocheck
@@ -485,14 +488,36 @@ 1:        iret
        .align 4
        .long 1b,iret_exc
 .previous
-       CFI_ENDPROC
 #else
+        ALIGN
+restore_all_enable_events:
+       TRACE_IRQS_ON
+       __ENABLE_INTERRUPTS
+scrit: /**** START OF CRITICAL REGION ****/
+       __TEST_PENDING
+       jnz  14f                        # process more events if necessary...
+       RESTORE_REGS
+       addl $4, %esp
+       CFI_ADJUST_CFA_OFFSET -4
+1:     iret
+.section __ex_table,"a"
+       .align 4
+       .long 1b,iret_exc
+.previous
+14:    __DISABLE_INTERRUPTS
+       TRACE_IRQS_OFF
+       jmp  11f
+ecrit:  /**** END OF CRITICAL REGION ****/
+
+       CFI_RESTORE_STATE
 hypervisor_iret:
        andl $~NMI_MASK, EFLAGS(%esp)
        RESTORE_REGS
        addl $4, %esp
+       CFI_ADJUST_CFA_OFFSET -4
        jmp  hypercall_page + (__HYPERVISOR_iret * 32)
 #endif
+       CFI_ENDPROC
 
        # perform work that needs to be done immediately before resumption
        ALIGN
@@ -736,7 +761,9 @@ error_code:
 # critical region we know that the entire frame is present and correct
 # so we can simply throw away the new one.
 ENTRY(hypervisor_callback)
+       RING0_INT_FRAME
        pushl %eax
+       CFI_ADJUST_CFA_OFFSET 4
        SAVE_ALL
        movl EIP(%esp),%eax
        cmpl $scrit,%eax
@@ -749,26 +776,13 @@ ENTRY(hypervisor_callback)
        ja   11f
        addl $OLDESP,%esp               # Remove eflags...ebx from stack frame.
 11:    push %esp
+       CFI_ADJUST_CFA_OFFSET 4
        call evtchn_do_upcall
        add  $4,%esp
+       CFI_ADJUST_CFA_OFFSET -4
        jmp  ret_from_intr
-
-        ALIGN
-restore_all_enable_events:
-       __ENABLE_INTERRUPTS
-scrit: /**** START OF CRITICAL REGION ****/
-       __TEST_PENDING
-       jnz  14f                        # process more events if necessary...
-       RESTORE_REGS
-       addl $4, %esp
-1:     iret
-.section __ex_table,"a"
-       .align 4
-       .long 1b,iret_exc
-.previous
-14:    __DISABLE_INTERRUPTS
-       jmp  11b
-ecrit:  /**** END OF CRITICAL REGION ****/
+       CFI_ENDPROC
+
 # [How we do the fixup]. We want to merge the current stack frame with the
 # just-interrupted frame. How we do this depends on where in the critical
 # region the interrupted handler was executing, and so how many saved
@@ -835,6 +849,7 @@ 4:  mov 16(%esp),%gs
        addl $16,%esp           # EAX != 0 => Category 2 (Bad IRET)
        jmp iret_exc
 5:     addl $16,%esp           # EAX == 0 => Category 1 (Bad segment)
+       RING0_INT_FRAME
        pushl $0
        SAVE_ALL
        jmp ret_from_exception
@@ -860,6 +875,7 @@ 9:  xorl %eax,%eax;         \
        .long 4b,9b;            \
 .previous
 #endif
+       CFI_ENDPROC
 
 ENTRY(coprocessor_error)
        RING0_INT_FRAME
@@ -1187,8 +1203,11 @@ ENDPROC(arch_unwind_init_running)
 #endif
 
 ENTRY(fixup_4gb_segment)
+       RING0_EC_FRAME
        pushl $do_fixup_4gb_segment
-       jmp error_code
+       CFI_ADJUST_CFA_OFFSET 4
+       jmp error_code
+       CFI_ENDPROC
 
 .section .rodata,"a"
 .align 4
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/kernel/head-xen.S
--- a/linux-2.6-xen-sparse/arch/i386/kernel/head-xen.S  Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/kernel/head-xen.S  Thu Mar 08 14:39:52 
2007 -0600
@@ -9,10 +9,9 @@
 #include <asm/cache.h>
 #include <asm/thread_info.h>
 #include <asm/asm-offsets.h>
+#include <asm/dwarf2.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/elfnote.h>
-
-#define _PAGE_PRESENT 0x1
 
 /*
  * References to members of the new_cpu_data structure.
@@ -65,14 +64,13 @@ ENTRY(startup_32)
 
        pushl %eax              # fake return address
        jmp start_kernel
-L6:
-       jmp L6                  # main should never return here, but
-                               # just in case, we know what happens.
 
 #define HYPERCALL_PAGE_OFFSET 0x1000
 .org HYPERCALL_PAGE_OFFSET
 ENTRY(hypercall_page)
+       CFI_STARTPROC
 .skip 0x1000
+       CFI_ENDPROC
 
 /*
  * Real beginning of normal "text" segment
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/kernel/setup-xen.c Thu Mar 08 14:39:52 
2007 -0600
@@ -1847,6 +1847,9 @@ static __init int add_pcspkr(void)
        struct platform_device *pd;
        int ret;
 
+       if (!is_initial_xendomain())
+               return 0;
+
        pd = platform_device_alloc("pcspkr", -1);
        if (!pd)
                return -ENOMEM;
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/kernel/time-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/kernel/time-xen.c  Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/kernel/time-xen.c  Thu Mar 08 14:39:52 
2007 -0600
@@ -1022,16 +1022,21 @@ void halt(void)
 }
 EXPORT_SYMBOL(halt);
 
-/* No locking required. We are only CPU running, and interrupts are off. */
+/* No locking required. Interrupts are disabled on all CPUs. */
 void time_resume(void)
 {
+       unsigned int cpu;
+
        init_cpu_khz();
 
-       get_time_values_from_xen(0);
+       for_each_online_cpu(cpu) {
+               get_time_values_from_xen(cpu);
+               per_cpu(processed_system_time, cpu) =
+                       per_cpu(shadow_time, 0).system_timestamp;
+               init_missing_ticks_accounting(cpu);
+       }
 
        processed_system_time = per_cpu(shadow_time, 0).system_timestamp;
-       per_cpu(processed_system_time, 0) = processed_system_time;
-       init_missing_ticks_accounting(0);
 
        update_wallclock();
 }
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/mm/ioremap-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/mm/ioremap-xen.c   Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/mm/ioremap-xen.c   Thu Mar 08 14:39:52 
2007 -0600
@@ -374,8 +374,6 @@ void iounmap(volatile void __iomem *addr
 }
 EXPORT_SYMBOL(iounmap);
 
-#ifdef __i386__
-
 void __init *bt_ioremap(unsigned long phys_addr, unsigned long size)
 {
        unsigned long offset, last_addr;
@@ -443,5 +441,3 @@ void __init bt_iounmap(void *addr, unsig
                --nrpages;
        }
 }
-
-#endif /* __i386__ */
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/i386/mm/pgtable-xen.c
--- a/linux-2.6-xen-sparse/arch/i386/mm/pgtable-xen.c   Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/i386/mm/pgtable-xen.c   Thu Mar 08 14:39:52 
2007 -0600
@@ -573,64 +573,67 @@ void make_pages_writable(void *va, unsig
        }
 }
 
-static inline int pgd_walk_set_prot(struct page *page, pgprot_t flags)
+static inline void pgd_walk_set_prot(struct page *page, pgprot_t flags)
 {
        unsigned long pfn = page_to_pfn(page);
-
-       if (PageHighMem(page))
-               return pgprot_val(flags) & _PAGE_RW
-                      ? test_and_clear_bit(PG_pinned, &page->flags)
-                      : !test_and_set_bit(PG_pinned, &page->flags);
-
-       BUG_ON(HYPERVISOR_update_va_mapping(
-               (unsigned long)__va(pfn << PAGE_SHIFT),
-               pfn_pte(pfn, flags), 0));
-
-       return 0;
-}
-
-static int pgd_walk(pgd_t *pgd_base, pgprot_t flags)
+       int rc;
+
+       if (PageHighMem(page)) {
+               if (pgprot_val(flags) & _PAGE_RW)
+                       clear_bit(PG_pinned, &page->flags);
+               else
+                       set_bit(PG_pinned, &page->flags);
+       } else {
+               rc = HYPERVISOR_update_va_mapping(
+                       (unsigned long)__va(pfn << PAGE_SHIFT),
+                       pfn_pte(pfn, flags), 0);
+               if (rc)
+                       BUG();
+       }
+}
+
+static void pgd_walk(pgd_t *pgd_base, pgprot_t flags)
 {
        pgd_t *pgd = pgd_base;
        pud_t *pud;
        pmd_t *pmd;
-       int    g, u, m, flush;
+       int    g, u, m, rc;
 
        if (xen_feature(XENFEAT_auto_translated_physmap))
                return 0;
 
-       for (g = 0, flush = 0; g < USER_PTRS_PER_PGD; g++, pgd++) {
+       for (g = 0; g < USER_PTRS_PER_PGD; g++, pgd++) {
                if (pgd_none(*pgd))
                        continue;
                pud = pud_offset(pgd, 0);
                if (PTRS_PER_PUD > 1) /* not folded */
-                       flush |= pgd_walk_set_prot(virt_to_page(pud),flags);
+                       pgd_walk_set_prot(virt_to_page(pud),flags);
                for (u = 0; u < PTRS_PER_PUD; u++, pud++) {
                        if (pud_none(*pud))
                                continue;
                        pmd = pmd_offset(pud, 0);
                        if (PTRS_PER_PMD > 1) /* not folded */
-                               flush |= 
pgd_walk_set_prot(virt_to_page(pmd),flags);
+                               pgd_walk_set_prot(virt_to_page(pmd),flags);
                        for (m = 0; m < PTRS_PER_PMD; m++, pmd++) {
                                if (pmd_none(*pmd))
                                        continue;
-                               flush |= 
pgd_walk_set_prot(pmd_page(*pmd),flags);
+                               pgd_walk_set_prot(pmd_page(*pmd),flags);
                        }
                }
        }
 
-       BUG_ON(HYPERVISOR_update_va_mapping(
+       rc = HYPERVISOR_update_va_mapping(
                (unsigned long)pgd_base,
                pfn_pte(virt_to_phys(pgd_base)>>PAGE_SHIFT, flags),
-               UVMF_TLB_FLUSH));
-
-       return flush;
+               UVMF_TLB_FLUSH);
+       if (rc)
+               BUG();
 }
 
 static void __pgd_pin(pgd_t *pgd)
 {
-       if (pgd_walk(pgd, PAGE_KERNEL_RO))
-               kmap_flush_unused();
+       pgd_walk(pgd, PAGE_KERNEL_RO);
+       kmap_flush_unused();
        xen_pgd_pin(__pa(pgd));
        set_bit(PG_pinned, &virt_to_page(pgd)->flags);
 }
@@ -638,8 +641,7 @@ static void __pgd_unpin(pgd_t *pgd)
 static void __pgd_unpin(pgd_t *pgd)
 {
        xen_pgd_unpin(__pa(pgd));
-       if (pgd_walk(pgd, PAGE_KERNEL))
-               kmap_flush_unused();
+       pgd_walk(pgd, PAGE_KERNEL);
        clear_bit(PG_pinned, &virt_to_page(pgd)->flags);
 }
 
diff -r 8f0b5295bb1b -r dcec453681bc linux-2.6-xen-sparse/arch/x86_64/Kconfig
--- a/linux-2.6-xen-sparse/arch/x86_64/Kconfig  Mon Mar 05 12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/Kconfig  Thu Mar 08 14:39:52 2007 -0600
@@ -450,7 +450,7 @@ config CALGARY_IOMMU
        bool "IBM Calgary IOMMU support"
        default y
        select SWIOTLB
-       depends on PCI && EXPERIMENTAL
+       depends on PCI && !X86_64_XEN && EXPERIMENTAL
        help
          Support for hardware IOMMUs in IBM's xSeries x366 and x460
          systems. Needed to run systems with more than 3GB of memory
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/kernel/Makefile
--- a/linux-2.6-xen-sparse/arch/x86_64/kernel/Makefile  Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/kernel/Makefile  Thu Mar 08 14:39:52 
2007 -0600
@@ -61,9 +61,7 @@ time-y                                += 
../../i386/kernel/time-xen.
 time-y                         += ../../i386/kernel/time-xen.o
 pci-dma-y                      += ../../i386/kernel/pci-dma-xen.o
 microcode-$(subst m,y,$(CONFIG_MICROCODE))  := 
../../i386/kernel/microcode-xen.o
-intel_cacheinfo-y              := ../../i386/kernel/cpu/intel_cacheinfo-xen.o
 quirks-y                       := ../../i386/kernel/quirks-xen.o
-alternative-y                  := ../../i386/kernel/alternative-xen.o
 
 n-obj-xen := i8259.o reboot.o i8237.o smpboot.o trampoline.o
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/kernel/entry-xen.S
--- a/linux-2.6-xen-sparse/arch/x86_64/kernel/entry-xen.S       Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/kernel/entry-xen.S       Thu Mar 08 
14:39:52 2007 -0600
@@ -116,19 +116,21 @@ NMI_MASK = 0x80000000
        CFI_ADJUST_CFA_OFFSET   -(6*8)
        .endm
 
-       .macro  CFI_DEFAULT_STACK start=1
+       .macro  CFI_DEFAULT_STACK start=1,adj=0
        .if \start
        CFI_STARTPROC   simple
-       CFI_DEF_CFA     rsp,SS+8
+       CFI_DEF_CFA     rsp,SS+8-(\adj*ARGOFFSET)
        .else
-       CFI_DEF_CFA_OFFSET SS+8
+       CFI_DEF_CFA_OFFSET SS+8-(\adj*ARGOFFSET)
        .endif
+       .if \adj == 0
        CFI_REL_OFFSET  r15,R15
        CFI_REL_OFFSET  r14,R14
        CFI_REL_OFFSET  r13,R13
        CFI_REL_OFFSET  r12,R12
        CFI_REL_OFFSET  rbp,RBP
        CFI_REL_OFFSET  rbx,RBX
+       .endif
        CFI_REL_OFFSET  r11,R11
        CFI_REL_OFFSET  r10,R10
        CFI_REL_OFFSET  r9,R9
@@ -363,8 +365,8 @@ ENTRY(int_ret_from_sys_call)
        CFI_REL_OFFSET  r9,R9-ARGOFFSET
        CFI_REL_OFFSET  r10,R10-ARGOFFSET
        CFI_REL_OFFSET  r11,R11-ARGOFFSET
+        XEN_BLOCK_EVENTS(%rsi)
        TRACE_IRQS_OFF
-        XEN_BLOCK_EVENTS(%rsi)
        testb $3,CS-ARGOFFSET(%rsp)
         jnz 1f
         /* Need to set the proper %ss (not NULL) for ring 3 iretq */
@@ -534,7 +536,7 @@ END(stub_rt_sigreturn)
  */ 
 
 retint_check:
-       CFI_DEFAULT_STACK
+       CFI_DEFAULT_STACK adj=1
        movl threadinfo_flags(%rcx),%edx
        andl %edi,%edx
        CFI_REMEMBER_STATE
@@ -838,6 +840,7 @@ ENTRY(error_entry)
        CFI_REL_OFFSET  r15,R15
 #if 0        
        cmpl $__KERNEL_CS,CS(%rsp)
+       CFI_REMEMBER_STATE
        je  error_kernelspace
 #endif        
 error_call_handler:
@@ -864,7 +867,7 @@ error_exit:
        TRACE_IRQS_IRETQ
        jmp   retint_restore_args
 
-error_kernelspace:
+#if 0
          /*
          * We need to re-write the logic here because we don't do iretq to 
          * to return to user mode. It's still possible that we get trap/fault
@@ -872,7 +875,8 @@ error_kernelspace:
          * for example).
          *
          */           
-#if 0
+       CFI_RESTORE_STATE
+error_kernelspace:
        incl %ebx
        /* There are two places in the kernel that can potentially fault with
           usergs. Handle them here. The exception handlers after
@@ -888,11 +892,13 @@ error_kernelspace:
        cmpq $gs_change,RIP(%rsp)
         je   error_swapgs
        jmp  error_sti
-#endif        
+#endif
+       CFI_ENDPROC
 END(error_entry)
        
 ENTRY(hypervisor_callback)
        zeroentry do_hypervisor_callback
+END(hypervisor_callback)
         
 /*
  * Copied from arch/xen/i386/kernel/entry.S
@@ -909,48 +915,66 @@ ENTRY(hypervisor_callback)
 # existing activation in its critical region -- if so, we pop the current
 # activation and restart the handler using the previous one.
 ENTRY(do_hypervisor_callback)   # do_hypervisor_callback(struct *pt_regs)
+       CFI_STARTPROC
 # Since we don't modify %rdi, evtchn_do_upall(struct *pt_regs) will
 # see the correct pointer to the pt_regs
        movq %rdi, %rsp            # we don't return, adjust the stack frame
-11:    movq %gs:pda_irqstackptr,%rax
-       incl %gs:pda_irqcount
-       cmovzq %rax,%rsp
-       pushq %rdi
+       CFI_ENDPROC
+       CFI_DEFAULT_STACK
+11:    incl %gs:pda_irqcount
+       movq %rsp,%rbp
+       CFI_DEF_CFA_REGISTER rbp
+       cmovzq %gs:pda_irqstackptr,%rsp
+       pushq %rbp                      # backlink for old unwinder
        call evtchn_do_upcall
        popq %rsp
+       CFI_DEF_CFA_REGISTER rsp
        decl %gs:pda_irqcount
        jmp  error_exit
+       CFI_ENDPROC
+END(do_hypervisor_callback)
 
 #ifdef CONFIG_X86_LOCAL_APIC
 KPROBE_ENTRY(nmi)
        zeroentry do_nmi_callback
 ENTRY(do_nmi_callback)
+       CFI_STARTPROC
         addq $8, %rsp
+       CFI_ENDPROC
+       CFI_DEFAULT_STACK
         call do_nmi
         orl  $NMI_MASK,EFLAGS(%rsp)
         RESTORE_REST
         XEN_BLOCK_EVENTS(%rsi)
+       TRACE_IRQS_OFF
         GET_THREAD_INFO(%rcx)
         jmp  retint_restore_args
+       CFI_ENDPROC
        .previous .text
+END(nmi)
 #endif
 
         ALIGN
 restore_all_enable_events:  
+       CFI_DEFAULT_STACK adj=1
+       TRACE_IRQS_ON
        XEN_UNBLOCK_EVENTS(%rsi)        # %rsi is already set up...
 
 scrit: /**** START OF CRITICAL REGION ****/
        XEN_TEST_PENDING(%rsi)
+       CFI_REMEMBER_STATE
        jnz  14f                        # process more events if necessary...
        XEN_PUT_VCPU_INFO(%rsi)
         RESTORE_ARGS 0,8,0
         HYPERVISOR_IRET 0
         
+       CFI_RESTORE_STATE
 14:    XEN_LOCKED_BLOCK_EVENTS(%rsi)
        XEN_PUT_VCPU_INFO(%rsi)
        SAVE_REST
         movq %rsp,%rdi                  # set the argument again
        jmp  11b
+       CFI_ENDPROC
 ecrit:  /**** END OF CRITICAL REGION ****/
 # At this point, unlike on x86-32, we don't do the fixup to simplify the 
 # code and the stack frame is more complex on x86-64.
@@ -970,8 +994,12 @@ ecrit:  /**** END OF CRITICAL REGION ***
 # We distinguish between categories by comparing each saved segment register
 # with its current contents: any discrepancy means we in category 1.
 ENTRY(failsafe_callback)
+       _frame (RIP-0x30)
+       CFI_REL_OFFSET rcx, 0
+       CFI_REL_OFFSET r11, 8
        movw %ds,%cx
        cmpw %cx,0x10(%rsp)
+       CFI_REMEMBER_STATE
        jne 1f
        movw %es,%cx
        cmpw %cx,0x18(%rsp)
@@ -984,17 +1012,26 @@ ENTRY(failsafe_callback)
        jne 1f
        /* All segments match their saved values => Category 2 (Bad IRET). */
        movq (%rsp),%rcx
+       CFI_RESTORE rcx
        movq 8(%rsp),%r11
+       CFI_RESTORE r11
        addq $0x30,%rsp
+       CFI_ADJUST_CFA_OFFSET -0x30
        movq $11,%rdi   /* SIGSEGV */
        jmp do_exit                     
+       CFI_RESTORE_STATE
 1:     /* Segment mismatch => Category 1 (Bad segment). Retry the IRET. */
        movq (%rsp),%rcx
+       CFI_RESTORE rcx
        movq 8(%rsp),%r11
+       CFI_RESTORE r11
        addq $0x30,%rsp
+       CFI_ADJUST_CFA_OFFSET -0x30
        pushq $0
+       CFI_ADJUST_CFA_OFFSET 8
        SAVE_ALL
        jmp error_exit
+       CFI_ENDPROC
 #if 0        
         .section __ex_table,"a"
         .align 8
@@ -1117,12 +1154,12 @@ END(device_not_available)
 
        /* runs on exception stack */
 KPROBE_ENTRY(debug)
-       INTR_FRAME
-/*     pushq $0
+/*     INTR_FRAME
+       pushq $0
        CFI_ADJUST_CFA_OFFSET 8 */
        zeroentry do_debug
-/*     paranoid_exit */
-       CFI_ENDPROC
+/*     paranoidexit
+       CFI_ENDPROC */
 END(debug)
        .previous .text
 
@@ -1144,12 +1181,12 @@ END(nmi)
 #endif        
 
 KPROBE_ENTRY(int3)
-       INTR_FRAME
-/*     pushq $0
+/*     INTR_FRAME
+       pushq $0
        CFI_ADJUST_CFA_OFFSET 8 */
        zeroentry do_int3
-/*     jmp paranoid_exit1 */
-       CFI_ENDPROC
+/*     jmp paranoid_exit1
+       CFI_ENDPROC */
 END(int3)
        .previous .text
 
@@ -1193,9 +1230,11 @@ END(segment_not_present)
 
        /* runs on exception stack */
 ENTRY(stack_segment)
-       XCPT_FRAME
+/*     XCPT_FRAME
+       paranoidentry do_stack_segment */
        errorentry do_stack_segment
-       CFI_ENDPROC
+/*     jmp paranoid_exit1
+       CFI_ENDPROC */
 END(stack_segment)
 
 KPROBE_ENTRY(general_protection)
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/kernel/head-xen.S
--- a/linux-2.6-xen-sparse/arch/x86_64/kernel/head-xen.S        Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/kernel/head-xen.S        Thu Mar 08 
14:39:52 2007 -0600
@@ -22,10 +22,8 @@
 #include <asm/page.h>
 #include <asm/msr.h>
 #include <asm/cache.h>
-
+#include <asm/dwarf2.h>
 #include <xen/interface/elfnote.h>
-
-#define _PAGE_PRESENT 0x1
 
        .section .bootstrap.text, "ax", @progbits
        .code64
@@ -42,6 +40,7 @@ ENTRY(_start)
        /* rsi is pointer to startup info structure.
           pass it to C */
        movq %rsi,%rdi
+       pushq $0                # fake return address
        jmp x86_64_start_kernel
 
 ENTRY(stext)
@@ -82,7 +81,25 @@ NEXT_PAGE(level2_kernel_pgt)
        .fill   512,8,0
 
 NEXT_PAGE(hypercall_page)
-       .fill   512,8,0
+       CFI_STARTPROC
+       .rept 0x1000 / 0x20
+       .skip 1 /* push %rcx */
+       CFI_ADJUST_CFA_OFFSET   8
+       CFI_REL_OFFSET  rcx,0
+       .skip 2 /* push %r11 */
+       CFI_ADJUST_CFA_OFFSET   8
+       CFI_REL_OFFSET  rcx,0
+       .skip 5 /* mov $#,%eax */
+       .skip 2 /* syscall */
+       .skip 2 /* pop %r11 */
+       CFI_ADJUST_CFA_OFFSET -8
+       CFI_RESTORE r11
+       .skip 1 /* pop %rcx */
+       CFI_ADJUST_CFA_OFFSET -8
+       CFI_RESTORE rcx
+       .align 0x20,0 /* ret */
+       .endr
+       CFI_ENDPROC
 
 #undef NEXT_PAGE
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/kernel/init_task.c
--- a/linux-2.6-xen-sparse/arch/x86_64/kernel/init_task.c       Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/kernel/init_task.c       Thu Mar 08 
14:39:52 2007 -0600
@@ -47,11 +47,11 @@ EXPORT_SYMBOL(init_task);
  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
  */ 
 DEFINE_PER_CPU(struct tss_struct, init_tss) 
____cacheline_internodealigned_in_smp = INIT_TSS;
-#endif
 
 /* Copies of the original ist values from the tss are only accessed during
  * debugging, no special alignment required.
  */
 DEFINE_PER_CPU(struct orig_ist, orig_ist);
+#endif
 
 #define ALIGN_TO_4K __attribute__((section(".data.init_task")))
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/kernel/setup-xen.c
--- a/linux-2.6-xen-sparse/arch/x86_64/kernel/setup-xen.c       Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/kernel/setup-xen.c       Thu Mar 08 
14:39:52 2007 -0600
@@ -676,7 +676,8 @@ void __init setup_arch(char **cmdline_p)
 
        init_memory_mapping(0, (end_pfn_map << PAGE_SHIFT));
 
-       /* dmi_scan_machine(); */
+       if (is_initial_xendomain())
+               dmi_scan_machine();
 
 #ifdef CONFIG_ACPI_NUMA
        /*
@@ -691,6 +692,14 @@ void __init setup_arch(char **cmdline_p)
        contig_initmem_init(0, end_pfn);
 #endif
 
+#ifdef CONFIG_XEN
+       /*
+        * Reserve kernel, physmap, start info, initial page tables, and
+        * direct mapping.
+        */
+       reserve_bootmem_generic(__pa_symbol(&_text),
+                               (table_end << PAGE_SHIFT) - 
__pa_symbol(&_text));
+#else
        /* Reserve direct mapping */
        reserve_bootmem_generic(table_start << PAGE_SHIFT, 
                                (table_end - table_start) << PAGE_SHIFT);
@@ -699,12 +708,6 @@ void __init setup_arch(char **cmdline_p)
        reserve_bootmem_generic(__pa_symbol(&_text),
                                __pa_symbol(&_end) - __pa_symbol(&_text));
 
-#ifdef CONFIG_XEN
-       /* reserve physmap, start info and initial page tables */
-       reserve_bootmem_generic(__pa_symbol(&_end),
-                               (table_start << PAGE_SHIFT) -
-                               __pa_symbol(&_end));
-#else
        /*
         * reserve physical page 0 - it's a special BIOS page on many boxes,
         * enabling clean reboots, SMP operation, laptop functions.
@@ -1625,13 +1628,6 @@ struct seq_operations cpuinfo_op = {
        .show = show_cpuinfo,
 };
 
-static int __init run_dmi_scan(void)
-{
-       dmi_scan_machine();
-       return 0;
-}
-core_initcall(run_dmi_scan);
-
 #if defined(CONFIG_INPUT_PCSPKR) || defined(CONFIG_INPUT_PCSPKR_MODULE)
 #include <linux/platform_device.h>
 static __init int add_pcspkr(void)
@@ -1639,6 +1635,9 @@ static __init int add_pcspkr(void)
        struct platform_device *pd;
        int ret;
 
+       if (!is_initial_xendomain())
+               return 0;
+
        pd = platform_device_alloc("pcspkr", -1);
        if (!pd)
                return -ENOMEM;
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/kernel/x8664_ksyms-xen.c
--- a/linux-2.6-xen-sparse/arch/x86_64/kernel/x8664_ksyms-xen.c Mon Mar 05 
12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,59 +0,0 @@
-/* Exports for assembly files.
-   All C exports should go in the respective C files. */
-
-#include <linux/module.h>
-#include <linux/smp.h>
-
-#include <asm/semaphore.h>
-#include <asm/processor.h>
-#include <asm/uaccess.h>
-#include <asm/pgtable.h>
-
-EXPORT_SYMBOL(kernel_thread);
-
-EXPORT_SYMBOL(__down_failed);
-EXPORT_SYMBOL(__down_failed_interruptible);
-EXPORT_SYMBOL(__down_failed_trylock);
-EXPORT_SYMBOL(__up_wakeup);
-
-EXPORT_SYMBOL(__get_user_1);
-EXPORT_SYMBOL(__get_user_2);
-EXPORT_SYMBOL(__get_user_4);
-EXPORT_SYMBOL(__get_user_8);
-EXPORT_SYMBOL(__put_user_1);
-EXPORT_SYMBOL(__put_user_2);
-EXPORT_SYMBOL(__put_user_4);
-EXPORT_SYMBOL(__put_user_8);
-
-EXPORT_SYMBOL(copy_user_generic);
-EXPORT_SYMBOL(copy_from_user);
-EXPORT_SYMBOL(copy_to_user);
-
-EXPORT_SYMBOL(copy_page);
-EXPORT_SYMBOL(clear_page);
-
-#ifdef CONFIG_SMP
-extern void FASTCALL( __write_lock_failed(rwlock_t *rw));
-extern void FASTCALL( __read_lock_failed(rwlock_t *rw));
-EXPORT_SYMBOL(__write_lock_failed);
-EXPORT_SYMBOL(__read_lock_failed);
-#endif
-
-/* Export string functions. We normally rely on gcc builtin for most of these,
-   but gcc sometimes decides not to inline them. */    
-#undef memcpy
-#undef memset
-#undef memmove
-
-extern void * memset(void *,int,__kernel_size_t);
-extern void * memcpy(void *,const void *,__kernel_size_t);
-extern void * __memcpy(void *,const void *,__kernel_size_t);
-
-EXPORT_SYMBOL(memset);
-EXPORT_SYMBOL(memcpy);
-EXPORT_SYMBOL(__memcpy);
-
-EXPORT_SYMBOL(empty_zero_page);
-
-EXPORT_SYMBOL(load_gs_index);
-
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/mm/init-xen.c
--- a/linux-2.6-xen-sparse/arch/x86_64/mm/init-xen.c    Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/mm/init-xen.c    Thu Mar 08 14:39:52 
2007 -0600
@@ -102,13 +102,10 @@ static void __meminit early_make_page_re
                BUG();
 }
 
-void make_page_readonly(void *va, unsigned int feature)
+static void __make_page_readonly(void *va)
 {
        pgd_t *pgd; pud_t *pud; pmd_t *pmd; pte_t pte, *ptep;
        unsigned long addr = (unsigned long) va;
-
-       if (xen_feature(feature))
-               return;
 
        pgd = pgd_offset_k(addr);
        pud = pud_offset(pgd, addr);
@@ -120,16 +117,13 @@ void make_page_readonly(void *va, unsign
                xen_l1_entry_update(ptep, pte); /* fallback */
 
        if ((addr >= VMALLOC_START) && (addr < VMALLOC_END))
-               make_page_readonly(__va(pte_pfn(pte) << PAGE_SHIFT), feature);
-}
-
-void make_page_writable(void *va, unsigned int feature)
+               __make_page_readonly(__va(pte_pfn(pte) << PAGE_SHIFT));
+}
+
+static void __make_page_writable(void *va)
 {
        pgd_t *pgd; pud_t *pud; pmd_t *pmd; pte_t pte, *ptep;
        unsigned long addr = (unsigned long) va;
-
-       if (xen_feature(feature))
-               return;
 
        pgd = pgd_offset_k(addr);
        pud = pud_offset(pgd, addr);
@@ -141,7 +135,19 @@ void make_page_writable(void *va, unsign
                xen_l1_entry_update(ptep, pte); /* fallback */
 
        if ((addr >= VMALLOC_START) && (addr < VMALLOC_END))
-               make_page_writable(__va(pte_pfn(pte) << PAGE_SHIFT), feature);
+               __make_page_writable(__va(pte_pfn(pte) << PAGE_SHIFT));
+}
+
+void make_page_readonly(void *va, unsigned int feature)
+{
+       if (!xen_feature(feature))
+               __make_page_readonly(va);
+}
+
+void make_page_writable(void *va, unsigned int feature)
+{
+       if (!xen_feature(feature))
+               __make_page_writable(va);
 }
 
 void make_pages_readonly(void *va, unsigned nr, unsigned int feature)
@@ -150,7 +156,7 @@ void make_pages_readonly(void *va, unsig
                return;
 
        while (nr-- != 0) {
-               make_page_readonly(va, feature);
+               __make_page_readonly(va);
                va = (void*)((unsigned long)va + PAGE_SIZE);
        }
 }
@@ -161,7 +167,7 @@ void make_pages_writable(void *va, unsig
                return;
 
        while (nr-- != 0) {
-               make_page_writable(va, feature);
+               __make_page_writable(va);
                va = (void*)((unsigned long)va + PAGE_SIZE);
        }
 }
@@ -208,7 +214,11 @@ static __init void *spp_getpage(void)
        void *ptr;
        if (after_bootmem)
                ptr = (void *) get_zeroed_page(GFP_ATOMIC); 
-       else
+       else if (start_pfn < table_end) {
+               ptr = __va(start_pfn << PAGE_SHIFT);
+               start_pfn++;
+               memset(ptr, 0, PAGE_SIZE);
+       } else
                ptr = alloc_bootmem_pages(PAGE_SIZE);
        if (!ptr || ((unsigned long)ptr & ~PAGE_MASK))
                panic("set_pte_phys: cannot allocate page data %s\n", 
after_bootmem?"after bootmem":"");
@@ -299,36 +309,26 @@ static __init void set_pte_phys_ma(unsig
 
                pmd = (pmd_t *) spp_getpage(); 
                make_page_readonly(pmd, XENFEAT_writable_page_tables);
-
                set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE | _PAGE_USER));
-
                if (pmd != pmd_offset(pud, 0)) {
                        printk("PAGETABLE BUG #01! %p <-> %p\n", pmd, 
pmd_offset(pud,0));
                        return;
                }
        }
        pmd = pmd_offset(pud, vaddr);
-
        if (pmd_none(*pmd)) {
                pte = (pte_t *) spp_getpage();
                make_page_readonly(pte, XENFEAT_writable_page_tables);
-
                set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE | _PAGE_USER));
                if (pte != pte_offset_kernel(pmd, 0)) {
                        printk("PAGETABLE BUG #02!\n");
                        return;
                }
        }
-
        new_pte = pfn_pte_ma(phys >> PAGE_SHIFT, prot);
+
        pte = pte_offset_kernel(pmd, vaddr);
-
-       /* 
-        * Note that the pte page is already RO, thus we want to use
-        * xen_l1_entry_update(), not set_pte().
-        */
-       xen_l1_entry_update(pte, 
-                           pfn_pte_ma(phys >> PAGE_SHIFT, prot));
+       set_pte(pte, new_pte);
 
        /*
         * It's enough to flush this one mapping.
@@ -404,6 +404,7 @@ static inline void __set_pte(pte_t *dst,
 
 static inline int make_readonly(unsigned long paddr)
 {
+       extern char __vsyscall_0;
        int readonly = 0;
 
        /* Make new page tables read-only. */
@@ -420,25 +421,45 @@ static inline int make_readonly(unsigned
        /*
         * No need for writable mapping of kernel image. This also ensures that
         * page and descriptor tables embedded inside don't have writable
-        * mappings. 
+        * mappings. Exclude the vsyscall area here, allowing alternative
+        * instruction patching to work.
         */
-       if ((paddr >= __pa_symbol(&_text)) && (paddr < __pa_symbol(&_end)))
+       if ((paddr >= __pa_symbol(&_text)) && (paddr < __pa_symbol(&_end))
+           && !(paddr >= __pa_symbol(&__vsyscall_0)
+                && paddr < __pa_symbol(&__vsyscall_0) + PAGE_SIZE))
                readonly = 1;
 
        return readonly;
 }
 
+#ifndef CONFIG_XEN
 /* Must run before zap_low_mappings */
 __init void *early_ioremap(unsigned long addr, unsigned long size)
 {
-       return ioremap(addr, size);
+       unsigned long map = round_down(addr, LARGE_PAGE_SIZE);
+
+       /* actually usually some more */
+       if (size >= LARGE_PAGE_SIZE) {
+               printk("SMBIOS area too long %lu\n", size);
+               return NULL;
+       }
+       set_pmd(temp_mappings[0].pmd,  __pmd(map | _KERNPG_TABLE | _PAGE_PSE));
+       map += LARGE_PAGE_SIZE;
+       set_pmd(temp_mappings[1].pmd,  __pmd(map | _KERNPG_TABLE | _PAGE_PSE));
+       __flush_tlb();
+       return temp_mappings[0].address + (addr & (LARGE_PAGE_SIZE-1));
 }
 
 /* To avoid virtual aliases later */
 __init void early_iounmap(void *addr, unsigned long size)
 {
-       iounmap(addr);
-}
+       if ((void *)round_down((unsigned long)addr, LARGE_PAGE_SIZE) != 
temp_mappings[0].address)
+               printk("early_iounmap: bad address %p\n", addr);
+       set_pmd(temp_mappings[0].pmd, __pmd(0));
+       set_pmd(temp_mappings[1].pmd, __pmd(0));
+       __flush_tlb();
+}
+#endif
 
 static void __meminit
 phys_pmd_init(pmd_t *pmd, unsigned long address, unsigned long end)
@@ -638,7 +659,7 @@ static void __init extend_init_mapping(u
 
 static void __init find_early_table_space(unsigned long end)
 {
-       unsigned long puds, pmds, ptes, tables; 
+       unsigned long puds, pmds, ptes, tables;
 
        puds = (end + PUD_SIZE - 1) >> PUD_SHIFT;
        pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
@@ -656,6 +677,64 @@ static void __init find_early_table_spac
        early_printk("kernel direct mapping tables up to %lx @ %lx-%lx\n",
                end, table_start << PAGE_SHIFT,
                (table_start << PAGE_SHIFT) + tables);
+}
+
+static void xen_finish_init_mapping(void)
+{
+       unsigned long i, start, end;
+
+       /* Re-vector virtual addresses pointing into the initial
+          mapping to the just-established permanent ones. */
+       xen_start_info = __va(__pa(xen_start_info));
+       xen_start_info->pt_base = (unsigned long)
+               __va(__pa(xen_start_info->pt_base));
+       if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+               phys_to_machine_mapping =
+                       __va(__pa(xen_start_info->mfn_list));
+               xen_start_info->mfn_list = (unsigned long)
+                       phys_to_machine_mapping;
+       }
+       if (xen_start_info->mod_start)
+               xen_start_info->mod_start = (unsigned long)
+                       __va(__pa(xen_start_info->mod_start));
+
+       /* Destroy the Xen-created mappings beyond the kernel image as
+        * well as the temporary mappings created above. Prevents
+        * overlap with modules area (if init mapping is very big).
+        */
+       start = PAGE_ALIGN((unsigned long)_end);
+       end   = __START_KERNEL_map + (table_end << PAGE_SHIFT);
+       for (; start < end; start += PAGE_SIZE)
+               WARN_ON(HYPERVISOR_update_va_mapping(
+                       start, __pte_ma(0), 0));
+
+       /* Allocate pte's for initial fixmaps from 'start_pfn' allocator. */
+       table_end = ~0UL;
+
+       /*
+        * Prefetch pte's for the bt_ioremap() area. It gets used before the
+        * boot-time allocator is online, so allocate-on-demand would fail.
+        */
+       for (i = FIX_BTMAP_END; i <= FIX_BTMAP_BEGIN; i++)
+               __set_fixmap(i, 0, __pgprot(0));
+
+       /* Switch to the real shared_info page, and clear the dummy page. */
+       set_fixmap(FIX_SHARED_INFO, xen_start_info->shared_info);
+       HYPERVISOR_shared_info = (shared_info_t *)fix_to_virt(FIX_SHARED_INFO);
+       memset(empty_zero_page, 0, sizeof(empty_zero_page));
+
+       /* Set up mapping of lowest 1MB of physical memory. */
+       for (i = 0; i < NR_FIX_ISAMAPS; i++)
+               if (is_initial_xendomain())
+                       set_fixmap(FIX_ISAMAP_BEGIN - i, i * PAGE_SIZE);
+               else
+                       __set_fixmap(FIX_ISAMAP_BEGIN - i,
+                                    virt_to_mfn(empty_zero_page)
+                                    << PAGE_SHIFT,
+                                    PAGE_KERNEL_RO);
+
+       /* Disable the 'start_pfn' allocator. */
+       table_end = start_pfn;
 }
 
 /* Setup the direct mapping of the physical memory at PAGE_OFFSET.
@@ -663,7 +742,7 @@ static void __init find_early_table_spac
    physical memory. To access them they are temporarily mapped. */
 void __meminit init_memory_mapping(unsigned long start, unsigned long end)
 { 
-       unsigned long next; 
+       unsigned long next;
 
        Dprintk("init_memory_mapping\n");
 
@@ -702,31 +781,7 @@ void __meminit init_memory_mapping(unsig
 
        if (!after_bootmem) {
                BUG_ON(start_pfn != table_end);
-
-               /* Re-vector virtual addresses pointing into the initial
-                  mapping to the just-established permanent ones. */
-               xen_start_info = __va(__pa(xen_start_info));
-               xen_start_info->pt_base = (unsigned long)
-                       __va(__pa(xen_start_info->pt_base));
-               if (!xen_feature(XENFEAT_auto_translated_physmap)) {
-                       phys_to_machine_mapping =
-                               __va(__pa(xen_start_info->mfn_list));
-                       xen_start_info->mfn_list = (unsigned long)
-                               phys_to_machine_mapping;
-               }
-               if (xen_start_info->mod_start)
-                       xen_start_info->mod_start = (unsigned long)
-                               __va(__pa(xen_start_info->mod_start));
-
-               /* Destroy the Xen-created mappings beyond the kernel image as
-                * well as the temporary mappings created above. Prevents
-                * overlap with modules area (if init mapping is very big).
-                */
-               start = PAGE_ALIGN((unsigned long)_end);
-               end   = __START_KERNEL_map + (table_end << PAGE_SHIFT);
-               for (; start < end; start += PAGE_SIZE)
-                       WARN_ON(HYPERVISOR_update_va_mapping(
-                               start, __pte_ma(0), 0));
+               xen_finish_init_mapping();
        }
 
        __flush_tlb_all();
@@ -805,7 +860,6 @@ void __init paging_init(void)
 void __init paging_init(void)
 {
        unsigned long zones[MAX_NR_ZONES], holes[MAX_NR_ZONES];
-       int i;
 
        memory_present(0, 0, end_pfn);
        sparse_init();
@@ -813,22 +867,7 @@ void __init paging_init(void)
        free_area_init_node(0, NODE_DATA(0), zones,
                            __pa(PAGE_OFFSET) >> PAGE_SHIFT, holes);
 
-       /* Switch to the real shared_info page, and clear the
-        * dummy page. */
-       set_fixmap(FIX_SHARED_INFO, xen_start_info->shared_info);
-       HYPERVISOR_shared_info = (shared_info_t *)fix_to_virt(FIX_SHARED_INFO);
-       memset(empty_zero_page, 0, sizeof(empty_zero_page));
-
        init_mm.context.pinned = 1;
-
-       /* Setup mapping of lower 1st MB */
-       for (i = 0; i < NR_FIX_ISAMAPS; i++)
-               if (is_initial_xendomain())
-                       set_fixmap(FIX_ISAMAP_BEGIN - i, i * PAGE_SIZE);
-               else
-                       __set_fixmap(FIX_ISAMAP_BEGIN - i,
-                                    virt_to_mfn(empty_zero_page) << PAGE_SHIFT,
-                                    PAGE_KERNEL_RO);
 }
 #endif
 
@@ -1028,11 +1067,6 @@ void __init mem_init(void)
 
 void free_init_pages(char *what, unsigned long begin, unsigned long end)
 {
-#ifdef __DO_LATER__
-       /*
-        * Some pages can be pinned, but some are not. Unpinning such pages 
-        * triggers BUG(). 
-        */
        unsigned long addr;
 
        if (begin >= end)
@@ -1044,25 +1078,27 @@ void free_init_pages(char *what, unsigne
                init_page_count(virt_to_page(addr));
                memset((void *)(addr & ~(PAGE_SIZE-1)),
                       POISON_FREE_INITMEM, PAGE_SIZE); 
-               make_page_writable(
-                       __va(__pa(addr)), XENFEAT_writable_page_tables);
-               /*
-                * Make pages from __PAGE_OFFSET address as well
-                */
-               make_page_writable(
-                       (void *)addr, XENFEAT_writable_page_tables);
+               if (addr >= __START_KERNEL_map) {
+                       /* make_readonly() reports all kernel addresses. */
+                       __make_page_writable(__va(__pa(addr)));
+                       if (HYPERVISOR_update_va_mapping(addr, __pte(0), 0)) {
+                               pgd_t *pgd = pgd_offset_k(addr);
+                               pud_t *pud = pud_offset(pgd, addr);
+                               pmd_t *pmd = pmd_offset(pud, addr);
+                               pte_t *pte = pte_offset_kernel(pmd, addr);
+
+                               xen_l1_entry_update(pte, __pte(0)); /* fallback 
*/
+                       }
+               }
                free_page(addr);
                totalram_pages++;
        }
-#endif
 }
 
 void free_initmem(void)
 {
-#ifdef __DO_LATER__
        memset(__initdata_begin, POISON_FREE_INITDATA,
                __initdata_end - __initdata_begin);
-#endif
        free_init_pages("unused kernel memory",
                        (unsigned long)(&__init_begin),
                        (unsigned long)(&__init_end));
@@ -1125,7 +1161,7 @@ int kern_addr_valid(unsigned long addr)
        if (pgd_none(*pgd))
                return 0;
 
-       pud = pud_offset_k(pgd, addr);
+       pud = pud_offset(pgd, addr);
        if (pud_none(*pud))
                return 0; 
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/arch/x86_64/mm/pageattr-xen.c
--- a/linux-2.6-xen-sparse/arch/x86_64/mm/pageattr-xen.c        Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/arch/x86_64/mm/pageattr-xen.c        Thu Mar 08 
14:39:52 2007 -0600
@@ -24,10 +24,13 @@ static inline void mm_walk_set_prot(void
 {
        struct page *page = virt_to_page(pt);
        unsigned long pfn = page_to_pfn(page);
-
-       BUG_ON(HYPERVISOR_update_va_mapping(
-                      (unsigned long)__va(pfn << PAGE_SHIFT),
-                      pfn_pte(pfn, flags), 0));
+       int rc;
+
+       rc = HYPERVISOR_update_va_mapping(
+               (unsigned long)__va(pfn << PAGE_SHIFT),
+               pfn_pte(pfn, flags), 0);
+       if (rc)
+               BUG();
 }
 
 static void mm_walk(struct mm_struct *mm, pgprot_t flags)
diff -r 8f0b5295bb1b -r dcec453681bc linux-2.6-xen-sparse/drivers/char/mem.c
--- a/linux-2.6-xen-sparse/drivers/char/mem.c   Mon Mar 05 12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/char/mem.c   Thu Mar 08 14:39:52 2007 -0600
@@ -787,7 +787,7 @@ static const struct file_operations mem_
        .open           = open_mem,
 };
 #else
-extern struct file_operations mem_fops;
+extern const struct file_operations mem_fops;
 #endif
 
 static const struct file_operations kmem_fops = {
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/char/tpm/tpm_xen.c
--- a/linux-2.6-xen-sparse/drivers/char/tpm/tpm_xen.c   Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/char/tpm/tpm_xen.c   Thu Mar 08 14:39:52 
2007 -0600
@@ -113,14 +113,13 @@ void __exit tpmif_exit(void);
 
 
 static inline int
-tx_buffer_copy(struct tx_buffer *txb, const u8 * src, int len,
+tx_buffer_copy(struct tx_buffer *txb, const u8 *src, int len,
                int isuserbuffer)
 {
        int copied = len;
 
-       if (len > txb->size) {
+       if (len > txb->size)
                copied = txb->size;
-       }
        if (isuserbuffer) {
                if (copy_from_user(txb->data, src, copied))
                        return -EFAULT;
@@ -133,18 +132,20 @@ tx_buffer_copy(struct tx_buffer *txb, co
 
 static inline struct tx_buffer *tx_buffer_alloc(void)
 {
-       struct tx_buffer *txb = kzalloc(sizeof (struct tx_buffer),
-                                       GFP_KERNEL);
-
-       if (txb) {
-               txb->len = 0;
-               txb->size = PAGE_SIZE;
-               txb->data = (unsigned char *)__get_free_page(GFP_KERNEL);
-               if (txb->data == NULL) {
-                       kfree(txb);
-                       txb = NULL;
-               }
-       }
+       struct tx_buffer *txb;
+
+       txb = kzalloc(sizeof(struct tx_buffer), GFP_KERNEL);
+       if (!txb)
+               return NULL;
+
+       txb->len = 0;
+       txb->size = PAGE_SIZE;
+       txb->data = (unsigned char *)__get_free_page(GFP_KERNEL);
+       if (txb->data == NULL) {
+               kfree(txb);
+               txb = NULL;
+       }
+
        return txb;
 }
 
@@ -160,37 +161,41 @@ static inline void tx_buffer_free(struct
 /**************************************************************
  Utility function for the tpm_private structure
 **************************************************************/
-static inline void tpm_private_init(struct tpm_private *tp)
+static void tpm_private_init(struct tpm_private *tp)
 {
        spin_lock_init(&tp->tx_lock);
        init_waitqueue_head(&tp->wait_q);
        atomic_set(&tp->refcnt, 1);
 }
 
-static inline void tpm_private_put(void)
-{
-       if ( atomic_dec_and_test(&my_priv->refcnt)) {
-               tpmif_free_tx_buffers(my_priv);
-               kfree(my_priv);
-               my_priv = NULL;
-       }
+static void tpm_private_put(void)
+{
+       if (!atomic_dec_and_test(&my_priv->refcnt))
+               return;
+
+       tpmif_free_tx_buffers(my_priv);
+       kfree(my_priv);
+       my_priv = NULL;
 }
 
 static struct tpm_private *tpm_private_get(void)
 {
        int err;
-       if (!my_priv) {
-               my_priv = kzalloc(sizeof(struct tpm_private), GFP_KERNEL);
-               if (my_priv) {
-                       tpm_private_init(my_priv);
-                       err = tpmif_allocate_tx_buffers(my_priv);
-                       if (err < 0) {
-                               tpm_private_put();
-                       }
-               }
-       } else {
+
+       if (my_priv) {
                atomic_inc(&my_priv->refcnt);
-       }
+               return my_priv;
+       }
+
+       my_priv = kzalloc(sizeof(struct tpm_private), GFP_KERNEL);
+       if (!my_priv)
+               return NULL;
+
+       tpm_private_init(my_priv);
+       err = tpmif_allocate_tx_buffers(my_priv);
+       if (err < 0)
+               tpm_private_put();
+
        return my_priv;
 }
 
@@ -379,10 +384,8 @@ static int tpmfront_probe(struct xenbus_
                return -ENOMEM;
 
        tp->chip = init_vtpm(&dev->dev, &tvd, tp);
-
-       if (IS_ERR(tp->chip)) {
+       if (IS_ERR(tp->chip))
                return PTR_ERR(tp->chip);
-       }
 
        err = xenbus_scanf(XBT_NIL, dev->nodename,
                           "handle", "%i", &handle);
@@ -401,6 +404,7 @@ static int tpmfront_probe(struct xenbus_
                tpm_private_put();
                return err;
        }
+
        return 0;
 }
 
@@ -417,30 +421,34 @@ static int tpmfront_suspend(struct xenbu
 {
        struct tpm_private *tp = tpm_private_from_dev(&dev->dev);
        u32 ctr;
-       /* lock, so no app can send */
+
+       /* Take the lock, preventing any application from sending. */
        mutex_lock(&suspend_lock);
        tp->is_suspended = 1;
 
-       for (ctr = 0; atomic_read(&tp->tx_busy) && ctr <= 300; ctr++) {
+       for (ctr = 0; atomic_read(&tp->tx_busy); ctr++) {
                if ((ctr % 10) == 0)
                        printk("TPM-FE [INFO]: Waiting for outstanding "
                               "request.\n");
-               /*
-                * Wait for a request to be responded to.
-                */
+               /* Wait for a request to be responded to. */
                interruptible_sleep_on_timeout(&tp->wait_q, 100);
        }
-       xenbus_switch_state(dev, XenbusStateClosing);
-
-       if (atomic_read(&tp->tx_busy)) {
-               /*
-                * A temporary work-around.
-                */
-               printk("TPM-FE [WARNING]: Resetting busy flag.");
-               atomic_set(&tp->tx_busy, 0);
-       }
-
-       return 0;
+
+       return 0;
+}
+
+static int tpmfront_suspend_finish(struct tpm_private *tp)
+{
+       tp->is_suspended = 0;
+       /* Allow applications to send again. */
+       mutex_unlock(&suspend_lock);
+       return 0;
+}
+
+static int tpmfront_suspend_cancel(struct xenbus_device *dev)
+{
+       struct tpm_private *tp = tpm_private_from_dev(&dev->dev);
+       return tpmfront_suspend_finish(tp);
 }
 
 static int tpmfront_resume(struct xenbus_device *dev)
@@ -484,6 +492,7 @@ static struct xenbus_driver tpmfront = {
        .resume = tpmfront_resume,
        .otherend_changed = backend_changed,
        .suspend = tpmfront_suspend,
+       .suspend_cancel = tpmfront_suspend_cancel,
 };
 
 static void __init init_tpm_xenbus(void)
@@ -514,9 +523,8 @@ static void tpmif_free_tx_buffers(struct
 {
        unsigned int i;
 
-       for (i = 0; i < TPMIF_TX_RING_SIZE; i++) {
+       for (i = 0; i < TPMIF_TX_RING_SIZE; i++)
                tx_buffer_free(tp->tx_buffers[i]);
-       }
 }
 
 static void tpmif_rx_action(unsigned long priv)
@@ -536,9 +544,8 @@ static void tpmif_rx_action(unsigned lon
        received = tx->size;
 
        buffer = kmalloc(received, GFP_ATOMIC);
-       if (NULL == buffer) {
+       if (!buffer)
                goto exit;
-       }
 
        for (i = 0; i < TPMIF_TX_RING_SIZE && offset < received; i++) {
                struct tx_buffer *txb = tp->tx_buffers[i];
@@ -547,9 +554,8 @@ static void tpmif_rx_action(unsigned lon
 
                tx = &tp->tx->ring[i].req;
                tocopy = tx->size;
-               if (tocopy > PAGE_SIZE) {
+               if (tocopy > PAGE_SIZE)
                        tocopy = PAGE_SIZE;
-               }
 
                memcpy(&buffer[offset], txb->data, tocopy);
 
@@ -607,12 +613,13 @@ static int tpm_xmit(struct tpm_private *
                struct tx_buffer *txb = tp->tx_buffers[i];
                int copied;
 
-               if (NULL == txb) {
+               if (!txb) {
                        DPRINTK("txb (i=%d) is NULL. buffers initilized?\n"
                                "Not transmitting anything!\n", i);
                        spin_unlock_irq(&tp->tx_lock);
                        return -EFAULT;
                }
+
                copied = tx_buffer_copy(txb, &buf[offset], count,
                                        isuserbuffer);
                if (copied < 0) {
@@ -624,25 +631,26 @@ static int tpm_xmit(struct tpm_private *
                offset += copied;
 
                tx = &tp->tx->ring[i].req;
-
                tx->addr = virt_to_machine(txb->data);
                tx->size = txb->len;
 
-               DPRINTK("First 4 characters sent by TPM-FE are 0x%02x 0x%02x 
0x%02x 0x%02x\n",
+               DPRINTK("First 4 characters sent by TPM-FE are "
+                       "0x%02x 0x%02x 0x%02x 0x%02x\n",
                        txb->data[0],txb->data[1],txb->data[2],txb->data[3]);
 
-               /* get the granttable reference for this page */
+               /* Get the granttable reference for this page. */
                tx->ref = gnttab_claim_grant_reference(&gref_head);
-
-               if (-ENOSPC == tx->ref) {
+               if (tx->ref == -ENOSPC) {
                        spin_unlock_irq(&tp->tx_lock);
-                       DPRINTK(" Grant table claim reference failed in func:%s 
line:%d file:%s\n", __FUNCTION__, __LINE__, __FILE__);
+                       DPRINTK("Grant table claim reference failed in "
+                               "func:%s line:%d file:%s\n",
+                               __FUNCTION__, __LINE__, __FILE__);
                        return -ENOSPC;
                }
-               gnttab_grant_foreign_access_ref( tx->ref,
-                                                tp->backend_id,
-                                                virt_to_mfn(txb->data),
-                                                0 /*RW*/);
+               gnttab_grant_foreign_access_ref(tx->ref,
+                                               tp->backend_id,
+                                               virt_to_mfn(txb->data),
+                                               0 /*RW*/);
                wmb();
        }
 
@@ -660,15 +668,10 @@ static int tpm_xmit(struct tpm_private *
 
 static void tpmif_notify_upperlayer(struct tpm_private *tp)
 {
-       /*
-        * Notify upper layer about the state of the connection
-        * to the BE.
-        */
-       if (tp->is_connected) {
-               vtpm_vd_status(tp->chip, TPM_VD_STATUS_CONNECTED);
-       } else {
-               vtpm_vd_status(tp->chip, TPM_VD_STATUS_DISCONNECTED);
-       }
+       /* Notify upper layer about the state of the connection to the BE. */
+       vtpm_vd_status(tp->chip, (tp->is_connected
+                                 ? TPM_VD_STATUS_CONNECTED
+                                 : TPM_VD_STATUS_DISCONNECTED));
 }
 
 
@@ -679,20 +682,16 @@ static void tpmif_set_connected_state(st
         * should disconnect - assumption is that we will resume
         * The mutex keeps apps from sending.
         */
-       if (is_connected == 0 && tp->is_suspended == 1) {
+       if (is_connected == 0 && tp->is_suspended == 1)
                return;
-       }
 
        /*
         * Unlock the mutex if we are connected again
         * after being suspended - now resuming.
         * This also removes the suspend state.
         */
-       if (is_connected == 1 && tp->is_suspended == 1) {
-               tp->is_suspended = 0;
-               /* unlock, so apps can resume sending */
-               mutex_unlock(&suspend_lock);
-       }
+       if (is_connected == 1 && tp->is_suspended == 1)
+               tpmfront_suspend_finish(tp);
 
        if (is_connected != tp->is_connected) {
                tp->is_connected = is_connected;
@@ -710,33 +709,24 @@ static void tpmif_set_connected_state(st
 
 static int __init tpmif_init(void)
 {
-       long rc = 0;
        struct tpm_private *tp;
 
        if (is_initial_xendomain())
                return -EPERM;
 
        tp = tpm_private_get();
-       if (!tp) {
-               rc = -ENOMEM;
-               goto failexit;
-       }
+       if (!tp)
+               return -ENOMEM;
 
        IPRINTK("Initialising the vTPM driver.\n");
-       if ( gnttab_alloc_grant_references ( TPMIF_TX_RING_SIZE,
-                                            &gref_head ) < 0) {
-               rc = -EFAULT;
-               goto gnttab_alloc_failed;
+       if (gnttab_alloc_grant_references(TPMIF_TX_RING_SIZE,
+                                         &gref_head) < 0) {
+               tpm_private_put();
+               return -EFAULT;
        }
 
        init_tpm_xenbus();
        return 0;
-
-gnttab_alloc_failed:
-       tpm_private_put();
-failexit:
-
-       return (int)rc;
 }
 
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c
--- a/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c  Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/blktap/blktap.c  Thu Mar 08 14:39:52 
2007 -0600
@@ -335,7 +335,7 @@ static int blktap_ioctl(struct inode *in
                         unsigned int cmd, unsigned long arg);
 static unsigned int blktap_poll(struct file *file, poll_table *wait);
 
-static struct file_operations blktap_fops = {
+static const struct file_operations blktap_fops = {
        .owner   = THIS_MODULE,
        .poll    = blktap_poll,
        .ioctl   = blktap_ioctl,
diff -r 8f0b5295bb1b -r dcec453681bc linux-2.6-xen-sparse/drivers/xen/char/mem.c
--- a/linux-2.6-xen-sparse/drivers/xen/char/mem.c       Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/char/mem.c       Thu Mar 08 14:39:52 
2007 -0600
@@ -194,7 +194,7 @@ static int open_mem(struct inode * inode
        return capable(CAP_SYS_RAWIO) ? 0 : -EPERM;
 }
 
-struct file_operations mem_fops = {
+const struct file_operations mem_fops = {
        .llseek         = memory_lseek,
        .read           = read_mem,
        .write          = write_mem,
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/core/machine_reboot.c
--- a/linux-2.6-xen-sparse/drivers/xen/core/machine_reboot.c    Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/core/machine_reboot.c    Thu Mar 08 
14:39:52 2007 -0600
@@ -59,27 +59,11 @@ EXPORT_SYMBOL(machine_halt);
 EXPORT_SYMBOL(machine_halt);
 EXPORT_SYMBOL(machine_power_off);
 
-/* Ensure we run on the idle task page tables so that we will
-   switch page tables before running user space. This is needed
-   on architectures with separate kernel and user page tables
-   because the user page table pointer is not saved/restored. */
-static void switch_idle_mm(void)
-{
-       struct mm_struct *mm = current->active_mm;
-
-       if (mm == &init_mm)
-               return;
-
-       atomic_inc(&init_mm.mm_count);
-       switch_mm(mm, &init_mm, current);
-       current->active_mm = &init_mm;
-       mmdrop(mm);
-}
-
 static void pre_suspend(void)
 {
        HYPERVISOR_shared_info = (shared_info_t *)empty_zero_page;
-       clear_fixmap(FIX_SHARED_INFO);
+       HYPERVISOR_update_va_mapping(fix_to_virt(FIX_SHARED_INFO),
+                                    __pte_ma(0), 0);
 
        xen_start_info->store_mfn = mfn_to_pfn(xen_start_info->store_mfn);
        xen_start_info->console.domU.mfn =
@@ -89,6 +73,7 @@ static void post_suspend(int suspend_can
 static void post_suspend(int suspend_cancelled)
 {
        int i, j, k, fpp;
+       unsigned long shinfo_mfn;
        extern unsigned long max_pfn;
        extern unsigned long *pfn_to_mfn_frame_list_list;
        extern unsigned long *pfn_to_mfn_frame_list[];
@@ -99,11 +84,14 @@ static void post_suspend(int suspend_can
                xen_start_info->console.domU.mfn =
                        pfn_to_mfn(xen_start_info->console.domU.mfn);
        } else {
+#ifdef CONFIG_SMP
                cpu_initialized_map = cpumask_of_cpu(0);
-       }
-       
-       set_fixmap(FIX_SHARED_INFO, xen_start_info->shared_info);
-
+#endif
+       }
+
+       shinfo_mfn = xen_start_info->shared_info >> PAGE_SHIFT;
+       HYPERVISOR_update_va_mapping(fix_to_virt(FIX_SHARED_INFO),
+                                    pfn_pte_ma(shinfo_mfn, PAGE_KERNEL), 0);
        HYPERVISOR_shared_info = (shared_info_t *)fix_to_virt(FIX_SHARED_INFO);
 
        memset(empty_zero_page, 0, PAGE_SIZE);
@@ -172,10 +160,25 @@ static int take_machine_down(void *p_fas
 
        post_suspend(suspend_cancelled);
        gnttab_resume();
-       if (!suspend_cancelled)
+       if (!suspend_cancelled) {
                irq_resume();
+#ifdef __x86_64__
+               /*
+                * Older versions of Xen do not save/restore the user %cr3.
+                * We do it here just in case, but there's no need if we are
+                * in fast-suspend mode as that implies a new enough Xen.
+                */
+               if (!fast_suspend) {
+                       struct mmuext_op op;
+                       op.cmd = MMUEXT_NEW_USER_BASEPTR;
+                       op.arg1.mfn = pfn_to_mfn(__pa(__user_pgd(
+                               current->active_mm->pgd)) >> PAGE_SHIFT);
+                       if (HYPERVISOR_mmuext_op(&op, 1, NULL, DOMID_SELF))
+                               BUG();
+               }
+#endif
+       }
        time_resume();
-       switch_idle_mm();
        local_irq_enable();
 
        if (fast_suspend && !suspend_cancelled) {
@@ -210,6 +213,10 @@ int __xen_suspend(int fast_suspend)
        }
 #endif
 
+       /* If we are definitely UP then 'slow mode' is actually faster. */
+       if (num_possible_cpus() == 1)
+               fast_suspend = 0;
+
        if (fast_suspend) {
                xenbus_suspend();
                err = stop_machine_run(take_machine_down, &fast_suspend, 0);
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/evtchn/evtchn.c
--- a/linux-2.6-xen-sparse/drivers/xen/evtchn/evtchn.c  Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/evtchn/evtchn.c  Thu Mar 08 14:39:52 
2007 -0600
@@ -406,7 +406,7 @@ static int evtchn_release(struct inode *
        return 0;
 }
 
-static struct file_operations evtchn_fops = {
+static const struct file_operations evtchn_fops = {
        .owner   = THIS_MODULE,
        .read    = evtchn_read,
        .write   = evtchn_write,
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/netback/common.h
--- a/linux-2.6-xen-sparse/drivers/xen/netback/common.h Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/netback/common.h Thu Mar 08 14:39:52 
2007 -0600
@@ -99,8 +99,20 @@ typedef struct netif_st {
        struct net_device *dev;
        struct net_device_stats stats;
 
+       unsigned int carrier;
+
        wait_queue_head_t waiting_to_free;
 } netif_t;
+
+/*
+ * Implement our own carrier flag: the network stack's version causes delays
+ * when the carrier is re-enabled (in particular, dev_activate() may not
+ * immediately be called, which can cause packet loss; also the etherbridge
+ * can be rather lazy in activating its port).
+ */
+#define netback_carrier_on(netif)      ((netif)->carrier = 1)
+#define netback_carrier_off(netif)     ((netif)->carrier = 0)
+#define netback_carrier_ok(netif)      ((netif)->carrier)
 
 #define NET_TX_RING_SIZE __RING_SIZE((netif_tx_sring_t *)0, PAGE_SIZE)
 #define NET_RX_RING_SIZE __RING_SIZE((netif_rx_sring_t *)0, PAGE_SIZE)
@@ -120,7 +132,8 @@ int netif_map(netif_t *netif, unsigned l
 
 void netif_xenbus_init(void);
 
-#define netif_schedulable(dev) (netif_running(dev) && netif_carrier_ok(dev))
+#define netif_schedulable(netif)                               \
+       (netif_running((netif)->dev) && netback_carrier_ok(netif))
 
 void netif_schedule_work(netif_t *netif);
 void netif_deschedule_work(netif_t *netif);
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/netback/interface.c
--- a/linux-2.6-xen-sparse/drivers/xen/netback/interface.c      Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/netback/interface.c      Thu Mar 08 
14:39:52 2007 -0600
@@ -66,16 +66,19 @@ static int net_open(struct net_device *d
 static int net_open(struct net_device *dev)
 {
        netif_t *netif = netdev_priv(dev);
-       if (netif_carrier_ok(dev))
+       if (netback_carrier_ok(netif)) {
                __netif_up(netif);
+               netif_start_queue(dev);
+       }
        return 0;
 }
 
 static int net_close(struct net_device *dev)
 {
        netif_t *netif = netdev_priv(dev);
-       if (netif_carrier_ok(dev))
+       if (netback_carrier_ok(netif))
                __netif_down(netif);
+       netif_stop_queue(dev);
        return 0;
 }
 
@@ -138,8 +141,6 @@ netif_t *netif_alloc(domid_t domid, unsi
                return ERR_PTR(-ENOMEM);
        }
 
-       netif_carrier_off(dev);
-
        netif = netdev_priv(dev);
        memset(netif, 0, sizeof(*netif));
        netif->domid  = domid;
@@ -147,6 +148,8 @@ netif_t *netif_alloc(domid_t domid, unsi
        atomic_set(&netif->refcnt, 1);
        init_waitqueue_head(&netif->waiting_to_free);
        netif->dev = dev;
+
+       netback_carrier_off(netif);
 
        netif->credit_bytes = netif->remaining_credit = ~0UL;
        netif->credit_usec  = 0UL;
@@ -285,7 +288,7 @@ int netif_map(netif_t *netif, unsigned l
        netif_get(netif);
 
        rtnl_lock();
-       netif_carrier_on(netif->dev);
+       netback_carrier_on(netif);
        if (netif_running(netif->dev))
                __netif_up(netif);
        rtnl_unlock();
@@ -302,9 +305,10 @@ err_rx:
 
 void netif_disconnect(netif_t *netif)
 {
-       if (netif_carrier_ok(netif->dev)) {
+       if (netback_carrier_ok(netif)) {
                rtnl_lock();
-               netif_carrier_off(netif->dev);
+               netback_carrier_off(netif);
+               netif_carrier_off(netif->dev); /* discard queued packets */
                if (netif_running(netif->dev))
                        __netif_down(netif);
                rtnl_unlock();
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/netback/netback.c
--- a/linux-2.6-xen-sparse/drivers/xen/netback/netback.c        Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/netback/netback.c        Thu Mar 08 
14:39:52 2007 -0600
@@ -38,7 +38,10 @@
 #include <xen/balloon.h>
 #include <xen/interface/memory.h>
 
-/*#define NETBE_DEBUG_INTERRUPT*/
+/*define NETBE_DEBUG_INTERRUPT*/
+
+/* extra field used in struct page */
+#define netif_page_index(pg) (*(long *)&(pg)->mapping)
 
 struct netbk_rx_meta {
        skb_frag_t frag;
@@ -231,7 +234,7 @@ static void tx_queue_callback(unsigned l
 static void tx_queue_callback(unsigned long data)
 {
        netif_t *netif = (netif_t *)data;
-       if (netif_schedulable(netif->dev))
+       if (netif_schedulable(netif))
                netif_wake_queue(netif->dev);
 }
 
@@ -242,7 +245,7 @@ int netif_be_start_xmit(struct sk_buff *
        BUG_ON(skb->dev != dev);
 
        /* Drop the packet if the target domain has no receive buffers. */
-       if (unlikely(!netif_schedulable(dev) || netbk_queue_full(netif)))
+       if (unlikely(!netif_schedulable(netif) || netbk_queue_full(netif)))
                goto drop;
 
        /*
@@ -352,7 +355,7 @@ static u16 netbk_gop_frag(netif_t *netif
                copy_gop->flags = GNTCOPY_dest_gref;
                if (PageForeign(page)) {
                        struct pending_tx_info *src_pend =
-                               &pending_tx_info[page->index];
+                               &pending_tx_info[netif_page_index(page)];
                        copy_gop->source.domid = src_pend->netif->domid;
                        copy_gop->source.u.ref = src_pend->req.gref;
                        copy_gop->flags |= GNTCOPY_source_gref;
@@ -681,7 +684,7 @@ static void net_rx_action(unsigned long 
                }
 
                if (netif_queue_stopped(netif->dev) &&
-                   netif_schedulable(netif->dev) &&
+                   netif_schedulable(netif) &&
                    !netbk_queue_full(netif))
                        netif_wake_queue(netif->dev);
 
@@ -739,7 +742,7 @@ static void add_to_net_schedule_list_tai
 
        spin_lock_irq(&net_schedule_list_lock);
        if (!__on_net_schedule_list(netif) &&
-           likely(netif_schedulable(netif->dev))) {
+           likely(netif_schedulable(netif))) {
                list_add_tail(&netif->list, &net_schedule_list);
                netif_get(netif);
        }
@@ -1327,7 +1330,7 @@ static void netif_page_release(struct pa
        /* Ready for next use. */
        init_page_count(page);
 
-       netif_idx_release(page->index);
+       netif_idx_release(netif_page_index(page));
 }
 
 irqreturn_t netif_be_int(int irq, void *dev_id, struct pt_regs *regs)
@@ -1337,7 +1340,7 @@ irqreturn_t netif_be_int(int irq, void *
        add_to_net_schedule_list_tail(netif);
        maybe_schedule_tx_action();
 
-       if (netif_schedulable(netif->dev) && !netbk_queue_full(netif))
+       if (netif_schedulable(netif) && !netbk_queue_full(netif))
                netif_wake_queue(netif->dev);
 
        return IRQ_HANDLED;
@@ -1457,7 +1460,7 @@ static int __init netback_init(void)
        for (i = 0; i < MAX_PENDING_REQS; i++) {
                page = mmap_pages[i];
                SetPageForeign(page, netif_page_release);
-               page->index = i;
+               netif_page_index(page) = i;
        }
 
        pending_cons = 0;
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c
--- a/linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/netback/xenbus.c Thu Mar 08 14:39:52 
2007 -0600
@@ -338,9 +338,7 @@ static void connect(struct backend_info 
 
        xenbus_switch_state(dev, XenbusStateConnected);
 
-       /* May not get a kick from the frontend, so start the tx_queue now. */
-       if (!netbk_can_queue(be->netif->dev))
-               netif_wake_queue(be->netif->dev);
+       netif_wake_queue(be->netif->dev);
 }
 
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c
--- a/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c      Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/netfront/netfront.c      Thu Mar 08 
14:39:52 2007 -0600
@@ -154,6 +154,7 @@ struct netfront_info {
 
        unsigned int irq;
        unsigned int copying_receiver;
+       unsigned int carrier;
 
        /* Receive-ring batched refills. */
 #define RX_MIN_TARGET 8
@@ -191,6 +192,15 @@ struct netfront_rx_info {
        struct netif_rx_response rx;
        struct netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX - 1];
 };
+
+/*
+ * Implement our own carrier flag: the network stack's version causes delays
+ * when the carrier is re-enabled (in particular, dev_activate() may not
+ * immediately be called, which can cause packet loss).
+ */
+#define netfront_carrier_on(netif)     ((netif)->carrier = 1)
+#define netfront_carrier_off(netif)    ((netif)->carrier = 0)
+#define netfront_carrier_ok(netif)     ((netif)->carrier)
 
 /*
  * Access macros for acquiring freeing slots in tx_skbs[].
@@ -590,6 +600,22 @@ static int send_fake_arp(struct net_devi
        return dev_queue_xmit(skb);
 }
 
+static inline int netfront_tx_slot_available(struct netfront_info *np)
+{
+       return ((np->tx.req_prod_pvt - np->tx.rsp_cons) <
+               (TX_MAX_TARGET - MAX_SKB_FRAGS - 2));
+}
+
+static inline void network_maybe_wake_tx(struct net_device *dev)
+{
+       struct netfront_info *np = netdev_priv(dev);
+
+       if (unlikely(netif_queue_stopped(dev)) &&
+           netfront_tx_slot_available(np) &&
+           likely(netif_running(dev)))
+               netif_wake_queue(dev);
+}
+
 static int network_open(struct net_device *dev)
 {
        struct netfront_info *np = netdev_priv(dev);
@@ -597,7 +623,7 @@ static int network_open(struct net_devic
        memset(&np->stats, 0, sizeof(np->stats));
 
        spin_lock(&np->rx_lock);
-       if (netif_carrier_ok(dev)) {
+       if (netfront_carrier_ok(np)) {
                network_alloc_rx_buffers(dev);
                np->rx.sring->rsp_event = np->rx.rsp_cons + 1;
                if (RING_HAS_UNCONSUMED_RESPONSES(&np->rx))
@@ -605,25 +631,9 @@ static int network_open(struct net_devic
        }
        spin_unlock(&np->rx_lock);
 
-       netif_start_queue(dev);
+       network_maybe_wake_tx(dev);
 
        return 0;
-}
-
-static inline int netfront_tx_slot_available(struct netfront_info *np)
-{
-       return ((np->tx.req_prod_pvt - np->tx.rsp_cons) <
-               (TX_MAX_TARGET - MAX_SKB_FRAGS - 2));
-}
-
-static inline void network_maybe_wake_tx(struct net_device *dev)
-{
-       struct netfront_info *np = netdev_priv(dev);
-
-       if (unlikely(netif_queue_stopped(dev)) &&
-           netfront_tx_slot_available(np) &&
-           likely(netif_running(dev)))
-               netif_wake_queue(dev);
 }
 
 static void network_tx_buf_gc(struct net_device *dev)
@@ -633,7 +643,7 @@ static void network_tx_buf_gc(struct net
        struct netfront_info *np = netdev_priv(dev);
        struct sk_buff *skb;
 
-       BUG_ON(!netif_carrier_ok(dev));
+       BUG_ON(!netfront_carrier_ok(np));
 
        do {
                prod = np->tx.sring->rsp_prod;
@@ -703,7 +713,7 @@ static void network_alloc_rx_buffers(str
        int nr_flips;
        netif_rx_request_t *req;
 
-       if (unlikely(!netif_carrier_ok(dev)))
+       if (unlikely(!netfront_carrier_ok(np)))
                return;
 
        /*
@@ -934,7 +944,7 @@ static int network_start_xmit(struct sk_
 
        spin_lock_irq(&np->tx_lock);
 
-       if (unlikely(!netif_carrier_ok(dev) ||
+       if (unlikely(!netfront_carrier_ok(np) ||
                     (frags > 1 && !xennet_can_sg(dev)) ||
                     netif_needs_gso(dev, skb))) {
                spin_unlock_irq(&np->tx_lock);
@@ -1024,7 +1034,7 @@ static irqreturn_t netif_int(int irq, vo
 
        spin_lock_irqsave(&np->tx_lock, flags);
 
-       if (likely(netif_carrier_ok(dev))) {
+       if (likely(netfront_carrier_ok(np))) {
                network_tx_buf_gc(dev);
                /* Under tx_lock: protects access to rx shared-ring indexes. */
                if (RING_HAS_UNCONSUMED_RESPONSES(&np->rx))
@@ -1299,7 +1309,7 @@ static int netif_poll(struct net_device 
 
        spin_lock(&np->rx_lock);
 
-       if (unlikely(!netif_carrier_ok(dev))) {
+       if (unlikely(!netfront_carrier_ok(np))) {
                spin_unlock(&np->rx_lock);
                return 0;
        }
@@ -1317,7 +1327,7 @@ static int netif_poll(struct net_device 
        work_done = 0;
        while ((i != rp) && (work_done < budget)) {
                memcpy(rx, RING_GET_RESPONSE(&np->rx, i), sizeof(*rx));
-               memset(extras, 0, sizeof(extras));
+               memset(extras, 0, sizeof(rinfo.extras));
 
                err = xennet_get_responses(np, &rinfo, rp, &tmpq,
                                           &pages_flipped);
@@ -1744,7 +1754,7 @@ static int network_connect(struct net_de
         * domain a kick because we've probably just requeued some
         * packets.
         */
-       netif_carrier_on(dev);
+       netfront_carrier_on(np);
        notify_remote_via_irq(np->irq);
        network_tx_buf_gc(dev);
        network_alloc_rx_buffers(dev);
@@ -1989,7 +1999,7 @@ static struct net_device * __devinit cre
 
        np->netdev = netdev;
 
-       netif_carrier_off(netdev);
+       netfront_carrier_off(np);
 
        return netdev;
 
@@ -2023,7 +2033,7 @@ static void netif_disconnect_backend(str
        /* Stop old i/f to prevent errors whilst we rebuild the state. */
        spin_lock_irq(&info->tx_lock);
        spin_lock(&info->rx_lock);
-       netif_carrier_off(info->netdev);
+       netfront_carrier_off(info);
        spin_unlock(&info->rx_lock);
        spin_unlock_irq(&info->tx_lock);
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c
--- a/linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c        Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/pcifront/pci_op.c        Thu Mar 08 
14:39:52 2007 -0600
@@ -239,17 +239,12 @@ static void free_root_bus_devs(struct pc
 {
        struct pci_dev *dev;
 
-       down_write(&pci_bus_sem);
        while (!list_empty(&bus->devices)) {
-               dev = container_of(bus->devices.next, struct pci_dev, bus_list);
-               up_write(&pci_bus_sem);
-
+               dev = container_of(bus->devices.next, struct pci_dev,
+                                  bus_list);
                dev_dbg(&dev->dev, "removing device\n");
                pci_remove_bus_device(dev);
-
-               down_write(&pci_bus_sem);
-       }
-       up_write(&pci_bus_sem);
+       }
 }
 
 void pcifront_free_roots(struct pcifront_device *pdev)
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c
--- a/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/privcmd/privcmd.c        Thu Mar 08 
14:39:52 2007 -0600
@@ -248,7 +248,7 @@ static int privcmd_enforce_singleshot_ma
 }
 #endif
 
-static struct file_operations privcmd_file_ops = {
+static const struct file_operations privcmd_file_ops = {
        .ioctl = privcmd_ioctl,
        .mmap  = privcmd_mmap,
 };
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/tpmback/tpmback.c
--- a/linux-2.6-xen-sparse/drivers/xen/tpmback/tpmback.c        Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/tpmback/tpmback.c        Thu Mar 08 
14:39:52 2007 -0600
@@ -629,7 +629,7 @@ static unsigned int vtpm_op_poll(struct 
        return flags;
 }
 
-static struct file_operations vtpm_ops = {
+static const struct file_operations vtpm_ops = {
        .owner = THIS_MODULE,
        .llseek = no_llseek,
        .open = vtpm_op_open,
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/drivers/xen/xenbus/xenbus_dev.c
--- a/linux-2.6-xen-sparse/drivers/xen/xenbus/xenbus_dev.c      Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/drivers/xen/xenbus/xenbus_dev.c      Thu Mar 08 
14:39:52 2007 -0600
@@ -173,17 +173,22 @@ static ssize_t xenbus_dev_write(struct f
        void *reply;
        char *path, *token;
        struct watch_adapter *watch, *tmp_watch;
-       int err;
-
-       if ((len + u->len) > sizeof(u->u.buffer))
-               return -EINVAL;
-
-       if (copy_from_user(u->u.buffer + u->len, ubuf, len) != 0)
-               return -EFAULT;
+       int err, rc = len;
+
+       if ((len + u->len) > sizeof(u->u.buffer)) {
+               rc = -EINVAL;
+               goto out;
+       }
+
+       if (copy_from_user(u->u.buffer + u->len, ubuf, len) != 0) {
+               rc = -EFAULT;
+               goto out;
+       }
 
        u->len += len;
-       if (u->len < (sizeof(u->u.msg) + u->u.msg.len))
-               return len;
+       if ((u->len < sizeof(u->u.msg)) ||
+           (u->len < (sizeof(u->u.msg) + u->u.msg.len)))
+               return rc;
 
        msg_type = u->u.msg.type;
 
@@ -201,14 +206,17 @@ static ssize_t xenbus_dev_write(struct f
        case XS_SET_PERMS:
                if (msg_type == XS_TRANSACTION_START) {
                        trans = kmalloc(sizeof(*trans), GFP_KERNEL);
-                       if (!trans)
-                               return -ENOMEM;
+                       if (!trans) {
+                               rc = -ENOMEM;
+                               goto out;
+                       }
                }
 
                reply = xenbus_dev_request_and_reply(&u->u.msg);
                if (IS_ERR(reply)) {
                        kfree(trans);
-                       return PTR_ERR(reply);
+                       rc = PTR_ERR(reply);
+                       goto out;
                }
 
                if (msg_type == XS_TRANSACTION_START) {
@@ -231,8 +239,10 @@ static ssize_t xenbus_dev_write(struct f
        case XS_UNWATCH:
                path = u->u.buffer + sizeof(u->u.msg);
                token = memchr(path, 0, u->u.msg.len);
-               if (token == NULL)
-                       return -EILSEQ;
+               if (token == NULL) {
+                       rc = -EILSEQ;
+                       goto out;
+               }
                token++;
 
                if (msg_type == XS_WATCH) {
@@ -251,7 +261,8 @@ static ssize_t xenbus_dev_write(struct f
                        err = register_xenbus_watch(&watch->watch);
                        if (err) {
                                free_watch_adapter(watch);
-                               return err;
+                               rc = err;
+                               goto out;
                        }
                        
                        list_add(&watch->list, &u->watches);
@@ -265,7 +276,6 @@ static ssize_t xenbus_dev_write(struct f
                                                  &u->watches, list) {
                                if (!strcmp(watch->token, token) &&
                                    !strcmp(watch->watch.node, path))
-                                       break;
                                {
                                        unregister_xenbus_watch(&watch->watch);
                                        list_del(&watch->list);
@@ -278,11 +288,13 @@ static ssize_t xenbus_dev_write(struct f
                break;
 
        default:
-               return -EINVAL;
-       }
-
+               rc = -EINVAL;
+               break;
+       }
+
+ out:
        u->len = 0;
-       return len;
+       return rc;
 }
 
 static int xenbus_dev_open(struct inode *inode, struct file *filp)
@@ -342,7 +354,7 @@ static unsigned int xenbus_dev_poll(stru
        return 0;
 }
 
-static struct file_operations xenbus_dev_file_ops = {
+static const struct file_operations xenbus_dev_file_ops = {
        .read = xenbus_dev_read,
        .write = xenbus_dev_write,
        .open = xenbus_dev_open,
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/page.h
--- a/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/page.h Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/page.h Thu Mar 08 
14:39:52 2007 -0600
@@ -20,6 +20,14 @@
 #define LARGE_PAGE_SIZE (1UL << PMD_SHIFT)
 
 #ifdef __KERNEL__
+
+/*
+ * Need to repeat this here in order to not include pgtable.h (which in turn
+ * depends on definitions made here), but to be able to use the symbolic
+ * below. The preprocessor will warn if the two definitions aren't identical.
+ */
+#define _PAGE_PRESENT  0x001
+
 #ifndef __ASSEMBLY__
 
 #include <linux/string.h>
@@ -28,13 +36,6 @@
 #include <asm/bug.h>
 #include <xen/interface/xen.h>
 #include <xen/features.h>
-
-/*
- * Need to repeat this here in order to not include pgtable.h (which in turn
- * depends on definitions made here), but to be able to use the symbolic
- * below. The preprocessor will warn if the two definitions aren't identical.
- */
-#define _PAGE_PRESENT  0x001
 
 #define arch_free_page(_page,_order)           \
 ({     int foreign = PageForeign(_page);       \
@@ -225,8 +226,6 @@ extern int page_is_ram(unsigned long pag
        ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0 ) | \
                 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
 
-#define __HAVE_ARCH_GATE_AREA 1
-
 #include <asm-generic/memory_model.h>
 #include <asm-generic/page.h>
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/system.h
--- a/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/system.h       Mon Mar 
05 12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/system.h       Thu Mar 
08 14:39:52 2007 -0600
@@ -137,7 +137,7 @@ __asm__ __volatile__ ("movw %%dx,%1\n\t"
 })
 
 #define write_cr4(x) \
-       __asm__ __volatile__("movl %0,%%cr4": :"r" (x));
+       __asm__ __volatile__("movl %0,%%cr4": :"r" (x))
 
 /*
  * Clear and set 'TS' bit respectively
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/fixmap.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/fixmap.h     Mon Mar 
05 12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/fixmap.h     Thu Mar 
08 14:39:52 2007 -0600
@@ -53,6 +53,11 @@ enum fixed_addresses {
 #define NR_FIX_ISAMAPS 256
        FIX_ISAMAP_END,
        FIX_ISAMAP_BEGIN = FIX_ISAMAP_END + NR_FIX_ISAMAPS - 1,
+       __end_of_permanent_fixed_addresses,
+       /* temporary boot-time mappings, used before ioremap() is functional */
+#define NR_FIX_BTMAPS  16
+       FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
+       FIX_BTMAP_BEGIN = FIX_BTMAP_END + NR_FIX_BTMAPS - 1,
        __end_of_fixed_addresses
 };
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/io.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/io.h Mon Mar 05 
12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/io.h Thu Mar 08 
14:39:52 2007 -0600
@@ -150,8 +150,10 @@ static inline void __iomem * ioremap (un
        return __ioremap(offset, size, 0);
 }
 
-extern void *early_ioremap(unsigned long addr, unsigned long size);
-extern void early_iounmap(void *addr, unsigned long size);
+extern void *bt_ioremap(unsigned long addr, unsigned long size);
+extern void bt_iounmap(void *addr, unsigned long size);
+#define early_ioremap bt_ioremap
+#define early_iounmap bt_iounmap
 
 /*
  * This one maps high address device memory and turns off caching for that 
area.
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/pgtable.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/pgtable.h    Mon Mar 
05 12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/pgtable.h    Thu Mar 
08 14:39:52 2007 -0600
@@ -403,19 +403,6 @@ static inline int pmd_large(pmd_t pte) {
 /* to find an entry in a page-table-directory. */
 #define pud_index(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
 #define pud_offset(pgd, address) ((pud_t *) pgd_page(*(pgd)) + 
pud_index(address))
-static inline pud_t *__pud_offset_k(pud_t *pud, unsigned long address)
-{ 
-       return pud + pud_index(address);
-} 
-
-/* Find correct pud via the hidden fourth level page level: */
-
-/* This accesses the reference page table of the boot cpu. 
-   Other CPUs get synced lazily via the page fault handler. */
-static inline pud_t *pud_offset_k(pgd_t *pgd, unsigned long address)
-{
-       return pud_offset(pgd_offset_k(address), address);
-}
 
 /* PMD  - Level 2 access */
 #define pmd_page_kernel(pmd) ((unsigned long) __va(pmd_val(pmd) & PTE_MASK))
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/processor.h
--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/processor.h  Mon Mar 
05 12:49:12 2007 -0600
+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/processor.h  Thu Mar 
08 14:39:52 2007 -0600
@@ -246,11 +246,13 @@ DECLARE_PER_CPU(struct tss_struct,init_t
 
 
 extern struct cpuinfo_x86 boot_cpu_data;
+#ifndef CONFIG_X86_NO_TSS
 /* Save the original ist values for checking stack pointers during debugging */
 struct orig_ist {
        unsigned long ist[7];
 };
 DECLARE_PER_CPU(struct orig_ist, orig_ist);
+#endif
 
 #ifdef CONFIG_X86_VSMP
 #define ARCH_MIN_TASKALIGN     (1 << INTERNODE_CACHE_SHIFT)
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/linux/page-flags.h
--- a/linux-2.6-xen-sparse/include/linux/page-flags.h   Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/include/linux/page-flags.h   Thu Mar 08 14:39:52 
2007 -0600
@@ -252,14 +252,14 @@
 #define PageForeign(page)      test_bit(PG_foreign, &(page)->flags)
 #define SetPageForeign(page, dtor) do {                \
        set_bit(PG_foreign, &(page)->flags);    \
-       (page)->mapping = (void *)dtor;         \
+       (page)->index = (long)(dtor);           \
 } while (0)
 #define ClearPageForeign(page) do {            \
        clear_bit(PG_foreign, &(page)->flags);  \
-       (page)->mapping = NULL;                 \
+       (page)->index = 0;                      \
 } while (0)
 #define PageForeignDestructor(page)            \
-       ( (void (*) (struct page *)) (page)->mapping )(page)
+       ( (void (*) (struct page *)) (page)->index )(page)
 
 struct page;   /* forward declaration */
 
diff -r 8f0b5295bb1b -r dcec453681bc 
linux-2.6-xen-sparse/include/xen/cpu_hotplug.h
--- a/linux-2.6-xen-sparse/include/xen/cpu_hotplug.h    Mon Mar 05 12:49:12 
2007 -0600
+++ b/linux-2.6-xen-sparse/include/xen/cpu_hotplug.h    Thu Mar 08 14:39:52 
2007 -0600
@@ -4,7 +4,7 @@
 #include <linux/kernel.h>
 #include <linux/cpumask.h>
 
-#if defined(CONFIG_X86)
+#if defined(CONFIG_X86) && defined(CONFIG_SMP)
 extern cpumask_t cpu_initialized_map;
 #define cpu_set_initialized(cpu) cpu_set(cpu, cpu_initialized_map)
 #else
diff -r 8f0b5295bb1b -r dcec453681bc linux-2.6-xen-sparse/mm/Kconfig
--- a/linux-2.6-xen-sparse/mm/Kconfig   Mon Mar 05 12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,157 +0,0 @@
-config SELECT_MEMORY_MODEL
-       def_bool y
-       depends on EXPERIMENTAL || ARCH_SELECT_MEMORY_MODEL
-
-choice
-       prompt "Memory model"
-       depends on SELECT_MEMORY_MODEL
-       default DISCONTIGMEM_MANUAL if ARCH_DISCONTIGMEM_DEFAULT
-       default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT
-       default FLATMEM_MANUAL
-
-config FLATMEM_MANUAL
-       bool "Flat Memory"
-       depends on !(ARCH_DISCONTIGMEM_ENABLE || ARCH_SPARSEMEM_ENABLE) || 
ARCH_FLATMEM_ENABLE
-       help
-         This option allows you to change some of the ways that
-         Linux manages its memory internally.  Most users will
-         only have one option here: FLATMEM.  This is normal
-         and a correct option.
-
-         Some users of more advanced features like NUMA and
-         memory hotplug may have different options here.
-         DISCONTIGMEM is an more mature, better tested system,
-         but is incompatible with memory hotplug and may suffer
-         decreased performance over SPARSEMEM.  If unsure between
-         "Sparse Memory" and "Discontiguous Memory", choose
-         "Discontiguous Memory".
-
-         If unsure, choose this option (Flat Memory) over any other.
-
-config DISCONTIGMEM_MANUAL
-       bool "Discontiguous Memory"
-       depends on ARCH_DISCONTIGMEM_ENABLE
-       help
-         This option provides enhanced support for discontiguous
-         memory systems, over FLATMEM.  These systems have holes
-         in their physical address spaces, and this option provides
-         more efficient handling of these holes.  However, the vast
-         majority of hardware has quite flat address spaces, and
-         can have degraded performance from extra overhead that
-         this option imposes.
-
-         Many NUMA configurations will have this as the only option.
-
-         If unsure, choose "Flat Memory" over this option.
-
-config SPARSEMEM_MANUAL
-       bool "Sparse Memory"
-       depends on ARCH_SPARSEMEM_ENABLE
-       help
-         This will be the only option for some systems, including
-         memory hotplug systems.  This is normal.
-
-         For many other systems, this will be an alternative to
-         "Discontiguous Memory".  This option provides some potential
-         performance benefits, along with decreased code complexity,
-         but it is newer, and more experimental.
-
-         If unsure, choose "Discontiguous Memory" or "Flat Memory"
-         over this option.
-
-endchoice
-
-config DISCONTIGMEM
-       def_bool y
-       depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || 
DISCONTIGMEM_MANUAL
-
-config SPARSEMEM
-       def_bool y
-       depends on SPARSEMEM_MANUAL
-
-config FLATMEM
-       def_bool y
-       depends on (!DISCONTIGMEM && !SPARSEMEM) || FLATMEM_MANUAL
-
-config FLAT_NODE_MEM_MAP
-       def_bool y
-       depends on !SPARSEMEM
-
-#
-# Both the NUMA code and DISCONTIGMEM use arrays of pg_data_t's
-# to represent different areas of memory.  This variable allows
-# those dependencies to exist individually.
-#
-config NEED_MULTIPLE_NODES
-       def_bool y
-       depends on DISCONTIGMEM || NUMA
-
-config HAVE_MEMORY_PRESENT
-       def_bool y
-       depends on ARCH_HAVE_MEMORY_PRESENT || SPARSEMEM
-
-#
-# SPARSEMEM_EXTREME (which is the default) does some bootmem
-# allocations when memory_present() is called.  If this can not
-# be done on your architecture, select this option.  However,
-# statically allocating the mem_section[] array can potentially
-# consume vast quantities of .bss, so be careful.
-#
-# This option will also potentially produce smaller runtime code
-# with gcc 3.4 and later.
-#
-config SPARSEMEM_STATIC
-       def_bool n
-
-#
-# Architectecture platforms which require a two level mem_section in SPARSEMEM
-# must select this option. This is usually for architecture platforms with
-# an extremely sparse physical address space.
-#
-config SPARSEMEM_EXTREME
-       def_bool y
-       depends on SPARSEMEM && !SPARSEMEM_STATIC
-
-# eventually, we can have this option just 'select SPARSEMEM'
-config MEMORY_HOTPLUG
-       bool "Allow for memory hot-add"
-       depends on SPARSEMEM && HOTPLUG && !SOFTWARE_SUSPEND && 
ARCH_ENABLE_MEMORY_HOTPLUG
-       depends on (IA64 || X86 || PPC64)
-
-comment "Memory hotplug is currently incompatible with Software Suspend"
-       depends on SPARSEMEM && HOTPLUG && SOFTWARE_SUSPEND
-
-# Heavily threaded applications may benefit from splitting the mm-wide
-# page_table_lock, so that faults on different parts of the user address
-# space can be handled with less contention: split it at this NR_CPUS.
-# Default to 4 for wider testing, though 8 might be more appropriate.
-# ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
-# PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
-# XEN on x86 architecture uses the mapping field on pagetable pages to store a
-# pointer to the destructor. This conflicts with pte_lock_deinit().
-#
-config SPLIT_PTLOCK_CPUS
-       int
-       default "4096" if ARM && !CPU_CACHE_VIPT
-       default "4096" if PARISC && !PA20
-       default "4096" if X86_XEN || X86_64_XEN
-       default "4"
-
-#
-# support for page migration
-#
-config MIGRATION
-       bool "Page migration"
-       def_bool y
-       depends on NUMA
-       help
-         Allows the migration of the physical location of pages of processes
-         while the virtual addresses are not changed. This is useful for
-         example on NUMA systems to put pages nearer to the processors 
accessing
-         the page.
-
-config RESOURCES_64BIT
-       bool "64 bit Memory and IO resources (EXPERIMENTAL)" if (!64BIT && 
EXPERIMENTAL)
-       default 64BIT
-       help
-         This option allows memory and IO resources to be 64 bit.
diff -r 8f0b5295bb1b -r dcec453681bc 
patches/linux-2.6.18/blktap-aio-16_03_06.patch
--- a/patches/linux-2.6.18/blktap-aio-16_03_06.patch    Mon Mar 05 12:49:12 
2007 -0600
+++ b/patches/linux-2.6.18/blktap-aio-16_03_06.patch    Thu Mar 08 14:39:52 
2007 -0600
@@ -106,7 +106,7 @@ diff -pruN ../orig-linux-2.6.18/fs/aio.c
 +      return pollflags;
 +}
 +
-+static struct file_operations aioq_fops = {
++static const struct file_operations aioq_fops = {
 +      .release        = aio_queue_fd_close,
 +      .poll           = aio_queue_fd_poll
 +};
@@ -201,7 +201,7 @@ diff -pruN ../orig-linux-2.6.18/fs/event
                   int maxevents, long timeout);
  static int eventpollfs_delete_dentry(struct dentry *dentry);
 -static struct inode *ep_eventpoll_inode(void);
-+static struct inode *ep_eventpoll_inode(struct file_operations *fops);
++static struct inode *ep_eventpoll_inode(const struct file_operations *fops);
  static int eventpollfs_get_sb(struct file_system_type *fs_type,
                              int flags, const char *dev_name,
                              void *data, struct vfsmount *mnt);
@@ -221,7 +221,7 @@ diff -pruN ../orig-linux-2.6.18/fs/event
 -static int ep_getfd(int *efd, struct inode **einode, struct file **efile,
 -                  struct eventpoll *ep)
 +int ep_getfd(int *efd, struct inode **einode, struct file **efile,
-+                  struct eventpoll *ep, struct file_operations *fops)
++                  struct eventpoll *ep, const struct file_operations *fops)
  {
        struct qstr this;
        char name[32];
@@ -248,7 +248,7 @@ diff -pruN ../orig-linux-2.6.18/fs/event
  
  
 -static struct inode *ep_eventpoll_inode(void)
-+static struct inode *ep_eventpoll_inode(struct file_operations *fops)
++static struct inode *ep_eventpoll_inode(const struct file_operations *fops)
  {
        int error = -ENOMEM;
        struct inode *inode = new_inode(eventpoll_mnt->mnt_sb);
@@ -288,7 +288,7 @@ diff -pruN ../orig-linux-2.6.18/include/
 + */
 +struct eventpoll;
 +int ep_getfd(int *efd, struct inode **einode, struct file **efile,
-+             struct eventpoll *ep, struct file_operations *fops);
++             struct eventpoll *ep, const struct file_operations *fops);
  #else
  
  static inline void eventpoll_init_file(struct file *file) {}
diff -r 8f0b5295bb1b -r dcec453681bc tools/Makefile
--- a/tools/Makefile    Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/Makefile    Thu Mar 08 14:39:52 2007 -0600
@@ -24,9 +24,8 @@ SUBDIRS-$(LIBXENAPI_BINDINGS) += libxen
 
 # These don't cross-compile
 ifeq ($(XEN_COMPILE_ARCH),$(XEN_TARGET_ARCH))
-SUBDIRS-y += python
-SUBDIRS-y += pygrub
-SUBDIRS-y += ptsname
+SUBDIRS-$(PYTHON_TOOLS) += python
+SUBDIRS-$(PYTHON_TOOLS) += pygrub
 endif
 
 .PHONY: all
@@ -42,8 +41,8 @@ install: check
                $(MAKE) -C $$subdir $@; \
        done
        $(MAKE) ioemuinstall
-       $(INSTALL_DIR) -p $(DESTDIR)/var/xen/dump
-       $(INSTALL_DIR) -p $(DESTDIR)/var/log/xen
+       $(INSTALL_DIR) $(DESTDIR)/var/xen/dump
+       $(INSTALL_DIR) $(DESTDIR)/var/log/xen
 
 .PHONY: clean
 clean: check_clean
diff -r 8f0b5295bb1b -r dcec453681bc tools/blktap/lib/Makefile
--- a/tools/blktap/lib/Makefile Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/blktap/lib/Makefile Thu Mar 08 14:39:52 2007 -0600
@@ -40,8 +40,8 @@ libblktap: libblktap.a
 libblktap: libblktap.a
 
 install: all
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/$(LIBDIR)
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/include
+       $(INSTALL_DIR) $(DESTDIR)/usr/$(LIBDIR)
+       $(INSTALL_DIR) $(DESTDIR)/usr/include
        $(INSTALL_PROG) $(LIB) $(DESTDIR)/usr/$(LIBDIR)
        ln -sf libblktap.so.$(MAJOR).$(MINOR) 
$(DESTDIR)/usr/$(LIBDIR)/libblktap.so.$(MAJOR)
        ln -sf libblktap.so.$(MAJOR) $(DESTDIR)/usr/$(LIBDIR)/libblktap.so
diff -r 8f0b5295bb1b -r dcec453681bc tools/console/Makefile
--- a/tools/console/Makefile    Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/console/Makefile    Thu Mar 08 14:39:52 2007 -0600
@@ -30,7 +30,7 @@ xenconsole: $(patsubst %.c,%.o,$(wildcar
 
 .PHONY: install
 install: $(BIN)
-       $(INSTALL_DIR) -p $(DESTDIR)/$(DAEMON_INSTALL_DIR)
+       $(INSTALL_DIR) $(DESTDIR)/$(DAEMON_INSTALL_DIR)
        $(INSTALL_PROG) xenconsoled $(DESTDIR)/$(DAEMON_INSTALL_DIR)
-       $(INSTALL_DIR) -p $(DESTDIR)/$(CLIENT_INSTALL_DIR)
+       $(INSTALL_DIR) $(DESTDIR)/$(CLIENT_INSTALL_DIR)
        $(INSTALL_PROG) xenconsole $(DESTDIR)/$(CLIENT_INSTALL_DIR)
diff -r 8f0b5295bb1b -r dcec453681bc tools/firmware/rombios/rombios.c
--- a/tools/firmware/rombios/rombios.c  Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/firmware/rombios/rombios.c  Thu Mar 08 14:39:52 2007 -0600
@@ -890,7 +890,7 @@ static void           int15_function();
 static void           int15_function();
 static void           int16_function();
 static void           int17_function();
-static void           int19_function();
+static void           int18_function();
 static void           int1a_function();
 static void           int70_function();
 static void           int74_function();
@@ -1837,6 +1837,38 @@ keyboard_panic(status)
 }
 
 //--------------------------------------------------------------------------
+// machine_reset
+//--------------------------------------------------------------------------
+  void
+machine_reset()
+{
+  /* Frob the keyboard reset line to reset the processor */
+  outb(0x64, 0x60); /* Map the flags register at data port (0x60) */
+  outb(0x60, 0x14); /* Set the flags to system|disable */
+  outb(0x64, 0xfe); /* Pulse output 0 (system reset) low */
+  BX_PANIC("Couldn't reset the machine\n");
+}
+
+//--------------------------------------------------------------------------
+// clobber_entry_point
+//    Because PV drivers in HVM guests detach some of the emulated devices, 
+//    it is not safe to do a soft reboot by just dropping to real mode and
+//    jumping at ffff:0000. -- the boot drives might have disappeared!
+//    This rather foul function overwrites(!) the BIOS entry point 
+//    to point at machine-reset, which will cause the Xen tools to
+//    rebuild the whole machine from scratch.
+//--------------------------------------------------------------------------
+  void 
+clobber_entry_point() 
+{
+    /* The instruction at the entry point is one byte (0xea) for the
+     * jump opcode, then two bytes of address, then two of segment. 
+     * Overwrite the address bytes.*/
+    write_word(0xffff, 0x0001, machine_reset); 
+}
+
+
+//--------------------------------------------------------------------------
 // shutdown_status_panic
 //   called when the shutdown statsu is not implemented, displays the status
 //--------------------------------------------------------------------------
@@ -7626,7 +7658,7 @@ int17_function(regs, ds, iret_addr)
 }
 
 void
-int19_function(seq_nr)
+int18_function(seq_nr)
 Bit16u seq_nr;
 {
   Bit16u ebda_seg=read_word(0x0040,0x000E);
@@ -7702,8 +7734,8 @@ ASM_START
     push cx
     push dx
 
-    mov  dl, _int19_function.bootdrv + 2[bp]
-    mov  ax, _int19_function.bootseg + 2[bp]
+    mov  dl, _int18_function.bootdrv + 2[bp]
+    mov  ax, _int18_function.bootseg + 2[bp]
     mov  es, ax         ;; segment
     mov  bx, #0x0000    ;; offset
     mov  ah, #0x02      ;; function 2, read diskette sector
@@ -7714,7 +7746,7 @@ ASM_START
     int  #0x13          ;; read sector
     jnc  int19_load_done
     mov  ax, #0x0001
-    mov  _int19_function.status + 2[bp], ax
+    mov  _int18_function.status + 2[bp], ax
 
 int19_load_done:
     pop  dx
@@ -7789,13 +7821,13 @@ ASM_START
     ;; Build an iret stack frame that will take us to the boot vector.
     ;; iret pops ip, then cs, then flags, so push them in the opposite order.
     pushf
-    mov  ax, _int19_function.bootseg + 0[bp] 
+    mov  ax, _int18_function.bootseg + 0[bp] 
     push ax
-    mov  ax, _int19_function.bootip + 0[bp] 
+    mov  ax, _int18_function.bootip + 0[bp] 
     push ax
     ;; Set the magic number in ax and the boot drive in dl.
     mov  ax, #0xaa55
-    mov  dl, _int19_function.bootdrv + 0[bp]
+    mov  dl, _int18_function.bootdrv + 0[bp]
     ;; Zero some of the other registers.
     xor  bx, bx
     mov  ds, bx
@@ -8272,6 +8304,8 @@ int18_handler: ;; Boot Failure recovery:
   mov  ss, ax
 
   ;; Get the boot sequence number out of the IPL memory
+  ;; The first time we do this it will have been set to -1 so 
+  ;; we will start from device 0.
   mov  bx, #IPL_SEG 
   mov  ds, bx                     ;; Set segment
   mov  bx, IPL_SEQUENCE_OFFSET    ;; BX is now the sequence number
@@ -8279,43 +8313,33 @@ int18_handler: ;; Boot Failure recovery:
   mov  IPL_SEQUENCE_OFFSET, bx    ;; Write it back
   mov  ds, ax                     ;; and reset the segment to zero. 
 
-  ;; Carry on in the INT 19h handler, using the new sequence number
+  ;; Call the C code for the next boot device
   push bx
-
-  jmp  int19_next_boot
+  call _int18_function
+
+  ;; Boot failed: invoke the boot recovery function...
+  int  #0x18
 
 ;----------
 ;- INT19h -
 ;----------
 int19_relocated: ;; Boot function, relocated
-
-  ;; int19 was beginning to be really complex, so now it
-  ;; just calls a C function that does the work
-
-  push bp
-  mov  bp, sp
-  
-  ;; Reset SS and SP
+  ;;
+  ;; *** Warning: INT 19h resets the whole machine *** 
+  ;;
+  ;; Because PV drivers in HVM guests detach some of the emulated devices, 
+  ;; it is not safe to do a soft reboot by just dropping to real mode and
+  ;; invoking INT 19h -- the boot drives might have disappeared!
+  ;; If the user asks for a soft reboot, the only thing we can do is 
+  ;; reset the whole machine.  When it comes back up, the normal BIOS 
+  ;; boot sequence will start, which is more or less the required behaviour.
+  ;; 
+  ;; Reset SP and SS
   mov  ax, #0xfffe
   mov  sp, ax
   xor  ax, ax
   mov  ss, ax
-
-  ;; Start from the first boot device (0, in AX)
-  mov  bx, #IPL_SEG 
-  mov  ds, bx                     ;; Set segment to write to the IPL memory
-  mov  IPL_SEQUENCE_OFFSET, ax    ;; Save the sequence number 
-  mov  ds, ax                     ;; and reset the segment.
-
-  push ax
-
-int19_next_boot:
-
-  ;; Call the C code for the next boot device
-  call _int19_function
-
-  ;; Boot failed: invoke the boot recovery function
-  int  #0x18
+  call _machine_reset
 
 ;----------
 ;- INT1Ch -
@@ -9609,6 +9633,8 @@ normal_post:
 
   call _log_bios_start
 
+  call _clobber_entry_point
+
   ;; set all interrupts to default handler
   mov  bx, #0x0000    ;; offset index
   mov  cx, #0x0100    ;; counter (256 interrupts)
@@ -9857,8 +9883,10 @@ post_default_ints:
   call _tcpa_calling_int19h          /* specs: 8.2.3 step 1 */
   call _tcpa_add_event_separators    /* specs: 8.2.3 step 2 */
 #endif
-  int  #0x19
-  //JMP_EP(0x0064) ; INT 19h location
+
+  ;; Start the boot sequence.   See the comments in int19_relocated 
+  ;; for why we use INT 18h instead of INT 19h here.
+  int  #0x18
 
 #if BX_TCGBIOS
   call _tcpa_returned_int19h         /* specs: 8.2.3 step 3/7 */
diff -r 8f0b5295bb1b -r dcec453681bc tools/guest-headers/Makefile
--- a/tools/guest-headers/Makefile      Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/guest-headers/Makefile      Thu Mar 08 14:39:52 2007 -0600
@@ -13,7 +13,7 @@ check:
 
 install-Linux:
        mkdir -p $(DESTDIR)/usr/include/xen/linux
-       install -m0644 $(linuxsparsetree)/include/xen/public/*.h 
$(DESTDIR)/usr/include/xen/linux
+       $(INSTALL_DATA) $(linuxsparsetree)/include/xen/public/*.h 
$(DESTDIR)/usr/include/xen/linux
 
 install-SunOS:
 
diff -r 8f0b5295bb1b -r dcec453681bc tools/ioemu/Makefile
--- a/tools/ioemu/Makefile      Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/ioemu/Makefile      Thu Mar 08 14:39:52 2007 -0600
@@ -65,10 +65,10 @@ common  de-ch  es     fo  fr-ca  hu     
 
 install-doc: $(DOCS)
        mkdir -p "$(DESTDIR)$(docdir)"
-       $(INSTALL) -m 644 qemu-doc.html  qemu-tech.html "$(DESTDIR)$(docdir)"
+       $(INSTALL_DATA) qemu-doc.html  qemu-tech.html "$(DESTDIR)$(docdir)"
 ifndef CONFIG_WIN32
        mkdir -p "$(DESTDIR)$(mandir)/man1"
-       $(INSTALL) qemu.1 qemu-img.1 "$(DESTDIR)$(mandir)/man1"
+       $(INSTALL_DATA) qemu.1 qemu-img.1 "$(DESTDIR)$(mandir)/man1"
 endif
 
 install: all $(if $(BUILD_DOCS),install-doc)
@@ -77,12 +77,12 @@ install: all $(if $(BUILD_DOCS),install-
 #      mkdir -p "$(DESTDIR)$(datadir)"
 #      for x in bios.bin vgabios.bin vgabios-cirrus.bin ppc_rom.bin \
 #                      video.x openbios-sparc32 linux_boot.bin; do \
-#              $(INSTALL) -m 644 $(SRC_PATH)/pc-bios/$$x 
"$(DESTDIR)$(datadir)"; \
+#              $(INSTALL_DATA) $(SRC_PATH)/pc-bios/$$x "$(DESTDIR)$(datadir)"; 
\
 #      done
 ifndef CONFIG_WIN32
        mkdir -p "$(DESTDIR)$(datadir)/keymaps"
        for x in $(KEYMAPS); do \
-               $(INSTALL) -m 644 $(SRC_PATH)/keymaps/$$x 
"$(DESTDIR)$(datadir)/keymaps"; \
+               $(INSTALL_DATA) $(SRC_PATH)/keymaps/$$x 
"$(DESTDIR)$(datadir)/keymaps"; \
        done
 endif
        for d in $(TARGET_DIRS); do \
diff -r 8f0b5295bb1b -r dcec453681bc tools/ioemu/hw/ide.c
--- a/tools/ioemu/hw/ide.c      Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/ioemu/hw/ide.c      Thu Mar 08 14:39:52 2007 -0600
@@ -2602,6 +2602,120 @@ void pci_cmd646_ide_init(PCIBus *bus, Bl
 #endif /* DMA_MULTI_THREAD */
 }
 
+static void pci_ide_save(QEMUFile* f, void *opaque)
+{
+    PCIIDEState *d = opaque;
+    int i;
+
+    for(i = 0; i < 2; i++) {
+        BMDMAState *bm = &d->bmdma[i];
+        qemu_put_8s(f, &bm->cmd);
+        qemu_put_8s(f, &bm->status);
+        qemu_put_be32s(f, &bm->addr);
+        /* XXX: if a transfer is pending, we do not save it yet */
+    }
+
+    /* per IDE interface data */
+    for(i = 0; i < 2; i++) {
+        IDEState *s = &d->ide_if[i * 2];
+        uint8_t drive1_selected;
+        qemu_put_8s(f, &s->cmd);
+        drive1_selected = (s->cur_drive != s);
+        qemu_put_8s(f, &drive1_selected);
+    }
+
+    /* per IDE drive data */
+    for(i = 0; i < 4; i++) {
+        IDEState *s = &d->ide_if[i];
+        qemu_put_be32s(f, &s->mult_sectors);
+        qemu_put_be32s(f, &s->identify_set);
+        if (s->identify_set) {
+            qemu_put_buffer(f, (const uint8_t *)s->identify_data, 512);
+        }
+        qemu_put_8s(f, &s->write_cache);
+        qemu_put_8s(f, &s->feature);
+        qemu_put_8s(f, &s->error);
+        qemu_put_be32s(f, &s->nsector);
+        qemu_put_8s(f, &s->sector);
+        qemu_put_8s(f, &s->lcyl);
+        qemu_put_8s(f, &s->hcyl);
+        qemu_put_8s(f, &s->hob_feature);
+        qemu_put_8s(f, &s->hob_nsector);
+        qemu_put_8s(f, &s->hob_sector);
+        qemu_put_8s(f, &s->hob_lcyl);
+        qemu_put_8s(f, &s->hob_hcyl);
+        qemu_put_8s(f, &s->select);
+        qemu_put_8s(f, &s->status);
+        qemu_put_8s(f, &s->lba48);
+
+        qemu_put_8s(f, &s->sense_key);
+        qemu_put_8s(f, &s->asc);
+        /* XXX: if a transfer is pending, we do not save it yet */
+    }
+}
+
+static int pci_ide_load(QEMUFile* f, void *opaque, int version_id)
+{
+    PCIIDEState *d = opaque;
+    int ret, i;
+
+    if (version_id != 1)
+        return -EINVAL;
+
+    for(i = 0; i < 2; i++) {
+        BMDMAState *bm = &d->bmdma[i];
+        qemu_get_8s(f, &bm->cmd);
+        qemu_get_8s(f, &bm->status);
+        qemu_get_be32s(f, &bm->addr);
+        /* XXX: if a transfer is pending, we do not save it yet */
+    }
+
+    /* per IDE interface data */
+    for(i = 0; i < 2; i++) {
+        IDEState *s = &d->ide_if[i * 2];
+        uint8_t drive1_selected;
+        qemu_get_8s(f, &s->cmd);
+        qemu_get_8s(f, &drive1_selected);
+        s->cur_drive = &d->ide_if[i * 2 + (drive1_selected != 0)];
+    }
+
+    /* per IDE drive data */
+    for(i = 0; i < 4; i++) {
+        IDEState *s = &d->ide_if[i];
+        qemu_get_be32s(f, &s->mult_sectors);
+        qemu_get_be32s(f, &s->identify_set);
+        if (s->identify_set) {
+            qemu_get_buffer(f, (uint8_t *)s->identify_data, 512);
+        }
+        qemu_get_8s(f, &s->write_cache);
+        qemu_get_8s(f, &s->feature);
+        qemu_get_8s(f, &s->error);
+        qemu_get_be32s(f, &s->nsector);
+        qemu_get_8s(f, &s->sector);
+        qemu_get_8s(f, &s->lcyl);
+        qemu_get_8s(f, &s->hcyl);
+        qemu_get_8s(f, &s->hob_feature);
+        qemu_get_8s(f, &s->hob_nsector);
+        qemu_get_8s(f, &s->hob_sector);
+        qemu_get_8s(f, &s->hob_lcyl);
+        qemu_get_8s(f, &s->hob_hcyl);
+        qemu_get_8s(f, &s->select);
+        qemu_get_8s(f, &s->status);
+        qemu_get_8s(f, &s->lba48);
+
+        qemu_get_8s(f, &s->sense_key);
+        qemu_get_8s(f, &s->asc);
+        /* XXX: if a transfer is pending, we do not save it yet */
+        if (s->status & (DRQ_STAT|BUSY_STAT)) {
+            /* Tell the guest that its transfer has gone away */
+            ide_abort_command(s);
+            ide_set_irq(s);
+        }
+    }
+    return 0;
+}
+
+
 /* hd_table must contain 4 block drivers */
 /* NOTE: for the PIIX3, the IRQs and IOports are hardcoded */
 void pci_piix3_ide_init(PCIBus *bus, BlockDriverState **hd_table, int devfn)
@@ -2643,6 +2757,7 @@ void pci_piix3_ide_init(PCIBus *bus, Blo
     buffered_pio_init();
 
     register_savevm("ide_pci", 0, 1, generic_pci_save, generic_pci_load, d);
+    register_savevm("ide", 0, 1, pci_ide_save, pci_ide_load, d);
 
 #ifdef DMA_MULTI_THREAD    
     dma_create_thread();
diff -r 8f0b5295bb1b -r dcec453681bc tools/ioemu/target-i386-dm/qemu-ifup
--- a/tools/ioemu/target-i386-dm/qemu-ifup      Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/ioemu/target-i386-dm/qemu-ifup      Thu Mar 08 14:39:52 2007 -0600
@@ -3,8 +3,7 @@
 #. /etc/rc.d/init.d/functions
 #ulimit -c unlimited
 
-echo -c 'config qemu network with xen bridge for '
-echo $*
+echo 'config qemu network with xen bridge for ' $*
 
 ifconfig $1 0.0.0.0 up
 brctl addif $2 $1
diff -r 8f0b5295bb1b -r dcec453681bc tools/ioemu/vl.c
--- a/tools/ioemu/vl.c  Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/ioemu/vl.c  Thu Mar 08 14:39:52 2007 -0600
@@ -3250,6 +3250,14 @@ static int net_tap_init(VLANState *vlan,
         pid = fork();
         if (pid >= 0) {
             if (pid == 0) {
+                int open_max = sysconf(_SC_OPEN_MAX), i;
+                for (i = 0; i < open_max; i++)
+                    if (i != STDIN_FILENO &&
+                        i != STDOUT_FILENO &&
+                        i != STDERR_FILENO &&
+                        i != fd)
+                        close(i);
+
                 parg = args;
                 *parg++ = (char *)setup_script;
                 *parg++ = ifname;
diff -r 8f0b5295bb1b -r dcec453681bc tools/ioemu/vnc.c
--- a/tools/ioemu/vnc.c Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/ioemu/vnc.c Thu Mar 08 14:39:52 2007 -0600
@@ -1445,7 +1445,7 @@ int vnc_display_init(DisplayState *ds, i
 
 int vnc_start_viewer(int port)
 {
-    int pid;
+    int pid, i, open_max;
     char s[16];
 
     sprintf(s, ":%d", port);
@@ -1456,6 +1456,12 @@ int vnc_start_viewer(int port)
        exit(1);
 
     case 0:    /* child */
+       open_max = sysconf(_SC_OPEN_MAX);
+       for (i = 0; i < open_max; i++)
+           if (i != STDIN_FILENO &&
+               i != STDOUT_FILENO &&
+               i != STDERR_FILENO)
+               close(i);
        execlp("vncviewer", "vncviewer", s, NULL);
        fprintf(stderr, "vncviewer execlp failed\n");
        exit(1);
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_core.c
--- a/tools/libxc/xc_core.c     Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_core.c     Thu Mar 08 14:39:52 2007 -0600
@@ -153,7 +153,7 @@ struct xc_core_section_headers {
     uint16_t    num;
     uint16_t    num_max;
 
-    Elf_Shdr   *shdrs;
+    Elf64_Shdr  *shdrs;
 };
 #define SHDR_INIT       16
 #define SHDR_INC        4
@@ -184,14 +184,14 @@ xc_core_shdr_free(struct xc_core_section
     free(sheaders);
 }
 
-Elf_Shdr*
+Elf64_Shdr*
 xc_core_shdr_get(struct xc_core_section_headers *sheaders)
 {
-    Elf_Shdr *shdr;
+    Elf64_Shdr *shdr;
 
     if ( sheaders->num == sheaders->num_max )
     {
-        Elf_Shdr *shdrs;
+        Elf64_Shdr *shdrs;
         if ( sheaders->num_max + SHDR_INC < sheaders->num_max )
         {
             errno = E2BIG;
@@ -212,7 +212,7 @@ xc_core_shdr_get(struct xc_core_section_
 }
 
 int
-xc_core_shdr_set(Elf_Shdr *shdr,
+xc_core_shdr_set(Elf64_Shdr *shdr,
                  struct xc_core_strtab *strtab,
                  const char *name, uint32_t type,
                  uint64_t offset, uint64_t size,
@@ -317,15 +317,15 @@ xc_domain_dumpcore_via_callback(int xc_h
 
     uint64_t *pfn_array = NULL;
 
-    Elf_Ehdr ehdr;
-    unsigned long filesz;
-    unsigned long offset;
-    unsigned long fixup;
+    Elf64_Ehdr ehdr;
+    uint64_t filesz;
+    uint64_t offset;
+    uint64_t fixup;
 
     struct xc_core_strtab *strtab = NULL;
     uint16_t strtab_idx;
     struct xc_core_section_headers *sheaders = NULL;
-    Elf_Shdr *shdr;
+    Elf64_Shdr *shdr;
 
     /* elf notes */
     struct elfnote elfnote;
@@ -460,7 +460,7 @@ xc_domain_dumpcore_via_callback(int xc_h
     ehdr.e_ident[EI_MAG1] = ELFMAG1;
     ehdr.e_ident[EI_MAG2] = ELFMAG2;
     ehdr.e_ident[EI_MAG3] = ELFMAG3;
-    ehdr.e_ident[EI_CLASS] = ELFCLASS;
+    ehdr.e_ident[EI_CLASS] = ELFCLASS64;
     ehdr.e_ident[EI_DATA] = ELF_ARCH_DATA;
     ehdr.e_ident[EI_VERSION] = EV_CURRENT;
     ehdr.e_ident[EI_OSABI] = ELFOSABI_SYSV;
@@ -474,9 +474,9 @@ xc_domain_dumpcore_via_callback(int xc_h
     ehdr.e_shoff = sizeof(ehdr);
     ehdr.e_flags = ELF_CORE_EFLAGS;
     ehdr.e_ehsize = sizeof(ehdr);
-    ehdr.e_phentsize = sizeof(Elf_Phdr);
+    ehdr.e_phentsize = sizeof(Elf64_Phdr);
     ehdr.e_phnum = 0;
-    ehdr.e_shentsize = sizeof(Elf_Shdr);
+    ehdr.e_shentsize = sizeof(Elf64_Shdr);
     /* ehdr.e_shnum and ehdr.e_shstrndx aren't known here yet. fill it later*/
 
     /* create section header */
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_core.h
--- a/tools/libxc/xc_core.h     Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_core.h     Thu Mar 08 14:39:52 2007 -0600
@@ -116,10 +116,10 @@ struct xc_core_strtab;
 struct xc_core_strtab;
 struct xc_core_section_headers;
 
-Elf_Shdr*
+Elf64_Shdr*
 xc_core_shdr_get(struct xc_core_section_headers *sheaders);
 int
-xc_core_shdr_set(Elf_Shdr *shdr,
+xc_core_shdr_set(Elf64_Shdr *shdr,
                  struct xc_core_strtab *strtab,
                  const char *name, uint32_t type,
                  uint64_t offset, uint64_t size,
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_core_ia64.c
--- a/tools/libxc/xc_core_ia64.c        Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_core_ia64.c        Thu Mar 08 14:39:52 2007 -0600
@@ -266,10 +266,10 @@ xc_core_arch_context_get_shdr(struct xc_
 xc_core_arch_context_get_shdr(struct xc_core_arch_context *arch_ctxt, 
                               struct xc_core_section_headers *sheaders,
                               struct xc_core_strtab *strtab,
-                              unsigned long *filesz, unsigned long offset)
+                              uint64_t *filesz, uint64_t offset)
 {
     int sts = -1;
-    Elf_Shdr *shdr;
+    Elf64_Shdr *shdr;
 
     /* mmapped priv regs */
     shdr = xc_core_shdr_get(sheaders);
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_core_ia64.h
--- a/tools/libxc/xc_core_ia64.h        Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_core_ia64.h        Thu Mar 08 14:39:52 2007 -0600
@@ -42,7 +42,7 @@ xc_core_arch_context_get_shdr(struct xc_
 xc_core_arch_context_get_shdr(struct xc_core_arch_context* arch_ctxt, 
                               struct xc_core_section_headers *sheaders,
                               struct xc_core_strtab *strtab,
-                              unsigned long *filesz, unsigned long offset);
+                              uint64_t *filesz, uint64_t offset);
 int
 xc_core_arch_context_dump(struct xc_core_arch_context* arch_ctxt,
                           void* args, dumpcore_rtn_t dump_rtn);
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_core_x86.h
--- a/tools/libxc/xc_core_x86.h Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_core_x86.h Thu Mar 08 14:39:52 2007 -0600
@@ -45,7 +45,7 @@ xc_core_arch_context_get_shdr(struct xc_
 xc_core_arch_context_get_shdr(struct xc_core_arch_context *arch_ctxt, 
                               struct xc_core_section_headers *sheaders,
                               struct xc_core_strtab *strtab,
-                              unsigned long *filesz, unsigned long offset)
+                              uint64_t *filesz, uint64_t offset)
 {
     *filesz = 0;
     return 0;
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_dom_core.c
--- a/tools/libxc/xc_dom_core.c Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_dom_core.c Thu Mar 08 14:39:52 2007 -0600
@@ -721,9 +721,6 @@ int xc_dom_build_image(struct xc_dom_ima
     }
     page_size = XC_DOM_PAGE_SIZE(dom);
 
-    /* 4MB align virtual base address */
-    dom->parms.virt_base &= ~(((uint64_t)1<<22)-1);
-
     /* load kernel */
     if ( xc_dom_alloc_segment(dom, &dom->kernel_seg, "kernel",
                               dom->kernel_seg.vstart,
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_linux_restore.c
--- a/tools/libxc/xc_linux_restore.c    Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_linux_restore.c    Thu Mar 08 14:39:52 2007 -0600
@@ -19,7 +19,7 @@ static unsigned long max_mfn;
 /* virtual starting address of the hypervisor */
 static unsigned long hvirt_start;
 
-/* #levels of page tables used by the currrent guest */
+/* #levels of page tables used by the current guest */
 static unsigned int pt_levels;
 
 /* total number of pages used by the current guest */
@@ -857,6 +857,28 @@ int xc_linux_restore(int xc_handle, int 
 
         ctxt.ctrlreg[3] = xen_pfn_to_cr3(p2m[pfn]);
 
+        /* Guest pagetable (x86/64) stored in otherwise-unused CR1. */
+        if ( (pt_levels == 4) && ctxt.ctrlreg[1] )
+        {
+            pfn = xen_cr3_to_pfn(ctxt.ctrlreg[1]);
+
+            if (pfn >= max_pfn) {
+                ERROR("User PT base is bad: pfn=%lu max_pfn=%lu type=%08lx",
+                      pfn, max_pfn, pfn_type[pfn]);
+                goto out;
+            }
+
+            if ( (pfn_type[pfn] & XEN_DOMCTL_PFINFO_LTABTYPE_MASK) !=
+                 ((unsigned long)pt_levels<<XEN_DOMCTL_PFINFO_LTAB_SHIFT) ) {
+                ERROR("User PT base is bad. pfn=%lu nr=%lu type=%08lx %08lx",
+                      pfn, max_pfn, pfn_type[pfn],
+                      (unsigned long)pt_levels<<XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+                goto out;
+            }
+
+            ctxt.ctrlreg[1] = xen_pfn_to_cr3(p2m[pfn]);
+        }
+
         domctl.cmd = XEN_DOMCTL_setvcpucontext;
         domctl.domain = (domid_t)dom;
         domctl.u.vcpucontext.vcpu = i;
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_linux_save.c
--- a/tools/libxc/xc_linux_save.c       Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_linux_save.c       Thu Mar 08 14:39:52 2007 -0600
@@ -34,7 +34,7 @@ static unsigned long max_mfn;
 /* virtual starting address of the hypervisor */
 static unsigned long hvirt_start;
 
-/* #levels of page tables used by the currrent guest */
+/* #levels of page tables used by the current guest */
 static unsigned int pt_levels;
 
 /* total number of pages used by the current guest */
@@ -491,7 +491,7 @@ static int canonicalize_pagetable(unsign
     ** reserved hypervisor mappings. This depends on the current
     ** page table type as well as the number of paging levels.
     */
-    xen_start = xen_end = pte_last = PAGE_SIZE / ((pt_levels == 2)? 4 : 8);
+    xen_start = xen_end = pte_last = PAGE_SIZE / ((pt_levels == 2) ? 4 : 8);
 
     if (pt_levels == 2 && type == XEN_DOMCTL_PFINFO_L2TAB)
         xen_start = (hvirt_start >> L2_PAGETABLE_SHIFT);
@@ -1279,6 +1279,18 @@ int xc_linux_save(int xc_handle, int io_
         ctxt.ctrlreg[3] = 
             xen_pfn_to_cr3(mfn_to_pfn(xen_cr3_to_pfn(ctxt.ctrlreg[3])));
 
+        /* Guest pagetable (x86/64) stored in otherwise-unused CR1. */
+        if ( (pt_levels == 4) && ctxt.ctrlreg[1] )
+        {
+            if ( !MFN_IS_IN_PSEUDOPHYS_MAP(xen_cr3_to_pfn(ctxt.ctrlreg[1])) ) {
+                ERROR("PT base is not in range of pseudophys map");
+                goto out;
+            }
+            /* Least-significant bit means 'valid PFN'. */
+            ctxt.ctrlreg[1] = 1 |
+                xen_pfn_to_cr3(mfn_to_pfn(xen_cr3_to_pfn(ctxt.ctrlreg[1])));
+        }
+
         if (!write_exact(io_fd, &ctxt, sizeof(ctxt))) {
             ERROR("Error when writing to state file (1) (errno %d)", errno);
             goto out;
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxc/xc_ptrace_core.c
--- a/tools/libxc/xc_ptrace_core.c      Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxc/xc_ptrace_core.c      Thu Mar 08 14:39:52 2007 -0600
@@ -192,7 +192,7 @@ struct elf_core
 struct elf_core
 {
     int         domfd;
-    Elf_Ehdr    ehdr;
+    Elf64_Ehdr  ehdr;
 
     char*       shdr;
     
@@ -241,6 +241,8 @@ elf_core_init(struct elf_core* ecore, in
     
     /* check elf header */
     if (!IS_ELF(ecore->ehdr) || ecore->ehdr.e_type != ET_CORE)
+        goto out;
+    if (ecore->ehdr.e_ident[EI_CLASS] != ELFCLASS64)
         goto out;
     /* check elf header more: EI_DATA, EI_VERSION, e_machine... */
 
@@ -294,7 +296,7 @@ elf_core_search_note(struct elf_core* ec
 }
 
 static int
-elf_core_alloc_read_sec(struct elf_core* ecore, const Elf_Shdr* shdr,
+elf_core_alloc_read_sec(struct elf_core* ecore, const Elf64_Shdr* shdr,
                         char** buf)
 {
     int ret;
@@ -309,19 +311,19 @@ elf_core_alloc_read_sec(struct elf_core*
     return ret;
 }
 
-static Elf_Shdr*
+static Elf64_Shdr*
 elf_core_shdr_by_index(struct elf_core* ecore, uint16_t index)
 {
     if (index >= ecore->ehdr.e_shnum)
         return NULL;
-    return (Elf_Shdr*)(ecore->shdr + ecore->ehdr.e_shentsize * index);
+    return (Elf64_Shdr*)(ecore->shdr + ecore->ehdr.e_shentsize * index);
 }
 
 static int
 elf_core_alloc_read_sec_by_index(struct elf_core* ecore, uint16_t index,
                                  char** buf, uint64_t* size)
 {
-    Elf_Shdr* shdr = elf_core_shdr_by_index(ecore, index);
+    Elf64_Shdr* shdr = elf_core_shdr_by_index(ecore, index);
     if (shdr == NULL)
         return -1;
     if (size != NULL)
@@ -329,14 +331,14 @@ elf_core_alloc_read_sec_by_index(struct 
     return elf_core_alloc_read_sec(ecore, shdr, buf);
 }
 
-static Elf_Shdr*
+static Elf64_Shdr*
 elf_core_shdr_by_name(struct elf_core* ecore, const char* name)
 {
     const char* s;
     for (s = ecore->shdr;
          s < ecore->shdr + ecore->ehdr.e_shentsize * ecore->ehdr.e_shnum;
          s += ecore->ehdr.e_shentsize) {
-        Elf_Shdr* shdr = (Elf_Shdr*)s;
+        Elf64_Shdr* shdr = (Elf64_Shdr*)s;
 
         if (strncmp(ecore->shstrtab + shdr->sh_name, name, strlen(name)) == 0)
             return shdr;
@@ -348,7 +350,7 @@ static int
 static int
 elf_core_read_sec_by_name(struct elf_core* ecore, const char* name, char* buf)
 {
-    Elf_Shdr* shdr = elf_core_shdr_by_name(ecore, name);
+    Elf64_Shdr* shdr = elf_core_shdr_by_name(ecore, name);
     return pread_exact(ecore->domfd, buf, shdr->sh_size, shdr->sh_offset);
     
 }
@@ -357,7 +359,7 @@ elf_core_alloc_read_sec_by_name(struct e
 elf_core_alloc_read_sec_by_name(struct elf_core* ecore, const char* name,
                                 char** buf, uint64_t* size)
 {
-    Elf_Shdr* shdr = elf_core_shdr_by_name(ecore, name);
+    Elf64_Shdr* shdr = elf_core_shdr_by_name(ecore, name);
     if (shdr == NULL)
         return -1;
     if (size != NULL)
@@ -508,8 +510,8 @@ xc_waitdomain_core_elf(
     struct xen_dumpcore_elfnote_xen_version *xen_version;
     struct xen_dumpcore_elfnote_format_version *format_version;
 
-    Elf_Shdr* table_shdr;
-    Elf_Shdr* pages_shdr;
+    Elf64_Shdr* table_shdr;
+    Elf64_Shdr* pages_shdr;
 
     if (elf_core_init(&ecore, domfd) < 0)
         goto out;
diff -r 8f0b5295bb1b -r dcec453681bc tools/libxen/Makefile
--- a/tools/libxen/Makefile     Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/libxen/Makefile     Thu Mar 08 14:39:52 2007 -0600
@@ -57,8 +57,8 @@ test/test_hvm_bindings: test/test_hvm_bi
 
 .PHONY: install
 install: all
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/include/xen/api
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/$(LIBDIR)
+       $(INSTALL_DIR) $(DESTDIR)/usr/include/xen/api
+       $(INSTALL_DIR) $(DESTDIR)/usr/$(LIBDIR)
        $(INSTALL_PROG) libxenapi.so.$(MAJOR).$(MINOR) $(DESTDIR)/usr/$(LIBDIR)
        ln -sf libxenapi.so.$(MAJOR).$(MINOR) 
$(DESTDIR)/usr/$(LIBDIR)/libxenapi.so.$(MAJOR)
        ln -sf libxenapi.so.$(MAJOR) $(DESTDIR)/usr/$(LIBDIR)/libxenapi.so
diff -r 8f0b5295bb1b -r dcec453681bc tools/ptsname/Makefile
--- a/tools/ptsname/Makefile    Mon Mar 05 12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,22 +0,0 @@
-
-XEN_ROOT = ../..
-include $(XEN_ROOT)/tools/Rules.mk
-
-.PHONY: all
-all: build
-.PHONY: build
-build:
-       CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py build
-
-.PHONY: install
-ifndef XEN_PYTHON_NATIVE_INSTALL
-install: all
-       CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--home="$(DESTDIR)/usr" --prefix=""
-else
-install: all
-       CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--root="$(DESTDIR)"
-endif
-
-.PHONY: clean
-clean:
-       rm -rf build tmp *.pyc *.pyo *.o *.a *~ a.out
diff -r 8f0b5295bb1b -r dcec453681bc tools/ptsname/ptsname.c
--- a/tools/ptsname/ptsname.c   Mon Mar 05 12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,44 +0,0 @@
-/******************************************************************************
- * ptsname.c
- * 
- * A python extension to expose the POSIX ptsname() function.
- * 
- * Copyright (C) 2007 XenSource Ltd
- */
-
-#include <Python.h>
-#include <stdlib.h>
-
-/* Needed for Python versions earlier than 2.3. */
-#ifndef PyMODINIT_FUNC
-#define PyMODINIT_FUNC DL_EXPORT(void)
-#endif
-
-static PyObject *do_ptsname(PyObject *self, PyObject *args)
-{
-    int fd;
-    char *path;
-
-    if (!PyArg_ParseTuple(args, "i", &fd))
-        return NULL;
-
-    path = ptsname(fd);
-
-    if (!path)
-    {
-        PyErr_SetFromErrno(PyExc_IOError);
-        return NULL;
-    } 
-
-    return PyString_FromString(path);
-}
-
-static PyMethodDef ptsname_methods[] = { 
-    { "ptsname", do_ptsname, METH_VARARGS }, 
-    { NULL }
-};
-
-PyMODINIT_FUNC initptsname(void)
-{
-    Py_InitModule("ptsname", ptsname_methods);
-}
diff -r 8f0b5295bb1b -r dcec453681bc tools/ptsname/setup.py
--- a/tools/ptsname/setup.py    Mon Mar 05 12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,11 +0,0 @@
-from distutils.core import setup, Extension
-
-extra_compile_args  = [ "-fno-strict-aliasing", "-Werror" ]
-
-setup(name         = 'ptsname',
-      version      = '1.0',
-      description  = 'POSIX ptsname() function',
-      author       = 'Tim Deegan',
-      author_email = 'Tim.Deegan@xxxxxxxxxxxxx',
-      license      = 'GPL',
-      ext_modules  = [ Extension("ptsname", [ "ptsname.c" ]) ])
diff -r 8f0b5295bb1b -r dcec453681bc tools/pygrub/Makefile
--- a/tools/pygrub/Makefile     Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/pygrub/Makefile     Thu Mar 08 14:39:52 2007 -0600
@@ -10,13 +10,14 @@ build:
 
 .PHONY: install
 ifndef XEN_PYTHON_NATIVE_INSTALL
+install: LIBPATH=$(shell PYTHONPATH=../python/xen/util python -c "import 
auxbin; print auxbin.libpath()")
 install: all
-       CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--home="$(DESTDIR)/usr" --prefix=""
-       $(INSTALL_DIR) -p $(DESTDIR)/var/run/xend/boot
+       CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--home="$(DESTDIR)/usr" --prefix="" --install-lib="$(DESTDIR)$(LIBPATH)/python"
+       $(INSTALL_DIR) $(DESTDIR)/var/run/xend/boot
 else
 install: all
        CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--root="$(DESTDIR)"
-       $(INSTALL_DIR) -p $(DESTDIR)/var/run/xend/boot
+       $(INSTALL_DIR) $(DESTDIR)/var/run/xend/boot
 endif
 
 .PHONY: clean
diff -r 8f0b5295bb1b -r dcec453681bc tools/python/Makefile
--- a/tools/python/Makefile     Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/python/Makefile     Thu Mar 08 14:39:52 2007 -0600
@@ -18,8 +18,9 @@ build:
 
 .PHONY: install
 ifndef XEN_PYTHON_NATIVE_INSTALL
+install: LIBPATH=$(shell PYTHONPATH=xen/util python -c "import auxbin; print 
auxbin.libpath()")
 install: install-messages
-       CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--home="$(DESTDIR)/usr" --prefix="" --force
+       CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--home="$(DESTDIR)/usr" --prefix="" --force 
--install-lib="$(DESTDIR)$(LIBPATH)/python"
 else
 install: install-messages
        CC="$(CC)" CFLAGS="$(CFLAGS)" python setup.py install 
--root="$(DESTDIR)" --force
diff -r 8f0b5295bb1b -r dcec453681bc tools/python/ptsname/ptsname.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/python/ptsname/ptsname.c    Thu Mar 08 14:39:52 2007 -0600
@@ -0,0 +1,44 @@
+/******************************************************************************
+ * ptsname.c
+ * 
+ * A python extension to expose the POSIX ptsname() function.
+ * 
+ * Copyright (C) 2007 XenSource Ltd
+ */
+
+#include <Python.h>
+#include <stdlib.h>
+
+/* Needed for Python versions earlier than 2.3. */
+#ifndef PyMODINIT_FUNC
+#define PyMODINIT_FUNC DL_EXPORT(void)
+#endif
+
+static PyObject *do_ptsname(PyObject *self, PyObject *args)
+{
+    int fd;
+    char *path;
+
+    if (!PyArg_ParseTuple(args, "i", &fd))
+        return NULL;
+
+    path = ptsname(fd);
+
+    if (!path)
+    {
+        PyErr_SetFromErrno(PyExc_IOError);
+        return NULL;
+    } 
+
+    return PyString_FromString(path);
+}
+
+static PyMethodDef ptsname_methods[] = { 
+    { "ptsname", do_ptsname, METH_VARARGS }, 
+    { NULL }
+};
+
+PyMODINIT_FUNC initptsname(void)
+{
+    Py_InitModule("ptsname", ptsname_methods);
+}
diff -r 8f0b5295bb1b -r dcec453681bc tools/python/setup.py
--- a/tools/python/setup.py     Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/python/setup.py     Thu Mar 08 14:39:52 2007 -0600
@@ -44,7 +44,14 @@ acm = Extension("acm",
                libraries          = libraries,
                sources            = [ "xen/lowlevel/acm/acm.c" ])
 
-modules = [ xc, xs, acm ]
+ptsname = Extension("ptsname",
+               extra_compile_args = extra_compile_args,
+               include_dirs       = include_dirs + [ "ptsname" ],
+               library_dirs       = library_dirs,
+               libraries          = libraries,
+               sources            = [ "ptsname/ptsname.c" ])
+
+modules = [ xc, xs, acm, ptsname ]
 if os.uname()[0] == 'SunOS':
     modules.append(scf)
 
diff -r 8f0b5295bb1b -r dcec453681bc tools/python/xen/xend/XendBootloader.py
--- a/tools/python/xen/xend/XendBootloader.py   Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/python/xen/xend/XendBootloader.py   Thu Mar 08 14:39:52 2007 -0600
@@ -21,7 +21,8 @@ from XendLogging import log
 from XendLogging import log
 from XendError import VmError
 
-import pty, ptsname, termios, fcntl
+import pty, termios, fcntl
+from xen.lowlevel import ptsname
 
 def bootloader(blexec, disk, dom, quiet = False, blargs = '', kernel = '',
                ramdisk = '', kernel_args = ''):
diff -r 8f0b5295bb1b -r dcec453681bc tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py   Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/python/xen/xend/XendDomainInfo.py   Thu Mar 08 14:39:52 2007 -0600
@@ -781,7 +781,6 @@ class XendDomainInfo:
             'name':               self.info['name_label'],
             'console/limit':      str(xoptions.get_console_limit() * 1024),
             'memory/target':      str(self.info['memory_static_min'] * 1024),
-            'control/platform-feature-multiprocessor-suspend': str(1)
             }
 
         def f(n, v):
@@ -795,6 +794,9 @@ class XendDomainInfo:
         f('console/ring-ref', self.console_mfn)
         f('store/port',       self.store_port)
         f('store/ring-ref',   self.store_mfn)
+
+        if arch.type == "x86":
+            f('control/platform-feature-multiprocessor-suspend', True)
 
         # elfnotes
         for n, v in self.info.get_notes().iteritems():
@@ -1503,7 +1505,7 @@ class XendDomainInfo:
             self.info['start_time'] = time.time()
 
             self._stateSet(DOM_STATE_RUNNING)
-        except RuntimeError, exn:
+        except (RuntimeError, VmError), exn:
             log.exception("XendDomainInfo.initDomain: exception occurred")
             self.image.cleanupBootloading()
             raise VmError(str(exn))
@@ -2090,26 +2092,26 @@ class XendDomainInfo:
         return self.info.get('tools_version', {})
     
     def get_on_shutdown(self):
-        after_shutdown = self.info.get('action_after_shutdown')
+        after_shutdown = self.info.get('actions_after_shutdown')
         if not after_shutdown or after_shutdown not in XEN_API_ON_NORMAL_EXIT:
             return XEN_API_ON_NORMAL_EXIT[-1]
         return after_shutdown
 
     def get_on_reboot(self):
-        after_reboot = self.info.get('action_after_reboot')
+        after_reboot = self.info.get('actions_after_reboot')
         if not after_reboot or after_reboot not in XEN_API_ON_NORMAL_EXIT:
             return XEN_API_ON_NORMAL_EXIT[-1]
         return after_reboot
 
     def get_on_suspend(self):
         # TODO: not supported        
-        after_suspend = self.info.get('action_after_suspend') 
+        after_suspend = self.info.get('actions_after_suspend') 
         if not after_suspend or after_suspend not in XEN_API_ON_NORMAL_EXIT:
             return XEN_API_ON_NORMAL_EXIT[-1]
         return after_suspend        
 
     def get_on_crash(self):
-        after_crash = self.info.get('action_after_crash')
+        after_crash = self.info.get('actions_after_crash')
         if not after_crash or after_crash not in XEN_API_ON_CRASH_BEHAVIOUR:
             return XEN_API_ON_CRASH_BEHAVIOUR[0]
         return after_crash
diff -r 8f0b5295bb1b -r dcec453681bc tools/python/xen/xend/XendNode.py
--- a/tools/python/xen/xend/XendNode.py Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/python/xen/xend/XendNode.py Thu Mar 08 14:39:52 2007 -0600
@@ -22,7 +22,7 @@ import xen.lowlevel.xc
 
 from xen.util import Brctl
 
-from xen.xend import uuid
+from xen.xend import uuid, arch
 from xen.xend.XendError import *
 from xen.xend.XendOptions import instance as xendoptions
 from xen.xend.XendQCoWStorageRepo import XendQCoWStorageRepo
@@ -97,17 +97,38 @@ class XendNode:
         for u in self.cpus.keys():
             log.error(self.cpus[u])
             number = self.cpus[u]['number']
+            # We can run off the end of the cpuinfo list if domain0 does not
+            # have #vcpus == #pcpus. In that case we just replicate one that's
+            # in the hash table.
+            if not cpuinfo.has_key(number):
+                number = cpuinfo.keys()[0]
             log.error(number)
             log.error(cpuinfo)
-            self.cpus[u].update(
-                { 'host'     : self.uuid,
-                  'features' : cpu_features,
-                  'speed'    : int(float(cpuinfo[number]['cpu MHz'])),
-                  'vendor'   : cpuinfo[number]['vendor_id'],
-                  'modelname': cpuinfo[number]['model name'],
-                  'stepping' : cpuinfo[number]['stepping'],
-                  'flags'    : cpuinfo[number]['flags'],
-                })
+            if arch.type == "x86":
+                self.cpus[u].update(
+                    { 'host'     : self.uuid,
+                      'features' : cpu_features,
+                      'speed'    : int(float(cpuinfo[number]['cpu MHz'])),
+                      'vendor'   : cpuinfo[number]['vendor_id'],
+                      'modelname': cpuinfo[number]['model name'],
+                      'stepping' : cpuinfo[number]['stepping'],
+                      'flags'    : cpuinfo[number]['flags'],
+                    })
+            elif arch.type == "ia64":
+                self.cpus[u].update(
+                    { 'host'     : self.uuid,
+                      'features' : cpu_features,
+                      'speed'    : int(float(cpuinfo[number]['cpu MHz'])),
+                      'vendor'   : cpuinfo[number]['vendor'],
+                      'modelname': cpuinfo[number]['family'],
+                      'stepping' : cpuinfo[number]['model'],
+                      'flags'    : cpuinfo[number]['features'],
+                    })
+            else:
+                self.cpus[u].update(
+                    { 'host'     : self.uuid,
+                      'features' : cpu_features,
+                    })
 
         self.pifs = {}
         self.pif_metrics = {}
diff -r 8f0b5295bb1b -r dcec453681bc tools/security/Makefile
--- a/tools/security/Makefile   Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/security/Makefile   Thu Mar 08 14:39:52 2007 -0600
@@ -54,26 +54,29 @@ all: build
 all: build
 
 .PHONY: install
+ifndef XEN_PYTHON_NATIVE_INSTALL
+install: LIBPATH=$(shell PYTHONPATH=../python/xen/util python -c "import 
auxbin; print auxbin.libpath()")
+endif
 install: all $(ACM_CONFIG_FILE)
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/sbin
-       $(INSTALL_PROG) -p $(ACM_INST_TOOLS) $(DESTDIR)/usr/sbin
-       $(INSTALL_PROG) -p $(ACM_EZPOLICY) $(DESTDIR)/usr/sbin
-       $(INSTALL_DIR) -p $(DESTDIR)$(ACM_CONFIG_DIR)
-       $(INSTALL_DIR) -p $(DESTDIR)$(ACM_POLICY_DIR)
-       $(INSTALL_DATA) -p policies/$(ACM_SCHEMA) $(DESTDIR)$(ACM_POLICY_DIR)
-       $(INSTALL_DIR) -p $(DESTDIR)$(ACM_POLICY_DIR)/example
+       $(INSTALL_DIR) $(DESTDIR)/usr/sbin
+       $(INSTALL_PROG) $(ACM_INST_TOOLS) $(DESTDIR)/usr/sbin
+       $(INSTALL_PROG) $(ACM_EZPOLICY) $(DESTDIR)/usr/sbin
+       $(INSTALL_DIR) $(DESTDIR)$(ACM_CONFIG_DIR)
+       $(INSTALL_DIR) $(DESTDIR)$(ACM_POLICY_DIR)
+       $(INSTALL_DATA) policies/$(ACM_SCHEMA) $(DESTDIR)$(ACM_POLICY_DIR)
+       $(INSTALL_DIR) $(DESTDIR)$(ACM_POLICY_DIR)/example
        for i in $(ACM_EXAMPLES); do \
-               $(INSTALL_DIR) -p $(DESTDIR)$(ACM_POLICY_DIR)/example/$$i; \
-               $(INSTALL_DATA) -p 
policies/example/$$i/client_v1-$(ACM_POLICY_SUFFIX) 
$(DESTDIR)$(ACM_POLICY_DIR)/example/$$i; \
+               $(INSTALL_DIR) $(DESTDIR)$(ACM_POLICY_DIR)/example/$$i; \
+               $(INSTALL_DATA) 
policies/example/$$i/client_v1-$(ACM_POLICY_SUFFIX) 
$(DESTDIR)$(ACM_POLICY_DIR)/example/$$i; \
        done
-       $(INSTALL_DIR) -p $(DESTDIR)$(ACM_SCRIPT_DIR)
-       $(INSTALL_PROG) -p $(ACM_SCRIPTS) $(DESTDIR)$(ACM_SCRIPT_DIR)
-       $(INSTALL_DIR) -p $(DESTDIR)$(ACM_SECGEN_HTMLDIR)
-       $(INSTALL_DATA) -p $(ACM_INST_HTML) $(DESTDIR)$(ACM_SECGEN_HTMLDIR)
-       $(INSTALL_DIR) -p $(DESTDIR)$(ACM_SECGEN_CGIDIR)
-       $(INSTALL_PROG) -p $(ACM_INST_CGI) $(DESTDIR)$(ACM_SECGEN_CGIDIR)
+       $(INSTALL_DIR) $(DESTDIR)$(ACM_SCRIPT_DIR)
+       $(INSTALL_PROG) $(ACM_SCRIPTS) $(DESTDIR)$(ACM_SCRIPT_DIR)
+       $(INSTALL_DIR) $(DESTDIR)$(ACM_SECGEN_HTMLDIR)
+       $(INSTALL_DATA) $(ACM_INST_HTML) $(DESTDIR)$(ACM_SECGEN_HTMLDIR)
+       $(INSTALL_DIR) $(DESTDIR)$(ACM_SECGEN_CGIDIR)
+       $(INSTALL_PROG) $(ACM_INST_CGI) $(DESTDIR)$(ACM_SECGEN_CGIDIR)
 ifndef XEN_PYTHON_NATIVE_INSTALL
-       python python/setup.py install --home="$(DESTDIR)/usr"
+       python python/setup.py install --home="$(DESTDIR)/usr" 
--install-lib="$(DESTDIR)$(LIBPATH)/python"
 else
        python python/setup.py install --root="$(DESTDIR)"
 endif
diff -r 8f0b5295bb1b -r dcec453681bc tools/vnet/libxutil/Makefile
--- a/tools/vnet/libxutil/Makefile      Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/vnet/libxutil/Makefile      Thu Mar 08 14:39:52 2007 -0600
@@ -71,7 +71,7 @@ check-for-zlib:
 
 .PHONY: install
 install: build
-       [ -d $(DESTDIR)/usr/$(LIBDIR) ] || $(INSTALL_DIR) -p 
$(DESTDIR)/usr/$(LIBDIR)
+       [ -d $(DESTDIR)/usr/$(LIBDIR) ] || $(INSTALL_DIR) 
$(DESTDIR)/usr/$(LIBDIR)
        $(INSTALL_PROG) libxutil.so.$(MAJOR).$(MINOR) $(DESTDIR)/usr/$(LIBDIR)
        $(INSTALL_DATA) libxutil.a $(DESTDIR)/usr/$(LIBDIR)
        ln -sf libxutil.so.$(MAJOR).$(MINOR) 
$(DESTDIR)/usr/$(LIBDIR)/libxutil.so.$(MAJOR)
diff -r 8f0b5295bb1b -r dcec453681bc tools/xenfb/Makefile
--- a/tools/xenfb/Makefile      Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/xenfb/Makefile      Thu Mar 08 14:39:52 2007 -0600
@@ -1,12 +1,9 @@ XEN_ROOT=../..
 XEN_ROOT=../..
 include $(XEN_ROOT)/tools/Rules.mk
 
-CFLAGS += -I$(XEN_LIBXC) -I$(XEN_XENSTORE) 
-I$(XEN_ROOT)/linux-2.6-xen-sparse/include -I$(XEN_ROOT)/tools/ioemu
+CFLAGS  += -I$(XEN_LIBXC) -I$(XEN_XENSTORE)
+CFLAGS  += -I$(XEN_ROOT)/linux-2.6-xen-sparse/include -I$(XEN_ROOT)/tools/ioemu
 LDFLAGS += -L$(XEN_LIBXC) -L$(XEN_XENSTORE)
-
-INSTALL         = install
-INSTALL_PROG    = $(INSTALL) -m0755
-INSTALL_DIR     = $(INSTALL) -d -m0755
 
 .PHONY: all
 all: build
@@ -16,7 +13,7 @@ build:
        $(MAKE) vncfb sdlfb
 
 install: all
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/$(LIBDIR)/xen/bin
+       $(INSTALL_DIR) $(DESTDIR)/usr/$(LIBDIR)/xen/bin
        $(INSTALL_PROG) vncfb $(DESTDIR)/usr/$(LIBDIR)/xen/bin/xen-vncfb
        $(INSTALL_PROG) sdlfb $(DESTDIR)/usr/$(LIBDIR)/xen/bin/xen-sdlfb
 
diff -r 8f0b5295bb1b -r dcec453681bc tools/xenfb/xenfb.c
--- a/tools/xenfb/xenfb.c       Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/xenfb/xenfb.c       Thu Mar 08 14:39:52 2007 -0600
@@ -245,11 +245,10 @@ static int xenfb_wait_for_state(struct x
        unsigned state, dummy;
        char **vec;
 
+       awaited |= 1 << XenbusStateUnknown;
+
        for (;;) {
                state = xenfb_read_state(xsh, dir);
-               if (state < 0)
-                       return -1;
-
                if ((1 << state) & awaited)
                        return state;
 
diff -r 8f0b5295bb1b -r dcec453681bc tools/xenstore/Makefile
--- a/tools/xenstore/Makefile   Mon Mar 05 12:49:12 2007 -0600
+++ b/tools/xenstore/Makefile   Thu Mar 08 14:39:52 2007 -0600
@@ -168,16 +168,16 @@ tarball: clean
 
 .PHONY: install
 install: all
-       $(INSTALL_DIR) -p $(DESTDIR)/var/run/xenstored
-       $(INSTALL_DIR) -p $(DESTDIR)/var/lib/xenstored
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/bin
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/sbin
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/include
+       $(INSTALL_DIR) $(DESTDIR)/var/run/xenstored
+       $(INSTALL_DIR) $(DESTDIR)/var/lib/xenstored
+       $(INSTALL_DIR) $(DESTDIR)/usr/bin
+       $(INSTALL_DIR) $(DESTDIR)/usr/sbin
+       $(INSTALL_DIR) $(DESTDIR)/usr/include
        $(INSTALL_PROG) xenstored $(DESTDIR)/usr/sbin
        $(INSTALL_PROG) $(CLIENTS) $(DESTDIR)/usr/bin
        $(INSTALL_PROG) xenstore-control $(DESTDIR)/usr/bin
        $(INSTALL_PROG) xenstore-ls $(DESTDIR)/usr/bin
-       $(INSTALL_DIR) -p $(DESTDIR)/usr/$(LIBDIR)
+       $(INSTALL_DIR) $(DESTDIR)/usr/$(LIBDIR)
        $(INSTALL_PROG) libxenstore.so.$(MAJOR).$(MINOR) 
$(DESTDIR)/usr/$(LIBDIR)
        ln -sf libxenstore.so.$(MAJOR).$(MINOR) 
$(DESTDIR)/usr/$(LIBDIR)/libxenstore.so.$(MAJOR)
        ln -sf libxenstore.so.$(MAJOR) $(DESTDIR)/usr/$(LIBDIR)/libxenstore.so
diff -r 8f0b5295bb1b -r dcec453681bc xen/Rules.mk
--- a/xen/Rules.mk      Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/Rules.mk      Thu Mar 08 14:39:52 2007 -0600
@@ -41,8 +41,8 @@ include $(BASEDIR)/arch/$(TARGET_ARCH)/R
 include $(BASEDIR)/arch/$(TARGET_ARCH)/Rules.mk
 
 # Do not depend on auto-generated header files.
-HDRS := $(subst $(BASEDIR)/include/asm-$(TARGET_ARCH)/asm-offsets.h,,$(HDRS))
-HDRS := $(subst $(BASEDIR)/include/xen/compile.h,,$(HDRS))
+AHDRS := $(filter-out %/include/xen/compile.h,$(HDRS))
+HDRS  := $(filter-out %/asm-offsets.h,$(AHDRS))
 
 # Note that link order matters!
 ALL_OBJS-y               += $(BASEDIR)/common/built_in.o
@@ -110,12 +110,12 @@ _clean_%/: FORCE
 %.o: %.c $(HDRS) Makefile
        $(CC) $(CFLAGS) -c $< -o $@
 
-%.o: %.S $(HDRS) Makefile
+%.o: %.S $(AHDRS) Makefile
        $(CC) $(AFLAGS) -c $< -o $@
 
 %.i: %.c $(HDRS) Makefile
        $(CPP) $(CFLAGS) $< -o $@
 
 # -std=gnu{89,99} gets confused by # as an end-of-line comment marker
-%.s: %.S $(HDRS) Makefile
+%.s: %.S $(AHDRS) Makefile
        $(CPP) $(AFLAGS) $< -o $@
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c     Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/domain.c     Thu Mar 08 14:39:52 2007 -0600
@@ -641,6 +641,31 @@ int arch_set_info_guest(
             }
 
             v->arch.guest_table = pagetable_from_pfn(cr3_pfn);
+
+#ifdef __x86_64__
+            if ( c.nat->ctrlreg[1] )
+            {
+                cr3_pfn = gmfn_to_mfn(d, xen_cr3_to_pfn(c.nat->ctrlreg[1]));
+
+                if ( !mfn_valid(cr3_pfn) ||
+                     (paging_mode_refcounts(d)
+                      ? !get_page(mfn_to_page(cr3_pfn), d)
+                      : !get_page_and_type(mfn_to_page(cr3_pfn), d,
+                                           PGT_base_page_table)) )
+                {
+                    cr3_pfn = pagetable_get_pfn(v->arch.guest_table);
+                    v->arch.guest_table = pagetable_null();
+                    if ( paging_mode_refcounts(d) )
+                        put_page(mfn_to_page(cr3_pfn));
+                    else
+                        put_page_and_type(mfn_to_page(cr3_pfn));
+                    destroy_gdt(v);
+                    return -EINVAL;
+                }
+
+                v->arch.guest_table_user = pagetable_from_pfn(cr3_pfn);
+            }
+#endif
         }
 #ifdef CONFIG_COMPAT
         else
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/domain_build.c
--- a/xen/arch/x86/domain_build.c       Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/domain_build.c       Thu Mar 08 14:39:52 2007 -0600
@@ -374,9 +374,6 @@ int construct_dom0(struct domain *d,
     if ( parms.f_required[0] /* Huh? -- kraxel */ )
             panic("Domain 0 requires an unsupported hypervisor feature.\n");
 
-    /* Align load address to 4MB boundary. */
-    v_start = parms.virt_base & ~((1UL<<22)-1);
-
     /*
      * Why do we need this? The number of page-table frames depends on the 
      * size of the bootstrap address space. But the size of the address space 
@@ -384,6 +381,7 @@ int construct_dom0(struct domain *d,
      * read-only). We have a pair of simultaneous equations in two unknowns, 
      * which we solve by exhaustive search.
      */
+    v_start          = parms.virt_base;
     vkern_start      = parms.virt_kstart;
     vkern_end        = parms.virt_kend;
     vinitrd_start    = round_pgup(vkern_end);
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/domctl.c
--- a/xen/arch/x86/domctl.c     Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/domctl.c     Thu Mar 08 14:39:52 2007 -0600
@@ -470,8 +470,15 @@ void arch_get_info_guest(struct vcpu *v,
         c(user_regs.eflags |= v->arch.iopl << 12);
 
         if ( !IS_COMPAT(v->domain) )
+        {
             c.nat->ctrlreg[3] = xen_pfn_to_cr3(
                 pagetable_get_pfn(v->arch.guest_table));
+#ifdef __x86_64__
+            if ( !pagetable_is_null(v->arch.guest_table_user) )
+                c.nat->ctrlreg[1] = xen_pfn_to_cr3(
+                    pagetable_get_pfn(v->arch.guest_table_user));
+#endif
+        }
 #ifdef CONFIG_COMPAT
         else
         {
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c    Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/hvm/hvm.c    Thu Mar 08 14:39:52 2007 -0600
@@ -161,7 +161,8 @@ int hvm_domain_initialise(struct domain 
     spin_lock_init(&d->arch.hvm_domain.buffered_io_lock);
     spin_lock_init(&d->arch.hvm_domain.irq_lock);
 
-    rc = paging_enable(d, PG_SH_enable|PG_refcounts|PG_translate|PG_external);
+    /* paging support will be determined inside paging.c */
+    rc = paging_enable(d, PG_refcounts|PG_translate|PG_external);
     if ( rc != 0 )
         return rc;
 
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/hvm/svm/emulate.c
--- a/xen/arch/x86/hvm/svm/emulate.c    Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/hvm/svm/emulate.c    Thu Mar 08 14:39:52 2007 -0600
@@ -24,8 +24,10 @@
 #include <asm/msr.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/support.h>
+#include <asm/hvm/svm/svm.h>
 #include <asm/hvm/svm/vmcb.h>
 #include <asm/hvm/svm/emulate.h>
+
 
 extern int inst_copy_from_guest(unsigned char *buf, unsigned long guest_eip,
         int inst_len);
@@ -133,13 +135,15 @@ static inline unsigned long DECODE_GPR_V
 #define sib operand [1]
 
 
-unsigned long get_effective_addr_modrm64(struct vmcb_struct *vmcb, 
-        struct cpu_user_regs *regs, const u8 prefix, int inst_len,
-        const u8 *operand, u8 *size)
+unsigned long get_effective_addr_modrm64(struct cpu_user_regs *regs, 
+                                         const u8 prefix, int inst_len,
+                                         const u8 *operand, u8 *size)
 {
     unsigned long effective_addr = (unsigned long) -1;
     u8 length, modrm_mod, modrm_rm;
     u32 disp = 0;
+    struct vcpu *v = current;
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
 
     HVM_DBG_LOG(DBG_LEVEL_1, "get_effective_addr_modrm64(): prefix = %x, "
             "length = %d, operand[0,1] = %x %x.\n", prefix, *size, operand [0],
@@ -198,7 +202,7 @@ unsigned long get_effective_addr_modrm64
 
 #if __x86_64__
         /* 64-bit mode */
-        if (vmcb->cs.attr.fields.l && (vmcb->efer & EFER_LMA))
+        if (vmcb->cs.attr.fields.l && svm_long_mode_enabled(v))
             return vmcb->rip + inst_len + *size + disp;
 #endif
         return disp;
@@ -310,7 +314,7 @@ unsigned int decode_src_reg(u8 prefix, u
 }
 
 
-unsigned long svm_rip2pointer(struct vmcb_struct *vmcb)
+unsigned long svm_rip2pointer(struct vcpu *v)
 {
     /*
      * The following is subtle. Intuitively this code would be something like:
@@ -322,8 +326,9 @@ unsigned long svm_rip2pointer(struct vmc
      * %cs is update, but fortunately, base contain the valid base address
      * no matter what kind of addressing is used.
      */
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
     unsigned long p = vmcb->cs.base + vmcb->rip;
-    if (!(vmcb->cs.attr.fields.l && vmcb->efer & EFER_LMA))
+    if (!(vmcb->cs.attr.fields.l && svm_long_mode_enabled(v)))
         return (u32)p; /* mask to 32 bits */
     /* NB. Should mask to 16 bits if in real mode or 16-bit protected mode. */
     return p;
@@ -410,10 +415,11 @@ static const u8 *opc_bytes[INSTR_MAX_COU
  * The caller can either pass a NULL pointer to the guest_eip_buf, or a pointer
  * to enough bytes to satisfy the instruction including prefix bytes.
  */
-int __get_instruction_length_from_list(struct vmcb_struct *vmcb,
+int __get_instruction_length_from_list(struct vcpu *v,
         enum instruction_index *list, unsigned int list_count, 
         u8 *guest_eip_buf, enum instruction_index *match)
 {
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
     unsigned int inst_len = 0;
     unsigned int i;
     unsigned int j;
@@ -429,7 +435,7 @@ int __get_instruction_length_from_list(s
     }
     else
     {
-        inst_copy_from_guest(buffer, svm_rip2pointer(vmcb), MAX_INST_LEN);
+        inst_copy_from_guest(buffer, svm_rip2pointer(v), MAX_INST_LEN);
         buf = buffer;
     }
 
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/hvm/svm/svm.c
--- a/xen/arch/x86/hvm/svm/svm.c        Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/hvm/svm/svm.c        Thu Mar 08 14:39:52 2007 -0600
@@ -49,6 +49,7 @@
 #include <public/sched.h>
 #include <asm/hvm/vpt.h>
 #include <asm/hvm/trace.h>
+#include <asm/hap.h>
 
 #define SVM_EXTRA_DEBUG
 
@@ -75,6 +76,10 @@ static void *root_vmcb[NR_CPUS] __read_m
 
 /* physical address of above for host VMSAVE/VMLOAD */
 u64 root_vmcb_pa[NR_CPUS] __read_mostly;
+
+/* hardware assisted paging bits */
+extern int opt_hap_enabled;
+extern int hap_capable_system;
 
 static inline void svm_inject_exception(struct vcpu *v, int trap, 
                                         int ev, int error_code)
@@ -148,31 +153,6 @@ static void svm_store_cpu_guest_regs(
     }
 }
 
-static int svm_paging_enabled(struct vcpu *v)
-{
-    unsigned long cr0;
-
-    cr0 = v->arch.hvm_svm.cpu_shadow_cr0;
-
-    return (cr0 & X86_CR0_PE) && (cr0 & X86_CR0_PG);
-}
-
-static int svm_pae_enabled(struct vcpu *v)
-{
-    unsigned long cr4;
-
-    if(!svm_paging_enabled(v))
-        return 0;
-
-    cr4 = v->arch.hvm_svm.cpu_shadow_cr4;
-
-    return (cr4 & X86_CR4_PAE);
-}
-
-static int svm_long_mode_enabled(struct vcpu *v)
-{
-    return test_bit(SVM_CPU_STATE_LMA_ENABLED, &v->arch.hvm_svm.cpu_state);
-}
 
 static inline int long_mode_do_msr_read(struct cpu_user_regs *regs)
 {
@@ -183,8 +163,7 @@ static inline int long_mode_do_msr_read(
     switch ((u32)regs->ecx)
     {
     case MSR_EFER:
-        msr_content = vmcb->efer;
-        msr_content &= ~EFER_SVME;
+        msr_content = v->arch.hvm_svm.cpu_shadow_efer;
         break;
 
 #ifdef __x86_64__
@@ -255,30 +234,54 @@ static inline int long_mode_do_msr_write
             goto gp_fault;
         }
 
+        /* 
+         * update the VMCB's EFER with the intended value along with
+         * that crucial EFER.SVME bit =)
+         */
+        vmcb->efer = msr_content | EFER_SVME;
+
 #ifdef __x86_64__
-        /* LME: 0 -> 1 */
-        if ( msr_content & EFER_LME &&
-             !test_bit(SVM_CPU_STATE_LME_ENABLED, &v->arch.hvm_svm.cpu_state))
+
+        /*
+         * Check for EFER.LME transitions from 0->1 or 1->0.  Do the
+         * sanity checks and then make sure that both EFER.LME and
+         * EFER.LMA are cleared. (EFER.LME can't be set in the vmcb
+         * until the guest also sets CR0.PG, since even if the guest has
+         * paging "disabled", the vmcb's CR0 always has PG set.)
+         */
+        if ( (msr_content & EFER_LME) && !svm_lme_is_set(v) )
         {
+            /* EFER.LME transition from 0 to 1 */
+            
             if ( svm_paging_enabled(v) ||
-                 !test_bit(SVM_CPU_STATE_PAE_ENABLED,
-                           &v->arch.hvm_svm.cpu_state) )
+                 !svm_cr4_pae_is_set(v) )
             {
                 gdprintk(XENLOG_WARNING, "Trying to set LME bit when "
                          "in paging mode or PAE bit is not set\n");
                 goto gp_fault;
             }
-            set_bit(SVM_CPU_STATE_LME_ENABLED, &v->arch.hvm_svm.cpu_state);
-        }
-
-        /* We have already recorded that we want LME, so it will be set 
-         * next time CR0 gets updated. So we clear that bit and continue.
-         */
-        if ((msr_content ^ vmcb->efer) & EFER_LME)
-            msr_content &= ~EFER_LME;  
-        /* No update for LME/LMA since it have no effect */
-#endif
-        vmcb->efer = msr_content | EFER_SVME;
+
+            vmcb->efer &= ~(EFER_LME | EFER_LMA);
+        }
+        else if ( !(msr_content & EFER_LME) && svm_lme_is_set(v) )
+        {
+            /* EFER.LME transistion from 1 to 0 */
+            
+            if ( svm_paging_enabled(v) )
+            {
+                gdprintk(XENLOG_WARNING, 
+                         "Trying to clear EFER.LME while paging enabled\n");
+                goto gp_fault;
+            }
+
+            vmcb->efer &= ~(EFER_LME | EFER_LMA);
+        }
+
+#endif /* __x86_64__ */
+
+        /* update the guest EFER's shadow with the intended value */
+        v->arch.hvm_svm.cpu_shadow_efer = msr_content;
+
         break;
 
 #ifdef __x86_64__
@@ -468,22 +471,25 @@ int svm_vmcb_restore(struct vcpu *v, str
             c->cr4);
 #endif
 
-    if (!svm_paging_enabled(v)) {
+    if ( !svm_paging_enabled(v) ) 
+    {
         printk("%s: paging not enabled.", __func__);
         goto skip_cr3;
     }
 
-    if (c->cr3 == v->arch.hvm_svm.cpu_cr3) {
+    if ( c->cr3 == v->arch.hvm_svm.cpu_cr3 ) 
+    {
         /*
          * This is simple TLB flush, implying the guest has
          * removed some translation or changed page attributes.
          * We simply invalidate the shadow.
          */
         mfn = gmfn_to_mfn(v->domain, c->cr3 >> PAGE_SHIFT);
-        if (mfn != pagetable_get_pfn(v->arch.guest_table)) {
+        if ( mfn != pagetable_get_pfn(v->arch.guest_table) ) 
             goto bad_cr3;
-        }
-    } else {
+    } 
+    else 
+    {
         /*
          * If different, make a shadow. Check if the PDBR is valid
          * first.
@@ -491,9 +497,9 @@ int svm_vmcb_restore(struct vcpu *v, str
         HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 c->cr3 = %"PRIx64"", c->cr3);
         /* current!=vcpu as not called by arch_vmx_do_launch */
         mfn = gmfn_to_mfn(v->domain, c->cr3 >> PAGE_SHIFT);
-        if( !mfn_valid(mfn) || !get_page(mfn_to_page(mfn), v->domain)) {
+        if( !mfn_valid(mfn) || !get_page(mfn_to_page(mfn), v->domain) ) 
             goto bad_cr3;
-        }
+
         old_base_mfn = pagetable_get_pfn(v->arch.guest_table);
         v->arch.guest_table = pagetable_from_pfn(mfn);
         if (old_base_mfn)
@@ -631,7 +637,7 @@ static int svm_guest_x86_mode(struct vcp
 {
     struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
 
-    if ( (vmcb->efer & EFER_LMA) && vmcb->cs.attr.fields.l )
+    if ( svm_long_mode_enabled(v) && vmcb->cs.attr.fields.l )
         return 8;
 
     if ( svm_realmode(v) )
@@ -681,7 +687,7 @@ static unsigned long svm_get_segment_bas
     int long_mode = 0;
 
 #ifdef __x86_64__
-    long_mode = vmcb->cs.attr.fields.l && (vmcb->efer & EFER_LMA);
+    long_mode = vmcb->cs.attr.fields.l && svm_long_mode_enabled(v);
 #endif
     switch ( seg )
     {
@@ -905,6 +911,10 @@ static void arch_svm_do_launch(struct vc
 {
     svm_do_launch(v);
 
+    if ( paging_mode_hap(v->domain) ) {
+        v->arch.hvm_svm.vmcb->h_cr3 = 
pagetable_get_paddr(v->domain->arch.phys_table);
+    }
+
     if ( v->vcpu_id != 0 )
     {
         cpu_user_regs_t *regs = &current->arch.guest_context.user_regs;
@@ -1011,6 +1021,21 @@ static struct hvm_function_table svm_fun
     .event_injection_faulted = svm_event_injection_faulted
 };
 
+void svm_npt_detect(void)
+{
+    u32 eax, ebx, ecx, edx;
+
+    /* check CPUID for nested paging support */
+    cpuid(0x8000000A, &eax, &ebx, &ecx, &edx);
+    if ( edx & 0x01 ) { /* nested paging */
+        hap_capable_system = 1;
+    }
+    else if ( opt_hap_enabled ) {
+        printk(" nested paging is not supported by this CPU.\n");
+        hap_capable_system = 0; /* no nested paging, we disable flag. */
+    }
+}
+
 int start_svm(void)
 {
     u32 eax, ecx, edx;
@@ -1041,6 +1066,8 @@ int start_svm(void)
     wrmsr(MSR_EFER, eax, edx);
     printk("AMD SVM Extension is enabled for cpu %d.\n", cpu );
 
+    svm_npt_detect();
+
     /* Initialize the HSA for this core */
     phys_hsa = (u64) virt_to_maddr(hsa[cpu]);
     phys_hsa_lo = (u32) phys_hsa;
@@ -1077,6 +1104,18 @@ void arch_svm_do_resume(struct vcpu *v)
     }
 }
 
+static int svm_do_nested_pgfault(paddr_t gpa, struct cpu_user_regs *regs)
+{
+    if (mmio_space(gpa)) {
+        handle_mmio(gpa);
+        return 1;
+    }
+
+    /* We should not reach here. Otherwise, P2M table is not correct.*/
+    return 0;
+}
+
+
 static int svm_do_page_fault(unsigned long va, struct cpu_user_regs *regs) 
 {
     HVM_DBG_LOG(DBG_LEVEL_VMMU, 
@@ -1114,7 +1153,7 @@ static void svm_do_general_protection_fa
         printk("Huh? We got a GP Fault with an invalid IDTR!\n");
         svm_dump_vmcb(__func__, vmcb);
         svm_dump_regs(__func__, regs);
-        svm_dump_inst(svm_rip2pointer(vmcb));
+        svm_dump_inst(svm_rip2pointer(v));
         domain_crash(v->domain);
         return;
     }
@@ -1209,7 +1248,7 @@ static void svm_vmexit_do_cpuid(struct v
     HVMTRACE_3D(CPUID, v, input,
                 ((uint64_t)eax << 32) | ebx, ((uint64_t)ecx << 32) | edx);
 
-    inst_len = __get_instruction_length(vmcb, INSTR_CPUID, NULL);
+    inst_len = __get_instruction_length(v, INSTR_CPUID, NULL);
     ASSERT(inst_len > 0);
     __update_guest_eip(vmcb, inst_len);
 }
@@ -1312,15 +1351,16 @@ static void svm_dr_access(struct vcpu *v
 }
 
 
-static void svm_get_prefix_info(
-    struct vmcb_struct *vmcb, 
-    unsigned int dir, svm_segment_register_t **seg, unsigned int *asize)
-{
+static void svm_get_prefix_info(struct vcpu *v, unsigned int dir, 
+                                svm_segment_register_t **seg, 
+                                unsigned int *asize)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
     unsigned char inst[MAX_INST_LEN];
     int i;
 
     memset(inst, 0, MAX_INST_LEN);
-    if (inst_copy_from_guest(inst, svm_rip2pointer(vmcb), sizeof(inst)) 
+    if (inst_copy_from_guest(inst, svm_rip2pointer(v), sizeof(inst)) 
         != MAX_INST_LEN) 
     {
         gdprintk(XENLOG_ERR, "get guest instruction failed\n");
@@ -1400,7 +1440,7 @@ static inline int svm_get_io_address(
 
 #ifdef __x86_64__
     /* If we're in long mode, we shouldn't check the segment presence & limit 
*/
-    long_mode = vmcb->cs.attr.fields.l && vmcb->efer & EFER_LMA;
+    long_mode = vmcb->cs.attr.fields.l && svm_long_mode_enabled(v);
 #endif
 
     /* d field of cs.attr is 1 for 32-bit, 0 for 16 or 64 bit. 
@@ -1419,7 +1459,7 @@ static inline int svm_get_io_address(
         isize --;
 
     if (isize > 1) 
-        svm_get_prefix_info(vmcb, info.fields.type, &seg, &asize);
+        svm_get_prefix_info(v, info.fields.type, &seg, &asize);
 
     if (info.fields.type == IOREQ_WRITE)
     {
@@ -1702,6 +1742,52 @@ static void svm_io_instruction(struct vc
     }
 }
 
+static int npt_set_cr0(unsigned long value) 
+{
+    struct vcpu *v = current;
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
+  
+    ASSERT(vmcb);
+
+    /* ET is reserved and should be always be 1*/
+    value |= X86_CR0_ET;
+
+    /* Check whether the guest is about to turn on long mode. 
+     * If it is, set EFER.LME and EFER.LMA.  Update the shadow EFER.LMA
+     * bit too, so svm_long_mode_enabled() will work.
+     */
+    if ( (value & X86_CR0_PG) && svm_lme_is_set(v) &&
+         (vmcb->cr4 & X86_CR4_PAE) && (vmcb->cr0 & X86_CR0_PE) )
+    {
+        v->arch.hvm_svm.cpu_shadow_efer |= EFER_LMA;
+        vmcb->efer |= EFER_LMA | EFER_LME;
+    }
+
+    /* Whenever CR0.PG is cleared under long mode, LMA will be cleared 
+     * immediatly. We emulate this process for svm_long_mode_enabled().
+     */
+    if ( (value & (X86_CR0_PE | X86_CR0_PG)) == X86_CR0_PE )
+    {
+        if ( svm_long_mode_enabled(v) )
+        {
+            v->arch.hvm_svm.cpu_shadow_efer &= ~EFER_LMA;
+        }
+    }
+    
+    vmcb->cr0 = value | X86_CR0_WP;
+    v->arch.hvm_svm.cpu_shadow_cr0 = value;
+
+    /* TS cleared? Then initialise FPU now. */
+    if ( !(value & X86_CR0_TS) ) {
+        setup_fpu(v);
+        vmcb->exception_intercepts &= ~EXCEPTION_BITMAP_NM;
+    }
+    
+    paging_update_paging_modes(v);
+    
+    return 1;
+}
+
 static int svm_set_cr0(unsigned long value)
 {
     struct vcpu *v = current;
@@ -1727,7 +1813,8 @@ static int svm_set_cr0(unsigned long val
 
     HVM_DBG_LOG(DBG_LEVEL_VMMU, "Update CR0 value = %lx\n", value);
 
-    if ((value & X86_CR0_PE) && (value & X86_CR0_PG) && !paging_enabled) 
+    if ( ((value & (X86_CR0_PE | X86_CR0_PG)) == (X86_CR0_PE | X86_CR0_PG))
+         && !paging_enabled ) 
     {
         /* The guest CR3 must be pointing to the guest physical. */
         mfn = get_mfn_from_gpfn(v->arch.hvm_svm.cpu_cr3 >> PAGE_SHIFT);
@@ -1740,18 +1827,16 @@ static int svm_set_cr0(unsigned long val
         }
 
 #if defined(__x86_64__)
-        if (test_bit(SVM_CPU_STATE_LME_ENABLED, &v->arch.hvm_svm.cpu_state) 
-            && !test_bit(SVM_CPU_STATE_PAE_ENABLED, 
-                         &v->arch.hvm_svm.cpu_state))
+        if ( svm_lme_is_set(v) && !svm_cr4_pae_is_set(v) )
         {
             HVM_DBG_LOG(DBG_LEVEL_1, "Enable paging before PAE enable\n");
             svm_inject_exception(v, TRAP_gp_fault, 1, 0);
         }
 
-        if (test_bit(SVM_CPU_STATE_LME_ENABLED, &v->arch.hvm_svm.cpu_state))
+        if ( svm_lme_is_set(v) )
         {
             HVM_DBG_LOG(DBG_LEVEL_1, "Enable the Long mode\n");
-            set_bit(SVM_CPU_STATE_LMA_ENABLED, &v->arch.hvm_svm.cpu_state);
+            v->arch.hvm_svm.cpu_shadow_efer |= EFER_LMA;
             vmcb->efer |= EFER_LMA | EFER_LME;
         }
 #endif  /* __x86_64__ */
@@ -1790,14 +1875,93 @@ static int svm_set_cr0(unsigned long val
     {
         if ( svm_long_mode_enabled(v) )
         {
-            vmcb->efer &= ~EFER_LMA;
-            clear_bit(SVM_CPU_STATE_LMA_ENABLED, &v->arch.hvm_svm.cpu_state);
+            vmcb->efer &= ~(EFER_LME | EFER_LMA);
+            v->arch.hvm_svm.cpu_shadow_efer &= ~EFER_LMA;
         }
         /* we should take care of this kind of situation */
         paging_update_paging_modes(v);
     }
 
     return 1;
+}
+
+//
+// nested paging functions
+//
+
+static int npt_mov_to_cr(int gpreg, int cr, struct cpu_user_regs *regs)
+{  
+    unsigned long value;
+    struct vcpu *v = current;
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
+    struct vlapic *vlapic = vcpu_vlapic(v);
+
+    ASSERT(vmcb);
+
+    value = get_reg(gpreg, regs, vmcb);
+
+    switch (cr) {
+    case 0:
+        return npt_set_cr0(value);
+
+    case 3:
+        vmcb->cr3 = value;
+        v->arch.hvm_svm.cpu_cr3 = value;
+        break;
+
+    case 4: /* CR4 */
+        vmcb->cr4 = value;
+        v->arch.hvm_svm.cpu_shadow_cr4 = value;
+        paging_update_paging_modes(v);
+        break;
+
+    case 8:
+        vlapic_set_reg(vlapic, APIC_TASKPRI, ((value & 0x0F) << 4));
+        vmcb->vintr.fields.tpr = value & 0x0F;
+        break;
+
+    default:
+        gdprintk(XENLOG_ERR, "invalid cr: %d\n", cr);
+        domain_crash(v->domain);
+        return 0;
+    }
+    
+    return 1;
+}
+
+static void npt_mov_from_cr(int cr, int gp, struct cpu_user_regs *regs)
+{
+    unsigned long value = 0;
+    struct vcpu *v = current;
+    struct vmcb_struct *vmcb;
+    struct vlapic *vlapic = vcpu_vlapic(v);
+
+    vmcb = v->arch.hvm_svm.vmcb;
+    ASSERT(vmcb);
+
+    switch(cr) {
+    case 0:
+        value = (unsigned long) v->arch.hvm_svm.cpu_shadow_cr0;
+        break;
+    case 2:
+        value = vmcb->cr2;
+        break;
+    case 3:
+        value = (unsigned long) v->arch.hvm_svm.cpu_cr3;
+        break;
+    case 4:
+        value = (unsigned long) v->arch.hvm_svm.cpu_shadow_cr4;
+       break;
+    case 8:
+        value = (unsigned long)vlapic_get_reg(vlapic, APIC_TASKPRI);
+        value = (value & 0xF0) >> 4;
+        break;
+    default:
+        domain_crash(v->domain);
+        return;
+    }
+    
+    set_reg(gp, value, regs, vmcb);
 }
 
 /*
@@ -1848,12 +2012,6 @@ static void mov_from_cr(int cr, int gp, 
     set_reg(gp, value, regs, vmcb);
 
     HVM_DBG_LOG(DBG_LEVEL_VMMU, "mov_from_cr: CR%d, value = %lx,", cr, value);
-}
-
-
-static inline int svm_pgbit_test(struct vcpu *v)
-{
-    return v->arch.hvm_svm.cpu_shadow_cr0 & X86_CR0_PG;
 }
 
 
@@ -1933,7 +2091,6 @@ static int mov_to_cr(int gpreg, int cr, 
         old_cr = v->arch.hvm_svm.cpu_shadow_cr4;
         if ( value & X86_CR4_PAE && !(old_cr & X86_CR4_PAE) )
         {
-            set_bit(SVM_CPU_STATE_PAE_ENABLED, &v->arch.hvm_svm.cpu_state);
             if ( svm_pgbit_test(v) )
             {
                 /* The guest is a 32-bit PAE guest. */
@@ -1962,15 +2119,13 @@ static int mov_to_cr(int gpreg, int cr, 
                             v->arch.hvm_svm.cpu_cr3, mfn);
 #endif
             }
-        }
-        else if (value & X86_CR4_PAE) {
-            set_bit(SVM_CPU_STATE_PAE_ENABLED, &v->arch.hvm_svm.cpu_state);
-        } else {
-            if (test_bit(SVM_CPU_STATE_LMA_ENABLED,
-                         &v->arch.hvm_svm.cpu_state)) {
+        } 
+        else if ( !(value & X86_CR4_PAE) )
+        {
+            if ( svm_long_mode_enabled(v) )
+            {
                 svm_inject_exception(v, TRAP_gp_fault, 1, 0);
             }
-            clear_bit(SVM_CPU_STATE_PAE_ENABLED, &v->arch.hvm_svm.cpu_state);
         }
 
         v->arch.hvm_svm.cpu_shadow_cr4 = value;
@@ -2024,7 +2179,7 @@ static int svm_cr_access(struct vcpu *v,
 
     ASSERT(vmcb);
 
-    inst_copy_from_guest(buffer, svm_rip2pointer(vmcb), sizeof(buffer));
+    inst_copy_from_guest(buffer, svm_rip2pointer(v), sizeof(buffer));
 
     /* get index to first actual instruction byte - as we will need to know 
        where the prefix lives later on */
@@ -2033,12 +2188,12 @@ static int svm_cr_access(struct vcpu *v,
     if ( type == TYPE_MOV_TO_CR )
     {
         inst_len = __get_instruction_length_from_list(
-            vmcb, list_a, ARR_SIZE(list_a), &buffer[index], &match);
+            v, list_a, ARR_SIZE(list_a), &buffer[index], &match);
     }
     else /* type == TYPE_MOV_FROM_CR */
     {
         inst_len = __get_instruction_length_from_list(
-            vmcb, list_b, ARR_SIZE(list_b), &buffer[index], &match);
+            v, list_b, ARR_SIZE(list_b), &buffer[index], &match);
     }
 
     ASSERT(inst_len > 0);
@@ -2055,12 +2210,18 @@ static int svm_cr_access(struct vcpu *v,
     {
     case INSTR_MOV2CR:
         gpreg = decode_src_reg(prefix, buffer[index+2]);
-        result = mov_to_cr(gpreg, cr, regs);
+        if ( paging_mode_hap(v->domain) )
+            result = npt_mov_to_cr(gpreg, cr, regs);
+        else
+            result = mov_to_cr(gpreg, cr, regs);
         break;
 
     case INSTR_MOVCR2:
         gpreg = decode_src_reg(prefix, buffer[index+2]);
-        mov_from_cr(cr, gpreg, regs);
+        if ( paging_mode_hap(v->domain) )
+            npt_mov_from_cr(cr, gpreg, regs);
+        else
+            mov_from_cr(cr, gpreg, regs);
         break;
 
     case INSTR_CLTS:
@@ -2073,7 +2234,7 @@ static int svm_cr_access(struct vcpu *v,
 
     case INSTR_LMSW:
         if (svm_dbg_on)
-            svm_dump_inst(svm_rip2pointer(vmcb));
+            svm_dump_inst(svm_rip2pointer(v));
         
         gpreg = decode_src_reg(prefix, buffer[index+2]);
         value = get_reg(gpreg, regs, vmcb) & 0xF;
@@ -2087,12 +2248,15 @@ static int svm_cr_access(struct vcpu *v,
         if (svm_dbg_on)
             printk("CR0-LMSW CR0 - New value=%lx\n", value);
 
-        result = svm_set_cr0(value);
+        if ( paging_mode_hap(v->domain) )
+            result = npt_set_cr0(value);
+        else
+            result = svm_set_cr0(value);
         break;
 
     case INSTR_SMSW:
         if (svm_dbg_on)
-            svm_dump_inst(svm_rip2pointer(vmcb));
+            svm_dump_inst(svm_rip2pointer(v));
         value = v->arch.hvm_svm.cpu_shadow_cr0;
         gpreg = decode_src_reg(prefix, buffer[index+2]);
         set_reg(gpreg, value, regs, vmcb);
@@ -2168,7 +2332,7 @@ static inline void svm_do_msr_access(
         HVM_DBG_LOG(DBG_LEVEL_1, "returns: ecx=%x, eax=%lx, edx=%lx",
                     ecx, (unsigned long)regs->eax, (unsigned long)regs->edx);
 
-        inst_len = __get_instruction_length(vmcb, INSTR_RDMSR, NULL);
+        inst_len = __get_instruction_length(v, INSTR_RDMSR, NULL);
     }
     else
     {
@@ -2200,7 +2364,7 @@ static inline void svm_do_msr_access(
             break;
         }
 
-        inst_len = __get_instruction_length(vmcb, INSTR_WRMSR, NULL);
+        inst_len = __get_instruction_length(v, INSTR_WRMSR, NULL);
     }
 
     __update_guest_eip(vmcb, inst_len);
@@ -2223,8 +2387,9 @@ static inline void svm_vmexit_do_hlt(str
 }
 
 
-static void svm_vmexit_do_invd(struct vmcb_struct *vmcb)
-{
+static void svm_vmexit_do_invd(struct vcpu *v)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
     int  inst_len;
     
     /* Invalidate the cache - we can't really do that safely - maybe we should 
@@ -2237,7 +2402,7 @@ static void svm_vmexit_do_invd(struct vm
      */
     printk("INVD instruction intercepted - ignored\n");
     
-    inst_len = __get_instruction_length(vmcb, INSTR_INVD, NULL);
+    inst_len = __get_instruction_length(v, INSTR_INVD, NULL);
     __update_guest_eip(vmcb, inst_len);
 }    
         
@@ -2289,7 +2454,7 @@ void svm_handle_invlpg(const short invlp
      * Unknown how many bytes the invlpg instruction will take.  Use the
      * maximum instruction length here
      */
-    if (inst_copy_from_guest(opcode, svm_rip2pointer(vmcb), length) < length)
+    if (inst_copy_from_guest(opcode, svm_rip2pointer(v), length) < length)
     {
         gdprintk(XENLOG_ERR, "Error reading memory %d bytes\n", length);
         domain_crash(v->domain);
@@ -2298,7 +2463,7 @@ void svm_handle_invlpg(const short invlp
 
     if (invlpga)
     {
-        inst_len = __get_instruction_length(vmcb, INSTR_INVLPGA, opcode);
+        inst_len = __get_instruction_length(v, INSTR_INVLPGA, opcode);
         ASSERT(inst_len > 0);
         __update_guest_eip(vmcb, inst_len);
 
@@ -2312,7 +2477,7 @@ void svm_handle_invlpg(const short invlp
     {
         /* What about multiple prefix codes? */
         prefix = (is_prefix(opcode[0])?opcode[0]:0);
-        inst_len = __get_instruction_length(vmcb, INSTR_INVLPG, opcode);
+        inst_len = __get_instruction_length(v, INSTR_INVLPG, opcode);
         ASSERT(inst_len > 0);
 
         inst_len--;
@@ -2323,7 +2488,7 @@ void svm_handle_invlpg(const short invlp
          * displacement to get effective address and length in bytes.  Assume
          * the system in either 32- or 64-bit mode.
          */
-        g_vaddr = get_effective_addr_modrm64(vmcb, regs, prefix, inst_len,
+        g_vaddr = get_effective_addr_modrm64(regs, prefix, inst_len,
                                              &opcode[inst_len], &length);
 
         inst_len += length;
@@ -2369,7 +2534,11 @@ static int svm_do_vmmcall_reset_to_realm
 
     vmcb->cr4 = SVM_CR4_HOST_MASK;
     v->arch.hvm_svm.cpu_shadow_cr4 = 0;
-    clear_bit(SVM_CPU_STATE_PAE_ENABLED, &v->arch.hvm_svm.cpu_state);
+
+    if ( paging_mode_hap(v->domain) ) {
+        vmcb->cr0 = v->arch.hvm_svm.cpu_shadow_cr0;
+        vmcb->cr4 = v->arch.hvm_svm.cpu_shadow_cr4;
+    }
 
     /* This will jump to ROMBIOS */
     vmcb->rip = 0xFFF0;
@@ -2445,7 +2614,7 @@ static int svm_do_vmmcall(struct vcpu *v
     ASSERT(vmcb);
     ASSERT(regs);
 
-    inst_len = __get_instruction_length(vmcb, INSTR_VMCALL, NULL);
+    inst_len = __get_instruction_length(v, INSTR_VMCALL, NULL);
     ASSERT(inst_len > 0);
 
     HVMTRACE_1D(VMMCALL, v, regs->eax);
@@ -2855,7 +3024,7 @@ asmlinkage void svm_vmexit_handler(struc
 
             svm_dump_vmcb(__func__, vmcb);
             svm_dump_regs(__func__, regs);
-            svm_dump_inst(svm_rip2pointer(vmcb));
+            svm_dump_inst(svm_rip2pointer(v));
         }
 
 #if defined(__i386__)
@@ -2957,7 +3126,7 @@ asmlinkage void svm_vmexit_handler(struc
         /* Debug info to hopefully help debug WHY the guest double-faulted. */
         svm_dump_vmcb(__func__, vmcb);
         svm_dump_regs(__func__, regs);
-        svm_dump_inst(svm_rip2pointer(vmcb));
+        svm_dump_inst(svm_rip2pointer(v));
         svm_inject_exception(v, TRAP_double_fault, 1, 0);
         break;
 
@@ -2967,7 +3136,7 @@ asmlinkage void svm_vmexit_handler(struc
         break;
 
     case VMEXIT_INVD:
-        svm_vmexit_do_invd(vmcb);
+        svm_vmexit_do_invd(v);
         break;
 
     case VMEXIT_GDTR_WRITE:
@@ -3053,6 +3222,15 @@ asmlinkage void svm_vmexit_handler(struc
     case VMEXIT_SHUTDOWN:
         hvm_triple_fault();
         break;
+
+    case VMEXIT_NPF:
+    {
+        regs->error_code = vmcb->exitinfo1;
+        if ( !svm_do_nested_pgfault(vmcb->exitinfo2, regs) ) {
+            domain_crash(v->domain);
+        }
+        break;
+    }
 
     default:
     exit_and_crash:
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/hvm/svm/vmcb.c
--- a/xen/arch/x86/hvm/svm/vmcb.c       Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/hvm/svm/vmcb.c       Thu Mar 08 14:39:52 2007 -0600
@@ -200,6 +200,13 @@ static int construct_vmcb(struct vcpu *v
     vmcb->cr3 = v->arch.hvm_vcpu.hw_cr3; 
 
     arch_svm->vmcb->exception_intercepts = MONITOR_DEFAULT_EXCEPTION_BITMAP;
+
+    if ( paging_mode_hap(v->domain) ) {
+        vmcb->cr0 = arch_svm->cpu_shadow_cr0;
+        vmcb->np_enable = 1; /* enable nested paging */
+        vmcb->g_pat = 0x0007040600070406ULL; /* guest PAT */
+        vmcb->exception_intercepts &= ~EXCEPTION_BITMAP_PG;
+    }
 
     return 0;
 }
@@ -310,7 +317,8 @@ void svm_dump_vmcb(const char *from, str
     printk("KernGSBase = 0x%016llx PAT = 0x%016llx \n", 
            (unsigned long long) vmcb->kerngsbase,
            (unsigned long long) vmcb->g_pat);
-    
+    printk("H_CR3 = 0x%016llx\n", (unsigned long long)vmcb->h_cr3);
+
     /* print out all the selectors */
     svm_dump_sel("CS", &vmcb->cs);
     svm_dump_sel("DS", &vmcb->ds);
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm.c
--- a/xen/arch/x86/mm.c Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/mm.c Thu Mar 08 14:39:52 2007 -0600
@@ -424,7 +424,10 @@ void invalidate_shadow_ldt(struct vcpu *
     }
 
     /* Dispose of the (now possibly invalid) mappings from the TLB.  */
-    queue_deferred_ops(v->domain, DOP_FLUSH_TLB | DOP_RELOAD_LDT);
+    if ( v == current )
+        queue_deferred_ops(v->domain, DOP_FLUSH_TLB | DOP_RELOAD_LDT);
+    else
+        flush_tlb_mask(v->domain->domain_dirty_cpumask);
 }
 
 
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/Makefile
--- a/xen/arch/x86/mm/Makefile  Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/mm/Makefile  Thu Mar 08 14:39:52 2007 -0600
@@ -1,4 +1,5 @@ subdir-y += shadow
 subdir-y += shadow
+subdir-y += hap
 
 obj-y += paging.o
 obj-y += p2m.o
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/hap/Makefile
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/arch/x86/mm/hap/Makefile      Thu Mar 08 14:39:52 2007 -0600
@@ -0,0 +1,2 @@
+obj-y += hap.o
+obj-y += support.o
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/hap/hap.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/arch/x86/mm/hap/hap.c Thu Mar 08 14:39:52 2007 -0600
@@ -0,0 +1,708 @@
+/******************************************************************************
+ * arch/x86/mm/hap/hap.c
+ *
+ * hardware assisted paging
+ * Copyright (c) 2007 Advanced Micro Devices (Wei Huang)
+ * Parts of this code are Copyright (c) 2007 by XenSource Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <xen/config.h>
+#include <xen/types.h>
+#include <xen/mm.h>
+#include <xen/trace.h>
+#include <xen/sched.h>
+#include <xen/perfc.h>
+#include <xen/irq.h>
+#include <xen/domain_page.h>
+#include <xen/guest_access.h>
+#include <xen/keyhandler.h>
+#include <asm/event.h>
+#include <asm/page.h>
+#include <asm/current.h>
+#include <asm/flushtlb.h>
+#include <asm/shared.h>
+#include <asm/hap.h>
+#include <asm/paging.h>
+#include <asm/domain.h>
+
+#include "private.h"
+
+/* Override macros from asm/page.h to make them work with mfn_t */
+#undef mfn_to_page
+#define mfn_to_page(_m) (frame_table + mfn_x(_m))
+#undef mfn_valid
+#define mfn_valid(_mfn) (mfn_x(_mfn) < max_page)
+#undef page_to_mfn
+#define page_to_mfn(_pg) (_mfn((_pg) - frame_table))
+
+/************************************************/
+/*             HAP SUPPORT FUNCTIONS            */
+/************************************************/
+mfn_t hap_alloc(struct domain *d, unsigned long backpointer)
+{
+    struct page_info *sp = NULL;
+    void *p;
+
+    ASSERT(hap_locked_by_me(d));
+
+    sp = list_entry(d->arch.paging.hap.freelists.next, struct page_info, list);
+    list_del(&sp->list);
+    d->arch.paging.hap.free_pages -= 1;
+
+    /* Now safe to clear the page for reuse */
+    p = hap_map_domain_page(page_to_mfn(sp));
+    ASSERT(p != NULL);
+    clear_page(p);
+    hap_unmap_domain_page(p);
+
+    return page_to_mfn(sp);
+}
+
+void hap_free(struct domain *d, mfn_t smfn)
+{
+    struct page_info *sp = mfn_to_page(smfn); 
+
+    ASSERT(hap_locked_by_me(d));
+
+    d->arch.paging.hap.free_pages += 1;
+    list_add_tail(&sp->list, &d->arch.paging.hap.freelists);
+}
+
+static int hap_alloc_p2m_pages(struct domain *d)
+{
+    struct page_info *pg;
+
+    ASSERT(hap_locked_by_me(d));
+
+    pg = mfn_to_page(hap_alloc(d, 0));
+    d->arch.paging.hap.p2m_pages += 1;
+    d->arch.paging.hap.total_pages -= 1;
+    
+    page_set_owner(pg, d);
+    pg->count_info = 1;
+    list_add_tail(&pg->list, &d->arch.paging.hap.p2m_freelist);
+
+    return 1;
+}
+
+struct page_info * hap_alloc_p2m_page(struct domain *d)
+{
+    struct list_head *entry;
+    struct page_info *pg;
+    mfn_t mfn;
+    void *p;
+
+    hap_lock(d);
+    
+    if ( list_empty(&d->arch.paging.hap.p2m_freelist) && 
+         !hap_alloc_p2m_pages(d) ) {
+        hap_unlock(d);
+        return NULL;
+    }
+    entry = d->arch.paging.hap.p2m_freelist.next;
+    list_del(entry);
+    
+    hap_unlock(d);
+
+    pg = list_entry(entry, struct page_info, list);
+    mfn = page_to_mfn(pg);
+    p = hap_map_domain_page(mfn);
+    clear_page(p);
+    hap_unmap_domain_page(p);
+
+    return pg;
+}
+
+void hap_free_p2m_page(struct domain *d, struct page_info *pg)
+{
+    ASSERT(page_get_owner(pg) == d);
+    /* Should have just the one ref we gave it in alloc_p2m_page() */
+    if ( (pg->count_info & PGC_count_mask) != 1 ) {
+        HAP_ERROR("Odd p2m page count c=%#x t=%"PRtype_info"\n",
+                  pg->count_info, pg->u.inuse.type_info);
+    }
+    /* Free should not decrement domain's total allocation, since 
+     * these pages were allocated without an owner. */
+    page_set_owner(pg, NULL); 
+    free_domheap_pages(pg, 0);
+    d->arch.paging.hap.p2m_pages--;
+}
+
+/* Return the size of the pool, rounded up to the nearest MB */
+static unsigned int
+hap_get_allocation(struct domain *d)
+{
+    unsigned int pg = d->arch.paging.hap.total_pages;
+
+    HERE_I_AM;
+    return ((pg >> (20 - PAGE_SHIFT))
+            + ((pg & ((1 << (20 - PAGE_SHIFT)) - 1)) ? 1 : 0));
+}
+
+/* Set the pool of pages to the required number of pages.
+ * Returns 0 for success, non-zero for failure. */
+static unsigned int
+hap_set_allocation(struct domain *d, unsigned int pages, int *preempted)
+{
+    struct page_info *sp;
+
+    ASSERT(hap_locked_by_me(d));
+
+    while ( d->arch.paging.hap.total_pages != pages ) {
+        if ( d->arch.paging.hap.total_pages < pages ) {
+            /* Need to allocate more memory from domheap */
+            sp = alloc_domheap_pages(NULL, 0, 0);
+            if ( sp == NULL ) {
+                HAP_PRINTK("failed to allocate hap pages.\n");
+                return -ENOMEM;
+            }
+            d->arch.paging.hap.free_pages += 1;
+            d->arch.paging.hap.total_pages += 1;
+            list_add_tail(&sp->list, &d->arch.paging.hap.freelists);
+        }
+        else if ( d->arch.paging.hap.total_pages > pages ) {
+            /* Need to return memory to domheap */
+            ASSERT(!list_empty(&d->arch.paging.hap.freelists));
+            sp = list_entry(d->arch.paging.hap.freelists.next,
+                            struct page_info, list);
+            list_del(&sp->list);
+            d->arch.paging.hap.free_pages -= 1;
+            d->arch.paging.hap.total_pages -= 1;
+            free_domheap_pages(sp, 0);
+        }
+        
+        /* Check to see if we need to yield and try again */
+        if ( preempted && hypercall_preempt_check() ) {
+            *preempted = 1;
+            return 0;
+        }
+    }
+
+    return 0;
+}
+
+#if CONFIG_PAGING_LEVELS == 4
+void hap_install_xen_entries_in_l4(struct vcpu *v, mfn_t gl4mfn, mfn_t sl4mfn)
+{
+    struct domain *d = v->domain;
+    l4_pgentry_t *sl4e;
+
+    sl4e = hap_map_domain_page(sl4mfn);
+    ASSERT(sl4e != NULL);
+
+    /* Copy the common Xen mappings from the idle domain */
+    memcpy(&sl4e[ROOT_PAGETABLE_FIRST_XEN_SLOT],
+           &idle_pg_table[ROOT_PAGETABLE_FIRST_XEN_SLOT],
+           ROOT_PAGETABLE_XEN_SLOTS * sizeof(l4_pgentry_t));
+
+    /* Install the per-domain mappings for this domain */
+    sl4e[l4_table_offset(PERDOMAIN_VIRT_START)] =
+        l4e_from_pfn(mfn_x(page_to_mfn(virt_to_page(d->arch.mm_perdomain_l3))),
+                     __PAGE_HYPERVISOR);
+
+    sl4e[l4_table_offset(LINEAR_PT_VIRT_START)] =
+        l4e_from_pfn(mfn_x(gl4mfn), __PAGE_HYPERVISOR);
+
+    /* install domain-specific P2M table */
+    sl4e[l4_table_offset(RO_MPT_VIRT_START)] =
+        l4e_from_pfn(mfn_x(pagetable_get_mfn(d->arch.phys_table)),
+                     __PAGE_HYPERVISOR);
+
+    hap_unmap_domain_page(sl4e);
+}
+#endif /* CONFIG_PAGING_LEVELS == 4 */
+
+#if CONFIG_PAGING_LEVELS == 3
+void hap_install_xen_entries_in_l2h(struct vcpu *v, mfn_t sl2hmfn)
+{
+    struct domain *d = v->domain;
+    l2_pgentry_t *sl2e;
+
+    int i;
+
+    sl2e = hap_map_domain_page(sl2hmfn);
+    ASSERT(sl2e != NULL);
+    
+    /* Copy the common Xen mappings from the idle domain */
+    memcpy(&sl2e[L2_PAGETABLE_FIRST_XEN_SLOT & (L2_PAGETABLE_ENTRIES-1)],
+           &idle_pg_table_l2[L2_PAGETABLE_FIRST_XEN_SLOT],
+           L2_PAGETABLE_XEN_SLOTS * sizeof(l2_pgentry_t));
+
+    /* Install the per-domain mappings for this domain */
+    for ( i = 0; i < PDPT_L2_ENTRIES; i++ )
+        sl2e[l2_table_offset(PERDOMAIN_VIRT_START) + i] =
+            l2e_from_pfn(
+                         
mfn_x(page_to_mfn(virt_to_page(d->arch.mm_perdomain_pt) + i)),
+                         __PAGE_HYPERVISOR);
+    
+    for ( i = 0; i < HAP_L3_PAGETABLE_ENTRIES; i++ )
+        sl2e[l2_table_offset(LINEAR_PT_VIRT_START) + i] =
+            l2e_empty();
+
+    if ( paging_mode_translate(d) )
+    {
+        /* Install the domain-specific p2m table */
+        l3_pgentry_t *p2m;
+        ASSERT(pagetable_get_pfn(d->arch.phys_table) != 0);
+        p2m = hap_map_domain_page(pagetable_get_mfn(d->arch.phys_table));
+        for ( i = 0; i < MACHPHYS_MBYTES>>1; i++ )
+        {
+            sl2e[l2_table_offset(RO_MPT_VIRT_START) + i] =
+                (l3e_get_flags(p2m[i]) & _PAGE_PRESENT)
+                ? l2e_from_pfn(mfn_x(_mfn(l3e_get_pfn(p2m[i]))),
+                                      __PAGE_HYPERVISOR)
+                : l2e_empty();
+        }
+        hap_unmap_domain_page(p2m);
+    }
+
+    hap_unmap_domain_page(sl2e);
+}
+#endif
+
+#if CONFIG_PAGING_LEVELS == 2
+void hap_install_xen_entries_in_l2(struct vcpu *v, mfn_t gl2mfn, mfn_t sl2mfn)
+{
+    struct domain *d = v->domain;
+    l2_pgentry_t *sl2e;
+    int i;
+
+    sl2e = hap_map_domain_page(sl2mfn);
+    ASSERT(sl2e != NULL);
+    
+    /* Copy the common Xen mappings from the idle domain */
+    memcpy(&sl2e[L2_PAGETABLE_FIRST_XEN_SLOT],
+           &idle_pg_table[L2_PAGETABLE_FIRST_XEN_SLOT],
+           L2_PAGETABLE_XEN_SLOTS * sizeof(l2_pgentry_t));
+
+    /* Install the per-domain mappings for this domain */
+    for ( i = 0; i < PDPT_L2_ENTRIES; i++ )
+        sl2e[l2_table_offset(PERDOMAIN_VIRT_START) + i] =
+            l2e_from_pfn(
+                mfn_x(page_to_mfn(virt_to_page(d->arch.mm_perdomain_pt) + i)),
+                __PAGE_HYPERVISOR);
+
+
+    sl2e[l2_table_offset(LINEAR_PT_VIRT_START)] =
+        l2e_from_pfn(mfn_x(gl2mfn), __PAGE_HYPERVISOR);
+
+    /* install domain-specific P2M table */
+    sl2e[l2_table_offset(RO_MPT_VIRT_START)] =
+        l2e_from_pfn(mfn_x(pagetable_get_mfn(d->arch.phys_table)),
+                            __PAGE_HYPERVISOR);
+
+    hap_unmap_domain_page(sl2e);
+}
+#endif
+
+mfn_t hap_make_monitor_table(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+
+    ASSERT(pagetable_get_pfn(v->arch.monitor_table) == 0);
+
+#if CONFIG_PAGING_LEVELS == 4
+    {
+        mfn_t m4mfn;
+        m4mfn = hap_alloc(d, 0);
+        hap_install_xen_entries_in_l4(v, m4mfn, m4mfn);
+        return m4mfn;
+    }
+#elif CONFIG_PAGING_LEVELS == 3
+    {
+        mfn_t m3mfn, m2mfn; 
+        l3_pgentry_t *l3e;
+        l2_pgentry_t *l2e;
+        int i;
+
+        m3mfn = hap_alloc(d, 0);
+
+        /* Install a monitor l2 table in slot 3 of the l3 table.
+         * This is used for all Xen entries, including linear maps
+         */
+        m2mfn = hap_alloc(d, 0);
+        l3e = hap_map_domain_page(m3mfn);
+        l3e[3] = l3e_from_pfn(mfn_x(m2mfn), _PAGE_PRESENT);
+        hap_install_xen_entries_in_l2h(v, m2mfn);
+        /* Install the monitor's own linear map */
+        l2e = hap_map_domain_page(m2mfn);
+        for ( i = 0; i < L3_PAGETABLE_ENTRIES; i++ )
+            l2e[l2_table_offset(LINEAR_PT_VIRT_START) + i] =
+                (l3e_get_flags(l3e[i]) & _PAGE_PRESENT) 
+                ? l2e_from_pfn(l3e_get_pfn(l3e[i]), __PAGE_HYPERVISOR) 
+                : l2e_empty();
+        hap_unmap_domain_page(l2e);
+        hap_unmap_domain_page(l3e);
+
+        HAP_PRINTK("new monitor table: %#lx\n", mfn_x(m3mfn));
+        return m3mfn;
+    }
+#else
+    {
+        mfn_t m2mfn;
+        
+        m2mfn = hap_alloc(d, 0);
+        hap_install_xen_entries_in_l2(v, m2mfn, m2mfn);
+    
+        return m2mfn;
+    }
+#endif
+}
+
+void hap_destroy_monitor_table(struct vcpu* v, mfn_t mmfn)
+{
+    struct domain *d = v->domain;
+
+#if CONFIG_PAGING_LEVELS == 4
+    /* Need to destroy the l3 monitor page in slot 0 too */
+    {
+        mfn_t m3mfn;
+        l4_pgentry_t *l4e = hap_map_domain_page(mmfn);
+        ASSERT(l4e_get_flags(l4e[0]) & _PAGE_PRESENT);
+        m3mfn = _mfn(l4e_get_pfn(l4e[0]));
+        hap_free(d, m3mfn);
+        hap_unmap_domain_page(l4e);
+    }
+#elif CONFIG_PAGING_LEVELS == 3
+    /* Need to destroy the l2 monitor page in slot 4 too */
+    {
+        l3_pgentry_t *l3e = hap_map_domain_page(mmfn);
+        ASSERT(l3e_get_flags(l3e[3]) & _PAGE_PRESENT);
+        hap_free(d, _mfn(l3e_get_pfn(l3e[3])));
+        hap_unmap_domain_page(l3e);
+    }
+#endif
+
+    /* Put the memory back in the pool */
+    hap_free(d, mmfn);
+}
+
+/************************************************/
+/*          HAP DOMAIN LEVEL FUNCTIONS          */
+/************************************************/
+void hap_domain_init(struct domain *d)
+{
+    hap_lock_init(d);
+    INIT_LIST_HEAD(&d->arch.paging.hap.freelists);
+    INIT_LIST_HEAD(&d->arch.paging.hap.p2m_freelist);
+}
+
+/* return 0 for success, -errno for failure */
+int hap_enable(struct domain *d, u32 mode)
+{
+    unsigned int old_pages;
+    int rv = 0;
+
+    HERE_I_AM;
+
+    domain_pause(d);
+    /* error check */
+    if ( (d == current->domain) ) {
+        rv = -EINVAL;
+        goto out;
+    }
+
+    old_pages = d->arch.paging.hap.total_pages;
+    if ( old_pages == 0 ) {
+        unsigned int r;
+        hap_lock(d);
+        r = hap_set_allocation(d, 256, NULL);
+        hap_unlock(d);
+        if ( r != 0 ) {
+            hap_set_allocation(d, 0, NULL);
+            rv = -ENOMEM;
+            goto out;
+        }
+    }
+
+    /* allocate P2m table */
+    if ( mode & PG_translate ) {
+        rv = p2m_alloc_table(d, hap_alloc_p2m_page, hap_free_p2m_page);
+        if ( rv != 0 )
+            goto out;
+    }
+
+    d->arch.paging.mode = mode | PG_SH_enable;
+
+ out:
+    domain_unpause(d);
+    return rv;
+}
+
+void hap_final_teardown(struct domain *d)
+{
+    HERE_I_AM;
+
+    if ( d->arch.paging.hap.total_pages != 0 )
+        hap_teardown(d);
+
+    p2m_teardown(d);
+}
+
+void hap_teardown(struct domain *d)
+{
+    struct vcpu *v;
+    mfn_t mfn;
+    HERE_I_AM;
+
+    ASSERT(test_bit(_DOMF_dying, &d->domain_flags));
+    ASSERT(d != current->domain);
+
+    if ( !hap_locked_by_me(d) )
+        hap_lock(d); /* Keep various asserts happy */
+
+    if ( paging_mode_enabled(d) ) {
+        /* release the monitor table held by each vcpu */
+        for_each_vcpu(d, v) {
+            if ( v->arch.paging.mode && paging_mode_external(d) ) {
+                mfn = pagetable_get_mfn(v->arch.monitor_table);
+                if ( mfn_valid(mfn) && (mfn_x(mfn) != 0) )
+                    hap_destroy_monitor_table(v, mfn);
+                v->arch.monitor_table = pagetable_null();
+            }
+        }
+    }
+
+    if ( d->arch.paging.hap.total_pages != 0 ) {
+        HAP_PRINTK("teardown of domain %u starts."
+                      "  pages total = %u, free = %u, p2m=%u\n",
+                      d->domain_id,
+                      d->arch.paging.hap.total_pages,
+                      d->arch.paging.hap.free_pages,
+                      d->arch.paging.hap.p2m_pages);
+        hap_set_allocation(d, 0, NULL);
+        HAP_PRINTK("teardown done."
+                      "  pages total = %u, free = %u, p2m=%u\n",
+                      d->arch.paging.hap.total_pages,
+                      d->arch.paging.hap.free_pages,
+                      d->arch.paging.hap.p2m_pages);
+        ASSERT(d->arch.paging.hap.total_pages == 0);
+    }
+    
+    d->arch.paging.mode &= ~PG_log_dirty;
+
+    hap_unlock(d);
+}
+
+int hap_domctl(struct domain *d, xen_domctl_shadow_op_t *sc,
+               XEN_GUEST_HANDLE(void) u_domctl)
+{
+    int rc, preempted = 0;
+
+    HERE_I_AM;
+
+    if ( unlikely(d == current->domain) ) {
+        gdprintk(XENLOG_INFO, "Don't try to do a hap op on yourself!\n");
+        return -EINVAL;
+    }
+    
+    switch ( sc->op ) {
+    case XEN_DOMCTL_SHADOW_OP_OFF:
+    case XEN_DOMCTL_SHADOW_OP_ENABLE_TEST:
+    case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
+    case XEN_DOMCTL_SHADOW_OP_ENABLE_TRANSLATE:
+    case XEN_DOMCTL_SHADOW_OP_CLEAN:
+    case XEN_DOMCTL_SHADOW_OP_PEEK:
+    case XEN_DOMCTL_SHADOW_OP_ENABLE:
+        HAP_ERROR("Bad hap domctl op %u\n", sc->op);
+        domain_crash(d);
+        return -EINVAL;
+    case XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION:
+        hap_lock(d);
+        rc = hap_set_allocation(d, sc->mb << (20 - PAGE_SHIFT), &preempted);
+        hap_unlock(d);
+        if ( preempted )
+            /* Not finished.  Set up to re-run the call. */
+            rc = hypercall_create_continuation(__HYPERVISOR_domctl, "h", 
+                                               u_domctl);
+        else
+            /* Finished.  Return the new allocation */
+            sc->mb = hap_get_allocation(d);
+        return rc;
+    case XEN_DOMCTL_SHADOW_OP_GET_ALLOCATION:
+        sc->mb = hap_get_allocation(d);
+        return 0;
+    default:
+        HAP_ERROR("Bad hap domctl op %u\n", sc->op);
+        return -EINVAL;
+    }
+}
+
+void hap_vcpu_init(struct vcpu *v)
+{
+    v->arch.paging.mode = &hap_paging_real_mode;
+}
+/************************************************/
+/*          HAP PAGING MODE FUNCTIONS           */
+/************************************************/
+/* In theory, hap should not intercept guest page fault. This function can 
+ * be recycled to handle host/nested page fault, if needed.
+ */
+int hap_page_fault(struct vcpu *v, unsigned long va, 
+                   struct cpu_user_regs *regs)
+{
+    HERE_I_AM;
+    domain_crash(v->domain);
+    return 0;
+}
+
+/* called when guest issues a invlpg request. 
+ * Return 1 if need to issue page invalidation on CPU; Return 0 if does not
+ * need to do so.
+ */
+int hap_invlpg(struct vcpu *v, unsigned long va)
+{
+    HERE_I_AM;
+    return 0;
+}
+
+void hap_update_cr3(struct vcpu *v, int do_locking)
+{
+    struct domain *d = v->domain;
+    mfn_t gmfn;
+
+    HERE_I_AM;
+    /* Don't do anything on an uninitialised vcpu */
+    if ( !is_hvm_domain(d) && !test_bit(_VCPUF_initialised, &v->vcpu_flags) ) {
+        ASSERT(v->arch.cr3 == 0);
+        return;
+    }
+
+    if ( do_locking )
+        hap_lock(v->domain);
+    
+    ASSERT(hap_locked_by_me(v->domain));
+    ASSERT(v->arch.paging.mode);
+    
+    gmfn = pagetable_get_mfn(v->arch.guest_table);
+
+    make_cr3(v, pagetable_get_pfn(v->arch.monitor_table));
+    
+    hvm_update_guest_cr3(v, pagetable_get_paddr(v->arch.monitor_table));
+
+    HAP_PRINTK("d=%u v=%u guest_table=%05lx, monitor_table = %05lx\n", 
+               d->domain_id, v->vcpu_id, 
+               (unsigned long)pagetable_get_pfn(v->arch.guest_table),
+               (unsigned long)pagetable_get_pfn(v->arch.monitor_table));
+
+    flush_tlb_mask(d->domain_dirty_cpumask);
+
+    if ( do_locking )
+        hap_unlock(v->domain);
+}
+
+void hap_update_paging_modes(struct vcpu *v)
+{
+    struct domain *d;
+
+    HERE_I_AM;
+
+    d = v->domain;
+    hap_lock(d);
+
+    /* update guest paging mode. Note that we rely on hvm functions to detect
+     * guest's paging mode. So, make sure the shadow registers (CR0, CR4, EFER)
+     * reflect guest's status correctly.
+     */
+    if ( hvm_paging_enabled(v) ) {
+        if ( hvm_long_mode_enabled(v) )
+            v->arch.paging.mode = &hap_paging_long_mode;
+        else if ( hvm_pae_enabled(v) )
+            v->arch.paging.mode = &hap_paging_pae_mode;
+        else
+            v->arch.paging.mode = &hap_paging_protected_mode;
+    }
+    else {
+        v->arch.paging.mode = &hap_paging_real_mode;
+    }
+
+    v->arch.paging.translate_enabled = !!hvm_paging_enabled(v);    
+
+    /* use p2m map */
+    v->arch.guest_table =
+        pagetable_from_pfn(pagetable_get_pfn(d->arch.phys_table));
+
+    if ( pagetable_is_null(v->arch.monitor_table) ) {
+        mfn_t mmfn = hap_make_monitor_table(v);
+        v->arch.monitor_table = pagetable_from_mfn(mmfn);
+        make_cr3(v, mfn_x(mmfn));
+    }
+
+    flush_tlb_mask(d->domain_dirty_cpumask);
+    hap_unlock(d);
+}
+
+void 
+hap_write_p2m_entry(struct vcpu *v, unsigned long gfn, l1_pgentry_t *p,
+                    l1_pgentry_t new, unsigned int level)
+{
+    hap_lock(v->domain);
+    safe_write_pte(p, new);
+    hap_unlock(v->domain);
+}
+
+/* Entry points into this mode of the hap code. */
+struct paging_mode hap_paging_real_mode = {
+    .page_fault             = hap_page_fault, 
+    .invlpg                 = hap_invlpg,
+    .gva_to_gfn             = hap_gva_to_gfn_real_mode,
+    .update_cr3             = hap_update_cr3,
+    .update_paging_modes    = hap_update_paging_modes,
+    .write_p2m_entry        = hap_write_p2m_entry,
+    .guest_levels           = 1
+};
+
+struct paging_mode hap_paging_protected_mode = {
+    .page_fault             = hap_page_fault, 
+    .invlpg                 = hap_invlpg,
+    .gva_to_gfn             = hap_gva_to_gfn_protected_mode,
+    .update_cr3             = hap_update_cr3,
+    .update_paging_modes    = hap_update_paging_modes,
+    .write_p2m_entry        = hap_write_p2m_entry,
+    .guest_levels           = 2
+};
+
+struct paging_mode hap_paging_pae_mode = {
+    .page_fault             = hap_page_fault, 
+    .invlpg                 = hap_invlpg,
+    .gva_to_gfn             = hap_gva_to_gfn_pae_mode,
+    .update_cr3             = hap_update_cr3,
+    .update_paging_modes    = hap_update_paging_modes,
+    .write_p2m_entry        = hap_write_p2m_entry,
+    .guest_levels           = 3
+};
+
+struct paging_mode hap_paging_long_mode = {
+    .page_fault             = hap_page_fault, 
+    .invlpg                 = hap_invlpg,
+    .gva_to_gfn             = hap_gva_to_gfn_long_mode,
+    .update_cr3             = hap_update_cr3,
+    .update_paging_modes    = hap_update_paging_modes,
+    .write_p2m_entry        = hap_write_p2m_entry,
+    .guest_levels           = 4
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
+
+
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/hap/private.h
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/arch/x86/mm/hap/private.h     Thu Mar 08 14:39:52 2007 -0600
@@ -0,0 +1,112 @@
+/*
+ * arch/x86/mm/hap/private.h
+ *
+ * Copyright (c) 2007, AMD Corporation (Wei Huang)
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+#ifndef __HAP_PRIVATE_H__
+#define __HAP_PRIVATE_H__
+
+#include <asm/flushtlb.h>
+#include <asm/hvm/support.h>
+
+/********************************************/
+/*          GUEST TRANSLATION FUNCS         */
+/********************************************/
+unsigned long hap_gva_to_gfn_real_mode(struct vcpu *v, unsigned long gva);
+unsigned long hap_gva_to_gfn_protected_mode(struct vcpu *v, unsigned long gva);
+unsigned long hap_gva_to_gfn_pae_mode(struct vcpu *v, unsigned long gva);
+unsigned long hap_gva_to_gfn_long_mode(struct vcpu *v, unsigned long gva);
+/********************************************/
+/*            MISC DEFINITIONS              */
+/********************************************/
+
+/* PT_SHIFT describes the amount by which a virtual address is shifted right 
+ * to right justify the portion to be used for indexing into a page 
+ * table, given the guest memory model (i.e. number of levels) and the level 
+ * of the page table being accessed. The idea is from Virtual Iron's code.
+ */
+static const int PT_SHIFT[][5] =
+  {   /*     ------  level ------           nr_levels  */
+    /*         1     2     3     4                   */
+    {    0,    0,    0,    0,    0},   /* 0 not used */
+    {    0,    0,    0,    0,    0},   /* 1 not used */
+    {    0,   12,   22,    0,    0},   /* 2  */
+    {    0,   12,   21,   30,    0},   /* 3  */
+    {    0,   12,   21,   30,   39}    /* 4  */
+  };
+
+/* PT_ENTRIES describes the number of entries in a page table, given the 
+ * memory model (i.e. number of levels) and the level of the page table 
+ * being considered. This idea from Virtual Iron's shadow code*/
+static const int PT_ENTRIES[][5] =
+  {   /*     ------  level ------           nr_levels  */
+    /*         1     2     3     4                   */
+    {    0,    0,    0,    0,    0},   /* 0 not used */
+    {    0,    0,    0,    0,    0},   /* 1 not used */
+    {    0, 1024, 1024,    0,    0},   /* 2  */
+    {    0,  512,  512,    4,    0},   /* 3  */
+    {    0,  512,  512,  512,  512}    /* 4  */
+  };
+
+/********************************************/
+/*       PAGING DEFINITION FOR GUEST        */
+/********************************************/
+#define PHYSICAL_PAGE_4K_SIZE (1UL << 12)
+#define PHYSICAL_PAGE_2M_SIZE (1UL << 21)
+#define PHYSICAL_PAGE_4M_SIZE (1UL << 22)
+#define PHYSICAL_PAGE_4K_MASK ( ~(PHYSICAL_PAGE_4K_SIZE - 1) )
+#define PHYSICAL_PAGE_2M_MASK ( ~(PHYSICAL_PAGE_2M_SIZE - 1) )
+#define PHYSICAL_PAGE_4M_MASK ( ~(PHYSICAL_PAGE_4M_SIZE - 1) )
+
+/* long mode physical address mask */
+#define PHYSICAL_ADDR_BITS_LM    52
+#define PHYSICAL_ADDR_MASK_LM    ((1UL << PHYSICAL_ADDR_BITS_LM)-1)
+#define PHYSICAL_ADDR_2M_MASK_LM (PHYSICAL_PAGE_2M_MASK & 
PHYSICAL_ADDR_MASK_LM)
+#define PHYSICAL_ADDR_4K_MASK_LM (PHYSICAL_PAGE_4K_MASK & 
PHYSICAL_ADDR_MASK_LM)
+
+#define PAGE_NX_BIT      (1ULL << 63)
+/************************************************/
+/*        PAGETABLE RELATED VARIABLES           */
+/************************************************/
+#if CONFIG_PAGING_LEVELS == 2
+#define HAP_L1_PAGETABLE_ENTRIES    1024
+#define HAP_L2_PAGETABLE_ENTRIES    1024
+#define HAP_L1_PAGETABLE_SHIFT        12
+#define HAP_L2_PAGETABLE_SHIFT        22
+#endif
+
+#if CONFIG_PAGING_LEVELS == 3
+#define HAP_L1_PAGETABLE_ENTRIES     512
+#define HAP_L2_PAGETABLE_ENTRIES     512
+#define HAP_L3_PAGETABLE_ENTRIES       4
+#define HAP_L1_PAGETABLE_SHIFT        12
+#define HAP_L2_PAGETABLE_SHIFT        21
+#define HAP_L3_PAGETABLE_SHIFT        30
+#endif
+
+#if CONFIG_PAGING_LEVELS == 4
+#define HAP_L1_PAGETABLE_ENTRIES     512
+#define HAP_L2_PAGETABLE_ENTRIES     512
+#define HAP_L3_PAGETABLE_ENTRIES     512
+#define HAP_L4_PAGETABLE_ENTRIES     512
+#define HAP_L1_PAGETABLE_SHIFT        12
+#define HAP_L2_PAGETABLE_SHIFT        21
+#define HAP_L3_PAGETABLE_SHIFT        30
+#define HAP_L4_PAGETABLE_SHIFT        39
+#endif
+
+#endif /* __SVM_NPT_H__ */
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/hap/support.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/arch/x86/mm/hap/support.c     Thu Mar 08 14:39:52 2007 -0600
@@ -0,0 +1,334 @@
+/*
+ * arch/x86/mm/hap/support.c
+ * 
+ * guest page table walker
+ * Copyright (c) 2007, AMD Corporation (Wei Huang)
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <xen/config.h>
+#include <xen/types.h>
+#include <xen/mm.h>
+#include <xen/domain_page.h>
+#include <asm/page.h>
+#include <xen/event.h>
+#include <xen/sched.h>
+#include <asm/hvm/svm/vmcb.h>
+#include <asm/domain.h>
+#include <asm/shadow.h>
+#include <asm/hap.h>
+
+#include "private.h"
+#include "../page-guest32.h"
+
+/*******************************************/
+/*      Platform Specific Functions        */
+/*******************************************/
+
+/* Translate guest virtual address to guest physical address. Specifically
+ * for real mode guest. 
+ */
+unsigned long hap_gva_to_gfn_real_mode(struct vcpu *v, unsigned long gva)
+{
+    HERE_I_AM;
+    return ((paddr_t)gva >> PAGE_SHIFT);
+}
+
+/* Translate guest virtual address to guest physical address. Specifically
+ * for protected guest. 
+ */
+unsigned long hap_gva_to_gfn_protected_mode(struct vcpu *v, unsigned long gva)
+{
+    unsigned long gcr3 = hvm_get_guest_ctrl_reg(v, 3);
+    int mode = 2; /* two-level guest */
+    int lev, index;
+    paddr_t gpa = 0;
+    unsigned long gpfn, mfn;
+    int success = 1;
+    l2_pgentry_32_t *l2e; /* guest page entry size is 32-bit */
+    l1_pgentry_32_t *l1e;
+
+    HERE_I_AM;
+
+    gpfn = (gcr3 >> PAGE_SHIFT);
+    for ( lev = mode; lev >= 1; lev-- ) {
+        mfn = get_mfn_from_gpfn( gpfn );
+        if ( mfn == INVALID_MFN ) {
+            HAP_PRINTK("bad pfn=0x%lx from gva=0x%lx at lev%d\n", gpfn, gva, 
+                       lev);
+            success = 0;
+            break;
+        }
+        index = (gva >> PT_SHIFT[mode][lev]) & (PT_ENTRIES[mode][lev]-1);
+
+        if ( lev == 2 ) {
+            l2e = map_domain_page( mfn );
+            HAP_PRINTK("l2 page table entry is %ulx at index = %d\n", 
+                       l2e[index].l2, index);
+            if ( !(l2e_get_flags_32(l2e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 2 entry not present at index = %d\n", index);
+                success = 0;
+            }
+
+            if ( l2e_get_flags_32(l2e[index]) & _PAGE_PSE ) { /* handle PSE */
+                HAP_PRINTK("guest page table is PSE\n");
+                if ( l2e_get_intpte(l2e[index]) & 0x001FE000UL ) { /*[13:20] */
+                    printk("guest physical memory size is too large!\n");
+                    domain_crash(v->domain);
+                }
+                gpa = (l2e_get_intpte(l2e[index]) & PHYSICAL_PAGE_4M_MASK) + 
+                    (gva & ~PHYSICAL_PAGE_4M_MASK);
+                unmap_domain_page(l2e);
+                break; /* last level page table, return from here */
+            }
+            else {
+                gpfn = l2e_get_pfn( l2e[index] );
+            }
+            unmap_domain_page(l2e);
+        }
+
+        if ( lev == 1 ) {
+            l1e = map_domain_page( mfn );
+            HAP_PRINTK("l1 page table entry is %ulx at index = %d\n", 
+                       l1e[index].l1, index);
+            if ( !(l1e_get_flags_32(l1e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 1 entry not present at index = %d\n", index);
+                success = 0;
+            }
+            gpfn = l1e_get_pfn( l1e[index] );
+            gpa = (l1e_get_intpte(l1e[index]) & PHYSICAL_PAGE_4K_MASK) + 
+                (gva & ~PHYSICAL_PAGE_4K_MASK);            
+            unmap_domain_page(l1e);
+        }
+
+        if ( !success ) /* error happened, jump out */
+            break;
+    }
+
+    HAP_PRINTK("success = %d, gva = %lx, gpa = %lx\n", success, gva, gpa);
+
+    if ( !success ) /* error happened */
+        return INVALID_GFN;
+    else
+        return ((paddr_t)gpa >> PAGE_SHIFT);
+}
+
+
+
+/* Translate guest virtual address to guest physical address. Specifically
+ * for PAE mode guest. 
+ */
+unsigned long hap_gva_to_gfn_pae_mode(struct vcpu *v, unsigned long gva)
+{
+#if CONFIG_PAGING_LEVELS >= 3
+    unsigned long gcr3 = hvm_get_guest_ctrl_reg(v, 3);
+    int mode = 3; /* three-level guest */
+    int lev, index;
+    paddr_t gpa = 0;
+    unsigned long gpfn, mfn;
+    int success = 1;
+    l1_pgentry_t *l1e;
+    l2_pgentry_t *l2e;
+    l3_pgentry_t *l3e;
+    
+    HERE_I_AM;
+
+    gpfn = (gcr3 >> PAGE_SHIFT);
+    for ( lev = mode; lev >= 1; lev-- ) {
+        mfn = get_mfn_from_gpfn( gpfn );
+        if ( mfn == INVALID_MFN ) {
+            HAP_PRINTK("bad pfn=0x%lx from gva=0x%lx at lev%d\n", gpfn, gva, 
+                       lev);
+            success = 0;
+            break;
+        }
+        index = (gva >> PT_SHIFT[mode][lev]) & (PT_ENTRIES[mode][lev]-1);
+
+        if ( lev == 3 ) {
+            l3e = map_domain_page( mfn );
+            index += ( ((gcr3 >> 5 ) & 127 ) * 4 );
+            if ( !(l3e_get_flags(l3e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 3 entry not present at index = %d\n", index);
+                success = 0;
+            }
+            gpfn = l3e_get_pfn( l3e[index] );
+            unmap_domain_page(l3e);
+        }
+
+        if ( lev == 2 ) {
+            l2e = map_domain_page( mfn );
+            if ( !(l2e_get_flags(l2e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 2 entry not present at index = %d\n", index);
+                success = 0;
+            }
+
+            if ( l2e_get_flags(l2e[index]) & _PAGE_PSE ) { /* handle PSE */
+                HAP_PRINTK("guest page table is PSE\n");
+                gpa = (l2e_get_intpte(l2e[index]) & PHYSICAL_PAGE_2M_MASK) + 
+                    (gva & ~PHYSICAL_PAGE_2M_MASK);
+                unmap_domain_page(l2e);
+                break; /* last level page table, jump out from here */
+            }
+            else { 
+                gpfn = l2e_get_pfn(l2e[index]);
+            }
+            unmap_domain_page(l2e);
+        }
+
+        if ( lev == 1 ) {
+            l1e = map_domain_page( mfn );
+            if ( !(l1e_get_flags(l1e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 1 entry not present at index = %d\n", index);
+                success = 0;
+            }
+            gpfn = l1e_get_pfn( l1e[index] );
+            gpa = (l1e_get_intpte(l1e[index]) & PHYSICAL_PAGE_4K_MASK) + 
+                (gva & ~PHYSICAL_PAGE_4K_MASK);
+            unmap_domain_page(l1e);
+        }
+
+        if ( success != 1 ) /* error happened, jump out */
+            break;
+    }
+
+    gpa &= ~PAGE_NX_BIT; /* clear NX bit of guest physical address */
+    HAP_PRINTK("success = %d, gva = %lx, gpa = %lx\n", success, gva, gpa);
+
+    if ( !success )
+        return INVALID_GFN;
+    else
+        return ((paddr_t)gpa >> PAGE_SHIFT);
+#else
+    HERE_I_AM;
+    printk("guest paging level (3) is greater than host paging level!\n");
+    domain_crash(v->domain);
+    return INVALID_GFN;
+#endif
+}
+
+
+
+/* Translate guest virtual address to guest physical address. Specifically
+ * for long mode guest. 
+ */
+unsigned long hap_gva_to_gfn_long_mode(struct vcpu *v, unsigned long gva)
+{
+#if CONFIG_PAGING_LEVELS == 4
+    unsigned long gcr3 = hvm_get_guest_ctrl_reg(v, 3);
+    int mode = 4; /* four-level guest */
+    int lev, index;
+    paddr_t gpa = 0;
+    unsigned long gpfn, mfn;
+    int success = 1;
+    l4_pgentry_t *l4e;
+    l3_pgentry_t *l3e;
+    l2_pgentry_t *l2e;
+    l1_pgentry_t *l1e;
+
+    HERE_I_AM;
+
+    gpfn = (gcr3 >> PAGE_SHIFT);
+    for ( lev = mode; lev >= 1; lev-- ) {
+        mfn = get_mfn_from_gpfn( gpfn );
+        if ( mfn == INVALID_MFN ) {
+            HAP_PRINTK("bad pfn=0x%lx from gva=0x%lx at lev%d\n", gpfn, gva, 
+                       lev);
+            success = 0;
+            break;
+        }
+        index = (gva >> PT_SHIFT[mode][lev]) & (PT_ENTRIES[mode][lev]-1);
+
+        if ( lev == 4 ) {
+            l4e = map_domain_page( mfn );
+            if ( !(l4e_get_flags(l4e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 4 entry not present at index = %d\n", index);
+                success = 0;
+            }
+            gpfn = l4e_get_pfn( l4e[index] );
+            unmap_domain_page(l4e);
+        }
+
+        if ( lev == 3 ) {
+            l3e = map_domain_page( mfn );
+            if ( !(l3e_get_flags(l3e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 3 entry not present at index = %d\n", index);
+                success = 0;
+            }
+            gpfn = l3e_get_pfn( l3e[index] );
+            unmap_domain_page(l3e);
+        }
+
+        if ( lev == 2 ) {
+            l2e = map_domain_page( mfn );
+            if ( !(l2e_get_flags(l2e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 2 entry not present at index = %d\n", index);
+                success = 0;
+            }
+
+            if ( l2e_get_flags(l2e[index]) & _PAGE_PSE ) { /* handle PSE */
+                HAP_PRINTK("guest page table is PSE\n");
+                gpa = (l2e_get_intpte(l2e[index]) & PHYSICAL_ADDR_2M_MASK_LM) 
+                    + (gva & ~PHYSICAL_PAGE_2M_MASK);
+                unmap_domain_page(l2e);
+                break; /* last level page table, jump out from here */
+            }
+            else { 
+                gpfn = l2e_get_pfn(l2e[index]);
+            }
+            unmap_domain_page(l2e);
+        }
+
+        if ( lev == 1 ) {
+            l1e = map_domain_page( mfn );
+            if ( !(l1e_get_flags(l1e[index]) & _PAGE_PRESENT) ) {
+                HAP_PRINTK("Level 1 entry not present at index = %d\n", index);
+                success = 0;
+            }
+            gpfn = l1e_get_pfn( l1e[index] );
+            gpa = (l1e_get_intpte(l1e[index]) & PHYSICAL_ADDR_4K_MASK_LM) + 
+                (gva & ~PHYSICAL_PAGE_4K_MASK);
+            unmap_domain_page(l1e);
+        }
+
+        if ( success != 1 ) /* error happened, jump out */
+            break;
+    }
+
+    gpa &= ~PAGE_NX_BIT; /* clear NX bit of guest physical address */
+    HAP_PRINTK("success = %d, gva = %lx, gpa = %lx\n", success, gva, gpa);
+
+    if ( !success )
+        return INVALID_GFN;
+    else
+        return ((paddr_t)gpa >> PAGE_SHIFT);
+#else
+    HERE_I_AM;
+    printk("guest paging level (4) is greater than host paging level!\n");
+    domain_crash(v->domain);
+    return INVALID_GFN;
+#endif
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
+
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/page-guest32.h
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/arch/x86/mm/page-guest32.h    Thu Mar 08 14:39:52 2007 -0600
@@ -0,0 +1,100 @@
+
+#ifndef __X86_PAGE_GUEST_H__
+#define __X86_PAGE_GUEST_H__
+
+#ifndef __ASSEMBLY__
+# include <asm/types.h>
+#endif
+
+#define PAGETABLE_ORDER_32         10
+#define L1_PAGETABLE_ENTRIES_32    (1<<PAGETABLE_ORDER_32)
+#define L2_PAGETABLE_ENTRIES_32    (1<<PAGETABLE_ORDER_32)
+#define ROOT_PAGETABLE_ENTRIES_32  L2_PAGETABLE_ENTRIES_32
+
+
+#define L1_PAGETABLE_SHIFT_32 12
+#define L2_PAGETABLE_SHIFT_32 22
+
+/* Extract flags into 12-bit integer, or turn 12-bit flags into a pte mask. */
+
+#ifndef __ASSEMBLY__
+
+typedef u32 intpte_32_t;
+
+typedef struct { intpte_32_t l1; } l1_pgentry_32_t;
+typedef struct { intpte_32_t l2; } l2_pgentry_32_t;
+typedef l2_pgentry_t root_pgentry_32_t;
+#endif
+
+#define get_pte_flags_32(x) ((u32)(x) & 0xFFF)
+#define put_pte_flags_32(x) ((intpte_32_t)(x))
+
+/* Get pte access flags (unsigned int). */
+#define l1e_get_flags_32(x)           (get_pte_flags_32((x).l1))
+#define l2e_get_flags_32(x)           (get_pte_flags_32((x).l2))
+
+#define l1e_get_paddr_32(x)           \
+    ((paddr_t)(((x).l1 & (PADDR_MASK&PAGE_MASK))))
+#define l2e_get_paddr_32(x)           \
+    ((paddr_t)(((x).l2 & (PADDR_MASK&PAGE_MASK))))
+
+/* Construct an empty pte. */
+#define l1e_empty_32()                ((l1_pgentry_32_t) { 0 })
+#define l2e_empty_32()                ((l2_pgentry_32_t) { 0 })
+
+/* Construct a pte from a pfn and access flags. */
+#define l1e_from_pfn_32(pfn, flags)   \
+    ((l1_pgentry_32_t) { ((intpte_32_t)(pfn) << PAGE_SHIFT) | 
put_pte_flags_32(flags) })
+#define l2e_from_pfn_32(pfn, flags)   \
+    ((l2_pgentry_32_t) { ((intpte_32_t)(pfn) << PAGE_SHIFT) | 
put_pte_flags_32(flags) })
+
+/* Construct a pte from a physical address and access flags. */
+#ifndef __ASSEMBLY__
+static inline l1_pgentry_32_t l1e_from_paddr_32(paddr_t pa, unsigned int flags)
+{
+    ASSERT((pa & ~(PADDR_MASK & PAGE_MASK)) == 0);
+    return (l1_pgentry_32_t) { pa | put_pte_flags_32(flags) };
+}
+static inline l2_pgentry_32_t l2e_from_paddr_32(paddr_t pa, unsigned int flags)
+{
+    ASSERT((pa & ~(PADDR_MASK & PAGE_MASK)) == 0);
+    return (l2_pgentry_32_t) { pa | put_pte_flags_32(flags) };
+}
+#endif /* !__ASSEMBLY__ */
+
+
+/* Construct a pte from a page pointer and access flags. */
+#define l1e_from_page_32(page, flags) 
(l1e_from_pfn_32(page_to_mfn(page),(flags)))
+#define l2e_from_page_32(page, flags) 
(l2e_from_pfn_32(page_to_mfn(page),(flags)))
+
+/* Add extra flags to an existing pte. */
+#define l1e_add_flags_32(x, flags)    ((x).l1 |= put_pte_flags_32(flags))
+#define l2e_add_flags_32(x, flags)    ((x).l2 |= put_pte_flags_32(flags))
+
+/* Remove flags from an existing pte. */
+#define l1e_remove_flags_32(x, flags) ((x).l1 &= ~put_pte_flags_32(flags))
+#define l2e_remove_flags_32(x, flags) ((x).l2 &= ~put_pte_flags_32(flags))
+
+/* Check if a pte's page mapping or significant access flags have changed. */
+#define l1e_has_changed_32(x,y,flags) \
+    ( !!(((x).l1 ^ (y).l1) & ((PADDR_MASK&PAGE_MASK)|put_pte_flags_32(flags))) 
)
+#define l2e_has_changed_32(x,y,flags) \
+    ( !!(((x).l2 ^ (y).l2) & ((PADDR_MASK&PAGE_MASK)|put_pte_flags_32(flags))) 
)
+
+/* Given a virtual address, get an entry offset into a page table. */
+#define l1_table_offset_32(a)         \
+    (((a) >> L1_PAGETABLE_SHIFT_32) & (L1_PAGETABLE_ENTRIES_32 - 1))
+#define l2_table_offset_32(a)         \
+    (((a) >> L2_PAGETABLE_SHIFT_32) & (L2_PAGETABLE_ENTRIES_32 - 1))
+
+#endif /* __X86_PAGE_GUEST_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/paging.c
--- a/xen/arch/x86/mm/paging.c  Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/mm/paging.c  Thu Mar 08 14:39:52 2007 -0600
@@ -24,10 +24,12 @@
 #include <asm/paging.h>
 #include <asm/shadow.h>
 #include <asm/p2m.h>
+#include <asm/hap.h>
 
 /* Xen command-line option to enable hardware-assisted paging */
 int opt_hap_enabled = 0; 
 boolean_param("hap", opt_hap_enabled);
+int hap_capable_system = 0;
 
 /* Printouts */
 #define PAGING_PRINTK(_f, _a...)                                     \
@@ -46,12 +48,18 @@ void paging_domain_init(struct domain *d
 {
     p2m_init(d);
     shadow_domain_init(d);
+
+    if ( opt_hap_enabled && hap_capable_system && is_hvm_domain(d) )
+        hap_domain_init(d);
 }
 
 /* vcpu paging struct initialization goes here */
 void paging_vcpu_init(struct vcpu *v)
 {
-    shadow_vcpu_init(v);
+    if ( opt_hap_enabled && hap_capable_system && is_hvm_vcpu(v) )
+        hap_vcpu_init(v);
+    else
+        shadow_vcpu_init(v);
 }
 
 
@@ -59,32 +67,38 @@ int paging_domctl(struct domain *d, xen_
                   XEN_GUEST_HANDLE(void) u_domctl)
 {
     /* Here, dispatch domctl to the appropriate paging code */
-    return shadow_domctl(d, sc, u_domctl);
+    if ( opt_hap_enabled && hap_capable_system && is_hvm_domain(d) )
+        return hap_domctl(d, sc, u_domctl);
+    else
+        return shadow_domctl(d, sc, u_domctl);
 }
 
 /* Call when destroying a domain */
 void paging_teardown(struct domain *d)
 {
-    shadow_teardown(d);
-    /* Call other modes' teardown code here */    
+    if ( opt_hap_enabled && hap_capable_system && is_hvm_domain(d) )
+        hap_teardown(d);
+    else
+        shadow_teardown(d);
 }
 
 /* Call once all of the references to the domain have gone away */
 void paging_final_teardown(struct domain *d)
 {
-    shadow_teardown(d);
-    /* Call other modes' final teardown code here */
+    if ( opt_hap_enabled && hap_capable_system && is_hvm_domain(d) )
+        hap_final_teardown(d);
+    else
+        shadow_final_teardown(d);
 }
 
 /* Enable an arbitrary paging-assistance mode.  Call once at domain
  * creation. */
 int paging_enable(struct domain *d, u32 mode)
 {
-    if ( mode & PG_SH_enable ) 
-        return shadow_enable(d, mode);
+    if ( opt_hap_enabled && hap_capable_system && is_hvm_domain(d) )
+        return hap_enable(d, mode | PG_HAP_enable);
     else
-        /* No other modes supported yet */
-        return -EINVAL; 
+        return shadow_enable(d, mode | PG_SH_enable);
 }
 
 /* Print paging-assistance info to the console */
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/shadow/common.c
--- a/xen/arch/x86/mm/shadow/common.c   Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/mm/shadow/common.c   Thu Mar 08 14:39:52 2007 -0600
@@ -2912,7 +2912,16 @@ void sh_mark_dirty(struct domain *d, mfn
      * can be called from __hvm_copy during emulation).
      * If the lock isn't held, take it for the duration of the call. */
     do_locking = !shadow_locked_by_me(d);
-    if ( do_locking ) shadow_lock(d);
+    if ( do_locking ) 
+    { 
+        shadow_lock(d);
+        /* Check the mode again with the lock held */ 
+        if ( unlikely(!shadow_mode_log_dirty(d)) )
+        {
+            shadow_unlock(d);
+            return;
+        }
+    }
 
     ASSERT(d->arch.paging.shadow.dirty_bitmap != NULL);
 
@@ -2968,8 +2977,16 @@ int shadow_domctl(struct domain *d,
 
     if ( unlikely(d == current->domain) )
     {
-        gdprintk(XENLOG_INFO, "Don't try to do a shadow op on yourself!\n");
+        gdprintk(XENLOG_INFO, "Dom %u tried to do a shadow op on itself.\n",
+                 d->domain_id);
         return -EINVAL;
+    }
+
+    if ( unlikely(test_bit(_DOMF_dying, &d->domain_flags)) )
+    {
+        gdprintk(XENLOG_INFO, "Ignoring shadow op on dying domain %u\n",
+                 d->domain_id);
+        return 0;
     }
 
     switch ( sc->op )
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/shadow/page-guest32.h
--- a/xen/arch/x86/mm/shadow/page-guest32.h     Mon Mar 05 12:49:12 2007 -0600
+++ /dev/null   Thu Jan 01 00:00:00 1970 +0000
@@ -1,100 +0,0 @@
-
-#ifndef __X86_PAGE_GUEST_H__
-#define __X86_PAGE_GUEST_H__
-
-#ifndef __ASSEMBLY__
-# include <asm/types.h>
-#endif
-
-#define PAGETABLE_ORDER_32         10
-#define L1_PAGETABLE_ENTRIES_32    (1<<PAGETABLE_ORDER_32)
-#define L2_PAGETABLE_ENTRIES_32    (1<<PAGETABLE_ORDER_32)
-#define ROOT_PAGETABLE_ENTRIES_32  L2_PAGETABLE_ENTRIES_32
-
-
-#define L1_PAGETABLE_SHIFT_32 12
-#define L2_PAGETABLE_SHIFT_32 22
-
-/* Extract flags into 12-bit integer, or turn 12-bit flags into a pte mask. */
-
-#ifndef __ASSEMBLY__
-
-typedef u32 intpte_32_t;
-
-typedef struct { intpte_32_t l1; } l1_pgentry_32_t;
-typedef struct { intpte_32_t l2; } l2_pgentry_32_t;
-typedef l2_pgentry_t root_pgentry_32_t;
-#endif
-
-#define get_pte_flags_32(x) ((u32)(x) & 0xFFF)
-#define put_pte_flags_32(x) ((intpte_32_t)(x))
-
-/* Get pte access flags (unsigned int). */
-#define l1e_get_flags_32(x)           (get_pte_flags_32((x).l1))
-#define l2e_get_flags_32(x)           (get_pte_flags_32((x).l2))
-
-#define l1e_get_paddr_32(x)           \
-    ((paddr_t)(((x).l1 & (PADDR_MASK&PAGE_MASK))))
-#define l2e_get_paddr_32(x)           \
-    ((paddr_t)(((x).l2 & (PADDR_MASK&PAGE_MASK))))
-
-/* Construct an empty pte. */
-#define l1e_empty_32()                ((l1_pgentry_32_t) { 0 })
-#define l2e_empty_32()                ((l2_pgentry_32_t) { 0 })
-
-/* Construct a pte from a pfn and access flags. */
-#define l1e_from_pfn_32(pfn, flags)   \
-    ((l1_pgentry_32_t) { ((intpte_32_t)(pfn) << PAGE_SHIFT) | 
put_pte_flags_32(flags) })
-#define l2e_from_pfn_32(pfn, flags)   \
-    ((l2_pgentry_32_t) { ((intpte_32_t)(pfn) << PAGE_SHIFT) | 
put_pte_flags_32(flags) })
-
-/* Construct a pte from a physical address and access flags. */
-#ifndef __ASSEMBLY__
-static inline l1_pgentry_32_t l1e_from_paddr_32(paddr_t pa, unsigned int flags)
-{
-    ASSERT((pa & ~(PADDR_MASK & PAGE_MASK)) == 0);
-    return (l1_pgentry_32_t) { pa | put_pte_flags_32(flags) };
-}
-static inline l2_pgentry_32_t l2e_from_paddr_32(paddr_t pa, unsigned int flags)
-{
-    ASSERT((pa & ~(PADDR_MASK & PAGE_MASK)) == 0);
-    return (l2_pgentry_32_t) { pa | put_pte_flags_32(flags) };
-}
-#endif /* !__ASSEMBLY__ */
-
-
-/* Construct a pte from a page pointer and access flags. */
-#define l1e_from_page_32(page, flags) 
(l1e_from_pfn_32(page_to_mfn(page),(flags)))
-#define l2e_from_page_32(page, flags) 
(l2e_from_pfn_32(page_to_mfn(page),(flags)))
-
-/* Add extra flags to an existing pte. */
-#define l1e_add_flags_32(x, flags)    ((x).l1 |= put_pte_flags_32(flags))
-#define l2e_add_flags_32(x, flags)    ((x).l2 |= put_pte_flags_32(flags))
-
-/* Remove flags from an existing pte. */
-#define l1e_remove_flags_32(x, flags) ((x).l1 &= ~put_pte_flags_32(flags))
-#define l2e_remove_flags_32(x, flags) ((x).l2 &= ~put_pte_flags_32(flags))
-
-/* Check if a pte's page mapping or significant access flags have changed. */
-#define l1e_has_changed_32(x,y,flags) \
-    ( !!(((x).l1 ^ (y).l1) & ((PADDR_MASK&PAGE_MASK)|put_pte_flags_32(flags))) 
)
-#define l2e_has_changed_32(x,y,flags) \
-    ( !!(((x).l2 ^ (y).l2) & ((PADDR_MASK&PAGE_MASK)|put_pte_flags_32(flags))) 
)
-
-/* Given a virtual address, get an entry offset into a page table. */
-#define l1_table_offset_32(a)         \
-    (((a) >> L1_PAGETABLE_SHIFT_32) & (L1_PAGETABLE_ENTRIES_32 - 1))
-#define l2_table_offset_32(a)         \
-    (((a) >> L2_PAGETABLE_SHIFT_32) & (L2_PAGETABLE_ENTRIES_32 - 1))
-
-#endif /* __X86_PAGE_GUEST_H__ */
-
-/*
- * Local variables:
- * mode: C
- * c-set-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/shadow/private.h
--- a/xen/arch/x86/mm/shadow/private.h  Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/mm/shadow/private.h  Thu Mar 08 14:39:52 2007 -0600
@@ -539,7 +539,7 @@ static inline int sh_get_ref(struct vcpu
 
     /* We remember the first shadow entry that points to each shadow. */
     if ( entry_pa != 0 
-         && sh_type_is_pinnable(v, sp->type) 
+         && !sh_type_is_pinnable(v, sp->type) 
          && sp->up == 0 ) 
         sp->up = entry_pa;
     
@@ -559,7 +559,7 @@ static inline void sh_put_ref(struct vcp
 
     /* If this is the entry in the up-pointer, remove it */
     if ( entry_pa != 0 
-         && sh_type_is_pinnable(v, sp->type) 
+         && !sh_type_is_pinnable(v, sp->type) 
          && sp->up == entry_pa ) 
         sp->up = 0;
 
diff -r 8f0b5295bb1b -r dcec453681bc xen/arch/x86/mm/shadow/types.h
--- a/xen/arch/x86/mm/shadow/types.h    Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/arch/x86/mm/shadow/types.h    Thu Mar 08 14:39:52 2007 -0600
@@ -235,7 +235,7 @@ static inline shadow_l4e_t shadow_l4e_fr
 
 #if GUEST_PAGING_LEVELS == 2
 
-#include "page-guest32.h"
+#include "../page-guest32.h"
 
 #define GUEST_L1_PAGETABLE_ENTRIES     1024
 #define GUEST_L2_PAGETABLE_ENTRIES     1024
diff -r 8f0b5295bb1b -r dcec453681bc xen/common/event_channel.c
--- a/xen/common/event_channel.c        Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/common/event_channel.c        Thu Mar 08 14:39:52 2007 -0600
@@ -560,6 +560,9 @@ void send_guest_global_virq(struct domai
 
     ASSERT(virq_is_global(virq));
 
+    if ( unlikely(d == NULL) )
+        return;
+
     v = d->vcpu[0];
     if ( unlikely(v == NULL) )
         return;
diff -r 8f0b5295bb1b -r dcec453681bc xen/common/page_alloc.c
--- a/xen/common/page_alloc.c   Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/common/page_alloc.c   Thu Mar 08 14:39:52 2007 -0600
@@ -49,7 +49,7 @@ string_param("badpage", opt_badpage);
  * Bit width of the DMA heap.
  */
 static unsigned int  dma_bitsize = CONFIG_DMA_BITSIZE;
-static unsigned long max_dma_mfn = (1UL << (CONFIG_DMA_BITSIZE - PAGE_SHIFT)) 
- 1;
+static unsigned long max_dma_mfn = (1UL<<(CONFIG_DMA_BITSIZE-PAGE_SHIFT))-1;
 static void parse_dma_bits(char *s)
 {
     unsigned int v = simple_strtol(s, NULL, 0);
@@ -339,11 +339,13 @@ static void init_heap_block(heap_by_zone
 
 /* Allocate 2^@order contiguous pages. */
 static struct page_info *alloc_heap_pages(
-    unsigned int zone_lo, unsigned zone_hi,
+    unsigned int zone_lo, unsigned int zone_hi,
     unsigned int cpu, unsigned int order)
 {
-    unsigned int i, j, node = cpu_to_node(cpu), num_nodes = num_online_nodes();
-    unsigned int zone, request = (1UL << order);
+    unsigned int i, j, zone;
+    unsigned int node = cpu_to_node(cpu), num_nodes = num_online_nodes();
+    unsigned long request = 1UL << order;
+    cpumask_t extra_cpus_mask, mask;
     struct page_info *pg;
 
     ASSERT(node >= 0);
@@ -356,25 +358,24 @@ static struct page_info *alloc_heap_page
 
     spin_lock(&heap_lock);
 
-    /* start with requested node, but exhaust all node memory
-     * in requested zone before failing, only calc new node
-     * value if we fail to find memory in target node, this avoids
-     * needless computation on fast-path */
+    /*
+     * Start with requested node, but exhaust all node memory in requested 
+     * zone before failing, only calc new node value if we fail to find memory 
+     * in target node, this avoids needless computation on fast-path.
+     */
     for ( i = 0; i < num_nodes; i++ )
     {
-        for ( zone = zone_hi; zone >= zone_lo; --zone )
-        {
-            /* check if target node can support the allocation */
-            if ( avail[node] && (avail[node][zone] >= request) )
-            {
-                /* Find smallest order which can satisfy the request. */
-                for ( j = order; j <= MAX_ORDER; j++ )
-                {
-                    if ( !list_empty(&heap(node, zone, j)) )
-                        goto found;
-                }
-            }
-        }
+        zone = zone_hi;
+        do {
+            /* Check if target node can support the allocation. */
+            if ( !avail[node] || (avail[node][zone] < request) )
+                continue;
+
+            /* Find smallest order which can satisfy the request. */
+            for ( j = order; j <= MAX_ORDER; j++ )
+                if ( !list_empty(&heap(node, zone, j)) )
+                    goto found;
+        } while ( zone-- > zone_lo ); /* careful: unsigned zone may wrap */
 
         /* Pick next node, wrapping around if needed. */
         if ( ++node == num_nodes )
@@ -403,6 +404,29 @@ static struct page_info *alloc_heap_page
 
     spin_unlock(&heap_lock);
 
+    cpus_clear(mask);
+
+    for ( i = 0; i < (1 << order); i++ )
+    {
+        /* Reference count must continuously be zero for free pages. */
+        BUG_ON(pg[i].count_info != 0);
+
+        /* Add in any extra CPUs that need flushing because of this page. */
+        cpus_andnot(extra_cpus_mask, pg[i].u.free.cpumask, mask);
+        tlbflush_filter(extra_cpus_mask, pg[i].tlbflush_timestamp);
+        cpus_or(mask, mask, extra_cpus_mask);
+
+        /* Initialise fields which have other uses for free pages. */
+        pg[i].u.inuse.type_info = 0;
+        page_set_owner(&pg[i], NULL);
+    }
+
+    if ( unlikely(!cpus_empty(mask)) )
+    {
+        perfc_incrc(need_flush_tlb_flush);
+        flush_tlb_mask(mask);
+    }
+
     return pg;
 }
 
@@ -411,13 +435,28 @@ static void free_heap_pages(
     unsigned int zone, struct page_info *pg, unsigned int order)
 {
     unsigned long mask;
-    unsigned int node = phys_to_nid(page_to_maddr(pg));
+    unsigned int i, node = phys_to_nid(page_to_maddr(pg));
+    struct domain *d;
 
     ASSERT(zone < NR_ZONES);
     ASSERT(order <= MAX_ORDER);
     ASSERT(node >= 0);
     ASSERT(node < num_online_nodes());
 
+    for ( i = 0; i < (1 << order); i++ )
+    {
+        BUG_ON(pg[i].count_info != 0);
+        if ( (d = page_get_owner(&pg[i])) != NULL )
+        {
+            pg[i].tlbflush_timestamp = tlbflush_current_time();
+            pg[i].u.free.cpumask     = d->domain_dirty_cpumask;
+        }
+        else
+        {
+            cpus_clear(pg[i].u.free.cpumask);
+        }
+    }
+
     spin_lock(&heap_lock);
 
     map_free(page_to_mfn(pg), 1 << order);
@@ -426,7 +465,7 @@ static void free_heap_pages(
     /* Merge chunks as far as possible. */
     while ( order < MAX_ORDER )
     {
-        mask = 1 << order;
+        mask = 1UL << order;
 
         if ( (page_to_mfn(pg) & mask) )
         {
@@ -554,7 +593,7 @@ void end_boot_allocator(void)
 /*
  * Scrub all unallocated pages in all heap zones. This function is more
  * convoluted than appears necessary because we do not want to continuously
- * hold the lock or disable interrupts while scrubbing very large memory areas.
+ * hold the lock while scrubbing very large memory areas.
  */
 void scrub_heap_pages(void)
 {
@@ -575,7 +614,7 @@ void scrub_heap_pages(void)
         if ( (mfn % ((100*1024*1024)/PAGE_SIZE)) == 0 )
             printk(".");
 
-        spin_lock_irq(&heap_lock);
+        spin_lock(&heap_lock);
 
         /* Re-check page status with lock held. */
         if ( !allocated_in_map(mfn) )
@@ -595,7 +634,7 @@ void scrub_heap_pages(void)
             }
         }
 
-        spin_unlock_irq(&heap_lock);
+        spin_unlock(&heap_lock);
     }
 
     printk("done.\n");
@@ -609,8 +648,6 @@ void scrub_heap_pages(void)
 
 void init_xenheap_pages(paddr_t ps, paddr_t pe)
 {
-    unsigned long flags;
-
     ps = round_pgup(ps);
     pe = round_pgdown(pe);
     if ( pe <= ps )
@@ -625,33 +662,21 @@ void init_xenheap_pages(paddr_t ps, padd
     if ( !IS_XEN_HEAP_FRAME(maddr_to_page(pe)) )
         pe -= PAGE_SIZE;
 
-    local_irq_save(flags);
     init_heap_pages(MEMZONE_XEN, maddr_to_page(ps), (pe - ps) >> PAGE_SHIFT);
-    local_irq_restore(flags);
 }
 
 
 void *alloc_xenheap_pages(unsigned int order)
 {
-    unsigned long flags;
     struct page_info *pg;
-    int i;
-
-    local_irq_save(flags);
+
+    ASSERT(!in_irq());
+
     pg = alloc_heap_pages(MEMZONE_XEN, MEMZONE_XEN, smp_processor_id(), order);
-    local_irq_restore(flags);
-
     if ( unlikely(pg == NULL) )
         goto no_memory;
 
     memguard_unguard_range(page_to_virt(pg), 1 << (order + PAGE_SHIFT));
-
-    for ( i = 0; i < (1 << order); i++ )
-    {
-        pg[i].count_info        = 0;
-        pg[i].u.inuse._domain   = 0;
-        pg[i].u.inuse.type_info = 0;
-    }
 
     return page_to_virt(pg);
 
@@ -663,16 +688,14 @@ void *alloc_xenheap_pages(unsigned int o
 
 void free_xenheap_pages(void *v, unsigned int order)
 {
-    unsigned long flags;
+    ASSERT(!in_irq());
 
     if ( v == NULL )
         return;
 
-    memguard_guard_range(v, 1 << (order + PAGE_SHIFT));    
-
-    local_irq_save(flags);
+    memguard_guard_range(v, 1 << (order + PAGE_SHIFT));
+
     free_heap_pages(MEMZONE_XEN, virt_to_page(v), order);
-    local_irq_restore(flags);
 }
 
 
@@ -762,8 +785,6 @@ struct page_info *__alloc_domheap_pages(
     unsigned int memflags)
 {
     struct page_info *pg = NULL;
-    cpumask_t mask;
-    unsigned long i;
     unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1;
 
     ASSERT(!in_irq());
@@ -792,38 +813,10 @@ struct page_info *__alloc_domheap_pages(
             return NULL;
     }
 
-    if ( pg == NULL )
-        if ( (pg = alloc_heap_pages(MEMZONE_XEN + 1,
-                                    zone_hi,
-                                    cpu, order)) == NULL )
-            return NULL;
-
-    mask = pg->u.free.cpumask;
-    tlbflush_filter(mask, pg->tlbflush_timestamp);
-
-    pg->count_info        = 0;
-    pg->u.inuse._domain   = 0;
-    pg->u.inuse.type_info = 0;
-
-    for ( i = 1; i < (1 << order); i++ )
-    {
-        /* Add in any extra CPUs that need flushing because of this page. */
-        cpumask_t extra_cpus_mask;
-        cpus_andnot(extra_cpus_mask, pg[i].u.free.cpumask, mask);
-        tlbflush_filter(extra_cpus_mask, pg[i].tlbflush_timestamp);
-        cpus_or(mask, mask, extra_cpus_mask);
-
-        pg[i].count_info        = 0;
-        pg[i].u.inuse._domain   = 0;
-        pg[i].u.inuse.type_info = 0;
-        page_set_owner(&pg[i], NULL);
-    }
-
-    if ( unlikely(!cpus_empty(mask)) )
-    {
-        perfc_incrc(need_flush_tlb_flush);
-        flush_tlb_mask(mask);
-    }
+    if ( (pg == NULL) &&
+         ((pg = alloc_heap_pages(MEMZONE_XEN + 1, zone_hi,
+                                 cpu, order)) == NULL) )
+         return NULL;
 
     if ( (d != NULL) && assign_pages(d, pg, order, memflags) )
     {
@@ -867,10 +860,7 @@ void free_domheap_pages(struct page_info
 
         for ( i = 0; i < (1 << order); i++ )
         {
-            shadow_drop_references(d, &pg[i]);
-            ASSERT((pg[i].u.inuse.type_info & PGT_count_mask) == 0);
-            pg[i].tlbflush_timestamp  = tlbflush_current_time();
-            pg[i].u.free.cpumask      = d->domain_dirty_cpumask;
+            BUG_ON((pg[i].u.inuse.type_info & PGT_count_mask) != 0);
             list_del(&pg[i].list);
         }
 
@@ -892,6 +882,7 @@ void free_domheap_pages(struct page_info
              */
             for ( i = 0; i < (1 << order); i++ )
             {
+                page_set_owner(&pg[i], NULL);
                 spin_lock(&page_scrub_lock);
                 list_add(&pg[i].list, &page_scrub_list);
                 scrub_pages++;
@@ -902,8 +893,6 @@ void free_domheap_pages(struct page_info
     else
     {
         /* Freeing anonymous domain-heap pages. */
-        for ( i = 0; i < (1 << order); i++ )
-            cpus_clear(pg[i].u.free.cpumask);
         free_heap_pages(pfn_dom_zone_type(page_to_mfn(pg)), pg, order);
         drop_dom_ref = 0;
     }
diff -r 8f0b5295bb1b -r dcec453681bc xen/common/xmalloc.c
--- a/xen/common/xmalloc.c      Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/common/xmalloc.c      Thu Mar 08 14:39:52 2007 -0600
@@ -33,6 +33,8 @@
 #include <xen/timer.h>
 #include <xen/cache.h>
 #include <xen/prefetch.h>
+#include <xen/irq.h>
+#include <xen/smp.h>
 
 /*
  * XMALLOC_DEBUG:
@@ -175,6 +177,8 @@ void *_xmalloc(size_t size, size_t align
     struct xmalloc_hdr *i;
     unsigned long flags;
 
+    ASSERT(!in_irq());
+
     /* We currently always return cacheline aligned. */
     BUG_ON(align > SMP_CACHE_BYTES);
 
@@ -212,6 +216,8 @@ void xfree(void *p)
 {
     unsigned long flags;
     struct xmalloc_hdr *i, *tmp, *hdr;
+
+    ASSERT(!in_irq());
 
     if ( p == NULL )
         return;
diff -r 8f0b5295bb1b -r dcec453681bc xen/drivers/acpi/numa.c
--- a/xen/drivers/acpi/numa.c   Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/drivers/acpi/numa.c   Thu Mar 08 14:39:52 2007 -0600
@@ -22,10 +22,6 @@
  * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
  */
-#if 0
-#include <linux/module.h>
-#include <linux/kernel.h>
-#endif
 #include <xen/config.h>
 #include <xen/init.h>
 #include <xen/types.h>
@@ -34,7 +30,6 @@
 #include <xen/numa.h>
 #include <acpi/acpi_bus.h>
 #include <acpi/acmacros.h>
-#include <asm/page.h> /* __va() */
 
 #define ACPI_NUMA      0x80000000
 #define _COMPONENT     ACPI_NUMA
@@ -106,7 +101,7 @@ static int __init acpi_parse_slit(unsign
        if (!phys_addr || !size)
                return -EINVAL;
 
-       slit = (struct acpi_table_slit *)__va(phys_addr);
+       slit = (struct acpi_table_slit *)__acpi_map_table(phys_addr, size);
 
        /* downcast just for %llu vs %lu for i386/ia64  */
        localities = (u32) slit->localities;
@@ -159,7 +154,7 @@ static int __init acpi_parse_srat(unsign
        if (!phys_addr || !size)
                return -EINVAL;
 
-       srat = (struct acpi_table_srat *)__va(phys_addr);
+       srat = (struct acpi_table_srat *)__acpi_map_table(phys_addr, size);
 
        return 0;
 }
diff -r 8f0b5295bb1b -r dcec453681bc xen/drivers/char/console.c
--- a/xen/drivers/char/console.c        Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/drivers/char/console.c        Thu Mar 08 14:39:52 2007 -0600
@@ -399,6 +399,8 @@ static void __putstr(const char *str)
         vga_putchar(c);
         putchar_console_ring(c);
     }
+
+    send_guest_global_virq(dom0, VIRQ_CON_RING);
 }
 
 static int printk_prefix_check(char *p, char **pp)
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/acm/acm_hooks.h
--- a/xen/include/acm/acm_hooks.h       Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/include/acm/acm_hooks.h       Thu Mar 08 14:39:52 2007 -0600
@@ -247,12 +247,12 @@ static inline int acm_pre_domctl(struct 
             if (*ssid == NULL) {
                 printk("%s: Warning. Destroying domain without ssid 
pointer.\n", 
                        __func__);
-                domain_rcu_lock(d);
+                rcu_unlock_domain(d);
                 return -EACCES;
             }
             d->ssid = NULL; /* make sure it's not used any more */
              /* no policy-specific hook */
-            domain_rcu_lock(d);
+            rcu_unlock_domain(d);
             ret = 0;
         }
         break;
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/asm-x86/domain.h
--- a/xen/include/asm-x86/domain.h      Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/include/asm-x86/domain.h      Thu Mar 08 14:39:52 2007 -0600
@@ -104,6 +104,21 @@ struct shadow_vcpu {
 };
 
 /************************************************/
+/*            hardware assisted paging          */
+/************************************************/
+struct hap_domain {
+    spinlock_t        lock;
+    int               locker;
+    const char       *locker_function;
+    
+    struct list_head  freelists;
+    struct list_head  p2m_freelist;
+    unsigned int      total_pages;  /* number of pages allocated */
+    unsigned int      free_pages;   /* number of pages on freelists */
+    unsigned int      p2m_pages;    /* number of pages allocates to p2m */
+};
+
+/************************************************/
 /*       p2m handling                           */
 /************************************************/
 
@@ -135,6 +150,7 @@ struct paging_domain {
     struct shadow_domain shadow;
 
     /* Other paging assistance code will have structs here */
+    struct hap_domain    hap;
 };
 
 struct paging_vcpu {
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/asm-x86/hap.h
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/include/asm-x86/hap.h Thu Mar 08 14:39:52 2007 -0600
@@ -0,0 +1,122 @@
+/******************************************************************************
+ * include/asm-x86/hap.h
+ *
+ * hardware-assisted paging
+ * Copyright (c) 2007 Advanced Micro Devices (Wei Huang)
+ * 
+ * Parts of this code are Copyright (c) 2006 by XenSource Inc.
+ * Parts of this code are Copyright (c) 2006 by Michael A Fetterman
+ * Parts based on earlier work by Michael A Fetterman, Ian Pratt et al.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef _XEN_HAP_H
+#define _XEN_HAP_H
+
+#define HERE_I_AM                                                     \
+    debugtrace_printk("HERE I AM: %s %s %d\n", __func__, __FILE__, __LINE__)
+#define HAP_PRINTK(_f, _a...)                                         \
+    debugtrace_printk("hap: %s(): " _f, __func__, ##_a)
+#define HAP_ERROR(_f, _a...)                                          \
+    printk("hap error: %s(): " _f, __func__, ##_a)
+
+/************************************************/
+/*          hap domain page mapping             */
+/************************************************/
+static inline void *
+hap_map_domain_page(mfn_t mfn)
+{
+    return map_domain_page(mfn_x(mfn));
+}
+
+static inline void
+hap_unmap_domain_page(void *p)
+{
+    unmap_domain_page(p);
+}
+
+static inline void *
+hap_map_domain_page_global(mfn_t mfn)
+{
+    return map_domain_page_global(mfn_x(mfn));
+}
+
+static inline void 
+hap_unmap_domain_page_global(void *p) 
+{
+    unmap_domain_page_global(p);
+}
+
+/************************************************/
+/*           locking for hap code               */
+/************************************************/
+#define hap_lock_init(_d)                                   \
+    do {                                                    \
+        spin_lock_init(&(_d)->arch.paging.hap.lock);        \
+        (_d)->arch.paging.hap.locker = -1;                  \
+        (_d)->arch.paging.hap.locker_function = "nobody";   \
+    } while (0)
+
+#define hap_locked_by_me(_d)                     \
+    (current->processor == (_d)->arch.paging.hap.locker)
+
+#define hap_lock(_d)                                                       \
+    do {                                                                   \
+        if ( unlikely((_d)->arch.paging.hap.locker == current->processor) )\
+        {                                                                  \
+            printk("Error: hap lock held by %s\n",                         \
+                   (_d)->arch.paging.hap.locker_function);                 \
+            BUG();                                                         \
+        }                                                                  \
+        spin_lock(&(_d)->arch.paging.hap.lock);                            \
+        ASSERT((_d)->arch.paging.hap.locker == -1);                        \
+        (_d)->arch.paging.hap.locker = current->processor;                 \
+        (_d)->arch.paging.hap.locker_function = __func__;                  \
+    } while (0)
+
+#define hap_unlock(_d)                                              \
+    do {                                                            \
+        ASSERT((_d)->arch.paging.hap.locker == current->processor); \
+        (_d)->arch.paging.hap.locker = -1;                          \
+        (_d)->arch.paging.hap.locker_function = "nobody";           \
+        spin_unlock(&(_d)->arch.paging.hap.lock);                   \
+    } while (0)
+
+/************************************************/
+/*        hap domain level functions            */
+/************************************************/
+void  hap_domain_init(struct domain *d);
+int   hap_domctl(struct domain *d, xen_domctl_shadow_op_t *sc,
+                 XEN_GUEST_HANDLE(void) u_domctl);
+int   hap_enable(struct domain *d, u32 mode);
+void  hap_final_teardown(struct domain *d);
+void  hap_teardown(struct domain *d);
+void  hap_vcpu_init(struct vcpu *v);
+
+extern struct paging_mode hap_paging_real_mode;
+extern struct paging_mode hap_paging_protected_mode;
+extern struct paging_mode hap_paging_pae_mode;
+extern struct paging_mode hap_paging_long_mode;
+#endif /* XEN_HAP_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/asm-x86/hvm/svm/emulate.h
--- a/xen/include/asm-x86/hvm/svm/emulate.h     Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/include/asm-x86/hvm/svm/emulate.h     Thu Mar 08 14:39:52 2007 -0600
@@ -76,7 +76,7 @@ enum instruction_index {
 };
 
 
-extern unsigned long get_effective_addr_modrm64(struct vmcb_struct *vmcb, 
+extern unsigned long get_effective_addr_modrm64(
         struct cpu_user_regs *regs, const u8 prefix, int inst_len,
         const u8 *operand, u8 *size);
 extern unsigned long get_effective_addr_sib(struct vmcb_struct *vmcb, 
@@ -85,17 +85,17 @@ extern OPERATING_MODE get_operating_mode
 extern OPERATING_MODE get_operating_mode (struct vmcb_struct *vmcb);
 extern unsigned int decode_dest_reg(u8 prefix, u8 modrm);
 extern unsigned int decode_src_reg(u8 prefix, u8 modrm);
-extern unsigned long svm_rip2pointer(struct vmcb_struct *vmcb);
-extern int __get_instruction_length_from_list(struct vmcb_struct *vmcb,
+extern unsigned long svm_rip2pointer(struct vcpu *v);
+extern int __get_instruction_length_from_list(struct vcpu *v,
         enum instruction_index *list, unsigned int list_count, 
         u8 *guest_eip_buf, enum instruction_index *match);
 
 
-static inline int __get_instruction_length(struct vmcb_struct *vmcb, 
+static inline int __get_instruction_length(struct vcpu *v, 
         enum instruction_index instr, u8 *guest_eip_buf)
 {
     return __get_instruction_length_from_list(
-        vmcb, &instr, 1, guest_eip_buf, NULL);
+        v, &instr, 1, guest_eip_buf, NULL);
 }
 
 
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/asm-x86/hvm/svm/svm.h
--- a/xen/include/asm-x86/hvm/svm/svm.h Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/include/asm-x86/hvm/svm/svm.h Thu Mar 08 14:39:52 2007 -0600
@@ -34,6 +34,41 @@ extern void arch_svm_do_resume(struct vc
 
 extern u64 root_vmcb_pa[NR_CPUS];
 
+static inline int svm_long_mode_enabled(struct vcpu *v)
+{
+    u64 guest_efer = v->arch.hvm_svm.cpu_shadow_efer;
+    return guest_efer & EFER_LMA;
+}
+
+static inline int svm_lme_is_set(struct vcpu *v)
+{
+    u64 guest_efer = v->arch.hvm_svm.cpu_shadow_efer;
+    return guest_efer & EFER_LME;
+}
+
+static inline int svm_cr4_pae_is_set(struct vcpu *v)
+{
+    unsigned long guest_cr4 = v->arch.hvm_svm.cpu_shadow_cr4;
+    return guest_cr4 & X86_CR4_PAE;
+}
+
+static inline int svm_paging_enabled(struct vcpu *v)
+{
+    unsigned long guest_cr0 = v->arch.hvm_svm.cpu_shadow_cr0;
+    return (guest_cr0 & X86_CR0_PE) && (guest_cr0 & X86_CR0_PG);
+}
+
+static inline int svm_pae_enabled(struct vcpu *v)
+{
+    unsigned long guest_cr4 = v->arch.hvm_svm.cpu_shadow_cr4;
+    return svm_paging_enabled(v) && (guest_cr4 & X86_CR4_PAE);
+}
+
+static inline int svm_pgbit_test(struct vcpu *v)
+{
+    return v->arch.hvm_svm.cpu_shadow_cr0 & X86_CR0_PG;
+}
+
 #define SVM_REG_EAX (0) 
 #define SVM_REG_ECX (1) 
 #define SVM_REG_EDX (2) 
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/asm-x86/hvm/svm/vmcb.h
--- a/xen/include/asm-x86/hvm/svm/vmcb.h        Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h        Thu Mar 08 14:39:52 2007 -0600
@@ -302,14 +302,6 @@ enum VMEXIT_EXITCODE
     VMEXIT_NPF              = 1024, /* nested paging fault */
     VMEXIT_INVALID          =  -1
 };
-
-enum {
-    SVM_CPU_STATE_PG_ENABLED=0,
-    SVM_CPU_STATE_PAE_ENABLED,
-    SVM_CPU_STATE_LME_ENABLED,      
-    SVM_CPU_STATE_LMA_ENABLED,
-    SVM_CPU_STATE_ASSIST_ENABLED,
-};  
 
 /* Definitions of segment state are borrowed by the generic HVM code. */
 typedef segment_attributes_t svm_segment_attributes_t;
@@ -457,12 +449,12 @@ struct arch_svm_struct {
     int                 saved_irq_vector;
     u32                 launch_core;
     
-    unsigned long       flags;      /* VMCB flags */
-    unsigned long       cpu_shadow_cr0; /* Guest value for CR0 */
-    unsigned long       cpu_shadow_cr4; /* Guest value for CR4 */
+    unsigned long       flags;            /* VMCB flags */
+    unsigned long       cpu_shadow_cr0;   /* Guest value for CR0 */
+    unsigned long       cpu_shadow_cr4;   /* Guest value for CR4 */
+    unsigned long       cpu_shadow_efer;  /* Guest value for EFER */
     unsigned long       cpu_cr2;
     unsigned long       cpu_cr3;
-    unsigned long       cpu_state;
 };
 
 struct vmcb_struct *alloc_vmcb(void);
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/public/arch-x86/xen.h
--- a/xen/include/public/arch-x86/xen.h Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/include/public/arch-x86/xen.h Thu Mar 08 14:39:52 2007 -0600
@@ -132,6 +132,7 @@ struct vcpu_guest_context {
     unsigned long ldt_base, ldt_ents;       /* LDT (linear address, # ents) */
     unsigned long gdt_frames[16], gdt_ents; /* GDT (machine frames, # ents) */
     unsigned long kernel_ss, kernel_sp;     /* Virtual TSS (only SS1/SP1)   */
+    /* NB. User pagetable on x86/64 is placed in ctrlreg[1]. */
     unsigned long ctrlreg[8];               /* CR0-CR7 (control registers)  */
     unsigned long debugreg[8];              /* DB0-DB7 (debug registers)    */
 #ifdef __i386__
diff -r 8f0b5295bb1b -r dcec453681bc xen/include/public/xen.h
--- a/xen/include/public/xen.h  Mon Mar 05 12:49:12 2007 -0600
+++ b/xen/include/public/xen.h  Thu Mar 08 14:39:52 2007 -0600
@@ -131,6 +131,7 @@
 #define VIRQ_TBUF       4  /* G. (DOM0) Trace buffer has records available.  */
 #define VIRQ_DEBUGGER   6  /* G. (DOM0) A domain has paused for debugging.   */
 #define VIRQ_XENOPROF   7  /* V. XenOprofile interrupt: new sample available */
+#define VIRQ_CON_RING   8  /* G. (DOM0) Bytes received on console            */
 
 /* Architecture-specific VIRQ definitions. */
 #define VIRQ_ARCH_0    16
@@ -473,26 +474,24 @@ typedef struct shared_info shared_info_t
 #endif
 
 /*
- * Start-of-day memory layout for the initial domain (DOM0):
+ * Start-of-day memory layout:
  *  1. The domain is started within contiguous virtual-memory region.
- *  2. The contiguous region begins and ends on an aligned 4MB boundary.
- *  3. The region start corresponds to the load address of the OS image.
- *     If the load address is not 4MB aligned then the address is rounded down.
- *  4. This the order of bootstrap elements in the initial virtual region:
+ *  2. The contiguous region ends on an aligned 4MB boundary.
+ *  3. This the order of bootstrap elements in the initial virtual region:
  *      a. relocated kernel image
  *      b. initial ram disk              [mod_start, mod_len]
  *      c. list of allocated page frames [mfn_list, nr_pages]
  *      d. start_info_t structure        [register ESI (x86)]
  *      e. bootstrap page tables         [pt_base, CR3 (x86)]
  *      f. bootstrap stack               [register ESP (x86)]
- *  5. Bootstrap elements are packed together, but each is 4kB-aligned.
- *  6. The initial ram disk may be omitted.
- *  7. The list of page frames forms a contiguous 'pseudo-physical' memory
+ *  4. Bootstrap elements are packed together, but each is 4kB-aligned.
+ *  5. The initial ram disk may be omitted.
+ *  6. The list of page frames forms a contiguous 'pseudo-physical' memory
  *     layout for the domain. In particular, the bootstrap virtual-memory
  *     region is a 1:1 mapping to the first section of the pseudo-physical map.
- *  8. All bootstrap elements are mapped read-writable for the guest OS. The
+ *  7. All bootstrap elements are mapped read-writable for the guest OS. The
  *     only exception is the bootstrap page table, which is mapped read-only.
- *  9. There is guaranteed to be at least 512kB padding after the final
+ *  8. There is guaranteed to be at least 512kB padding after the final
  *     bootstrap element. If necessary, the bootstrap virtual region is
  *     extended by an extra 4MB to ensure this.
  */

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-changelog] [xen-unstable] [POWERPC][XEN] Merge with xen-unstable.hg., Xen patchbot-unstable <=