[Xen-devel] RE: [PATCH 00/13] Nested Virtualization: Overview

To:	Christoph Egger <Christoph.Egger@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject:	[Xen-devel] RE: [PATCH 00/13] Nested Virtualization: Overview
From:	"Dong, Eddie" <eddie.dong@xxxxxxxxx>
Date:	Mon, 18 Oct 2010 09:30:10 +0800
Accept-language:	en-US
Acceptlanguage:	en-US
Cc:	"Dong, Eddie" <eddie.dong@xxxxxxxxx>, Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Delivery-date:	Sun, 17 Oct 2010 18:36:48 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<201010151455.15068.Christoph.Egger@xxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<201010151455.15068.Christoph.Egger@xxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	ActsasUdh+oPtJa3Qu6U5KpXnQj4wAB8hOMg
Thread-topic:	[PATCH 00/13] Nested Virtualization: Overview

Christoph Egger wrote:
> Hi!
> 
> This patch series brings Nested Virtualization to Xen.
> This is the fifth patch series. Improvements to the
> previous patch submission:
> 

It is a step forward progress. At least those obvious SVM specific stuff is 
moved back to SVM tree, although there are still something left which is not 
good but maybe not that critical (like 
-    hvm_intblk_nmi_iret   /* NMI blocked until IRET */
+    hvm_intblk_nmi_iret,  /* NMI blocked until IRET */
+    hvm_intblk_svm_gif,   /* GIF cleared */
).

However the majority of my comments are still not addressed 
(http://www.mailinglistarchive.com/html/xen-devel@xxxxxxxxxxxxxxxxxxx/2010-09/msg01078.html).

I can re-comments here:

+struct nestedhvm {
+    bool_t nh_guestmode; /* vcpu in guestmode? */
+    void *nh_vmcx; /* l1 guest virtual VMCB/VMCS */
+
+    /* address of l1 guest virtual VMCB/VMCS, needed for VMEXIT */
+    uint64_t nh_vmaddr;
+
+    void *nh_arch; /* SVM/VMX specific data */
+    size_t nh_arch_size;

I hate the pointer+size style. Rather I prefer union for SVM/VMX data structure.

Once this is followed, the API 
nestedhvm_vcpu_initialise/nestedhvm_vcpu_reset/nestedhvm_vcpu_destroy can be 
back to wrapper only if not completely removed.


+    /* Cache guest cr3/host cr3 the guest sets up for the nested guest.
+     * Used by Shadow-on-Shadow and Nested-on-Nested.
+     * nh_vm_guestcr3: in l2 guest physical address space and points to
+     *     the l2 guest page table
+     * nh_vm_hostcr3: in l1 guest physical address space and points to
+     *     the l1 guest nested page table
+     */
+    uint64_t nh_vm_guestcr3, nh_vm_hostcr3;
+    uint32_t nh_guest_asid;

Duplicating data instance (Caching) is really not a good solution to me. I 
prefer using a wrapper API for that as my proposal sent to you in private mail. 
Caching really doesn't bring additional value here.


+    /* Only meaningful when forcevmexit flag is set */
+    struct {
+        uint64_t exitcode;  /* generic exitcode */
+        uint64_t exitinfo1; /* additional information to the exitcode */
+        uint64_t exitinfo2; /* additional information to the exitcode */
+    } nh_forcevmexit;
+    union {
+        uint32_t bytes;
+        struct {
+            uint32_t forcevmexit : 1;
+            uint32_t use_native_exitcode : 1;
+            uint32_t vmentry : 1;   /* true during vmentry/vmexit emulation */
+            uint32_t reserved : 29;
+        } fields;
+    } nh_hostflags;

New namespace for VM exit info and entry control.

I am not convinced by the value of this new namespace comparing with soultion 
that demux in each arch while call 1st level helper API in common code. The 
only different result is that we will pay with one API: nestedhvm_vcpu_vmexit, 
but we gain with removed conversion to/from between new & old namespace APIs. I 
strongly prefer the demux + 1st level helper solution.

+int
+nestedhvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs,
+       uint64_t vmaddr, unsigned int inst_len)
+{
+       int ret;
+       struct nestedhvm *hvm = &vcpu_nestedhvm(v);
+
+       hvm->nh_hostflags.fields.vmentry = 1;
+
+       ret = nestedhvm_vcpu_state_validate(v, vmaddr);
+       if (ret) {
+               gdprintk(XENLOG_ERR,
+                       "nestedhvm_vcpu_state_validate failed, injecting 
0x%x\n",
+                       ret);
+               hvm_inject_exception(ret, HVM_DELIVER_NO_ERROR_CODE, 0);
+               return ret;
+       }
+
+       /* Save vmaddr. Needed for VMEXIT */
+       hvm->nh_vmaddr = vmaddr;
+
+       /* get nested vm */
+       ASSERT(hvm->nh_vmcx == NULL);
+       hvm->nh_vmcx = hvm_map_guest_frame_ro(vmaddr >> PAGE_SHIFT);
+       if (hvm->nh_vmcx == NULL) {
+               gdprintk(XENLOG_ERR,
+                       "hvm_map_guest_frame_ro failed, injecting #GP\n");
+               hvm_inject_exception(TRAP_gp_fault,
+                       HVM_DELIVER_NO_ERROR_CODE, 0);
+               hvm->nh_hostflags.fields.vmentry = 0;
+               return TRAP_gp_fault;
+       }
+
+       /* save host state */
+       ret = nhvm_vcpu_vmentry(v, regs, inst_len);
+       if (ret) {
+               gdprintk(XENLOG_ERR,
+                       "nhvm_vcpu_vmentry failed, injecting #UD\n");
+               hvm_inject_exception(TRAP_invalid_op,
+                       HVM_DELIVER_NO_ERROR_CODE, 0);
+               hvm->nh_hostflags.fields.vmentry = 0;
+               hvm_unmap_guest_frame(hvm->nh_vmcx);
+               hvm->nh_vmcx = NULL;
+               return ret;
+       }
+
+       hvm_unmap_guest_frame(hvm->nh_vmcx);
+       hvm->nh_vmcx = NULL;
+
+       /* Switch vcpu to guest mode.
+        */
+       nestedhvm_vcpu_enter_guestmode(v);
+
+       hvm->nh_hostflags.fields.vmentry = 0;
+       return 0;
+}

You must not read VMX code yet or didn't understand the reason why VMX can;t do 
this. Appearantly this common code doesn't apply to VMX and have to be moved to 
arch specific stuff.
VMX can only switch the context (L1 or L2 guest) at certain point (now defered 
to last step before resume) because only one VMCS can be accessed in one 
context.

Thx, Eddie



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] RE: [PATCH 00/13] Nested Virtualization: Overview