|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/hvm: Add Kconfig option to disable nested virtualization
On 13.02.2026 23:02, Stefano Stabellini wrote:
> Introduce CONFIG_NESTED_VIRT (default n) to allow nested virtualization
> support to be disabled at build time. This is useful for embedded or
> safety-focused deployments where nested virtualization is not needed,
> reducing code size and attack surface.
>
> When CONFIG_NESTED_VIRT=n, the following source files are excluded:
> - arch/x86/hvm/nestedhvm.c
> - arch/x86/hvm/svm/nestedsvm.c
> - arch/x86/hvm/vmx/vvmx.c
> - arch/x86/mm/nested.c
> - arch/x86/mm/hap/nested_hap.c
> - arch/x86/mm/hap/nested_ept.c
>
> Add inline stubs where needed in headers. Guard assembly code paths
> for nested virt with #ifdef CONFIG_NESTED_VIRT. Move exception
> injection for VMX/SVM instructions to the callers in vmx.c/svm.c to
> avoid header dependency issues in the stubs.
>
> No functional change when CONFIG_NESTED_VIRT=y.
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxx>
>
> ---
> Changes in v3:
> - Kconfig: Change "depends on AMD_SVM || INTEL_VMX" to "depends on HVM"
> - Kconfig: Remove redundant "default n" line
> - Kconfig: Remove "If unsure, say N." from help text
> - mm/hap/Makefile: Simplify using intermediate nested-y variable:
> nested-y := nested_hap.o
> nested-$(CONFIG_INTEL_VMX) += nested_ept.o
> obj-$(CONFIG_NESTED_VIRT) += $(nested-y)
> - svm/nestedhvm.h: Remove #ifdef CONFIG_NESTED_VIRT stubs, keep only
> function declarations (the functions are only called from code that
> is already compiled out when nested virt is disabled)
> - svm/nestedhvm.h: Add CONFIG_NESTED_VIRT guard to nsvm_efer_svm_enabled
> macro to return false when nested virt is disabled
> - svm/svm.c: Move #UD injection for STGI/CLGI to the caller instead of
> stub functions, checking nestedhvm_enabled()/nsvm_efer_svm_enabled()
> - svm/svm.c: Mark svm_vmexit_do_vmrun/vmload/vmsave as __maybe_unused
> - svm/svm.c: Remove empty nsvm_vcpu_switch stub (now guarded in asm)
> - svm/entry.S: Add #ifdef CONFIG_NESTED_VIRT guards around nested virt
> specific code paths
> - vmx/vmx.c: Remove empty nvmx_switch_guest stub (now guarded in asm)
> - vmx/vmx.c: Move nvmx_enqueue_n2_exceptions and nvmx_vmexit_event to
> vvmx.c where they belong
> - vmx/vvmx.h: Add declarations for nvmx_vmexit_event and
> nvmx_enqueue_n2_exceptions
> - vmx/vvmx.h: Fix nvmx_msr_read_intercept stub comment
> - vmx/vvmx.h: nvmx_handle_vmx_insn stub returns X86EMUL_EXCEPTION with
> ASSERT_UNREACHABLE (caller handles injection)
> - vmx/vvmx.h: Convert get_vvmcs macro to inline function in stubs
> - vmx/entry.S: Add #ifdef CONFIG_NESTED_VIRT guard around nvmx_switch_guest
> - nestedhvm.h: Convert macro stubs to proper inline functions
Oh, wow, that's an almost complete re-write?
> --- a/xen/arch/x86/hvm/Kconfig
> +++ b/xen/arch/x86/hvm/Kconfig
> @@ -92,4 +92,11 @@ config MEM_SHARING
> bool "Xen memory sharing support (UNSUPPORTED)" if UNSUPPORTED
> depends on INTEL_VMX
>
> +config NESTED_VIRT
> + bool "Nested virtualization support"
> + depends on HVM
> + help
> + Enable nested virtualization, allowing guests to run their own
> + hypervisors. This requires hardware support.
What's the last sentence about? HVM itself already requires hardware
support, yet that's about it especially for VMX (where only HAP is the
other requirement), isn't it? If this is about those advanced features,
perhaps this would then want to be more specific?
> --- a/xen/arch/x86/hvm/svm/nestedhvm.h
> +++ b/xen/arch/x86/hvm/svm/nestedhvm.h
> @@ -24,7 +24,7 @@
>
> /* True when l1 guest enabled SVM in EFER */
> #define nsvm_efer_svm_enabled(v) \
> - (!!((v)->arch.hvm.guest_efer & EFER_SVME))
> + (IS_ENABLED(CONFIG_NESTED_VIRT) && ((v)->arch.hvm.guest_efer &
> EFER_SVME))
Constructs like these are on the edge: Yes, passing in an expression with a side
effect isn't very likely here. Yet still, this being a widely visible macro, I
wonder if it wouldn't better guarantee v to be evaluated exactly once.
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -2165,7 +2165,7 @@ static void svm_vmexit_do_pause(struct cpu_user_regs
> *regs)
> vcpu_yield();
> }
>
> -static void
> +static void __maybe_unused
> svm_vmexit_do_vmrun(struct cpu_user_regs *regs,
> struct vcpu *v, uint64_t vmcbaddr)
> {
> @@ -2211,7 +2211,7 @@ nsvm_get_nvmcb_page(struct vcpu *v, uint64_t vmcbaddr)
> return page;
> }
>
> -static void
> +static void __maybe_unused
> svm_vmexit_do_vmload(struct vmcb_struct *vmcb,
> struct cpu_user_regs *regs,
> struct vcpu *v, uint64_t vmcbaddr)
> @@ -2246,7 +2246,7 @@ svm_vmexit_do_vmload(struct vmcb_struct *vmcb,
> __update_guest_eip(regs, inst_len);
> }
>
> -static void
> +static void __maybe_unused
> svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
> struct cpu_user_regs *regs,
> struct vcpu *v, uint64_t vmcbaddr)
Why are these needed? The call sites don't go away afaics.
If these are nevertheless needed, question is whether a suitable single #ifdef
might not be tidier.
> @@ -3011,10 +3013,16 @@ void asmlinkage svm_vmexit_handler(void)
> svm_vmexit_do_vmsave(vmcb, regs, v, regs->rax);
> break;
> case VMEXIT_STGI:
> - svm_vmexit_do_stgi(regs, v);
> + if ( !nestedhvm_enabled(v->domain) )
> + hvm_inject_hw_exception(X86_EXC_UD, X86_EVENT_NO_EC);
> + else
> + svm_vmexit_do_stgi(regs, v);
> break;
> case VMEXIT_CLGI:
> - svm_vmexit_do_clgi(regs, v);
> + if ( !nsvm_efer_svm_enabled(v) )
> + hvm_inject_hw_exception(X86_EXC_UD, X86_EVENT_NO_EC);
> + else
> + svm_vmexit_do_clgi(regs, v);
> break;
These render respective checks in the functions themselves dead, which in
particular means the bodies of those if()s there are then unreachable (a
Misra violation of a rule we did accept).
> @@ -2942,8 +2916,9 @@ static struct hvm_function_table __initdata_cf_clobber
> vmx_function_table = {
> .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
> .nhvm_intr_blocked = nvmx_intr_blocked,
> .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
> - .update_vlapic_mode = vmx_vlapic_msr_changed,
I realize the = wasn't properly padded here, but ...
> .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
> +#endif
> + .update_vlapic_mode = vmx_vlapic_msr_changed,
... can you please to so while moving the line?
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -2821,6 +2821,32 @@ void nvmx_set_cr_read_shadow(struct vcpu *v, unsigned
> int cr)
> __vmwrite(read_shadow_field, v->arch.hvm.nvcpu.guest_cr[cr]);
> }
>
> +void nvmx_enqueue_n2_exceptions(struct vcpu *v,
> + unsigned long intr_fields, int error_code, uint8_t source)
While moving, can obvious style issues please be addressed? Bad indentation
here, ...
> +{
> + struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> +
> + if ( !(nvmx->intr.intr_info & INTR_INFO_VALID_MASK) ) {
... misplaced brace here, and ...
> + /* enqueue the exception till the VMCS switch back to L1 */
... malformed comment here.
> --- a/xen/arch/x86/include/asm/hvm/hvm.h
> +++ b/xen/arch/x86/include/asm/hvm/hvm.h
> @@ -711,7 +711,7 @@ static inline bool hvm_altp2m_supported(void)
> /* Returns true if we have the minimum hardware requirements for nested virt
> */
> static inline bool hvm_nested_virt_supported(void)
> {
> - return hvm_funcs.caps.nested_virt;
> + return IS_ENABLED(CONFIG_NESTED_VIRT) && hvm_funcs.caps.nested_virt;
> }
Should the field itself perhaps become conditional?
> --- a/xen/arch/x86/include/asm/hvm/nestedhvm.h
> +++ b/xen/arch/x86/include/asm/hvm/nestedhvm.h
> @@ -25,9 +25,21 @@ enum nestedhvm_vmexits {
> /* Nested HVM on/off per domain */
> static inline bool nestedhvm_enabled(const struct domain *d)
> {
> - return IS_ENABLED(CONFIG_HVM) && (d->options &
> XEN_DOMCTL_CDF_nested_virt);
> + return IS_ENABLED(CONFIG_NESTED_VIRT) &&
> + (d->options & XEN_DOMCTL_CDF_nested_virt);
> }
>
> +/* Nested paging */
> +#define NESTEDHVM_PAGEFAULT_DONE 0
> +#define NESTEDHVM_PAGEFAULT_INJECT 1
> +#define NESTEDHVM_PAGEFAULT_L1_ERROR 2
> +#define NESTEDHVM_PAGEFAULT_L0_ERROR 3
> +#define NESTEDHVM_PAGEFAULT_MMIO 4
> +#define NESTEDHVM_PAGEFAULT_RETRY 5
> +#define NESTEDHVM_PAGEFAULT_DIRECT_MMIO 6
> +
> +#ifdef CONFIG_NESTED_VIRT
In a reply to my comment on v1 (Or was it v2? This submission isn't tagged.),
you referred me to the stub nestedhvm_hap_nested_page_fault() using the
constant. However, why would that stub be needed when the sole call site of
the function lives in a conditional using nestedhvm_enabled() (which is
compile-time fales when NESTED_VIRT=n)? All you need to make sure is that
the decl remains available. I then wonder for how many of the other stubs
which might be the case as well.
> @@ -199,5 +201,77 @@ int nept_translate_l2ga(struct vcpu *v, paddr_t l2ga,
> uint64_t *exit_qual, uint32_t *exit_reason);
> int nvmx_cpu_up_prepare(unsigned int cpu);
> void nvmx_cpu_dead(unsigned int cpu);
> +int cf_check nvmx_vmexit_event(struct vcpu *v, const struct x86_event
> *event);
> +void nvmx_enqueue_n2_exceptions(struct vcpu *v,
> + unsigned long intr_fields, int error_code, uint8_t source);
Nit: Bad indentation even copied here.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |