|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] Fwd: [PATCH 0/18] Nested Virtualization: Overview
On Thursday 15 April 2010 16:57:40 Keir Fraser wrote:
> On 15/04/2010 14:20, "Christoph Egger" <Christoph.Egger@xxxxxxx> wrote:
> > patch 03: change local_event_delivery_* to take vcpu argument.
> > This prevents spurious xen crashes on guest
> > shutdown/destroy with nestedhvm enabled.
>
> Can you give an example of how this bug manifests? I don't really see how
> nestedhvm would interact so unexpectedly with this rather pv-oriented
> subsystem.
On guest shutdown/destroy, 'current' does not always point to the expected
virtual cpu. When current != v, then xen crashes this way:
(XEN) ----[ Xen-4.0.0-rc6 x86_64 debug=y Tainted: C ]----
(XEN) CPU: 2
(XEN) RIP: e008:[<ffff82c4801a5257>] nestedsvm_vcpu_stgi+0xa0/0xaf
(XEN) RFLAGS: 0000000000010202 CONTEXT: hypervisor
(XEN) rax: 0000000000000001 rbx: ffff83008faf0000 rcx: 0000000000000001
(XEN) rdx: 0000000000000000 rsi: ffff82c480270b20 rdi: ffff8301495c0000
(XEN) rbp: ffff830167e27dd0 rsp: ffff830167e27dc0 r8: 0000000000000001
(XEN) r9: ffff82c4803937e0 r10: 0000ffff0000ffff r11: 00ff00ff00ff00ff
(XEN) r12: 0000000000000000 r13: 0000000000000000 r14: ffff82c48026cb00
(XEN) r15: ffffffffffffffff cr0: 000000008005003b cr4: 00000000000006f0
(XEN) cr3: 0000000160e28000 cr2: 0000000000000001
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen stack trace from rsp=ffff830167e27dc0:
(XEN) ffff830144fd08d0 ffff83008faf0000 ffff830167e27df0 ffff82c4801a5583
(XEN) ffff83008faf0000 ffff83008faf0000 ffff830167e27e10 ffff82c48019cf13
(XEN) ffff82c48011e68c ffff83008faf0000 ffff830167e27e30 ffff82c48014d17d
(XEN) ffff830167e27e40 ffff8301495c0000 ffff830167e27e60 ffff82c480105d8c
(XEN) ffff830167e27e90 ffff82c4802701a0 0000000000000000 0000000000000000
(XEN) ffff830167e27e90 ffff82c480124113 ffff82c48038a980 ffff82c48038aa80
(XEN) ffff82c48038a980 ffff830167e27f28 ffff830167e27ed0 ffff82c48011e1f2
(XEN) ffff83008faf80f8 ffff830167e27f28 ffff82c480243040 ffff830167e27f28
(XEN) ffff82c48026cb00 ffff82c480243ab8 ffff830167e27ee0 ffff82c48011e211
(XEN) ffff830167e27f20 ffff82c48014d363 ffffffff8057c580 ffff83008fe68000
(XEN) ffff83008faf8000 0000000000004000 ffff82c480270080 0000000000000002
(XEN) ffff830167e27dc8 0000000000000000 ffffffff8057d1c0 ffffffff8057c580
(XEN) ffffffffffffffff 0000000000631918 0000000000000000 0000000000000246
(XEN) 0000000000000000 00000397cc124096 0000000000000000 0000000000000000
(XEN) ffffffff802083aa 00000000deadbeef 00000000deadbeef 00000000deadbeef
(XEN) 0000010000000000 ffffffff802083aa 000000000000e033 0000000000000246
(XEN) ffffffff80553f10 000000000000e02b 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000002 ffff83008fe68000
(XEN) Xen call trace:
(XEN) [<ffff82c4801a5257>] nestedsvm_vcpu_stgi+0xa0/0xaf
(XEN) [<ffff82c4801a5583>] nestedsvm_vcpu_destroy+0x48/0x6d
(XEN) [<ffff82c48019cf13>] hvm_vcpu_destroy+0x11/0x67
(XEN) [<ffff82c48014d17d>] vcpu_destroy+0x33/0x3a
(XEN) [<ffff82c480105d8c>] complete_domain_destroy+0x39/0x103
(XEN) [<ffff82c480124113>] rcu_process_callbacks+0x17e/0x1dc
(XEN) [<ffff82c48011e1f2>] __do_softirq+0x74/0x85
(XEN) [<ffff82c48011e211>] do_softirq+0xe/0x10
(XEN) [<ffff82c48014d363>] idle_loop+0x8d/0x8f
(XEN)
(XEN) Pagetable walk from 0000000000000001:
(XEN) L4[0x000] = 00000001619e5067 000000000001e849
(XEN) L3[0x000] = 0000000160cbc067 000000000001dd72
(XEN) L2[0x000] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: 0000000000000001
(XEN) ****************************************
nestedsvm_vcpu_stgi() enables events for PV drivers.
>
> > patch 04: obsolete gfn_to_mfn_current and remove it.
> > gfn_to_mfn_current is redundant to
> > gfn_to_mfn(current->domain, ...)
> > This patch reduces the size of patch 17.
>
> This one (at least -- there may be others) needs an ack from Tim.
I could imagine, all patches touching p2m need ack from Tim.
>
> > patch 05: hvm_set_cr0: Allow guest to switch into paged real mode.
> > This makes hvmloader boot when we use xen in xen.
>
> What if we are not running a nestedhvm guest, or otherwise on a system not
> supporting paged real mode?
The hvmloader itself switches to real mode right before invoking the BIOS
via a trampoline. In virtualization, the cpu always uses paged real mode or
it can't fetch the instructions, otherwise.
> Is it wise to remove the check in that case?
No. The mov-to-cr0 instruction fails with a #GP, otherwise. The guest
doesn't expect the #GP to happen. The guest just retries the same instruction
summing up to a double-fault and triple-fault.
> Even where we *do* support nestedhvm, should all guest writes to CR0 be
> allowed to bypass that check (Isn't paged real mode architecturally only
> allowed to be entered via VMRUN)?
Are you talking about the VMRUN instruction or the VMRUN emulation ?
Note, this is a difference. With nestedhvm, the guest believes to be able
to run VMRUN instruction. In reality, VMRUN is intercepted and emulated in
the host. The papers (pdf documents) describe how VMRUN emulation works.
> More generally, I will allow these patches to sit for a week or two to give
> time for potential reviewers to digest them.
Ack.
Christoph
>
> Thanks,
> Keir
--
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|