|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] Need help with fixing the Xen waitqueue feature
On 11/11/2011 22:56, "Olaf Hering" <olaf@xxxxxxxxx> wrote:
> Keir,
>
> just do dump my findings to the list:
>
> On Tue, Nov 08, Keir Fraser wrote:
>
>> Tbh I wonder anyway whether stale hypercall context would be likely to cause
>> a silent machine reboot. Booting with max_cpus=1 would eliminate moving
>> between CPUs as a cause of inconsistencies, or pin the guest under test.
>> Another problem could be sleeping with locks held, but we do test for that
>> (in debug builds at least) and I'd expect crash/hang rather than silent
>> reboot. Another problem could be if the vcpu has its own state in an
>> inconsistent/invalid state temporarily (e.g., its pagetable base pointers)
>> which then is attempted to be restored during a waitqueue wakeup. That could
>> certainly cause a reboot, but I don't know of an example where this might
>> happen.
>
> The crashes also happen with maxcpus=1 and a single guest cpu.
> Today I added wait_event to ept_get_entry and this works.
>
> But at some point the codepath below is executed, after that wake_up the
> host hangs hard. I will trace it further next week, maybe the backtrace
> gives a glue what the cause could be.
So you run with a single CPU, and with wait_event() in one location, and
that works for a while (actually doing full waitqueue work: executing wait()
and wake_up()), but then hangs? That's weird, but pretty interesting if I've
understood correctly.
> Also, the 3K stacksize is still too small, this path uses 3096.
I'll allocate a whole page for the stack then.
-- Keir
> (XEN) prep 127a 30 0
> (XEN) wake 127a 30
> (XEN) prep 1cf71 30 0
> (XEN) wake 1cf71 30
> (XEN) prep 1cf72 30 0
> (XEN) wake 1cf72 30
> (XEN) prep 1cee9 30 0
> (XEN) wake 1cee9 30
> (XEN) prep 121a 30 0
> (XEN) wake 121a 30
>
> (This means 'gfn (p2m_unshare << 4) in_atomic)'
>
> (XEN) prep 1ee61 20 0
> (XEN) max stacksize c18
> (XEN) Xen WARN at wait.c:126
> (XEN) ----[ Xen-4.2.24114-20111111.221356 x86_64 debug=y Tainted: C
> ]----
> (XEN) CPU: 0
> (XEN) RIP: e008:[<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2
> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor
> (XEN) rax: 0000000000000000 rbx: ffff830201f76000 rcx: 0000000000000000
> (XEN) rdx: ffff82c4802b7f18 rsi: 000000000000000a rdi: ffff82c4802673f0
> (XEN) rbp: ffff82c4802b73a8 rsp: ffff82c4802b7378 r8: 0000000000000000
> (XEN) r9: ffff82c480221da0 r10: 00000000fffffffa r11: 0000000000000003
> (XEN) r12: ffff82c4802b7f18 r13: ffff830201f76000 r14: ffff83003ea5c000
> (XEN) r15: 000000000001ee61 cr0: 000000008005003b cr4: 00000000000026f0
> (XEN) cr3: 000000020336d000 cr2: 00007fa88ac42000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802b7378:
> (XEN) 0000000000000020 000000000001ee61 0000000000000002 ffff830201aa9e90
> (XEN) ffff830201aa9f60 0000000000000020 ffff82c4802b7428 ffff82c4801e02f9
> (XEN) ffff830000000002 0000000000000000 ffff82c4802b73f8 ffff82c4802b73f4
> (XEN) 0000000000000000 ffff82c4802b74e0 ffff82c4802b74e4 0000000101aa9e90
> (XEN) 000000ffffffffff ffff830201aa9e90 000000000001ee61 ffff82c4802b74e4
> (XEN) 0000000000000002 0000000000000000 ffff82c4802b7468 ffff82c4801d810f
> (XEN) ffff82c4802b74e0 000000000001ee61 ffff830201aa9e90 ffff82c4802b75bc
> (XEN) 00000000002167f5 ffff88001ee61900 ffff82c4802b7518 ffff82c480211b80
> (XEN) ffff8302167f5000 ffff82c4801c168c 0000000000000000 ffff83003ea5c000
> (XEN) ffff88001ee61900 0000000001805063 0000000001809063 000000001ee001e3
> (XEN) 000000001ee61067 00000000002167f5 000000000022ee70 000000000022ed10
> (XEN) ffffffffffffffff 0000000a00000007 0000000000000004 ffff82c48025db80
> (XEN) ffff83003ea5c000 ffff82c4802b75bc ffff88001ee61900 ffff830201aa9e90
> (XEN) ffff82c4802b7528 ffff82c480211cb1 ffff82c4802b7568 ffff82c4801da97f
> (XEN) ffff82c4801be053 0000000000000008 ffff82c4802b7b58 ffff88001ee61900
> (XEN) 0000000000000000 ffff82c4802b78b0 ffff82c4802b75f8 ffff82c4801aaec8
> (XEN) 0000000000000003 ffff88001ee61900 ffff82c4802b78b0 ffff82c4802b7640
> (XEN) ffff83003ea5c000 00000000000000a0 0000000000000900 0000000000000008
> (XEN) 00000003802b7650 0000000000000004 00000003802b7668 0000000000000000
> (XEN) ffff82c4802b7b58 0000000000000001 0000000000000003 ffff82c4802b78b0
> (XEN) Xen call trace:
> (XEN) [<ffff82c48012b85e>] prepare_to_wait+0x178/0x1b2
> (XEN) [<ffff82c4801e02f9>] ept_get_entry+0x81/0xd8
> (XEN) [<ffff82c4801d810f>] gfn_to_mfn_type_p2m+0x55/0x114
> (XEN) [<ffff82c480211b80>] hap_p2m_ga_to_gfn_4_levels+0x1c4/0x2d6
> (XEN) [<ffff82c480211cb1>] hap_gva_to_gfn_4_levels+0x1f/0x2e
> (XEN) [<ffff82c4801da97f>] paging_gva_to_gfn+0xae/0xc4
> (XEN) [<ffff82c4801aaec8>] hvmemul_linear_to_phys+0xf1/0x25c
> (XEN) [<ffff82c4801ab762>] hvmemul_rep_movs+0xe8/0x31a
> (XEN) [<ffff82c48018de07>] x86_emulate+0x4e01/0x10fde
> (XEN) [<ffff82c4801aab3c>] hvm_emulate_one+0x12d/0x1c5
> (XEN) [<ffff82c4801b68a9>] handle_mmio+0x4e/0x1d8
> (XEN) [<ffff82c4801b3a1e>] hvm_hap_nested_page_fault+0x1e7/0x302
> (XEN) [<ffff82c4801d1ff6>] vmx_vmexit_handler+0x12cf/0x1594
> (XEN)
> (XEN) wake 1ee61 20
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|