|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_en
On 12/08/2010 12:48 AM, Jan Beulich wrote:
>>>> On 08.12.10 at 01:54, Chuck Anderson <chuck.anderson@xxxxxxxxxx> wrote:
>> I'm posting this because I am writing a patch to fix a 2.6.32 based PV
>> Xen domU panic due to a nested call to arch/x86/include/asm/paravirt.h
>> arch_enter_lazy_mmu_mode() (see details below). The following BUG_ON()
>> was triggered:
>>
>> arch/x86/kernel/paravirt.c
>>
>> static inline void enter_lazy(enum paravirt_lazy_mode mode)
>> {
>> BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);
>>
>> percpu_write(paravirt_lazy_mode, mode);
>> }
>>
>> because enter_lazy() was called twice, once through mm/memory.c
>> copy_pte_range() and a second time through an interrupt path.
>>
>> The easy fix is to disable interrupts in copy_pte_range() before calling
>> arch_enter_lazy_mmu_mode() and re-enable them after the call to
>> arch_leave_lazy_mmu_mode() but I'm asking if there is a better way to
>> handle this. If disabling interrupts is best, there are other calls to
>> arch_enter_lazy_mmu_mode() that appear to have the same interruption
>> issue. It may be best then to disable interrupts in
>> arch_enter_lazy_mmu_mode() or paravirt_enter_lazy_mmu().
> I don't think this is an option, as the period of time for which you
> would disable interrupts could be pretty much unbounded.
>
> Instead (being a performance optimization only anyway)
> the BUG_ON() could be removed (accepting that the
> interrupted sequence would not batch any further
> hypercalls, and provided all of this stuff can actually be
> used in a nested way), the flag could be converted to a
> counter (again provided nesting is okay here in the first
> place), or a filter could be applied when actually checking
> whether to batch (which is what we do in our non-pvops
> kernels: in IRQ context, no batching happens).
That's what happens in pvops kernels too - batching is disabled in
interrupt context so that (for example) vmalloc pagefault pte updates
aren't deferred.
Looks like enter/leave lazy should just be no-op in interrupt context too.
Though I'm surprised it has taken so long for this to appear.
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|