xen-devel
Re: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61
To: |
"Tian, Kevin" <kevin.tian@xxxxxxxxx> |
Subject: |
Re: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61 |
From: |
Jeremy Fitzhardinge <jeremy@xxxxxxxx> |
Date: |
Thu, 28 Apr 2011 16:29:09 -0700 |
Cc: |
MaoXiaoyun <tinnycloud@xxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, "giamteckchoon@xxxxxxxxx" <giamteckchoon@xxxxxxxxx>, "konrad.wilk@xxxxxxxxxx" <konrad.wilk@xxxxxxxxxx> |
Delivery-date: |
Thu, 28 Apr 2011 16:30:16 -0700 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
References: |
<COL0-MC1-F14hmBzxHs00230882@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>, <BLU157-w488E5FEBD5E2DBC0666EF1DAA70@xxxxxxx>, <BLU157-w5025BFBB4B1CDFA7AA0966DAA90@xxxxxxx>, <BLU157-w540B39FBA137B4D96278D2DAA90@xxxxxxx>, <BANLkTimgh_iip27zkDPNV9r7miwbxHmdVg@xxxxxxxxxxxxxx>, <BANLkTimkMgYNyANcKiZu5tJTL4==zdP3xg@xxxxxxxxxxxxxx>, <BLU157-w116F1BB57ABFDE535C7851DAA80@xxxxxxx>, <4DA3438A.6070503@xxxxxxxx>, <BLU157-w2C6CD57CEA345B8D115E8DAAB0@xxxxxxx>, <BLU157-w36F4E0A7503A357C9DE6A3DAAB0@xxxxxxx>, <20110412100000.GA15647@xxxxxxxxxxxx>, <BLU157-w14B84A51C80B41AB72B6CBDAAD0@xxxxxxx>, <BANLkTinNxLnJxtZD68ODLSJqafq0tDRPfw@xxxxxxxxxxxxxx>, <BLU157-w30A1A208238A9031F0D18EDAAD0@xxxxxxx>, <BLU157-w383D1A2536480BCD4C0E0EDAAD0@xxxxxxx> <BLU157-w42DAD248C94153635E9749DAAC0@xxxxxxx>, <4DA8B715.9080508@xxxxxxxx> <BLU157-w51A8A73D5A656542F9AB13DA960@xxxxxxx> <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110307 Fedora/3.1.9-0.39.b3pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.9 |
On 04/25/2011 10:52 PM, Tian, Kevin wrote:
>> From: MaoXiaoyun
>> Sent: Monday, April 25, 2011 11:15 AM
>>> Date: Fri, 15 Apr 2011 14:22:29 -0700
>>> From: jeremy@xxxxxxxx
>>> To: tinnycloud@xxxxxxxxxxx
>>> CC: giamteckchoon@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx;
>>> konrad.wilk@xxxxxxxxxx
>>> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61
>>>
>>> On 04/15/2011 05:23 AM, MaoXiaoyun wrote:
>>>> Hi:
>>>>
>>>> Could the crash related to this patch ?
>>>> http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commitdiff;h=45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3
>>>>
>>>> Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is before
>>>> cpumask_clear_cpu(line 49).
>>>> Could it possible that right after execute line 40 of mmu_context.h,
>>>> CPU revice IPI from other CPU to
>>>> flush the mm, and when in interrupt, find the TLB state happened to be
>>>> TLBSTATE_OK. Which conflicts.
>>> Does reverting it help?
>>>
>>> J
>>
>> Hi Jeremy:
>>
>> The lastest test result shows the reverting didn't help.
>> Kernel panic exactly at the same place in tlb.c.
>>
>> I have question about TLB state, from the stack,
>> xen_do_hypervisor_callback-> xen_evtchn_do_upcall->...
>> ->drop_other_mm_ref
>>
>> What cpu_tlbstate.state should be, could TLBSTATE_OK or TLBSTATE_LAZY
>> all be possible?
>> That is after a hypercall from userspace, state will be TLBSTATE_OK, and
>> if from kernel space, state will be TLBSTATE_LAZE ?
>>
>> thanks.
> it looks a bug in drop_other_mm_ref implementation, that current TLB state
> should be checked
> before invoking leave_mm(). There's a window between below lines of code:
>
> <xen_drop_mm_ref>
> /* Get the "official" set of cpus referring to our pagetable. */
> if (!alloc_cpumask_var(&mask, GFP_ATOMIC)) {
> for_each_online_cpu(cpu) {
> if (!cpumask_test_cpu(cpu, mm_cpumask(mm))
> && per_cpu(xen_current_cr3, cpu) != __pa(mm->pgd))
> continue;
> smp_call_function_single(cpu, drop_other_mm_ref, mm,
> 1);
> }
> return;
> }
>
> there's chance that when smp_call_function_single is invoked, actual TLB
> state has been
> updated in the other cpu. The upstream kernel patch you referred to earlier
> just makes
> this bug exposed more easily. But even without this patch, you may still
> suffer such issue
> which is why reverting the patch doesn't help.
>
> Could you try adding a check in drop_other_mm_ref?
>
> if (active_mm == mm && percpu_read(cpu_tlbstate.state) != TLBSTATE_OK)
> leave_mm(smp_processor_id());
>
> once the interrupted context has TLBSTATE_OK, it implicates that later it
> will handle
> the TLB flush and thus no need for leave_mm from interrupt handler, and
> that's the
> assumption of doing leave_mm.
That seems reasonable. MaoXiaoyun, does it fix the bug for you?
Kevin, could you submit this as a proper patch?
Thanks,
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-devel] Re: Kernel BUG at arch/x86/mm/tlb.c:61, (continued)
- [Xen-devel] Re: Kernel BUG at arch/x86/mm/tlb.c:61, Teck Choon Giam
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- [Xen-devel] Re: Kernel BUG at arch/x86/mm/tlb.c:61, Jeremy Fitzhardinge
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, Tian, Kevin
- RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, Tian, Kevin
- Re: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61,
Jeremy Fitzhardinge <=
- RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, Tian, Kevin
- RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, Tian, Kevin
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, MaoXiaoyun
- RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61, Tian, Kevin
- [Xen-devel] Re: kernel BUG at arch/x86/xen/mmu.c:1872, Teck Choon Giam
[Xen-devel] Re: kernel BUG at arch/x86/xen/mmu.c:1860!, Joerg Stephan
|
|
|