WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] 2.6.32 PV Xen donU guest panic on nested call to arch_enter_lazy_mmu_mode()
From: Chuck Anderson <chuck.anderson@xxxxxxxxxx>
Date: Tue, 07 Dec 2010 16:54:37 -0800
Delivery-date: Tue, 07 Dec 2010 16:56:42 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.22 (X11/20090608)
I'm posting this because I am writing a patch to fix a 2.6.32 based PV Xen domU panic due to a nested call to arch/x86/include/asm/paravirt.h arch_enter_lazy_mmu_mode() (see details below). The following BUG_ON() was triggered:

   arch/x86/kernel/paravirt.c

   static inline void enter_lazy(enum paravirt_lazy_mode mode)
   {
           BUG_ON(percpu_read(paravirt_lazy_mode) != PARAVIRT_LAZY_NONE);

           percpu_write(paravirt_lazy_mode, mode);
   }

because enter_lazy() was called twice, once through mm/memory.c copy_pte_range() and a second time through an interrupt path.

The easy fix is to disable interrupts in copy_pte_range() before calling arch_enter_lazy_mmu_mode() and re-enable them after the call to arch_leave_lazy_mmu_mode() but I'm asking if there is a better way to handle this. If disabling interrupts is best, there are other calls to arch_enter_lazy_mmu_mode() that appear to have the same interruption issue. It may be best then to disable interrupts in arch_enter_lazy_mmu_mode() or paravirt_enter_lazy_mmu().

Here is how the nested call to arch_enter_lazy_mmu_mode() was made. The first call path is:

   do_fork()
     copy_process()
       dup_mm()
         dup_mmap()
           copy_page_range()
             copy_pud_range()
               copy_pmd_range()
                 copy_pte_range()
                   arch_enter_lazy_mmu_mode()
                     paravirt_enter_lazy_mmu()
                       enter_lazy()

We bubble back up to mm/memory.c copy_pte_range(). The guest is interrupted in that function. Here is the edited interrupt call stack that gets us to arch_enter_lazy_mmu_mode() for the second time without an intervening arch_leave_lazy_mmu_mode(), triggering the BUG_ON() in enter_lazy():

   xen_evtchn_do_upcall()
    handle_irq()
      blkif_interrupt()
        do_blkif_request()
          blkif_queue_request()
            gnttab_alloc_grant_references()
              get_free_entries()
                gnttab_expand()
                  gnttab_map()
                    arch_gnttab_map_shared()
                      apply_to_page_range(... map_pte_fn ...)

We get to enter_lazy() downstream from apply_to_page_range():

   apply_to_page_range(... map_pte_fn ...)
     apply_to_pud_range(... map_pte_fn ...)
       apply_to_pmd_range(... map_pte_fn ...)
          apply_to_pte_range(... map_pte_fn ...)
            arch_enter_lazy_mmu_mode()
              paravirt_enter_lazy_mmu()
                enter_lazy()

The spin locks acquired indirectly through mm/memory.c copy_pte_range() are obtained with spin_lock() and spin_acquire() which I believe do not disable interrupts.

Thanks,
Chuck

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel