WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 7 of 7] KEXEC: correctly revert x2apic state whe

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing
From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date: Wed, 15 Jun 2011 10:40:01 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 15 Jun 2011 02:40:46 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DF896CC0200007800047532@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4DF896CC0200007800047532@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110424 Lightning/1.0b2 Thunderbird/3.1.10

On 15/06/11 10:26, Jan Beulich wrote:
>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> Introduce the boolean variable 'kexecing' which indicates to functions
>> whether we are on the kexec path or not.  This is used by
>> disable_local_APIC() to try and revert the APIC mode back to how it
>> was found on boot.
>>
>> We also need some fudging of the x2apic_enabled variable.  It is used
>> in multiple places over the codebase to mean multiple things,
>> including:
>>     What did the user specifify on the command line?
>>     Did the BIOS boot me in x2apic mode?
>>     Is the BSP Local APIC in x2apic mode?
>>     What mode is my Local APIC in?
> I don't really follow the need for this, and a properly explaining
> comment certainly also belongs at the place where the hack is.

It was more for reference when some unlucky person really has to go and
refactor the apic code.  This and the paragraph following it were meant
to convey the why we are choosing to re-evaluate the x2apic_enabled
variable.

>> Therefore, set it up to prevent a protection fault when disabling the
>> IOAPICs.  (In this case, it is used in the "What mode is my Local APIC
>> in?" case, so the processor doesnt suffer a protection fault because
>> of trying to use x2apic MSRs when it should be using xapic MMIO)
>>
>> Finally, make sure that interrupts are disabled when jumping into the
>> purgatory code.  It would be bad to service interrupts in the Xen
>> context when the next kernel is booting.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c
>> --- a/xen/arch/x86/apic.c    Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/apic.c    Mon Jun 13 17:45:43 2011 +0100
>> @@ -37,6 +37,7 @@
>>  #include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */
>>  #include <mach_apic.h>
>>  #include <io_ports.h>
>> +#include <xen/kexec.h>
>>  
>>  static bool_t tdt_enabled __read_mostly;
>>  static bool_t tdt_enable __initdata = 1;
>> @@ -345,6 +346,33 @@ void disable_local_APIC(void)
>>          wrmsrl(MSR_IA32_APICBASE, msr_content &
>>                 ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD));
>>      }
>> +
>> +    if ( kexecing )
>> +    {
>> +        uint64_t msr_content;
>> +        rdmsrl(MSR_IA32_APICBASE, msr_content);
>> +        msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
>> +        wrmsrl(MSR_IA32_APICBASE, msr_content);
>> +
>> +        switch ( apic_boot_mode )
> Did you verify this gets executed only for the single remaining CPU?

It most definitely runs on all CPUs.  Because of the difference between
x2apic and xapic interrupts, it is stark-raving mad to try and run a
system with different lapics in different modes as the default operating
state.

>> +        {
>> +        case APIC_MODE_DISABLED:
>> +            break; /* Nothing to do - we did this above */
>> +        case APIC_MODE_XAPIC:
>> +            msr_content |= MSR_IA32_APICBASE_ENABLE;
>> +            wrmsrl(MSR_IA32_APICBASE, msr_content);
>> +            break;
>> +        case APIC_MODE_X2APIC:
>> +            msr_content |= 
>> (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD);
>> +            wrmsrl(MSR_IA32_APICBASE, msr_content);
>> +            break;
>> +        default:
>> +            printk("Default case when reverting #%d lapic to boot state\n",
>> +                   smp_processor_id());
>> +            break;
>> +        }
>> +    }
>> +
>>  }
>>  
>>  /*
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c
>> --- a/xen/arch/x86/crash.c   Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/crash.c   Mon Jun 13 17:45:43 2011 +0100
>> @@ -27,6 +27,7 @@
>>  #include <asm/hvm/support.h>
>>  #include <asm/apic.h>
>>  #include <asm/io_apic.h>
>> +#include <xen/iommu.h>
>>  
>>  static atomic_t waiting_for_crash_ipi;
>>  static unsigned int crashing_cpu;
>> @@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void)
>>      iommu_crash_shutdown();
>>  
>>      __stop_this_cpu();
>> +
>> +    /* This is a bit of a hack due to the problems with the x2apic_enabled
>> +     * variable, but we can't do any better without a significant 
>> refactoring
>> +     * of the APIC code */
> Ugly, but I can't exclude it may indeed be necessary. But no matter
> what, I think this belongs into apic.c.

I tried but couldn't get it to work in any sane way.  This is, as far as
I can tell, the only simple solution to the problem

>> +    if ( current_local_apic_mode() == APIC_MODE_X2APIC )
>> +        x2apic_enabled = 1;
>> +    else
>> +        x2apic_enabled = 0;
> Do you really need to force x2apic_enabled *both* ways to avoid
> described protection fault? And really I still don't follow why the
> variable at the end of the life of the system all of the sudden needs
> tweaking, when the system lived happily with its "normal" value.
>
> Jan

You do need to force it both ways.  disable_IO_APIC() which is the
following call runs the risk of causing a protection fault, when setting
virtual wire mode back up.  However, in the alternate case where the
local APIC is in x2apic mode, and x2apic_enabled is false, all the APIC
code will attempt to use MMIO and get confused when twiddling it does
nothing.  (This is one of the problems the linux kdump kernel had until
very recently)

~Andrew

>> +
>>      disable_IO_APIC();
>>  
>>      local_irq_restore(flags);
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c
>> --- a/xen/arch/x86/machine_kexec.c   Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/arch/x86/machine_kexec.c   Mon Jun 13 17:45:43 2011 +0100
>> @@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im
>>      if ( hpet_broadcast_is_available() )
>>          hpet_disable_legacy_broadcast();
>>  
>> +    /* We are about to permenantly jump out of the Xen context into the 
>> kexec
>> +     * purgatory code.  We really dont want to be still servicing 
>> interupts.
>> +     */
>> +    local_irq_disable();
>> +
>>      /*
>>       * compat_machine_kexec() returns to idle pagetables, which requires us
>>       * to be running on a static GDT mapping (idle pagetables have no GDT
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c
>> --- a/xen/common/kexec.c     Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/common/kexec.c     Mon Jun 13 17:45:43 2011 +0100
>> @@ -29,6 +29,8 @@
>>  #include <compat/kexec.h>
>>  #endif
>>  
>> +bool_t kexecing = FALSE;
>> +
>>  static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes);
>>  
>>  static Elf_Note *xen_crash_note;
>> @@ -220,6 +222,8 @@ void kexec_crash(void)
>>      if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) )
>>          return;
>>  
>> +    kexecing = TRUE;
>> +
>>      kexec_common_shutdown();
>>      kexec_crash_save_cpu();
>>      machine_crash_shutdown();
>> @@ -232,6 +236,8 @@ static long kexec_reboot(void *_image)
>>  {
>>      xen_kexec_image_t *image = _image;
>>  
>> +    kexecing = TRUE;
>> +
>>      kexec_common_shutdown();
>>      machine_reboot_kexec(image);
>>  
>> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h
>> --- a/xen/include/xen/kexec.h        Mon Jun 13 17:45:43 2011 +0100
>> +++ b/xen/include/xen/kexec.h        Mon Jun 13 17:45:43 2011 +0100
>> @@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve {
>>  
>>  extern xen_kexec_reserve_t kexec_crash_area;
>>  
>> +extern bool_t kexecing;
>> +
>>  void set_kexec_crash_area_size(u64 system_ram);
>>  
>>  /* We have space for 4 images to support atomic update
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx 
>> http://lists.xensource.com/xen-devel 
>

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel