WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] Unified shutdown code

To: "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] Unified shutdown code
From: "Cihula, Joseph" <joseph.cihula@xxxxxxxxx>
Date: Mon, 10 Sep 2007 14:22:27 -0700
Cc: "Wang, Shane" <shane.wang@xxxxxxxxx>, "Wei, Gang" <gang.wei@xxxxxxxxx>
Delivery-date: Mon, 10 Sep 2007 14:23:04 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C30B3A1B.155B6%Keir.Fraser@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <D936D925018D154694D8A362EEB08920025E4BF7@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <C30B3A1B.155B6%Keir.Fraser@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcfxsZEip0g/k3RSRdWVCnLdnRf+DgCHIfGIAAaqfEA=
Thread-topic: [Xen-devel] [PATCH] Unified shutdown code
Keir Fraser <mailto:Keir.Fraser@xxxxxxxxxxxx> scribbled on Monday,
September 10, 2007 10:15 AM:
> I'm not enormously keen on most of this patch. Early startup failures
> are rare enough that trying to cleanly shutdown is quite a pain. Also
at
> that point if we reboot then the user cannot see what the error was,
to
> correct it. And implementing a reliable software delay or
keypress-check at
> that point is quite a pain. Changing machine_halt() seems rather
pointless
> -- if we're going to halt, why bother to clean up any state?

It was my mistake to change the EARLY_FAIL from halt to reboot.

But the purpose of centralizing it was so that the hook into sboot's
shutdown wouldn't need to be in multiple place.  And the reason to hook
into sboot's shutdown (which also supports the halt action) even though
the system is being halt'ed is so that we don't leave some path that
allows the system to be subverted or misused while still having
privileged access to the TPM, etc.

That said, I'm not aware of any exploitable conditions/paths/environment
when Xen is placed in a halt loop (at least none that JTAG users
wouldn't already have without waiting for the system to halt), so I
suppose that this extra bit of caution is not really necessary.  But if
the EARLY_FAIL behavior gets changed back to halt, is there any harm?

> 
> So all I've taken is the code to wait for other CPUs to confirm their
> offline status before actually restarting the machine. But I moved
> that code to smp_send_stop() and changed the timeout to 10ms, as 50us
is pretty
> short. Staging changeset 15848.

This is fine.

> 
> If there's a good reason for the other patch pieces then the
> motivation needs to be more clearly stated.
> 
>  -- Keir
> 
> On 8/9/07 01:45, "Cihula, Joseph" <joseph.cihula@xxxxxxxxx> wrote:
> 
>> Attached and below are a patch that unifies the shutdown code paths
>> in Xen, including those from EARLY_FAIL.  This will facilitate the
>> use of Intel(R) TXT for measured launch by ensuring that all
>> shutdowns will call the necessary hook to tear down the measured
>> environment.  It also centralizes any future shutdown-related
>> changes. 
>> 
>> The patch also postpones clearing the online status APs in
>> stop_this_cpu() until just before putting them into the hlt loop. 
>> This ensures that when machine_teardown() checks that all APs have
>> finished their shutdown, that they have really have completed all
>> interesting tasks. 
>> 
>> This applies to the staging tree as of today.
>> 
>> Joe
>> 
>> # HG changeset patch
>> # User Joseph Cihula <joseph.cihula@xxxxxxxxx>
>> # Date 1189211821 25200
>> # Node ID a85c4081738a77a4691640c823a74fb6159ca722
>> # Parent  a53aaea4c69813a7143daa677b9e65d1d2f15b6b
>> Unify all shutdown paths to go through machine_teardown().  This will
>> facilitate shutdown after an Intel(R) TXT measured launch, as well as
>> centralizing any other shutdown-related changes.
>> 
>> Postpone clearing the online status APs in stop_this_cpu() until
>> just before putting them into the hlt loop.  This ensures that when
>> machine_teardown() checks that all APs have finished their shutdown,
>> that they have really have completed all interesting tasks.
>> 
>> Signed-off-by:  Joseph Cihula <joseph.cihula@xxxxxxxxx>
>> 
>> diff -r a53aaea4c698 -r a85c4081738a xen/arch/ia64/xen/domain.c
>> --- a/xen/arch/ia64/xen/domain.c Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/arch/ia64/xen/domain.c Fri Sep 07 17:37:01 2007 -0700
>> @@ -1486,7 +1486,7 @@ int __init construct_dom0(struct domain
>> return 0;
>>  }
>> 
>> -void machine_restart(char * __unused)
>> +void machine_restart(int __unused)
>>  {
>> console_start_sync();
>> if (running_on_sim)
>> diff -r a53aaea4c698 -r a85c4081738a xen/arch/powerpc/domain.c
>> --- a/xen/arch/powerpc/domain.c Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/arch/powerpc/domain.c Fri Sep 07 17:37:01 2007 -0700
>> @@ -119,7 +119,7 @@ void machine_halt(void)
>>      machine_fail(__func__);
>>  }
>> 
>> -void machine_restart(char * __unused)
>> +void machine_restart(int __unused)
>>  {
>>      console_start_sync();
>>      printk("%s called\n", __func__);
>> diff -r a53aaea4c698 -r a85c4081738a xen/arch/x86/setup.c
>> --- a/xen/arch/x86/setup.c Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/arch/x86/setup.c Fri Sep 07 17:37:01 2007 -0700 @@ -19,6
>>  +19,7 @@ #include <xen/numa.h>
>>  #include <xen/rcupdate.h>
>>  #include <xen/vga.h>
>> +#include <xen/shutdown.h>
>>  #include <public/version.h>
>>  #ifdef CONFIG_COMPAT
>>  #include <compat/platform.h>
>> @@ -168,7 +169,7 @@ static void __init do_initcalls(void)
>> 
>>  #define EARLY_FAIL(f, a...) do {                \
>>      printk( f , ## a );                         \
>> -    for ( ; ; ) __asm__ __volatile__ ( "hlt" ); \
>> +    machine_restart(1);                         \
>>  } while (0)
>> 
>>  static unsigned long __initdata initial_images_start,
>> initial_images_end; diff -r a53aaea4c698 -r a85c4081738a
>> xen/arch/x86/shutdown.c --- a/xen/arch/x86/shutdown.c Fri Sep 07
>> 19:54:29 2007 +0100 +++ b/xen/arch/x86/shutdown.c Fri Sep 07
>> 17:37:01 2007 -0700 @@ -29,6 +29,10 @@ static long no_idt[2];
>>  static long no_idt[2];
>>  static int reboot_mode;
>> 
>> +#define TEARDOWN_TYPE_REBOOT    0
>> +#define TEARDOWN_TYPE_HALT      1
>> +#define TEARDOWN_TYPE_EARLY     2
>> +
>>  static inline void kb_wait(void)
>>  {
>>      int i;
>> @@ -36,20 +40,6 @@ static inline void kb_wait(void)
>>      for ( i = 0; i < 0x10000; i++ )
>>          if ( (inb_p(0x64) & 0x02) == 0 )
>>              break;
>> -}
>> -
>> -static void  __attribute__((noreturn)) __machine_halt(void *unused)
>> -{ 
>> -    for ( ; ; )
>> -        __asm__ __volatile__ ( "hlt" );
>> -}
>> -
>> -void machine_halt(void)
>> -{
>> -    watchdog_disable();
>> -    console_start_sync();
>> -    smp_call_function(__machine_halt, NULL, 1, 0);
>> -    __machine_halt(NULL);
>>  }
>> 
>>  #ifdef __i386__
>> @@ -197,32 +187,71 @@ static void machine_real_restart(const u
>> 
>>  #endif
>> 
>> -void machine_restart(char *cmd)
>> +/*
>> + * generic teardown (i.e. kill watchdog, disable hvm, disable IO
>> APIC, etc.) + *
>> + * will teardown APs and leave in hlt loop
>> + * will teardown BSP and reboot or halt
>> + *
>> + * TEARDOWN_TYPE_EARLY is used by __start_xen (in EARLY_FAIL())
>> before much + * of the system has been initialized (e.g. before
>> fixmap) + */ +static void machine_teardown(int type)
>>  {
>>      int i;
>> +    int timeout = 10;
>> +
>> +    if ( type != TEARDOWN_TYPE_EARLY )
>> +    {
>> +        /* Ensure we are the boot CPU. */
>> +        if ( GET_APIC_ID(apic_read(APIC_ID)) !=
>> boot_cpu_physical_apicid )
>> + {
>> +     printk("machine_teardown() not called on BSP\n");
>> +     /* Send IPI to the boot CPU (logical cpu 0). */
>> +     on_selected_cpus(cpumask_of_cpu(0), (void
>> *)machine_teardown,
>> +        (void *)(unsigned long)type, 1, 0);
>> +     /* park us until smp_send_stop() sends us to shutdown
>> handler */
>> +     for ( ; ; )
>> +         safe_halt();
>> + }
>> +    }
>> 
>>      watchdog_disable();
>> +
>>      console_start_sync();
>> 
>> -    local_irq_enable();
>> -
>> -    /* Ensure we are the boot CPU. */
>> -    if ( GET_APIC_ID(apic_read(APIC_ID)) !=
>> boot_cpu_physical_apicid ) 
>> -    {
>> -        /* Send IPI to the boot CPU (logical cpu 0). */
>> -        on_selected_cpus(cpumask_of_cpu(0), (void *)machine_restart,
>> -                         NULL, 1, 0);
>> -        for ( ; ; )
>> -            safe_halt();
>> -    }
>> +    if ( type != TEARDOWN_TYPE_EARLY )
>> +        local_irq_enable();
>> 
>>      /*
>>       * Stop all CPUs and turn off local APICs and the IO-APIC, so
>>       * other OSs see a clean IRQ state.
>>       */
>> -    smp_send_stop();
>> -    disable_IO_APIC();
>> +    printk("num_online_cpus=%d\n", num_online_cpus());
>> +    if ( num_online_cpus() > 1 )
>> +        smp_send_stop();
>> +
>> +    if ( type != TEARDOWN_TYPE_EARLY )
>> +        disable_IO_APIC();
>> +
>>      hvm_cpu_down();
>> +
>> +    /* BSP needs to wait until all APs have gone through shutdown
>> (disabled */ +    /* VMX, etc.) before TXT teardown else will hang;
>> but also a good */ +    /* practice even for non-TXT shutdown */
>> +    while ( num_online_cpus() > 1 && timeout > 0 ) {
>> +        /* TBD: determine why this printk is necessary to prevent
>> timeout */ +        printk("loop: num_online_cpus = %d\n",
>> num_online_cpus()); +        /* if not all APs have shutdown, wait a
>> little */ +        udelay(5); +        timeout--;
>> +    }
>> +
>> +    /* just shutdown, not reboot */
>> +    if ( type == TEARDOWN_TYPE_HALT ) {
>> +        for ( ; ; )
>> +            __asm__ __volatile__ ( "hlt" );
>> +    }
>> 
>>      /* Rebooting needs to touch the page at absolute address 0. */
>>      *((unsigned short *)__va(0x472)) = reboot_mode;
>> @@ -248,6 +277,16 @@ void machine_restart(char *cmd)
>>      machine_real_restart(jump_to_bios, sizeof(jump_to_bios));  }
>> 
>> +void machine_halt(void)
>> +{
>> +    machine_teardown(TEARDOWN_TYPE_HALT);
>> +}
>> +
>> +void machine_restart(int early)
>> +{
>> +  machine_teardown( early ? TEARDOWN_TYPE_EARLY :
>> TEARDOWN_TYPE_REBOOT ); +}
>> +
>>  #ifndef reboot_thru_bios
>>  static int __init set_bios_reboot(struct dmi_system_id *d)  {
>> diff -r a53aaea4c698 -r a85c4081738a xen/arch/x86/smp.c
>> --- a/xen/arch/x86/smp.c Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/arch/x86/smp.c Fri Sep 07 17:37:01 2007 -0700
>> @@ -306,12 +306,13 @@ int on_selected_cpus(
>> 
>>  static void stop_this_cpu (void *dummy)
>>  {
>> -    cpu_clear(smp_processor_id(), cpu_online_map); -
>>      local_irq_disable();
>>      disable_local_APIC();
>>      hvm_cpu_down();
>> 
>> +    /* not really offline until just before we halt */
>> +    cpu_clear(smp_processor_id(), cpu_online_map); +
>>      for ( ; ; )
>>          __asm__ __volatile__ ( "hlt" );
>>  }
>> diff -r a53aaea4c698 -r a85c4081738a xen/common/keyhandler.c
>> --- a/xen/common/keyhandler.c Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/common/keyhandler.c Fri Sep 07 17:37:01 2007 -0700
>> @@ -123,7 +123,7 @@ static void halt_machine(unsigned char k
>>  static void halt_machine(unsigned char key, struct cpu_user_regs
>>      *regs)  { printk("'%c' pressed -> rebooting machine\n", key); -
>> machine_restart(NULL); +    machine_restart(0);
>>  }
>> 
>>  static void cpuset_print(char *set, int size, cpumask_t mask)
>> diff -r a53aaea4c698 -r a85c4081738a xen/common/shutdown.c
>> --- a/xen/common/shutdown.c Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/common/shutdown.c Fri Sep 07 17:37:01 2007 -0700
>> @@ -24,7 +24,7 @@ static void maybe_reboot(void)
>>          printk("rebooting machine in 5 seconds.\n");         
>>          watchdog_disable(); mdelay(5000);
>> -        machine_restart(NULL);
>> +        machine_restart(0);
>>      }
>>  }
>> 
>> @@ -50,7 +50,7 @@ void dom0_shutdown(u8 reason)
>>      case SHUTDOWN_reboot:
>>      {
>>          printk("Domain 0 shutdown: rebooting machine.\n");
>> -        machine_restart(NULL);
>> +        machine_restart(0);
>>          break; /* not reached */
>>      }
>> 
>> diff -r a53aaea4c698 -r a85c4081738a xen/drivers/char/console.c
>> --- a/xen/drivers/char/console.c Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/drivers/char/console.c Fri Sep 07 17:37:01 2007 -0700
>> @@ -895,7 +895,7 @@ void panic(const char *fmt, ...)      {
>>          watchdog_disable();
>>          mdelay(5000);
>> -        machine_restart(NULL);
>> +        machine_restart(0);
>>      }
>>  }
>> 
>> diff -r a53aaea4c698 -r a85c4081738a xen/include/xen/shutdown.h
>> --- a/xen/include/xen/shutdown.h Fri Sep 07 19:54:29 2007 +0100
>> +++ b/xen/include/xen/shutdown.h Fri Sep 07 17:37:01 2007 -0700
>> @@ -6,7 +6,7 @@ extern int opt_noreboot;
>> 
>>  void dom0_shutdown(u8 reason);
>> 
>> -void machine_restart(char *cmd);
>> +void machine_restart(int early);
>>  void machine_halt(void);
>>  void machine_power_off(void);
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>