[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 13/13] xen/arm: Add host system suspend backend


  • To: Oleksandr Tyshchenko <olekstysh@xxxxxxxxx>
  • From: Mykola Kvach <xakep.amatop@xxxxxxxxx>
  • Date: Mon, 18 May 2026 19:49:33 +0300
  • Arc-authentication-results: i=1; mx.google.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=2Apv+qMKbA7fz7/8la2yYzKpT8LQKv7OimyMDcCXInk=; fh=SFtdlMni3BnFH+oTSZbVgVF1hLyFxrwlVmTI+BqD6+c=; b=Y053u7+ksSq2USO2mZpqe1qQaqWbMm62lIzalwSMmljQkSe6j1ipUKotuFPvA8ypYg LmxxBeRKlR8ixJiF/CmCZLjhIjJ+BcqLXfRo53pfMT8amKNr5aZqctmV33Udp2TZ6GaC jNWYLll0intSYunqxw7hgWLCbWEwPmPKeNgKRaofto3+ZccHHsU1hDDTTLEW2rLiuzdL 01wfzhRuaVJN9bSg41rRs+S16FWExhonlvKbsUWSsSnibvPqXOE+jcGQIiagYph3hUG0 LtNU5tcSVeD4f6v0rKknb/GbiJFzAURCbtgh8iJW1rOrSprtJRW6r4lHbFIAxvfGeEpr 9rww==; darn=lists.xenproject.org
  • Arc-seal: i=1; a=rsa-sha256; t=1779122985; cv=none; d=google.com; s=arc-20240605; b=Q7ahKdAi/ytb+adxfgwau/iRKzgThadUCXpjzxDW8HvKVtUXX+Dqt6IZNdIHqxq9tq p1fOB1rfXPTJ0lLBMBEf1+SwZiGaS+8QdhQ9o0ntQt2YaGtxVRJN9iTF2OK2X2oo4apV /WmMa4erZdYTqptQGSgiIujuydTNzAvVcofFvsqsRBaf7+VCp78JWVhVUHB2QRmcz18t /aeYEkdOBfy8GWsl37r7lNttnrNyi3grqBsITlIArz5QSD/zIRdJcdXfLpHoex6vMVEX NKDDN7pdvA+wxIoouLvTuaNgCDMrr7NqEk4Sx838txwlCuzPIyHIVCupAXzefBxCZgBd zt9g==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=20251104 header.d=gmail.com header.i="@gmail.com" header.h="Content-Transfer-Encoding:Cc:To:Subject:Message-ID:Date:From:In-Reply-To:References:MIME-Version"
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Mykola Kvach <mykola_kvach@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Mon, 18 May 2026 16:49:59 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Oleksandr,

Thank you for the review.

On Sun, May 17, 2026 at 4:00 PM Oleksandr Tyshchenko
<olekstysh@xxxxxxxxx> wrote:
>
>
>
> On 5/12/26 20:07, Mykola Kvach wrote:
>
> Hello Mykola
>
> > From: Mirela Simonovic <mirela.simonovic@xxxxxxxxxx>
> >
> > Add the Xen-wide suspend/resume backend used after a control-domain
> > vPSCI SYSTEM_SUSPEND request has been accepted. The vPSCI policy,
> > runtime driver blockers and control-domain sequencing checks are handled
> > by the preceding commit; this change adds the code that actually drives
> > the host suspend attempt.
> >
> > The backend runs from a tasklet scheduled on pCPU0, because non-boot CPUs
> > are disabled during suspend. It freezes domains, disables the scheduler
> > and then disables non-boot CPUs.
> >
> > Host-side suspend participants are handled in phases. IOMMU and console
> > state are suspended first. Local IRQs are then disabled before suspending
> > timer and GIC state. On resume or failure, the completed suspend phases
> > are unwound in reverse: GIC and timer state are restored while IRQs are
> > still disabled, local IRQs are restored, and then console and IOMMU state
> > are restored.
> >
> > On boot, init_ttbr is normally initialized during secondary CPU hotplug.
> > On uniprocessor systems this can leave init_ttbr uninitialized, so set it
> > from the boot CPU before entering suspend.
> >
> > Note: the code is behind CONFIG_HAS_SYSTEM_SUSPEND, which is currently
> > only selected when UNSUPPORTED is set and MPU is not set.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@xxxxxxxxxx>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xxxxxxxxxx>
> > Signed-off-by: Mykyta Poturai <mykyta_poturai@xxxxxxxx>
> > Signed-off-by: Mykola Kvach <mykola_kvach@xxxxxxxx>
>
>
> Patch looks good to me, just one question to clarify ...
>
>
> > ---
> > Changes in V9:
> > - Split vPSCI availability policy, runtime host-suspend blockers and the
> >    domain-readiness precheck into the preceding commit.
> > - Trigger the host suspend backend from the control-domain SYSTEM_SUSPEND
> >    path.
> > - Reorder the host suspend/resume phases so the timer is suspended with
> >    local IRQs disabled and local IRQs are restored after the GIC and timer
> >    resume paths, before the console and IOMMU resume paths.
> > - Move HAS_HWDOM_SYSTEM_SUSPEND and related logic to policy patch.
> >
> > Changes in V8:
> > - Add a pre-suspend check in system_suspend() after scheduler_disable() to
> >    require all domains to be in the shut down state with SHUTDOWN_suspend
> >    before proceeding with the global suspend flow.
> > - Drop the common-level depends on !ARM_64 || !SYSTEM_SUSPEND from
> >    CONFIG_HAS_HWDOM_SHUTDOWN_ON_SUSPEND and model the ARM64 suspend case
> >    with an arch-selected capability instead.
> > - Rename CONFIG_HAS_HWDOM_SHUTDOWN_ON_SUSPEND to
> >    CONFIG_HAS_HWDOM_SYSTEM_SUSPEND.
> > - Rename need_hwdom_shutdown() to want_hwdom_shutdown().
> >
> > Changes in V7:
> > - Control domain is responsible for host suspend.
> > - Add an empty inline host_system_suspend() function when SYSTEM_SUSPEND
> >    config is disabled.
> > - Use IS_ENABLED() for config checking instead of #ifdef.
> > - Replace #ifdef checks in domain_shutdown() with IS_ENABLED() to simplify
> >    control flow.
> > - Factor hardware domain shutdown condition into a helper
> >    (need_hwdom_shutdown()) to avoid preprocessor directives inside the
> >    function.
> > - Squash with iommu suspend/resume commit.
> > ---
> >   xen/arch/arm/Kconfig               |   1 +
> >   xen/arch/arm/include/asm/mm.h      |   2 +
> >   xen/arch/arm/include/asm/suspend.h |   2 +
> >   xen/arch/arm/mmu/smpboot.c         |   2 +-
> >   xen/arch/arm/suspend.c             | 140 +++++++++++++++++++++++++++++
> >   xen/arch/arm/vpsci.c               |  10 ++-
> >   6 files changed, 154 insertions(+), 3 deletions(-)
> >
> > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > index 54a5bfb9ae..119bc00674 100644
> > --- a/xen/arch/arm/Kconfig
> > +++ b/xen/arch/arm/Kconfig
> > @@ -9,6 +9,7 @@ config ARM_64
> >       select 64BIT
> >       select HAS_DOMAIN_TYPE
> >       select HAS_FAST_MULTIPLY
> > +     select HAS_SYSTEM_SUSPEND if !MPU && UNSUPPORTED
> >       select HAS_VPCI_GUEST_SUPPORT if PCI_PASSTHROUGH
> >
> >   config ARM
> > diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> > index 2eb8465aa9..de119cad3a 100644
> > --- a/xen/arch/arm/include/asm/mm.h
> > +++ b/xen/arch/arm/include/asm/mm.h
> > @@ -360,6 +360,8 @@ static inline void page_set_xenheap_gfn(struct 
> > page_info *p, gfn_t gfn)
> >       } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x );
> >   }
> >
> > +void set_init_ttbr(lpae_t *root);
> > +
> >   #endif /*  __ARCH_ARM_MM__ */
> >   /*
> >    * Local variables:
> > diff --git a/xen/arch/arm/include/asm/suspend.h 
> > b/xen/arch/arm/include/asm/suspend.h
> > index 87db12eac3..a194dbb21a 100644
> > --- a/xen/arch/arm/include/asm/suspend.h
> > +++ b/xen/arch/arm/include/asm/suspend.h
> > @@ -40,11 +40,13 @@ int prepare_resume_ctx(void);
> >   void hyp_resume(void);
> >   bool host_system_suspend_allowed(void);
> >   void host_system_suspend_disable(const char *reason);
> > +void host_system_suspend(struct domain *d);
> >
> >   #else /* !CONFIG_SYSTEM_SUSPEND */
> >
> >   static inline bool host_system_suspend_allowed(void) { return false; }
> >   static inline void host_system_suspend_disable(const char *reason) {}
> > +static inline void host_system_suspend(struct domain *d) {}
> >
> >   #endif
> >
> > diff --git a/xen/arch/arm/mmu/smpboot.c b/xen/arch/arm/mmu/smpboot.c
> > index 37e91d72b7..ff508ecf40 100644
> > --- a/xen/arch/arm/mmu/smpboot.c
> > +++ b/xen/arch/arm/mmu/smpboot.c
> > @@ -72,7 +72,7 @@ static void clear_boot_pagetables(void)
> >       clear_table(boot_third);
> >   }
> >
> > -static void set_init_ttbr(lpae_t *root)
> > +void set_init_ttbr(lpae_t *root)
> >   {
> >       /*
> >        * init_ttbr is part of the identity mapping which is read-only. So
> > diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> > index a571035d2c..b1cc67fbdb 100644
> > --- a/xen/arch/arm/suspend.c
> > +++ b/xen/arch/arm/suspend.c
> > @@ -1,10 +1,16 @@
> >   /* SPDX-License-Identifier: GPL-2.0-only */
> >
> > +#include <asm/gic.h>
> >   #include <asm/psci.h>
> >   #include <asm/suspend.h>
> >
> > +#include <xen/console.h>
> > +#include <xen/cpu.h>
> > +#include <xen/iommu.h>
> >   #include <xen/lib.h>
> > +#include <xen/sched.h>
> >   #include <xen/serial.h>
> > +#include <xen/tasklet.h>
> >
> >   struct resume_cpu_context resume_cpu_context;
> >
> > @@ -44,6 +50,140 @@ void host_system_suspend_disable(const char *reason)
> >              reason ? reason : "unsupported suspend/resume path");
> >   }
> >
> > +/* Xen suspend. data identifies the domain that initiated suspend. */
> > +static void system_suspend(void *data)
> > +{
> > +    int status;
> > +    unsigned long flags;
> > +    struct domain *d = (struct domain *)data;
> > +
> > +    BUG_ON(system_state != SYS_STATE_active);
> > +
> > +    system_state = SYS_STATE_suspend;
> > +
> > +    printk("Xen suspending...\n");
> > +
> > +    freeze_domains();
> > +    scheduler_disable();
> > +
> > +    /*
> > +     * Non-boot CPUs have to be disabled on suspend and enabled on resume
> > +     * (hotplug-based mechanism). Disabling non-boot CPUs will lead to PSCI
> > +     * CPU_OFF to be called by each non-boot CPU. Depending on the 
> > underlying
> > +     * platform capabilities, this may lead to the physical powering down 
> > of
> > +     * CPUs.
> > +     */
> > +    status = disable_nonboot_cpus();
> > +    if ( status )
> > +    {
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_nonboot_cpus;
> > +    }
> > +
> > +    console_start_sync();
> > +    status = iommu_suspend();
> > +    if ( status )
> > +    {
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_end_sync;
> > +    }
> > +
> > +    status = console_suspend();
> > +    if ( status )
> > +    {
> > +        dprintk(XENLOG_ERR, "Failed to suspend the console, err=%d\n", 
> > status);
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_iommu;
> > +    }
> > +
> > +    local_irq_save(flags);
> > +
> > +    time_suspend();
> > +
> > +    status = gic_suspend();
> > +    if ( status )
> > +    {
> > +        system_state = SYS_STATE_resume;
> > +        goto resume_time;
> > +    }
> > +
> > +    set_init_ttbr(xen_pgtable);
> > +
> > +    /*
> > +     * Enable identity mapping before entering suspend to simplify
> > +     * the resume path
> > +     */
> > +    update_boot_mapping(true);
> > +
> > +    if ( prepare_resume_ctx() )
> > +    {
> > +        status = call_psci_system_suspend();
> > +        /*
> > +         * If suspend is finalized properly by above system suspend PSCI 
> > call,
> > +         * the code below in this 'if' branch will never execute. Execution
> > +         * will continue from hyp_resume which is the hypervisor's resume 
> > point.
> > +         * In hyp_resume CPU context will be restored and since 
> > link-register is
> > +         * restored as well, it will appear to return from 
> > prepare_resume_ctx.
> > +         * The difference in returning from prepare_resume_ctx on system 
> > suspend
> > +         * versus resume is in function's return value: on suspend, the 
> > return
> > +         * value is a non-zero value, on resume it is zero. That is why the
> > +         * control flow will not re-enter this 'if' branch on resume.
> > +         */
> > +        if ( status )
> > +            dprintk(XENLOG_WARNING, "PSCI system suspend failed, err=%d\n",
> > +                    status);
> > +    }
>
>
>
> ... unless I am mistaken, the boot CPU appears to bypass errata checks
> on resume.
>
> Non-boot (secondary) CPUs: before suspending, Xen calls
> disable_nonboot_cpus(). Upon resume, it calls enable_nonboot_cpus(),
> which utilizes the standard secondary CPU bring-up path. Secondary CPUs
> boot through init_secondary() -> start_secondary(), where Xen explicitly
> calls functions such as check_local_cpu_errata(), etc. So, secondary
> CPUs are fine.
>
> Boot CPU: when CPU0 wakes up from SYSTEM_SUSPEND, it enters
> hyp_resume(), calls cpu_init(), and branches directly back into the
> middle of system_suspend(), where it resumes the GIC, timer, IOMMU, etc.
> At no point in hyp_resume() or system_suspend() does CPU0 call
> check_local_cpu_errata().
>
> Could you, please, clarify why this is OK?

Good catch, thanks.

I agree there is a gap here. Secondary CPUs go through the normal
secondary bring-up path after resume, so they run the local CPU
errata/workaround handling again. The boot CPU, however, resumes through
hyp_resume() and returns directly to the suspend path.

I will rework this so that the boot CPU also re-applies the relevant
local CPU errata/workaround handling after SYSTEM_SUSPEND, before
continuing with the rest of the resume sequence.

Best regards,
Mykola



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.