[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests


  • To: Julien Grall <julien@xxxxxxx>, Mykola Kvach <xakep.amatop@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • From: Jürgen Groß <jgross@xxxxxxxx>
  • Date: Mon, 21 Jul 2025 10:08:09 +0200
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: Mykola Kvach <mykola_kvach@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Mon, 21 Jul 2025 08:08:26 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 28.06.25 20:17, Julien Grall wrote:
Hi Mykola,

On 27/06/2025 11:51, Mykola Kvach wrote:
diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/ perfc_defn.h
index effd25b69e..8dfcac7e3b 100644
--- a/xen/arch/arm/include/asm/perfc_defn.h
+++ b/xen/arch/arm/include/asm/perfc_defn.h
@@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset,        "vpsci: system_reset")
  PERFCOUNTER(vpsci_cpu_suspend,         "vpsci: cpu_suspend")
  PERFCOUNTER(vpsci_cpu_affinity_info,   "vpsci: cpu_affinity_info")
  PERFCOUNTER(vpsci_features,            "vpsci: features")
+PERFCOUNTER(vpsci_system_suspend,      "vpsci: system_suspend")
  PERFCOUNTER(vcpu_kick,                 "vcpu: notify other vcpu")
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index 4780972621..48a93e6b79 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -47,10 +47,12 @@ void call_psci_system_reset(void);
  #define PSCI_0_2_FN32_SYSTEM_OFF          PSCI_0_2_FN32(8)
  #define PSCI_0_2_FN32_SYSTEM_RESET        PSCI_0_2_FN32(9)
  #define PSCI_1_0_FN32_PSCI_FEATURES       PSCI_0_2_FN32(10)
+#define PSCI_1_0_FN32_SYSTEM_SUSPEND      PSCI_0_2_FN32(14)
  #define PSCI_0_2_FN64_CPU_SUSPEND         PSCI_0_2_FN64(1)
  #define PSCI_0_2_FN64_CPU_ON              PSCI_0_2_FN64(3)
  #define PSCI_0_2_FN64_AFFINITY_INFO       PSCI_0_2_FN64(4)
+#define PSCI_1_0_FN64_SYSTEM_SUSPEND      PSCI_0_2_FN64(14)
  /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
  #define PSCI_0_2_AFFINITY_LEVEL_ON      0
diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
index 0cca5e6830..69d40f9d7f 100644
--- a/xen/arch/arm/include/asm/vpsci.h
+++ b/xen/arch/arm/include/asm/vpsci.h
@@ -23,7 +23,7 @@
  #include <asm/psci.h>
  /* Number of function implemented by virtual PSCI (only 0.2 or later) */
-#define VPSCI_NR_FUNCS  12
+#define VPSCI_NR_FUNCS  14
  /* Functions handle PSCI calls from the guests */
  bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c
index 67296dabb5..f9c09a49e2 100644
--- a/xen/arch/arm/mmu/p2m.c
+++ b/xen/arch/arm/mmu/p2m.c
@@ -6,6 +6,8 @@
  #include <xen/sched.h>
  #include <xen/softirq.h>
+#include <public/sched.h>
+
  #include <asm/alternative.h>
  #include <asm/event.h>
  #include <asm/flushtlb.h>
@@ -198,7 +200,9 @@ void dump_p2m_lookup(struct domain *d, paddr_t addr)
   */
  void p2m_save_state(struct vcpu *p)
  {
-    p->arch.sctlr = READ_SYSREG(SCTLR_EL1);
+    if ( !(p->domain->is_shutting_down &&
+           p->domain->shutdown_code == SHUTDOWN_suspend) )

This is definitely not obvious why you need to change p2m_save_state(). AFAIU, you are doing this because when suspending the system, you will overwrite p- >arch.sctlr. However, this is super fragile. For instance, you still seem to overwrite TTBR{0,1} and TTBCR.

TTBR{0,1} are technically unknown at boot. So it is fine to ignore them.  But for TTBCR, I am not 100% whether this is supposed to be unknown.

Anyway, adding more "if (...)" is not the right solution because in the future we may decide to reset more registers.

It would be better if we stash the value sand then update the registers. Another possibility would be to entirely skip the save path for CPUs that are turned off (after all this is a bit useless work...).

+static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
+                            register_t context_id)
+{
+    int rc;
+    struct vcpu *v;
+    struct domain *d = current->domain;
+    register_t vcpuid;
+
+    vcpuid = vaffinity_to_vcpuid(target_cpu);
+
+    if ( (v = domain_vcpu(d, vcpuid)) == NULL )
+        return PSCI_INVALID_PARAMETERS;
+
+    if ( !test_bit(_VPF_down, &v->pause_flags) )
+        return PSCI_ALREADY_ON;
+
+    rc = do_setup_vcpu_ctx(v, entry_point, context_id);
+    if ( rc == PSCI_SUCCESS )
+        vcpu_wake(v);
+
+    return rc;
+}
+
  static int32_t do_psci_cpu_on(uint32_t vcpuid, register_t entry_point)
  {
      int32_t ret;
@@ -197,6 +208,52 @@ static void do_psci_0_2_system_reset(void)
      domain_shutdown(d,SHUTDOWN_reboot);
  }
+static void do_resume_on_error(struct domain *d)

This looks like an open-coding version of domain_resume() without the domain_{,un}pause(). What worries me is there is a comment on top of domain_pause() explaining why it is called. I understand, we can't call domain_pause() here (it doesn't work on the current domain). However, it feels to me there is an explanation necessary why this is fine to diverge.

I am not a scheduler expert, so I am CCing Juergen in the hope he could provide some inputs.

I don't think the reason for domain_[un]pause() is directly scheduling
related.

It seems to be x86 specific for shadow page table handling.

In any case I'd suggest to split domain_resume() into 2 functions, one
of them (e.g. domain_resume_nopause()) replacing do_resume_on_error()
and domain_resume() just pausing the domain, calling the new function,
and then unpausing the domain again.

Another option might be to move the suspend action into a tasklet, enabling
to call domain_resume() directly if needed. I didn't check whether other
constraints even allow this idea, though.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.