|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 1/2] xenbus: Use .freeze/.thaw to handle xenbus devices
On Mon, Dec 01, 2025 at 01:20:40PM -0500, Jason Andryuk wrote: > On 2025-11-29 21:03, Marek Marczykowski-Górecki wrote: > > On Wed, Nov 19, 2025 at 05:47:29PM -0500, Jason Andryuk wrote: > > > The goal is to fix s2idle and S3 for Xen PV devices. > > > > Can you give a little more context of this? We do have working S3 in > > qubes with no need for such change. We trigger it via the toolstack > > (libxl_domain_suspend_only()). > > Are you talking about guest-initiated suspend here? > > This is intended to help domU s2idle/S3 and resume. I guess that is what > you mean by guest-initiated? The domU can use 'echo mem > /sys/power/state' > to enter s2idle/S3. We also have the domU react to the ACPI sleep button > from `xl trigger $dom sleep`. Ok, so this is indeed a different path than we use in Qubes OS. > AIUI, libxl_domain_suspend_only() triggers xenstore writes which Linux > drivers/xen/manage.c:do_suspend() acts on. `xl save/suspend/migrate` all > use this path. > > The terminology gets confusing. Xen uses "suspend" for > save/suspend/migrate, but the Linux power management codes uses > freeze/thaw/restore. AIUI, Linux's PMSG_SUSPEND/.suspend is for runtime > power management. Indeed it gets confusing... > When you call libxl_domain_suspend_only()/libxl_domain_resume(), you pass > suspend_cancel==1. > * 1. (fast=1) Resume the guest without resetting the domain > environment. > * The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will > return 1. > > That ends up in Linux do_suspend() as si.cancelled = 1, which calls > PMSG_THAW -> .thaw -> xenbus_dev_cancel() which is a no-op. So it does not > change the PV devices. > > We needed guest user space to perform actions before entering s2idle. > libxl_domain_suspend_only() triggers the Linux kernel path which does not > notify user space. The ACPI power buttons let user space perform actions > (lock and blank the screen) before entering the idle state. I see. In our case, we have our own userspace hook that gets called before (if relevant - in most cases it isn't). > > We also have kinda working (host) s2idle. You may want to take a look at > > this > > work (some/most of it was posted upstream, but not all got > > committed/reviewed): > > https://github.com/QubesOS/qubes-issues/issues/6411#issuecomment-1538089344 > > https://github.com/QubesOS/qubes-linux-kernel/pull/910 (some patches > > changed since that PR, see the current main too). > > This would not affect host s2idle - it changes PV frontend devices. > > Do you libxl_domain_suspend_only() all domUs and then put dom0 into s0ix? Yes, exactly. > > > A domain resuming > > > from s3 or s2idle disconnects its PV devices during resume. The > > > backends are not expecting this and do not reconnect. > > > > > > b3e96c0c7562 ("xen: use freeze/restore/thaw PM events for suspend/ > > > resume/chkpt") changed xen_suspend()/do_suspend() from > > > PMSG_SUSPEND/PMSG_RESUME to PMSG_FREEZE/PMSG_THAW/PMSG_RESTORE, but the > > > suspend/resume callbacks remained. > > > > > > .freeze/restore are used with hiberation where Linux restarts in a new > > > place in the future. .suspend/resume are useful for runtime power > > > management for the duration of a boot. > > > > > > The current behavior of the callbacks works for an xl save/restore or > > > live migration where the domain is restored/migrated to a new location > > > and connecting to a not-already-connected backend. > > > > > > Change xenbus_pm_ops to use .freeze/thaw/restore and drop the > > > .suspend/resume hook. This matches the use in drivers/xen/manage.c for > > > save/restore and live migration. With .suspend/resume empty, PV devices > > > are left connected during s2idle and s3, so PV devices are not changed > > > and work after resume. > > > > Is that intended? While it might work for suspend by a chance(*), I'm > > pretty sure not disconnecting + re-reconnecting PV devices across > > save/restore/live migration will break them. > > save/restore/live migration keep using .freeze/thaw/restore, which > disconnects and reconnects today. Nothing changes there as > xen_suspend()/do_suspend() call the power management code with > PMSG_FREEZE/PMSG_THAW/PMSG_RESTORE. > > This patches makes .suspend/resume no-ops for PMSG_SUSPEND/PMSG_RESUME. When > a domU goes into s2idle/S3, the backend state remains connected. With this > patch, when the domU wakes up, the frontends do nothing and remain > connected. This explanation makes sense. > > (*) and even that I'm not sure - with driver domains, depending on > > suspend order this feels like might result in a deadlock... > > I'm not sure. I don't think this patch changes anything with respect to > them. > > Thanks for testing. > > Maybe the commit messages should change to highlight this is for domU PV > devices? struct xen_bus_type xenbus_backend does not define dev_pm_ops. Good idea. > Regards, > Jason > > > > Signed-off-by: Jason Andryuk <jason.andryuk@xxxxxxx> > > > --- > > > drivers/xen/xenbus/xenbus_probe_frontend.c | 4 +--- > > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > > > diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c > > > b/drivers/xen/xenbus/xenbus_probe_frontend.c > > > index 6d1819269cbe..199917b6f77c 100644 > > > --- a/drivers/xen/xenbus/xenbus_probe_frontend.c > > > +++ b/drivers/xen/xenbus/xenbus_probe_frontend.c > > > @@ -148,11 +148,9 @@ static void xenbus_frontend_dev_shutdown(struct > > > device *_dev) > > > } > > > static const struct dev_pm_ops xenbus_pm_ops = { > > > - .suspend = xenbus_dev_suspend, > > > - .resume = xenbus_frontend_dev_resume, > > > .freeze = xenbus_dev_suspend, > > > .thaw = xenbus_dev_cancel, > > > - .restore = xenbus_dev_resume, > > > + .restore = xenbus_frontend_dev_resume, > > > }; > > > static struct xen_bus_type xenbus_frontend = { > > > -- > > > 2.34.1 > > > > > > > > > -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |