[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: domU suspend issue - freeze processes failed - Linux 6.16



On Wed, Sep 24, 2025 at 02:28:27PM +0000, Yann Sionneau wrote:
> On 9/24/25 15:30, Marek Marczykowski-Górecki wrote:
> > On Wed, Sep 24, 2025 at 01:17:15PM +0300, Grygorii Strashko wrote:
> >>
> >>
> >> On 22.09.25 13:09, Marek Marczykowski-Górecki wrote:
> >>> On Fri, Aug 22, 2025 at 08:42:30PM +0200, Marek Marczykowski-Górecki 
> >>> wrote:
> >>>> On Fri, Aug 22, 2025 at 05:27:20PM +0200, Jürgen Groß wrote:
> >>>>> On 22.08.25 16:42, Marek Marczykowski-Górecki wrote:
> >>>>>> On Fri, Aug 22, 2025 at 04:39:33PM +0200, Marek Marczykowski-Górecki 
> >>>>>> wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> When suspending domU I get the following issue:
> >>>>>>>
> >>>>>>>        Freezing user space processes
> >>>>>>>        Freezing user space processes failed after 20.004 seconds (1 
> >>>>>>> tasks refusing to freeze, wq_busy=0):
> >>>>>>>        task:xl              state:D stack:0     pid:466   tgid:466   
> >>>>>>> ppid:1      task_flags:0x400040 flags:0x00004006
> >>>>>>>        Call Trace:
> >>>>>>>         <TASK>
> >>>>>>>         __schedule+0x2f3/0x780
> >>>>>>>         schedule+0x27/0x80
> >>>>>>>         schedule_preempt_disabled+0x15/0x30
> >>>>>>>         __mutex_lock.constprop.0+0x49f/0x880
> >>>>>>>         unregister_xenbus_watch+0x216/0x230
> >>>>>>>         xenbus_write_watch+0xb9/0x220
> >>>>>>>         xenbus_file_write+0x131/0x1b0
> >>>>>>>         vfs_writev+0x26c/0x3d0
> >>>>>>>         ? do_writev+0xeb/0x110
> >>>>>>>         do_writev+0xeb/0x110
> >>>>>>>         do_syscall_64+0x84/0x2c0
> >>>>>>>         ? do_syscall_64+0x200/0x2c0
> >>>>>>>         ? generic_handle_irq+0x3f/0x60
> >>>>>>>         ? syscall_exit_work+0x108/0x140
> >>>>>>>         ? do_syscall_64+0x200/0x2c0
> >>>>>>>         ? __irq_exit_rcu+0x4c/0xe0
> >>>>>>>         entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >>>>>>>        RIP: 0033:0x79b618138642
> >>>>>>>        RSP: 002b:00007fff9a192fc8 EFLAGS: 00000246 ORIG_RAX: 
> >>>>>>> 0000000000000014
> >>>>>>>        RAX: ffffffffffffffda RBX: 00000000024fd490 RCX: 
> >>>>>>> 000079b618138642
> >>>>>>>        RDX: 0000000000000003 RSI: 00007fff9a193120 RDI: 
> >>>>>>> 0000000000000014
> >>>>>>>        RBP: 00007fff9a193000 R08: 0000000000000000 R09: 
> >>>>>>> 0000000000000000
> >>>>>>>        R10: 0000000000000000 R11: 0000000000000246 R12: 
> >>>>>>> 0000000000000014
> >>>>>>>        R13: 00007fff9a193120 R14: 0000000000000003 R15: 
> >>>>>>> 0000000000000000
> >>>>>>>         </TASK>
> >>>>>>>        OOM killer enabled.
> >>>>>>>        Restarting tasks: Starting
> >>>>>>>        Restarting tasks: Done
> >>>>>>>        xen:manage: do_suspend: freeze processes failed -16
> >>>>>>>
> >>>>>>> The process in question is `xl devd` daemon. It's a domU serving a
> >>>>>>> xenvif backend.
> >>>>>>>
> >>>>>>> I noticed it on 6.16.1, but looking at earlier test logs I see it with
> >>>>>>> 6.16-rc6 already (but interestingly, not 6.16-rc2 yet? feels weird 
> >>>>>>> given
> >>>>>>> seemingly no relevant changes between rc2 and rc6).
> >>>>>>
> >>>>>> I forgot to include link for (a little) more details:
> >>>>>> https://github.com/QubesOS/qubes-linux-kernel/pull/1157
> >>>>>>
> >>>>>> Especially, there is another call trace with panic_on_warn enabled -
> >>>>>> slightly different, but looks related.
> >>>>>>
> >>>>>
> >>>>> I'm pretty sure the PV variant for suspending is just wrong: it is 
> >>>>> calling
> >>>>> dpm_suspend_start() from do_suspend() without taking the required
> >>>>> system_transition_mutex, resulting in the WARN() in 
> >>>>> pm_restrict_gfp_mask().
> >>>>>
> >>>>> It might be as easy as just adding the mutex() call to do_suspend(), 
> >>>>> but I'm
> >>>>> really not sure that will be a proper fix.
> >>>>
> >>>> Hm, this might explain the second call trace, but not the freeze failure
> >>>> quoted here above, I think?
> >>>
> >>> While the patch I sent appears to fix this particular issue, it made me
> >>> wonder: is there any fundamental reason why do_suspend() is not using
> >>> pm_suspend() and register Xen-specific actions via platform_suspend_ops
> >>> (and maybe syscore_ops)? From a brief look at the code, it should
> >>> theoretically be possible, and should avoid issues like this.
> >>>
> >>> I tried to do a quick&dirty attempt at that[1], and it failed (panic). I
> >>> surely made several mistakes there (and also left a ton of todo
> >>> comments). But before spending any more time at that, I'd like to ask
> >>> if this is a viable option at all.
> >>
> >> I think it might, but be careful with this, because there are two "System 
> >> Low power" paths in Linux
> >> 1) Suspend2RAM and Co
> >> 2) Hybernation
> >>
> >> While "Suspend2RAM and Co" path is relatively straight forward and 
> >> expected to be always
> >> started through pm_suspend(). In general, it's expected to happen
> >>   - from sysfs (User space)
> >>   - from autosuspend (wakelocks).
> >>
> >> the "hibernation" path is more complicated:(
> >> - Genuine Linux hybernation hibernate()/hibernate_quiet_exec()
> >
> > IIUC hibernation is very different as it puts Linux in charge of dumping
> > all the state to the disk. In case of Xen, the primary use case for
> > suspend is preparing VM for Xen toolstack serializing its state to disk
> > (or migrating to another host).
> > Additionally, VM suspend may be used as preparation for host suspend
> > (this is what I actually do here). This is especially relevant if the VM
> > has some PCI passthrough - to properly suspend (and resume) devices
> > across host suspend.
> >
> >> I'm not sure what path Xen originally implemented :( It seems like 
> >> "suspend2RAM",
> >> but, at the same time "hybernation" specific staff is used, like 
> >> PMSG_FREEZE/PMSG_THAW/PMSG_RESTORE.
> >> As result, Linux suspend/hybernation code moves forward while Xen stays 
> >> behind and unsync.
> >
> > Yeah, I think it's supposed to be suspend2RAM. TBH the
> > PMSG_FREEZE/PMSG_THAW/PMSG_RESTORE confuses me too and Qubes OS has a
> > patch[2] to switch it to PMSG_SUSPEND/PMSG_RESUME.
> >
> >> So it sounds reasonable to avoid custom implementation, but may be not 
> >> easy :(
> >>
> >> Suspending Xen features can be split between suspend stages, but
> >> not sure if platform_suspend_ops can be used.
> >>
> >> Generic suspend stages list
> >> - freeze
> >> - prepare
> >> - suspend
> >> - suspend_late
> >> - suspend_noirq (SPIs disabled, except wakeups)
> >>    [most of Xen specific staff has to be suspended at this point]
> >> - disable_secondary_cpus
> >> - arch disable IRQ (from this point no IRQs allowed, no timers, no 
> >> scheduling)
> >> - syscore_suspend
> >>    [rest here]
> >> - platform->enter() (suspended)
> >>
> >> You can't just overwrite platform_suspend_ops, because ARM64 is expected 
> >> to enter
> >> suspend through PSCI FW interface:
> >> drivers/firmware/psci/psci.c
> >>   static const struct platform_suspend_ops psci_suspend_ops = {
> >
> > Does this apply to a VM on ARM64 too? At least on x86, the VM is
> > supposed to make a hypercall to tell Xen it suspended (the hypercall
> > will return only on resume).
> >
> >> As an option, some Xen components could be converted to use syscore_ops 
> >> (but not xenstore),
> >> and some might need to use DD(dev_pm_ops).
> >>
> >>>
> >>> [1] 
> >>> https://github.com/marmarek/linux/commit/47cfdb991c85566c9c333570511e67bf477a5da6
> >>
> >> --
> >> Best regards,
> >> -grygorii
> >>
> >
> > [2] 
> > https://github.com/QubesOS/qubes-linux-kernel/blob/main/xen-pm-use-suspend.patch
> >
> 
> On my setup I get a weird behavior when trying to suspend (s2idle) a
> Linux guest.
> Doing echo freeze > /sys/power/state in the guest seems to "freeze" the
> guest for good, I could not unfreeze it afterward.
> VCPU goes to 100% according to XenOrchestra
> xl list shows state "r" but xl console blocks forever
> xl shutdown would block for some time and then print:
> Shutting down domain 721
> ?ibxl: error: libxl_domain.c:848:pvcontrol_cb: guest didn't acknowledge
> control request: -9
> shutdown failed (rc=-9)
> 
> Do you think it's related to your current issue?

Maybe? Is it a HVM guest or PV?

Regarding s2idle, we have another patch:
https://github.com/QubesOS/qubes-linux-kernel/blob/main/xen-events-Add-wakeup-support-to-xen-pirq.patch
but it fixes wakeup, not really going to sleep.

It was posted also to this ML, but still waiting for review (see link
in the patch header).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.