Skip to content

Since 256, suspend fails over 50% of the time: Freezing user space processes failed #33626

@mahkoh

Description

@mahkoh

systemd version the issue has been seen with

256

Used distribution

Arch

Linux kernel version used

6.10.0-rc6

CPU architectures issue was seen on

x86_64

Component

No response

Expected behaviour you didn't see

systemctl suspend suspends the system

Unexpected behaviour you saw

Suspending fails with

Jul 03 17:47:54 desk kernel: Freezing user space processes failed after 20.009 seconds (1 tasks refusing to freeze, wq_busy=0):
Jul 03 17:47:54 desk kernel: task:jay             state:R stack:0     pid:1550  tgid:1550  ppid:1533   flags:0x00024002

I originally reported this issue here: https://bugzilla.kernel.org/show_bug.cgi?id=219004

In there I already suspected that systemd might be involved since the issue started appearing immediately after upgrading systemd from 255 to 256 and knowing that systemd restarts itself without requiring a system reboot. I had not rebooted the system for around two weeks at that point.

I have now performed the following experiment which makes it look more likely that some behavior in systemd changed that has unintended consequences:

  • I downgraded systemd, systemd-libs, and systemd-sysvcompat to 255 and tried to suspend 4 times in a row. They all succeeded.
  • I upgraded to 256 and tried the same. 2 out of 4 attempts failed with the error above.
  • I again downgraded to 255 and again all attempts succeeded.

The issue seems to happen with at least two stable 6.9.x kernels and also 6.10.0-rc6.

I've added logging to the kernel to see why the process cannot be frozen and it shows that the condition https://github.com/torvalds/linux/blob/795c58e4c7fc6163d8fb9f2baa86cfe898fa4b19/kernel/freezer.c#L112 is always true. More precisely, p->on_rq == TASK_ON_RQ_QUEUED.

Is it possible that systemd 256 has started doing something that would prevent a task with p->on_rq == TASK_ON_RQ_QUEUED during suspend to leave that state?

Steps to reproduce the problem

You can try to install https://github.com/mahkoh/jay to reproduce. But the issue might be specific to my system.

Additional program output to the terminal or log subsystem illustrating the issue

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions