Linux 7.2 Can Significantly Lower Container Exit/Unmount Latency

Written by Michael Larabel in Linux Storage on 16 June 2026 at 11:02 AM EDT. 3 Comments
LINUX STORAGE
A patch series merged for the Linux 7.2 kernel addresses a race condition that can occur when a container is exiting yielding "VFS: Busy inodes after unmount" messages and a possible user-after-free condition. But the patch series also goes further and delivers a very nice optimization to lower the container unmounting latency for environments with heavy I/O load.

Alibaba engineer Baokun Li tracked down the possible race condition when a container exits and addressed it with the now-merged patch. That portion of the work should also be back-ported to current Linux stable kernel series in the near future. What's most exciting though is the additional work that eliminates a global serialization penalty and can lead to much lower container exit/unmount latency.

Christian Brauner summed up the situation in this pull request that is now merged for Linux 7.2:
"Fix a race between cgroup_writeback_umount() and inode_switch_wbs()

When a container exits, a race between cgroup_writeback_umount() and inode_switch_wbs()/cleanup_offline_cgwb() can trigger "VFS: Busy inodes after unmount" followed by a use-after-free on percpu counters. There is a window between inode_prepare_wbs_switch() returning true (having passed the SB_ACTIVE check and grabbed the inode) and the subsequent wb_queue_isw() call: if cgroup_writeback_umount() observes the global isw_nr_in_flight counter as non-zero but flush_workqueue() finds nothing queued yet, it returns early - leaving a held inode reference that blocks evict_inodes() and a later iput() that hits freed percpu counters.

The race is closed by covering the window from inode_prepare_wbs_switch() through wb_queue_isw() with an RCU read-side critical section and synchronizing in the umount path. On top of that the now-dead rcu_barrier() left over from the queue_rcu_work() era is removed, and the global synchronize_rcu()/flush_workqueue() pair is replaced with a per-sb in-flight counter plus pin/unpin/drain helpers so umount no longer serializes against switch activity on unrelated superblocks.

Under cgroup writeback churn on a 16 vCPU guest this takes umount latency from ~92-138ms p50 down to ~5-8ms p50 and the cumulative cost of cgroup_writeback_umount() from ~62ms to ~4us per call. The initial race fix is kept separate and minimal so it backports cleanly to stable trees that still queue switches via queue_rcu_work()."

Quite a nice improvement for the unmount latency.

Linux 7.2 unmount latency benchmark


There are also additional benchmark numbers from this patch.

Linux 7.2 unmount latency benchmark


Separately, that same VFS pull request for Linux 7.2 also improves write performance when using the RWF_DONTCACHE flag. Those benchmark numbers and more details within this patch.
Related News
About The Author

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week