From: Michal Hocko <mhocko@suse.com>
To: Minchan Kim <minchan@kernel.org>
Cc: akpm@linux-foundation.org, david@kernel.org, brauner@kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
surenb@google.com, timmurray@google.com
Subject: Re: [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support
Date: Fri, 17 Apr 2026 09:11:21 +0200 [thread overview]
Message-ID: <aeHdGf7XDjvOES1V@tiehlicka> (raw)
In-Reply-To: <aeHRLv834FCAQlQ8@google.com>
On Thu 16-04-26 23:20:30, Minchan Kim wrote:
> On Thu, Apr 16, 2026 at 08:54:53AM +0200, Michal Hocko wrote:
> > On Wed 15-04-26 16:26:34, Minchan Kim wrote:
> > > On Wed, Apr 15, 2026 at 09:38:05AM +0200, Michal Hocko wrote:
> > > > On Tue 14-04-26 13:00:16, Minchan Kim wrote:
> > > > > On Tue, Apr 14, 2026 at 08:57:57AM +0200, Michal Hocko wrote:
> > > > > > On Mon 13-04-26 15:39:45, Minchan Kim wrote:
> > > > > > > This patch series introduces optimizations to expedite memory reclamation
> > > > > > > in process_mrelease() and provides a secure, race-free "auto-kill"
> > > > > > > mechanism for efficient container shutdown and OOM handling.
> > > > > > >
> > > > > > > Currently, process_mrelease() unmaps pages but leaves clean file folios
> > > > > > > on the LRU list, relying on standard memory reclaim to eventually free
> > > > > > > them. Furthermore, requiring userspace to send a SIGKILL prior to
> > > > > > > invoking process_mrelease() introduces scheduling race conditions where
> > > > > > > the victim task may enter the exit path prematurely, bypassing expedited
> > > > > > > reclamation hooks.
> > > > > > >
> > > > > > > This series addresses these limitations in three logical steps.
> > > > > > >
> > > > > > > Patch #1: mm: process_mrelease: expedite clean file folio reclaim via mmu_gather
> > > > > > > Integrates clean file folio eviction directly into the low-level TLB
> > > > > > > batching (mmu_gather) infrastructure. Symmetrically truncates clean file
> > > > > > > folios alongside anonymous pages during the unmap loop.
> > > > > >
> > > > > > Why do we need to care about clean page cache? Is this a form of
> > > > > > drop_caches?
> > > > >
> > > > > The goal is to ensure the memory is actually freed by the time
> > > > > process_mrelease returns. Currently, process_mrelease unmaps pages, but
> > > > > page caches remain on the LRU, leaving them to be reclaimed later
> > > > > by kswapd or direct reclaim.
> > > >
> > > > Correct. This was the initial design decision because there is not much
> > > > you can assume about page cache pages which are very often shared. Even
> > > > if they are not mapped by all users.
> > >
> > > Fair point. However, that's the trade-off:
> > >
> > > Leaving unmapped caches to be reclaimed asynchronously keeps system memory
> > > pressure high for too long. In Android, this delay forces the LMKD to
> > > unnecessarily kill additional innocent background apps before the memory
> > > from the original victim is recovered.
> >
> > OK, this is really not clear to me. How come you end up triggering LMKD
> > (or any OOM handling) when there is a considerable amount of clean page
> > cache?
>
> It's not simple to explain all the heuristics, but basically, LMKD is triggered
> by PSI pressure (usually contributed by kswapd rather than other components
> like refault, kcompactd, or workingset operations).
>
> It then checks the current free memory against system watermarks. Depending
> on the free memory size, file cache, and free swap, it decides to start
> killing background apps.
>
> In other words, LMKD acts as a "userspace kswapd" to assist kernel kswapd's
> reclamation speed. It is smarter than kswapd because it has high-level knowledge
> of which processes are okay to be killed rather than forcing slow, unnecessary
> paing out.
>
> Whenever LMKD is running, kswapd is usually running alongside it. You might
> wonder why LMKD kills background apps even when there are plenty of clean file
> pages. That's because the system cannot predict current memory allocation rates.
> If the allocation is bursty, kswapd can never catch up with the allocation speed.
> This forces the foreground apps into direct reclaim, resulting in visible
> UI jank. Android prioritizes UI smoothness and chooses to kill background apps.
>
> Furthermore, when LMKD kills a background app, it expects immediate memory relief.
> If the clean file pages of the killed process are left on the LRU to be reclaimed
> asynchronously later, the system's memory pressure (PSI) remains high.
> This forces LMKD to unnecessarily kill *additional* background apps before
> the memory from the first victim is fully recovered.
>
> Again, this is why I want process_mrelease expedite clean file reclamation
> synchronously.
How much of a clean page cache do you usually drop this way?
[...]
> > I suspect you are missing my point. I am arguing that those special
> > hacks in the address space release path shouldn't be process_mrelease
>
> I am a bit confused now. Do you mean you want to apply these expedited
> reclamation optimizations to ALL dying processes in the common exit path,
> rather than making them specific to process_mrelease?
Yes. All which make sense, really. I am still not convinced about the
clean page cache because that just seems like a hack to workaround wrong
userspace oom heuristics.
--
Michal Hocko
SUSE Labs
prev parent reply other threads:[~2026-04-17 7:11 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-13 22:39 Minchan Kim
2026-04-13 22:39 ` [RFC 1/3] mm: process_mrelease: expedite clean file folio reclaim via mmu_gather Minchan Kim
2026-04-14 7:45 ` David Hildenbrand (Arm)
2026-04-14 20:21 ` Minchan Kim
2026-04-13 22:39 ` [RFC 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios Minchan Kim
2026-04-14 7:20 ` David Hildenbrand (Arm)
2026-04-14 20:22 ` Minchan Kim
2026-04-13 22:39 ` [RFC 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag Minchan Kim
2026-04-16 9:13 ` Christian Brauner
2026-04-17 6:30 ` Minchan Kim
2026-04-17 7:04 ` Michal Hocko
2026-04-14 6:57 ` [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support Michal Hocko
2026-04-14 20:00 ` Minchan Kim
2026-04-15 7:38 ` Michal Hocko
2026-04-15 23:26 ` Minchan Kim
2026-04-16 6:54 ` Michal Hocko
2026-04-17 6:20 ` Minchan Kim
2026-04-17 7:11 ` Michal Hocko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeHdGf7XDjvOES1V@tiehlicka \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=david@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=surenb@google.com \
--cc=timmurray@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox