linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Minchan Kim <minchan@kernel.org>
Cc: akpm@linux-foundation.org, david@kernel.org, brauner@kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	surenb@google.com, timmurray@google.com
Subject: Re: [RFC 0/3]  mm: process_mrelease: expedited reclaim and auto-kill support
Date: Fri, 17 Apr 2026 09:11:21 +0200	[thread overview]
Message-ID: <aeHdGf7XDjvOES1V@tiehlicka> (raw)
In-Reply-To: <aeHRLv834FCAQlQ8@google.com>

On Thu 16-04-26 23:20:30, Minchan Kim wrote:
> On Thu, Apr 16, 2026 at 08:54:53AM +0200, Michal Hocko wrote:
> > On Wed 15-04-26 16:26:34, Minchan Kim wrote:
> > > On Wed, Apr 15, 2026 at 09:38:05AM +0200, Michal Hocko wrote:
> > > > On Tue 14-04-26 13:00:16, Minchan Kim wrote:
> > > > > On Tue, Apr 14, 2026 at 08:57:57AM +0200, Michal Hocko wrote:
> > > > > > On Mon 13-04-26 15:39:45, Minchan Kim wrote:
> > > > > > > This patch series introduces optimizations to expedite memory reclamation
> > > > > > > in process_mrelease() and provides a secure, race-free "auto-kill"
> > > > > > > mechanism for efficient container shutdown and OOM handling.
> > > > > > > 
> > > > > > > Currently, process_mrelease() unmaps pages but leaves clean file folios
> > > > > > > on the LRU list, relying on standard memory reclaim to eventually free
> > > > > > > them. Furthermore, requiring userspace to send a SIGKILL prior to
> > > > > > > invoking process_mrelease() introduces scheduling race conditions where
> > > > > > > the victim task may enter the exit path prematurely, bypassing expedited
> > > > > > > reclamation hooks.
> > > > > > > 
> > > > > > > This series addresses these limitations in three logical steps.
> > > > > > > 
> > > > > > > Patch #1: mm: process_mrelease: expedite clean file folio reclaim via mmu_gather
> > > > > > > Integrates clean file folio eviction directly into the low-level TLB
> > > > > > > batching (mmu_gather) infrastructure. Symmetrically truncates clean file
> > > > > > > folios alongside anonymous pages during the unmap loop.
> > > > > > 
> > > > > > Why do we need to care about clean page cache? Is this a form of
> > > > > > drop_caches?
> > > > > 
> > > > > The goal is to ensure the memory is actually freed by the time
> > > > > process_mrelease returns. Currently, process_mrelease unmaps pages, but
> > > > > page caches remain on the LRU, leaving them to be reclaimed later
> > > > > by kswapd or direct reclaim.
> > > > 
> > > > Correct. This was the initial design decision because there is not much
> > > > you can assume about page cache pages which are very often shared. Even
> > > > if they are not mapped by all users.
> > > 
> > > Fair point. However, that's the trade-off:
> > > 
> > > Leaving unmapped caches to be reclaimed asynchronously keeps system memory
> > > pressure high for too long. In Android, this delay forces the LMKD to
> > > unnecessarily kill additional innocent background apps before the memory
> > > from the original victim is recovered.
> > 
> > OK, this is really not clear to me. How come you end up triggering LMKD
> > (or any OOM handling) when there is a considerable amount of clean page
> > cache?
> 
> It's not simple to explain all the heuristics, but basically, LMKD is triggered
> by PSI pressure (usually contributed by kswapd rather than other components
> like refault, kcompactd, or workingset operations).
> 
> It then checks the current free memory against system watermarks. Depending
> on the free memory size, file cache, and free swap, it decides to start
> killing background apps.
> 
> In other words, LMKD acts as a "userspace kswapd" to assist kernel kswapd's
> reclamation speed. It is smarter than kswapd because it has high-level knowledge
> of which processes are okay to be killed rather than forcing slow, unnecessary
> paing out.
> 
> Whenever LMKD is running, kswapd is usually running alongside it. You might
> wonder why LMKD kills background apps even when there are plenty of clean file
> pages. That's because the system cannot predict current memory allocation rates.
> If the allocation is bursty, kswapd can never catch up with the allocation speed.
> This forces the foreground apps into direct reclaim, resulting in visible
> UI jank. Android prioritizes UI smoothness and chooses to kill background apps.
>
> Furthermore, when LMKD kills a background app, it expects immediate memory relief.
> If the clean file pages of the killed process are left on the LRU to be reclaimed
> asynchronously later, the system's memory pressure (PSI) remains high.
> This forces LMKD to unnecessarily kill *additional* background apps before
> the memory from the first victim is fully recovered.
> 
> Again, this is why I want process_mrelease expedite clean file reclamation
> synchronously.

How much of a clean page cache do you usually drop this way?
 
[...]
> > I suspect you are missing my point. I am arguing that those special
> > hacks in the address space release path shouldn't be process_mrelease
> 
> I am a bit confused now. Do you mean you want to apply these expedited
> reclamation optimizations to ALL dying processes in the common exit path,
> rather than making them specific to process_mrelease?

Yes. All which make sense, really. I am still not convinced about the
clean page cache because that just seems like a hack to workaround wrong
userspace oom heuristics.
-- 
Michal Hocko
SUSE Labs


      reply	other threads:[~2026-04-17  7:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-13 22:39 Minchan Kim
2026-04-13 22:39 ` [RFC 1/3] mm: process_mrelease: expedite clean file folio reclaim via mmu_gather Minchan Kim
2026-04-14  7:45   ` David Hildenbrand (Arm)
2026-04-14 20:21     ` Minchan Kim
2026-04-13 22:39 ` [RFC 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios Minchan Kim
2026-04-14  7:20   ` David Hildenbrand (Arm)
2026-04-14 20:22     ` Minchan Kim
2026-04-13 22:39 ` [RFC 3/3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag Minchan Kim
2026-04-16  9:13   ` Christian Brauner
2026-04-17  6:30     ` Minchan Kim
2026-04-17  7:04       ` Michal Hocko
2026-04-14  6:57 ` [RFC 0/3] mm: process_mrelease: expedited reclaim and auto-kill support Michal Hocko
2026-04-14 20:00   ` Minchan Kim
2026-04-15  7:38     ` Michal Hocko
2026-04-15 23:26       ` Minchan Kim
2026-04-16  6:54         ` Michal Hocko
2026-04-17  6:20           ` Minchan Kim
2026-04-17  7:11             ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeHdGf7XDjvOES1V@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox