linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tal Zussman <tz2294@columbia.edu>
To: Lorenzo Stoakes <ljs@kernel.org>, John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Gregory Price <gourry@gourry.net>, Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	lsf-pc@lists.linux-foundation.org,
	Johannes Weiner <hannes@cmpxchg.org>,
	David Hildenbrand <david@kernel.org>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Chen Ridong <chenridong@huaweicloud.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	Kairui Song <ryncsn@gmail.com>, Nhat Pham <nphamcs@gmail.com>,
	Barry Song <21cnbao@gmail.com>,
	David Stevens <stevensd@google.com>,
	Vernon Yang <vernon2gm@gmail.com>,
	David Rientjes <rientjes@google.com>,
	Kalesh Singh <kaleshsingh@google.com>,
	wangzicheng <wangzicheng@honor.com>,
	"T . J . Mercier" <tjmercier@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Meta kernel team <kernel-team@meta.com>,
	bpf@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Towards Unified and Extensible Memory Reclaim (reclaim_ext)
Date: Tue, 14 Apr 2026 17:38:00 -0400	[thread overview]
Message-ID: <5c1205ec-cbcc-4976-85d3-0643083bfbba@columbia.edu> (raw)
In-Reply-To: <adddpxFfWIO9a2Ih@lucifer>

On 4/9/26 4:22 AM, Lorenzo Stoakes wrote:
> Yes, thanks for that, it's interesting!
> 
> But I would say for now we need to defer any consideration of bpf being a
> thing until we actually get things into shape in terms of improving and
> modularising the existing reclaim mechanisms.
> 
> mm has been far too keen to take features without paying down technical
> debt first and it's been very costly, so before anything else, we must
> ensure that reclaim is both long-term maintainable and maintained.
>

Completely agreed that cleanup is necessary and modularization is a big step
towards that. I do think it makes sense to think about what eBPF would look
like as part of that future though. It would be a shame to do all of that
modularization work, then decide to integrate eBPF later on and realize that
we need major changes to make that happen (but, given a well-designed
interface, I think that's unlikely to be a significant issue).

> In terms of reclaim bpf as a concept in general - reclaim is so very
> sensitive to even minor changes, and I fear that people might find
> something that appears to dramatically improve matters in one scenario, but
> end up with an unusable system in another.

That concern is precisely why we implemented per-cgroup policies. We found
that running each application with a policy that works best for it yielded
greater overall performance. System-wide policies may need something more
generic, but if you can make them more granular, you can specialize more
without running into such issues. This was simple enough to do given that
the LRU lists are already per-memcg, and eBPF (is about to) support
per-cgroup struct_ops programs.

> A bad sched_ext implementation might result in poor responsiveness, but a
> bad reclaim_ext implementation might result in a soft-locked system, and I
> fear that it might be quite easy to do that.

Anecdotally, having implemented about a dozen policies across dozens of
workloads, we never ran into a soft-lockup. That's not a guarantee that it
can't happen, but with a properly implemented fallback/watchdog to ensure
that pages actually get reclaimed as necessary, this should be manageable.
sched_ext actually implements such a watchdog to kick out misbehaving eBPF
schedulers.

> In any case, we can look at all that once we are in a better place with
> reclaim, which Shakeel's proposal focuses on and I'm very much in favour
> of! :)
>
> Cheers, Lorenzo
> 



  reply	other threads:[~2026-04-14 21:38 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-25 21:06 Shakeel Butt
2026-03-26  0:10 ` T.J. Mercier
2026-03-26  2:05 ` Andrew Morton
2026-03-26  7:03   ` Michal Hocko
2026-03-26  8:02     ` Lorenzo Stoakes (Oracle)
2026-03-26 12:37       ` Kairui Song
2026-03-26 13:13         ` Lorenzo Stoakes (Oracle)
2026-03-26 13:42           ` David Hildenbrand (Arm)
2026-03-26 13:45             ` Lorenzo Stoakes (Oracle)
2026-03-26 16:02         ` Lorenzo Stoakes (Oracle)
2026-03-26 20:02       ` Axel Rasmussen
2026-03-26 20:30         ` Gregory Price
2026-03-26 20:47           ` Axel Rasmussen
2026-03-27  3:43             ` Matthew Wilcox
2026-03-27 19:12               ` Tal Zussman
2026-03-27 19:43                 ` Gregory Price
2026-04-14 21:11                   ` Tal Zussman
2026-04-09  0:21                 ` John Hubbard
2026-04-09  8:22                   ` Lorenzo Stoakes
2026-04-14 21:38                     ` Tal Zussman [this message]
2026-04-14 20:35                   ` Tal Zussman
2026-03-27  8:07         ` [Lsf-pc] " Vlastimil Babka
2026-03-27  9:29           ` Lorenzo Stoakes (Oracle)
2026-03-26 12:06   ` Kairui Song
2026-03-26 12:31     ` Lorenzo Stoakes (Oracle)
2026-03-26 13:17       ` Kairui Song
2026-03-26 13:26         ` Lorenzo Stoakes (Oracle)
2026-03-26 13:21   ` Shakeel Butt
2026-03-26  7:12 ` Michal Hocko
2026-03-26 13:44   ` Shakeel Butt
2026-03-26 15:24     ` Michal Hocko
2026-03-26 18:21       ` Shakeel Butt
2026-03-26  7:18 ` wangzicheng
2026-03-26 11:43 ` Lorenzo Stoakes (Oracle)
2026-03-26 15:24   ` Gregory Price
2026-03-26 15:35     ` Lorenzo Stoakes (Oracle)
2026-03-26 16:32       ` Gregory Price
2026-03-26 16:40         ` Lorenzo Stoakes (Oracle)
2026-03-27 19:53       ` Johannes Weiner
2026-04-07 11:36         ` Lorenzo Stoakes
2026-04-07 16:56           ` Gregory Price
2026-04-07 17:30             ` Lorenzo Stoakes
2026-04-07 17:52               ` Johannes Weiner
2026-04-07 18:37               ` Gregory Price
2026-04-08  6:48                 ` Lorenzo Stoakes
2026-03-26 18:49   ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5c1205ec-cbcc-4976-85d3-0643083bfbba@columbia.edu \
    --to=tz2294@columbia.edu \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=axelrasmussen@google.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bpf@vger.kernel.org \
    --cc=chenridong@huaweicloud.com \
    --cc=david@kernel.org \
    --cc=emil@etsalapatis.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=jhubbard@nvidia.com \
    --cc=kaleshsingh@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mhocko@suse.com \
    --cc=nphamcs@gmail.com \
    --cc=rientjes@google.com \
    --cc=ryncsn@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=stevensd@google.com \
    --cc=surenb@google.com \
    --cc=tjmercier@google.com \
    --cc=vernon2gm@gmail.com \
    --cc=wangzicheng@honor.com \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox