From: Lorenzo Stoakes <ljs@kernel.org>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Tal Zussman <tz2294@columbia.edu>,
Matthew Wilcox <willy@infradead.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Gregory Price <gourry@gourry.net>,
Michal Hocko <mhocko@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Shakeel Butt <shakeel.butt@linux.dev>,
lsf-pc@lists.linux-foundation.org,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Chen Ridong <chenridong@huaweicloud.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
Alexei Starovoitov <ast@kernel.org>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Kairui Song <ryncsn@gmail.com>, Nhat Pham <nphamcs@gmail.com>,
Barry Song <21cnbao@gmail.com>,
David Stevens <stevensd@google.com>,
Vernon Yang <vernon2gm@gmail.com>,
David Rientjes <rientjes@google.com>,
Kalesh Singh <kaleshsingh@google.com>,
wangzicheng <wangzicheng@honor.com>,
"T . J . Mercier" <tjmercier@google.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Suren Baghdasaryan <surenb@google.com>,
Meta kernel team <kernel-team@meta.com>,
bpf@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Towards Unified and Extensible Memory Reclaim (reclaim_ext)
Date: Thu, 9 Apr 2026 09:22:33 +0100 [thread overview]
Message-ID: <adddpxFfWIO9a2Ih@lucifer> (raw)
In-Reply-To: <70fd648a-efa1-465a-8e6a-51411dfd50b8@nvidia.com>
On Wed, Apr 08, 2026 at 05:21:17PM -0700, John Hubbard wrote:
> On 3/27/26 12:12 PM, Tal Zussman wrote:
> > On 3/26/26 11:43 PM, Matthew Wilcox wrote:
> >> On Thu, Mar 26, 2026 at 01:47:43PM -0700, Axel Rasmussen wrote:
> >>> On Thu, Mar 26, 2026 at 1:30 PM Gregory Price <gourry@gourry.net> wrote:
> ...
> > Yeah, unfortunately it's not so straightforward. As a simple illustrative
> > example, consider a file-search workload, where you search through a large
> > number of files over and over again (e.g., a poor kernel developer trying to
> > understand how the page cache works). This follows an MRU, rather than LRU,
> > pattern, and readahead doesn't help much, leading the active/inactive and
> > MGLRU policies to have similar performance (~40s runtime in a specific
> > benchmark we ran). In comparison, using cache_ext (our eBPF-based caching
> > framework), we can run an MRU policy and it goes down to 20s.
>
> That's dramatic!
>
> ...
> > It's been well-known in the academic realm for a while that there isn't
> > really a "one-size-fits-all" policy that works *best* for all workloads.
>
> I think that that point has been less clear, outside of academia. In fact,
> MGRLU (to the extent that we believed we would eventually get rid of LRU,
> in favor of MGLRU) doubled down on the idea of one size fits all. So this
> is interesting.
>
> > Yes, you can make a general policy that works *well*, but if you really care
> > about a workload's performance and want to squeeze out the last 10-20% (or
> > more) of performance, you need to be able to (1) experiment and (2) take
> > advantage of application-level insights. Being able to extend reclaim (in
> > our case with eBPF) enables that.
> >
> > We wrote a paper about this that was published a few months ago [1]. Happy
> > to answer any questions and continue the discussion!
> >
> > [1] https://dl.acm.org/doi/pdf/10.1145/3731569.3764820
> >
>
> Excellent work, I was delighted to find a well-balanced description of
> both older and more recent history of the Linux page cache there.
>
> It's helpful to read this, even if we go with a non-eBPF approach.
Yes, thanks for that, it's interesting!
But I would say for now we need to defer any consideration of bpf being a
thing until we actually get things into shape in terms of improving and
modularising the existing reclaim mechanisms.
mm has been far too keen to take features without paying down technical
debt first and it's been very costly, so before anything else, we must
ensure that reclaim is both long-term maintainable and maintained.
In terms of reclaim bpf as a concept in general - reclaim is so very
sensitive to even minor changes, and I fear that people might find
something that appears to dramatically improve matters in one scenario, but
end up with an unusable system in another.
A bad sched_ext implementation might result in poor responsiveness, but a
bad reclaim_ext implementation might result in a soft-locked system, and I
fear that it might be quite easy to do that.
In any case, we can look at all that once we are in a better place with
reclaim, which Shakeel's proposal focuses on and I'm very much in favour
of! :)
>
>
> thanks,
> --
> John Hubbard
>
Cheers, Lorenzo
next prev parent reply other threads:[~2026-04-09 8:22 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-25 21:06 Shakeel Butt
2026-03-26 0:10 ` T.J. Mercier
2026-03-26 2:05 ` Andrew Morton
2026-03-26 7:03 ` Michal Hocko
2026-03-26 8:02 ` Lorenzo Stoakes (Oracle)
2026-03-26 12:37 ` Kairui Song
2026-03-26 13:13 ` Lorenzo Stoakes (Oracle)
2026-03-26 13:42 ` David Hildenbrand (Arm)
2026-03-26 13:45 ` Lorenzo Stoakes (Oracle)
2026-03-26 16:02 ` Lorenzo Stoakes (Oracle)
2026-03-26 20:02 ` Axel Rasmussen
2026-03-26 20:30 ` Gregory Price
2026-03-26 20:47 ` Axel Rasmussen
2026-03-27 3:43 ` Matthew Wilcox
2026-03-27 19:12 ` Tal Zussman
2026-03-27 19:43 ` Gregory Price
2026-04-09 0:21 ` John Hubbard
2026-04-09 8:22 ` Lorenzo Stoakes [this message]
2026-03-27 8:07 ` [Lsf-pc] " Vlastimil Babka
2026-03-27 9:29 ` Lorenzo Stoakes (Oracle)
2026-03-26 12:06 ` Kairui Song
2026-03-26 12:31 ` Lorenzo Stoakes (Oracle)
2026-03-26 13:17 ` Kairui Song
2026-03-26 13:26 ` Lorenzo Stoakes (Oracle)
2026-03-26 13:21 ` Shakeel Butt
2026-03-26 7:12 ` Michal Hocko
2026-03-26 13:44 ` Shakeel Butt
2026-03-26 15:24 ` Michal Hocko
2026-03-26 18:21 ` Shakeel Butt
2026-03-26 7:18 ` wangzicheng
2026-03-26 11:43 ` Lorenzo Stoakes (Oracle)
2026-03-26 15:24 ` Gregory Price
2026-03-26 15:35 ` Lorenzo Stoakes (Oracle)
2026-03-26 16:32 ` Gregory Price
2026-03-26 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-27 19:53 ` Johannes Weiner
2026-04-07 11:36 ` Lorenzo Stoakes
2026-04-07 16:56 ` Gregory Price
2026-04-07 17:30 ` Lorenzo Stoakes
2026-04-07 17:52 ` Johannes Weiner
2026-04-07 18:37 ` Gregory Price
2026-04-08 6:48 ` Lorenzo Stoakes
2026-03-26 18:49 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adddpxFfWIO9a2Ih@lucifer \
--to=ljs@kernel.org \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=ast@kernel.org \
--cc=axelrasmussen@google.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=bpf@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=david@kernel.org \
--cc=emil@etsalapatis.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=jhubbard@nvidia.com \
--cc=kaleshsingh@google.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mhocko@suse.com \
--cc=nphamcs@gmail.com \
--cc=rientjes@google.com \
--cc=ryncsn@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=stevensd@google.com \
--cc=surenb@google.com \
--cc=tjmercier@google.com \
--cc=tz2294@columbia.edu \
--cc=vernon2gm@gmail.com \
--cc=wangzicheng@honor.com \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox