From: Johannes Weiner <hannes@cmpxchg.org>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>, Yu Zhao <yuzhao@google.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
linux-mm@kvack.org, Yosry Ahmed <yosryahmed@google.com>,
Wei Xu <weixugc@google.com>, Shakeel Butt <shakeelb@google.com>,
Greg Thelen <gthelen@google.com>
Subject: Re: [RFC] Mechanism to induce memory reclaim
Date: Mon, 7 Mar 2022 15:50:36 -0500 [thread overview]
Message-ID: <YiZwHHQ0+CFL78Sb@cmpxchg.org> (raw)
In-Reply-To: <5df21376-7dd1-bf81-8414-32a73cea45dd@google.com>
On Sun, Mar 06, 2022 at 03:11:23PM -0800, David Rientjes wrote:
> Hi everybody,
>
> We'd like to discuss formalizing a mechanism to induce memory reclaim by
> the kernel.
>
> The current multigenerational LRU proposal introduces a debugfs
> mechanism[1] for this. The "TMO: Transparent Memory Offloading in
> Datacenters" paper also discusses a per-memcg mechanism[2]. While the
> former can be used for debugging of MGLRU, both can quite powerfully be
> used for proactive reclaim.
>
> Google's datacenters use a similar per-memcg mechanism for the same
> purpose. Thus, formalizing the mechanism would allow our userspace to use
> an upstream supported interface that will be stable and consistent.
>
> This could be an incremental addition to MGLRU's lru_gen debugfs mechanism
> but, since the concept has no direct dependency on the work, we believe it
> is useful independent of the reclaim mechanism in use (both with and
> without CONFIG_LRU_GEN).
>
> Idea: introduce a per-node sysfs mechanism for inducing memory reclaim
> that can be useful for global (non-memcg constrained) reclaim and possible
> even if memcg is not enabled in the kernel or mounted. This could
> optionally take a memcg id to induce reclaim for a memcg hierarchy.
>
> IOW, this would be a /sys/devices/system/node/nodeN/reclaim mechanim for
> each NUMA node N on the system. (It would be similar to the existing
> per-node sysfs "compact" mechanism used to trigger compaction from
> userspace.)
I generally think a proactive reclaim interface is a good idea.
A per-cgroup control knob would make more sense to me, as cgroupfs
takes care of delegation, namespacing etc. and so would permit
self-directed proactive reclaim inside containers.
> Userspace would write the following to this file:
> - nr_to_reclaim pages
This makes sense, although (and you hinted at this below), I'm
thinking it should be in bytes, especially if part of cgroupfs.
> - swappiness factor
This I'm not sure about.
Mostly because I'm not sure about swappiness in general. It balances
between anon and file, but both of them are aged according to the same
LRU rules. The only reason to prefer one over the other seems to be
when the cost of reloading one (refault vs swapin) isn't the same as
the other. That's usually a hardware property, which in a perfect
world we'd auto-tune inside the kernel based on observed IO
performance. Not sure why you'd want this per reclaim request.
> - flags to specify context, if any[**]
>
> [**] this is offered for extensibility to specify the context in which
> reclaim is being done (clean file pages only, demotion for memory
> tiering vs eviction, etc), otherwise 0
This one is curious. I don't understand the use cases for either of
these examples, and I can't think of other flags a user may pass on a
per-invocation basis. Would you care to elaborate some?
next prev parent reply other threads:[~2022-03-07 20:50 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-06 23:11 David Rientjes
2022-03-07 0:49 ` Yu Zhao
2022-03-07 14:41 ` Michal Hocko
2022-03-07 18:31 ` Shakeel Butt
2022-03-07 20:26 ` Johannes Weiner
2022-03-08 12:53 ` Michal Hocko
2022-03-08 14:44 ` Dan Schatzberg
2022-03-08 16:05 ` Michal Hocko
2022-03-08 17:21 ` Wei Xu
2022-03-08 17:23 ` Johannes Weiner
2022-03-08 12:52 ` Michal Hocko
2022-03-09 22:03 ` David Rientjes
2022-03-10 16:58 ` Johannes Weiner
2022-03-10 17:25 ` Shakeel Butt
2022-03-10 17:33 ` Wei Xu
2022-03-10 17:42 ` Johannes Weiner
2022-03-07 20:50 ` Johannes Weiner [this message]
2022-03-07 22:53 ` Wei Xu
2022-03-08 12:53 ` Michal Hocko
2022-03-08 14:49 ` Dan Schatzberg
2022-03-08 19:27 ` Johannes Weiner
2022-03-08 22:37 ` Dan Schatzberg
2022-03-09 22:30 ` David Rientjes
2022-03-10 16:10 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YiZwHHQ0+CFL78Sb@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@linux.intel.com \
--cc=gthelen@google.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=weixugc@google.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox