* [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX
@ 2025-03-12 9:43 Zhongkun He
2025-03-12 22:36 ` Andrew Morton
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Zhongkun He @ 2025-03-12 9:43 UTC (permalink / raw)
To: akpm; +Cc: mhocko, hannes, muchun.song, linux-mm, linux-kernel, Zhongkun He
With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
memory.reclaim")', we can submit an additional swappiness=<val> argument
to memory.reclaim. It is very useful because we can dynamically adjust
the reclamation ratio based on the anonymous folios and file folios of
each cgroup. For example,when swappiness is set to 0, we only reclaim
from file pages.
However,we have also encountered a new issue: when swappiness is set to
the MAX_SWAPPINESS, it may still only reclaim file folios. This is due
to the knob of cache_trim_mode, which depends solely on the ratio of
inactive folios, regardless of whether there are a large number of cold
folios in anonymous folio list.
So, we hope to add a new control logic where proactive memory reclaim only
reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS.
For example, something like this:
echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim
will perform reclaim on the rootcg with a swappiness setting of 200 (max
swappiness) regardless of the file folios. Users have a more comprehensive
view of the application's memory distribution because there are many
metrics available.
With this patch, the swappiness argument of memory.reclaim has a more
precise semantics: 0 means reclaiming only from file pages, while 200
means reclaiming just from anonymous pages.
Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
---
mm/vmscan.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c767d71c43d7..f4312b41e0e0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2438,6 +2438,16 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
goto out;
}
+ /*
+ * Do not bother scanning file folios if the memory reclaim
+ * invoked by userspace through memory.reclaim and the
+ * swappiness is MAX_SWAPPINESS.
+ */
+ if (sc->proactive && (swappiness == MAX_SWAPPINESS)) {
+ scan_balance = SCAN_ANON;
+ goto out;
+ }
+
/*
* Do not apply any pressure balancing cleverness when the
* system is close to OOM, scan both anon and file equally
--
2.39.5
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX
2025-03-12 9:43 [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX Zhongkun He
@ 2025-03-12 22:36 ` Andrew Morton
2025-03-13 2:29 ` [External] " Zhongkun He
2025-03-12 23:16 ` Johannes Weiner
2025-03-14 0:17 ` Yosry Ahmed
2 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2025-03-12 22:36 UTC (permalink / raw)
To: Zhongkun He; +Cc: mhocko, hannes, muchun.song, linux-mm, linux-kernel
On Wed, 12 Mar 2025 17:43:37 +0800 Zhongkun He <hezhongkun.hzk@bytedance.com> wrote:
> With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
> memory.reclaim")', we can submit an additional swappiness=<val> argument
> to memory.reclaim. It is very useful because we can dynamically adjust
> the reclamation ratio based on the anonymous folios and file folios of
> each cgroup. For example,when swappiness is set to 0, we only reclaim
> from file pages.
>
> However,we have also encountered a new issue: when swappiness is set to
> the MAX_SWAPPINESS, it may still only reclaim file folios. This is due
> to the knob of cache_trim_mode, which depends solely on the ratio of
> inactive folios, regardless of whether there are a large number of cold
> folios in anonymous folio list.
>
> So, we hope to add a new control logic where proactive memory reclaim only
> reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS.
> For example, something like this:
>
> echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim
>
> will perform reclaim on the rootcg with a swappiness setting of 200 (max
> swappiness) regardless of the file folios. Users have a more comprehensive
> view of the application's memory distribution because there are many
> metrics available.
>
> With this patch, the swappiness argument of memory.reclaim has a more
> precise semantics: 0 means reclaiming only from file pages, while 200
> means reclaiming just from anonymous pages.
Please update Documentation/admin-guide/cgroup-v2.rst for this.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX
2025-03-12 9:43 [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX Zhongkun He
2025-03-12 22:36 ` Andrew Morton
@ 2025-03-12 23:16 ` Johannes Weiner
2025-03-13 2:34 ` [External] " Zhongkun He
2025-03-14 0:17 ` Yosry Ahmed
2 siblings, 1 reply; 7+ messages in thread
From: Johannes Weiner @ 2025-03-12 23:16 UTC (permalink / raw)
To: Zhongkun He; +Cc: akpm, mhocko, muchun.song, linux-mm, linux-kernel
On Wed, Mar 12, 2025 at 05:43:37PM +0800, Zhongkun He wrote:
> With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
> memory.reclaim")', we can submit an additional swappiness=<val> argument
> to memory.reclaim. It is very useful because we can dynamically adjust
> the reclamation ratio based on the anonymous folios and file folios of
> each cgroup. For example,when swappiness is set to 0, we only reclaim
> from file pages.
>
> However,we have also encountered a new issue: when swappiness is set to
> the MAX_SWAPPINESS, it may still only reclaim file folios. This is due
> to the knob of cache_trim_mode, which depends solely on the ratio of
> inactive folios, regardless of whether there are a large number of cold
> folios in anonymous folio list.
>
> So, we hope to add a new control logic where proactive memory reclaim only
> reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS.
> For example, something like this:
>
> echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim
>
> will perform reclaim on the rootcg with a swappiness setting of 200 (max
> swappiness) regardless of the file folios. Users have a more comprehensive
> view of the application's memory distribution because there are many
> metrics available.
I'm not opposed but can you be a bit more specific on your usecase?
Presumably this is with zram/zswap, where the IO tradeoff that
cache_trim_mode is making doesn't hold - file refaults will cause IO,
whereas anon decompression will not.
> With this patch, the swappiness argument of memory.reclaim has a more
> precise semantics: 0 means reclaiming only from file pages, while 200
> means reclaiming just from anonymous pages.
>
> Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
Makes sense to me. With the doc update Andrew had asked for,
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [External] Re: [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX
2025-03-12 22:36 ` Andrew Morton
@ 2025-03-13 2:29 ` Zhongkun He
0 siblings, 0 replies; 7+ messages in thread
From: Zhongkun He @ 2025-03-13 2:29 UTC (permalink / raw)
To: Andrew Morton; +Cc: mhocko, hannes, muchun.song, linux-mm, linux-kernel
On Thu, Mar 13, 2025 at 6:36 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Wed, 12 Mar 2025 17:43:37 +0800 Zhongkun He <hezhongkun.hzk@bytedance.com> wrote:
>
> > With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
> > memory.reclaim")', we can submit an additional swappiness=<val> argument
> > to memory.reclaim. It is very useful because we can dynamically adjust
> > the reclamation ratio based on the anonymous folios and file folios of
> > each cgroup. For example,when swappiness is set to 0, we only reclaim
> > from file pages.
> >
> > However,we have also encountered a new issue: when swappiness is set to
> > the MAX_SWAPPINESS, it may still only reclaim file folios. This is due
> > to the knob of cache_trim_mode, which depends solely on the ratio of
> > inactive folios, regardless of whether there are a large number of cold
> > folios in anonymous folio list.
> >
> > So, we hope to add a new control logic where proactive memory reclaim only
> > reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS.
> > For example, something like this:
> >
> > echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim
> >
> > will perform reclaim on the rootcg with a swappiness setting of 200 (max
> > swappiness) regardless of the file folios. Users have a more comprehensive
> > view of the application's memory distribution because there are many
> > metrics available.
> >
> > With this patch, the swappiness argument of memory.reclaim has a more
> > precise semantics: 0 means reclaiming only from file pages, while 200
> > means reclaiming just from anonymous pages.
>
> Please update Documentation/admin-guide/cgroup-v2.rst for this.
OK, thanks. I will add it in the next version.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [External] Re: [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX
2025-03-12 23:16 ` Johannes Weiner
@ 2025-03-13 2:34 ` Zhongkun He
0 siblings, 0 replies; 7+ messages in thread
From: Zhongkun He @ 2025-03-13 2:34 UTC (permalink / raw)
To: Johannes Weiner; +Cc: akpm, mhocko, muchun.song, linux-mm, linux-kernel
On Thu, Mar 13, 2025 at 7:17 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Wed, Mar 12, 2025 at 05:43:37PM +0800, Zhongkun He wrote:
> > With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
> > memory.reclaim")', we can submit an additional swappiness=<val> argument
> > to memory.reclaim. It is very useful because we can dynamically adjust
> > the reclamation ratio based on the anonymous folios and file folios of
> > each cgroup. For example,when swappiness is set to 0, we only reclaim
> > from file pages.
> >
> > However,we have also encountered a new issue: when swappiness is set to
> > the MAX_SWAPPINESS, it may still only reclaim file folios. This is due
> > to the knob of cache_trim_mode, which depends solely on the ratio of
> > inactive folios, regardless of whether there are a large number of cold
> > folios in anonymous folio list.
> >
> > So, we hope to add a new control logic where proactive memory reclaim only
> > reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS.
> > For example, something like this:
> >
> > echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim
> >
> > will perform reclaim on the rootcg with a swappiness setting of 200 (max
> > swappiness) regardless of the file folios. Users have a more comprehensive
> > view of the application's memory distribution because there are many
> > metrics available.
>
> I'm not opposed but can you be a bit more specific on your usecase?
>
> Presumably this is with zram/zswap, where the IO tradeoff that
> cache_trim_mode is making doesn't hold - file refaults will cause IO,
> whereas anon decompression will not.
Indeed, I will add this description in the comments.
>
> > With this patch, the swappiness argument of memory.reclaim has a more
> > precise semantics: 0 means reclaiming only from file pages, while 200
> > means reclaiming just from anonymous pages.
> >
> > Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
>
> Makes sense to me. With the doc update Andrew had asked for,
>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Got it, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX
2025-03-12 9:43 [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX Zhongkun He
2025-03-12 22:36 ` Andrew Morton
2025-03-12 23:16 ` Johannes Weiner
@ 2025-03-14 0:17 ` Yosry Ahmed
2025-03-14 2:20 ` [External] " Zhongkun He
2 siblings, 1 reply; 7+ messages in thread
From: Yosry Ahmed @ 2025-03-14 0:17 UTC (permalink / raw)
To: Zhongkun He; +Cc: akpm, mhocko, hannes, muchun.song, linux-mm, linux-kernel
On Wed, Mar 12, 2025 at 05:43:37PM +0800, Zhongkun He wrote:
> With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
> memory.reclaim")', we can submit an additional swappiness=<val> argument
> to memory.reclaim. It is very useful because we can dynamically adjust
> the reclamation ratio based on the anonymous folios and file folios of
> each cgroup. For example,when swappiness is set to 0, we only reclaim
> from file pages.
>
> However,we have also encountered a new issue: when swappiness is set to
> the MAX_SWAPPINESS, it may still only reclaim file folios. This is due
> to the knob of cache_trim_mode, which depends solely on the ratio of
> inactive folios, regardless of whether there are a large number of cold
> folios in anonymous folio list.
>
> So, we hope to add a new control logic where proactive memory reclaim only
> reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS.
> For example, something like this:
>
> echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim
>
> will perform reclaim on the rootcg with a swappiness setting of 200 (max
> swappiness) regardless of the file folios. Users have a more comprehensive
> view of the application's memory distribution because there are many
> metrics available.
>
> With this patch, the swappiness argument of memory.reclaim has a more
> precise semantics: 0 means reclaiming only from file pages, while 200
> means reclaiming just from anonymous pages.
>
> Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> ---
> mm/vmscan.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c767d71c43d7..f4312b41e0e0 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2438,6 +2438,16 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
> goto out;
> }
>
> + /*
> + * Do not bother scanning file folios if the memory reclaim
> + * invoked by userspace through memory.reclaim and the
> + * swappiness is MAX_SWAPPINESS.
> + */
> + if (sc->proactive && (swappiness == MAX_SWAPPINESS)) {
> + scan_balance = SCAN_ANON;
> + goto out;
> + }
> +
> /*
> * Do not apply any pressure balancing cleverness when the
> * system is close to OOM, scan both anon and file equally
> --
> 2.39.5
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [External] Re: [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX
2025-03-14 0:17 ` Yosry Ahmed
@ 2025-03-14 2:20 ` Zhongkun He
0 siblings, 0 replies; 7+ messages in thread
From: Zhongkun He @ 2025-03-14 2:20 UTC (permalink / raw)
To: Yosry Ahmed; +Cc: akpm, mhocko, hannes, muchun.song, linux-mm, linux-kernel
On Fri, Mar 14, 2025 at 8:17 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>
> On Wed, Mar 12, 2025 at 05:43:37PM +0800, Zhongkun He wrote:
> > With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
> > memory.reclaim")', we can submit an additional swappiness=<val> argument
> > to memory.reclaim. It is very useful because we can dynamically adjust
> > the reclamation ratio based on the anonymous folios and file folios of
> > each cgroup. For example,when swappiness is set to 0, we only reclaim
> > from file pages.
> >
> > However,we have also encountered a new issue: when swappiness is set to
> > the MAX_SWAPPINESS, it may still only reclaim file folios. This is due
> > to the knob of cache_trim_mode, which depends solely on the ratio of
> > inactive folios, regardless of whether there are a large number of cold
> > folios in anonymous folio list.
> >
> > So, we hope to add a new control logic where proactive memory reclaim only
> > reclaims from anonymous folios when swappiness is set to MAX_SWAPPINESS.
> > For example, something like this:
> >
> > echo "2M swappiness=200" > /sys/fs/cgroup/memory.reclaim
> >
> > will perform reclaim on the rootcg with a swappiness setting of 200 (max
> > swappiness) regardless of the file folios. Users have a more comprehensive
> > view of the application's memory distribution because there are many
> > metrics available.
> >
> > With this patch, the swappiness argument of memory.reclaim has a more
> > precise semantics: 0 means reclaiming only from file pages, while 200
> > means reclaiming just from anonymous pages.
> >
> > Signed-off-by: Zhongkun He <hezhongkun.hzk@bytedance.com>
>
> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Thanks for your time, Yosry.
>
> > ---
> > mm/vmscan.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index c767d71c43d7..f4312b41e0e0 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2438,6 +2438,16 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
> > goto out;
> > }
> >
> > + /*
> > + * Do not bother scanning file folios if the memory reclaim
> > + * invoked by userspace through memory.reclaim and the
> > + * swappiness is MAX_SWAPPINESS.
> > + */
> > + if (sc->proactive && (swappiness == MAX_SWAPPINESS)) {
> > + scan_balance = SCAN_ANON;
> > + goto out;
> > + }
> > +
> > /*
> > * Do not apply any pressure balancing cleverness when the
> > * system is close to OOM, scan both anon and file equally
> > --
> > 2.39.5
> >
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-14 2:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-12 9:43 [PATCH] mm: vmscan: skip the file folios in proactive reclaim if swappiness is MAX Zhongkun He
2025-03-12 22:36 ` Andrew Morton
2025-03-13 2:29 ` [External] " Zhongkun He
2025-03-12 23:16 ` Johannes Weiner
2025-03-13 2:34 ` [External] " Zhongkun He
2025-03-14 0:17 ` Yosry Ahmed
2025-03-14 2:20 ` [External] " Zhongkun He
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox