* [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable
@ 2024-10-14 22:12 Wei Xu
2024-10-14 23:25 ` Andrew Morton
2024-10-16 4:56 ` Yu Zhao
0 siblings, 2 replies; 5+ messages in thread
From: Wei Xu @ 2024-10-14 22:12 UTC (permalink / raw)
To: Yu Zhao; +Cc: Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel, Wei Xu
lru_gen_shrink_node() unconditionally clears kswapd_failures, which
can prevent kswapd from sleeping and cause 100% kswapd cpu usage even
when kswapd repeatedly fails to make progress in reclaim.
Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes
some progress, similar to shrink_node().
Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists")
Signed-off-by: Wei Xu <weixugc@google.com>
---
mm/vmscan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 50dc06d55b1d..9d1e1c4e383d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4970,8 +4970,8 @@ static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control *
blk_finish_plug(&plug);
done:
- /* kswapd should never fail */
- pgdat->kswapd_failures = 0;
+ if (sc->nr_reclaimed > reclaimed)
+ pgdat->kswapd_failures = 0;
}
/******************************************************************************
--
2.47.0.rc1.288.g06298d1525-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable
2024-10-14 22:12 [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable Wei Xu
@ 2024-10-14 23:25 ` Andrew Morton
2024-10-14 23:41 ` Wei Xu
2024-10-16 4:56 ` Yu Zhao
1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2024-10-14 23:25 UTC (permalink / raw)
To: Wei Xu; +Cc: Yu Zhao, Axel Rasmussen, linux-mm, linux-kernel
On Mon, 14 Oct 2024 22:12:11 +0000 Wei Xu <weixugc@google.com> wrote:
> lru_gen_shrink_node() unconditionally clears kswapd_failures, which
> can prevent kswapd from sleeping and cause 100% kswapd cpu usage even
> when kswapd repeatedly fails to make progress in reclaim.
>
> Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes
> some progress, similar to shrink_node().
That sounds bad. What triggers this? Can you suggest why it has just
bee discovered, after 1.5 years? And should the fix be backported into
-stable kernels?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable
2024-10-14 23:25 ` Andrew Morton
@ 2024-10-14 23:41 ` Wei Xu
0 siblings, 0 replies; 5+ messages in thread
From: Wei Xu @ 2024-10-14 23:41 UTC (permalink / raw)
To: Andrew Morton; +Cc: Yu Zhao, Axel Rasmussen, linux-mm, linux-kernel
On Mon, Oct 14, 2024 at 4:25 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Mon, 14 Oct 2024 22:12:11 +0000 Wei Xu <weixugc@google.com> wrote:
>
> > lru_gen_shrink_node() unconditionally clears kswapd_failures, which
> > can prevent kswapd from sleeping and cause 100% kswapd cpu usage even
> > when kswapd repeatedly fails to make progress in reclaim.
> >
> > Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes
> > some progress, similar to shrink_node().
>
> That sounds bad. What triggers this? Can you suggest why it has just
> bee discovered, after 1.5 years? And should the fix be backported into
> -stable kernels?
>
I happened to run into this problem in one of my tests recently. It
requires a combination of several conditions: The allocator needs to
allocate a right amount of pages such that it can wake up kswapd
without itself being OOM killed; there is no memory for kswapd to
reclaim (My test disables swap and cleans page cache first); no other
process frees enough memory at the same time.
I think the fix is a good candidate for stable kernels.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable
2024-10-14 22:12 [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable Wei Xu
2024-10-14 23:25 ` Andrew Morton
@ 2024-10-16 4:56 ` Yu Zhao
2024-10-16 5:29 ` Wei Xu
1 sibling, 1 reply; 5+ messages in thread
From: Yu Zhao @ 2024-10-16 4:56 UTC (permalink / raw)
To: Wei Xu; +Cc: Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel
On Mon, Oct 14, 2024 at 4:12 PM Wei Xu <weixugc@google.com> wrote:
>
> lru_gen_shrink_node() unconditionally clears kswapd_failures, which
> can prevent kswapd from sleeping and cause 100% kswapd cpu usage even
> when kswapd repeatedly fails to make progress in reclaim.
>
> Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes
> some progress, similar to shrink_node().
>
> Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists")
> Signed-off-by: Wei Xu <weixugc@google.com>
> ---
> mm/vmscan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 50dc06d55b1d..9d1e1c4e383d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -4970,8 +4970,8 @@ static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control *
>
> blk_finish_plug(&plug);
> done:
Nit: the "done:" isn't used anymore, so better just remove it.
> - /* kswapd should never fail */
> - pgdat->kswapd_failures = 0;
> + if (sc->nr_reclaimed > reclaimed)
> + pgdat->kswapd_failures = 0;
> }
>
> /******************************************************************************
> --
> 2.47.0.rc1.288.g06298d1525-goog
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable
2024-10-16 4:56 ` Yu Zhao
@ 2024-10-16 5:29 ` Wei Xu
0 siblings, 0 replies; 5+ messages in thread
From: Wei Xu @ 2024-10-16 5:29 UTC (permalink / raw)
To: Yu Zhao; +Cc: Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel
On Tue, Oct 15, 2024 at 9:57 PM Yu Zhao <yuzhao@google.com> wrote:
>
> On Mon, Oct 14, 2024 at 4:12 PM Wei Xu <weixugc@google.com> wrote:
> >
> > lru_gen_shrink_node() unconditionally clears kswapd_failures, which
> > can prevent kswapd from sleeping and cause 100% kswapd cpu usage even
> > when kswapd repeatedly fails to make progress in reclaim.
> >
> > Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes
> > some progress, similar to shrink_node().
> >
> > Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists")
> > Signed-off-by: Wei Xu <weixugc@google.com>
> > ---
> > mm/vmscan.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 50dc06d55b1d..9d1e1c4e383d 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -4970,8 +4970,8 @@ static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control *
> >
> > blk_finish_plug(&plug);
> > done:
>
> Nit: the "done:" isn't used anymore, so better just remove it.
>
"goto done" is still used at the beginning of lru_gen_shrink_node().
We can refactor the code to remove it. But it is better to be handled
in a separate change.
> > - /* kswapd should never fail */
> > - pgdat->kswapd_failures = 0;
> > + if (sc->nr_reclaimed > reclaimed)
> > + pgdat->kswapd_failures = 0;
> > }
> >
> > /******************************************************************************
> > --
> > 2.47.0.rc1.288.g06298d1525-goog
> >
> >
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-16 5:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-14 22:12 [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable Wei Xu
2024-10-14 23:25 ` Andrew Morton
2024-10-14 23:41 ` Wei Xu
2024-10-16 4:56 ` Yu Zhao
2024-10-16 5:29 ` Wei Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox