Re: [PATCH 0/8] mm/mglru: improve reclaim loop and dirty folio handling

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Eric Naim <dnaim@cachyos.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	 Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	David Hildenbrand <david@kernel.org>,
	 Michal Hocko <mhocko@kernel.org>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	 Shakeel Butt <shakeel.butt@linux.dev>,
	Lorenzo Stoakes <ljs@kernel.org>, Barry Song <baohua@kernel.org>,
	 David Stevens <stevensd@google.com>,
	Chen Ridong <chenridong@huaweicloud.com>,
	 Leno Hou <lenohou@gmail.com>, Yafang Shao <laoar.shao@gmail.com>,
	Yu Zhao <yuzhao@google.com>,
	 Zicheng Wang <wangzicheng@honor.com>,
	Kalesh Singh <kaleshsingh@google.com>,
	 Suren Baghdasaryan <surenb@google.com>,
	Chris Li <chrisl@kernel.org>, Vernon Yang <vernon2gm@gmail.com>,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/8] mm/mglru: improve reclaim loop and dirty folio handling
Date: Sun, 29 Mar 2026 01:30:38 +0800	[thread overview]
Message-ID: <acgNCzRDVmSbXrOE@KASONG-MC4> (raw)
In-Reply-To: <CAMgjq7AQeP8maeMWNun=60oyq_KDu18MwXfGEyK4bwj_k92NgQ@mail.gmail.com>

On Wed, Mar 25, 2026 at 05:47:41PM +0800, Kairui Song wrote:
> On Wed, Mar 25, 2026 at 5:27 PM Eric Naim <dnaim@cachyos.org> wrote:
> >
> > On 3/25/26 1:47 PM, Kairui Song wrote:
> > > On Wed, Mar 25, 2026 at 1:04 PM Eric Naim <dnaim@cachyos.org> wrote:
> > >>
> > >> Hi Kairui,
> > >>
> > >> On 3/18/26 3:08 AM, Kairui Song via B4 Relay wrote:
> > >>> This series cleans up and slightly improves MGLRU's reclaim loop and
> > >>> dirty flush logic. As a result, we can see an up to ~50% reduce of file
> > >>> faults and 30% increase in MongoDB throughput with YCSB and no swap
> > >>> involved, other common benchmarks have no regression, and LOC is
> > >>> reduced, with less unexpected OOM in our production environment.
> > >>>
> > >
> > > ...
> > >
> > >>
> > >> I applied this patch set to 7.0-rc5 and noticed the system locking up when performing the below test.
> > >>
> > >> fallocate -l 5G 5G
> > >> while true; do tail /dev/zero; done
> > >> while true; do time cat 5G > /dev/null; sleep $(($(cat /sys/kernel/mm/lru_gen/min_ttl_ms)/1000+1)); done
> > >>
> > >> After reading [1], I suspect that this was because the system was using zram as swap, and yes if zram is disabled then the lock up does not occur.
> > >
> > > Hi Eric,
> > >
> > > Thanks for the report, I was about to send V2 but noticing your report
> > > I'll try to reproduce your issue first.
> > >
> > > So far I didn't notice any regression, is this an issue caused by this
> > > patch or is it an existing issue? I don't have any context about how
> > > you are doing the test. BTW the calculation in patch "mm/mglru:
> > > restructure the reclaim loop" needs to have a lowest bar
> > > "max(nr_to_scan, SWAP_CLUSTER_MAX)" for small machines, not sure if
> > > related but will add to V2.
> > >
> >
> > As of writing this, I got some new information that makes this a bit more confusing. The kernel that doesn't have the issue was patched with [1] as a means of protecting the working set (similar to lru_gen_min_ttl_ms).
> >
> > So this time on an unpatched kernel, the system still freezes but quickly recovers itself after about 2 seconds. With this patchset applied, the system freezes but it doesn't quickly recover (if at all).
> >
> > Curiously, I had the user test again but this time with lru_gen_min_ttl_ms = 100. With this set, the system doesn't freeze at all with or without this patchset.
> 
> Ah thanks, that makes sense now, the downstream patch you mentioned
> limits the reclaim of file pages to avoid thrashing, and your test
> cases exhaust the memory on purpose which forces the kernel to reclaim
> all reclaimable folios including page cache.
> 
> A thrashing page cache causes desktop hangs easily, using TTL is an
> effective way to avoid thrashing and trigger OOM early. That's why the
> problem is gone with lru_gen_min_ttl_ms = 100 or le9.
> 
> > > And about the test you posted:
> > > while true; do tail /dev/zero; done
> > >
> > > I believe this will just consume all memory with zero pages and then
> > > get OOM killed, that's exactly what the test is meant to do. By lockup
> > > I'm not sure you mean since you mentioned OOM kill. The system
> > > actually hung or the desktop is dead?
> >
> > The system actually hung. They needed a hard reset to recover the system. (pure speculation: given a few minutes the system would likely recover itself as this seems to be a common scenario)
> 
> Yeah I believe so.
> 
> Thrashing prevention is why MGLRU's TTL is introduced, so I do suggest
> using that. It can be further improved too.
> 
> Will keep that in mind and try to make some test cases to cover your
> case too and make some adjustments.
> 
> BTW how does the kernel behave with MGLRU disabled for your case?

Hi all,

I tested it multiple times on my Fedora, comparing MGLRU to classic LRU
(using v2 of this series also also includes some minor improvements).

I modified the reproduce a bit just to test the OOM behavior:

- Running following command in console A:
fallocate -l 5G 5G
while true; do time cat 5G > /dev/null; done

- Then run following command in console B:
while true; do tail /dev/zero; done

The console A output is below:

With MGLRU disabled:
...
real    0m4.925s user    0m0.016s sys     0m4.904s # Under pressure
real    0m5.544s user    0m0.015s sys     0m5.521s
real    0m5.444s user    0m0.012s sys     0m5.425s
real    0m7.607s user    0m0.016s sys     0m7.561s
real    0m7.268s user    0m0.017s sys     0m7.240s
real    0m6.686s user    0m0.016s sys     0m6.656s
real    0m9.919s user    0m0.014s sys     0m9.831s # <- OOM in B triggers
real    0m4.559s user    0m0.012s sys     0m4.539s
real    0m1.381s user    0m0.009s sys     0m1.362s
real    0m11.816s user    0m0.010s sys     0m11.795s
real    0m6.797s user    0m0.021s sys     0m6.753s
real    0m0.944s user    0m0.013s sys     0m0.931s # <- OOM kill in B ends
real    0m0.285s user    0m0.013s sys     0m0.272s

MGLRU enabled, before this series:
...
real    0m0.355s user    0m0.009s sys     0m0.346s # Under pressure
real    0m0.352s user    0m0.008s sys     0m0.344s
real    0m0.549s user    0m0.014s sys     0m0.535s
real    0m0.628s user    0m0.009s sys     0m0.619s
real    0m0.651s user    0m0.009s sys     0m0.642s
real    0m5.294s user    0m0.010s sys     0m5.280s # <- OOM in B triggers
real    0m1.041s user    0m0.014s sys     0m1.026s
real    0m0.837s user    0m0.011s sys     0m0.826s
real    0m2.450s user    0m0.013s sys     0m2.435s
real    0m2.499s user    0m0.012s sys     0m2.485s
real    0m1.857s user    0m0.015s sys     0m1.841s
real    0m0.512s user    0m0.015s sys     0m0.497s
real    0m0.418s user    0m0.011s sys     0m0.407s # <- OOM kill in B ends
real    0m0.282s user    0m0.010s sys     0m0.272s

MGLRU enabled, after this series:
...
real    0m0.280s user    0m0.015s sys     0m0.265s # Under pressure
real    0m0.283s user    0m0.010s sys     0m0.273s
real    0m0.278s user    0m0.012s sys     0m0.266s
real    0m0.315s user    0m0.018s sys     0m0.297s
real    0m0.679s user    0m0.014s sys     0m0.663s
real    0m0.716s user    0m0.011s sys     0m0.705s
real    0m0.657s user    0m0.009s sys     0m0.648s
real    0m6.615s user    0m0.007s sys     0m6.453s # <- OOM in B triggers
real    0m1.244s user    0m0.018s sys     0m1.226s
real    0m1.290s user    0m0.014s sys     0m1.276s
real    0m1.119s user    0m0.011s sys     0m1.108s
real    0m0.882s user    0m0.010s sys     0m0.872s
real    0m0.855s user    0m0.007s sys     0m0.848s
real    0m0.933s user    0m0.005s sys     0m0.928s
real    0m0.833s user    0m0.009s sys     0m0.823s
real    0m0.279s user    0m0.012s sys     0m0.267s # <- OOM killed in B
real    0m0.273s user    0m0.010s sys     0m0.263s

It seems with MGLRU enabled, both performance and OOM jitter
seem better.

As for this series, it now has no significant effect or slightly
changed the jitter pattern, which I can't say is better or worse.
The peak latency seems slightly higher, but the system seems to
recover faster. Or maybe that's just noise.

The OOM behavior is not really perfect in any case, but with
MGLRU's TTL enabled, I got confirmation that the jitter is
gone completely (only a few frames).

     prev parent reply	other threads:[~2026-03-28 17:30 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 19:08 Kairui Song via B4 Relay
2026-03-17 19:08 ` [PATCH 1/8] mm/mglru: consolidate common code for retrieving evitable size Kairui Song via B4 Relay
2026-03-17 19:55   ` Yuanchu Xie
2026-03-18  9:42   ` Barry Song
2026-03-18  9:57     ` Kairui Song
2026-03-19  1:40   ` Chen Ridong
2026-03-20 19:51     ` Axel Rasmussen
2026-03-22 16:10       ` Kairui Song
2026-03-26  6:25   ` Baolin Wang
2026-03-17 19:08 ` [PATCH 2/8] mm/mglru: relocate the LRU scan batch limit to callers Kairui Song via B4 Relay
2026-03-19  2:00   ` Chen Ridong
2026-03-19  4:12     ` Kairui Song
2026-03-20 21:00   ` Axel Rasmussen
2026-03-22  8:14   ` Barry Song
2026-03-24  6:05     ` Kairui Song
2026-03-17 19:08 ` [PATCH 3/8] mm/mglru: restructure the reclaim loop Kairui Song via B4 Relay
2026-03-20 20:09   ` Axel Rasmussen
2026-03-22 16:11     ` Kairui Song
2026-03-24  6:41   ` Chen Ridong
2026-03-26  7:31   ` Baolin Wang
2026-03-26  8:37     ` Kairui Song
2026-03-17 19:09 ` [PATCH 4/8] mm/mglru: scan and count the exact number of folios Kairui Song via B4 Relay
2026-03-20 20:57   ` Axel Rasmussen
2026-03-22 16:20     ` Kairui Song
2026-03-24  7:22       ` Chen Ridong
2026-03-24  8:05         ` Kairui Song
2026-03-24  9:10           ` Chen Ridong
2026-03-24  9:29             ` Kairui Song
2026-03-17 19:09 ` [PATCH 5/8] mm/mglru: use a smaller batch for reclaim Kairui Song via B4 Relay
2026-03-20 20:58   ` Axel Rasmussen
2026-03-24  7:51   ` Chen Ridong
2026-03-17 19:09 ` [PATCH 6/8] mm/mglru: don't abort scan immediately right after aging Kairui Song via B4 Relay
2026-03-17 19:09 ` [PATCH 7/8] mm/mglru: simplify and improve dirty writeback handling Kairui Song via B4 Relay
2026-03-20 21:18   ` Axel Rasmussen
2026-03-22 16:22     ` Kairui Song
2026-03-24  8:57   ` Chen Ridong
2026-03-24 11:09     ` Kairui Song
2026-03-26  7:56   ` Baolin Wang
2026-03-17 19:09 ` [PATCH 8/8] mm/vmscan: remove sc->file_taken Kairui Song via B4 Relay
2026-03-20 21:19   ` Axel Rasmussen
2026-03-25  4:49 ` [PATCH 0/8] mm/mglru: improve reclaim loop and dirty folio handling Eric Naim
2026-03-25  5:47   ` Kairui Song
2026-03-25  9:26     ` Eric Naim
2026-03-25  9:47       ` Kairui Song
2026-03-28 17:30         ` Kairui Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acgNCzRDVmSbXrOE@KASONG-MC4 \
    --to=ryncsn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=chenridong@huaweicloud.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=dnaim@cachyos.org \
    --cc=hannes@cmpxchg.org \
    --cc=kaleshsingh@google.com \
    --cc=laoar.shao@gmail.com \
    --cc=lenohou@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=stevensd@google.com \
    --cc=surenb@google.com \
    --cc=vernon2gm@gmail.com \
    --cc=wangzicheng@honor.com \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=yuzhao@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox