From: Barry Song <21cnbao@gmail.com>
To: Leno Hou <lenohou@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Jialing Wang <wjl.linux@gmail.com>,
Yafang Shao <laoar.shao@gmail.com>, Yu Zhao <yuzhao@google.com>,
Kairui Song <ryncsn@gmail.com>, Bingfang Guo <bfguo@icloud.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] mm/mglru: fix cgroup OOM during MGLRU state switching
Date: Wed, 18 Mar 2026 16:30:56 +0800 [thread overview]
Message-ID: <CAGsJ_4y+SwwCtDu8zPYnuSdyS+1=q77d1Fwq6eHbqrnw8-6K=w@mail.gmail.com> (raw)
In-Reply-To: <8c01a707-f798-4649-8441-d82dd0dac7b9@gmail.com>
On Wed, Mar 18, 2026 at 4:17 PM Leno Hou <lenohou@gmail.com> wrote:
>
> On 3/18/26 3:16 PM, Barry Song wrote:
> > On Wed, Mar 18, 2026 at 11:29 AM Leno Hou via B4 Relay
> > <devnull+lenohou.gmail.com@kernel.org> wrote:
> >>
> >> From: Leno Hou <lenohou@gmail.com>
> >
> > [...]
> >
> >>
> >> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> >> index ad50688d89db..1f6b19bf365b 100644
> >> --- a/include/linux/mm_inline.h
> >> +++ b/include/linux/mm_inline.h
> >> @@ -102,6 +102,12 @@ static __always_inline enum lru_list folio_lru_list(const struct folio *folio)
> >>
> >> #ifdef CONFIG_LRU_GEN
> >>
> >> +static inline bool lru_gen_draining(void)
> >> +{
> >> + DECLARE_STATIC_KEY_FALSE(lru_drain_core);
> >> +
> >> + return static_branch_unlikely(&lru_drain_core);
> >> +}
> >
> > Can we name it lru_gen_switch() or lru_switch?
> > Since “drain” implies disabling MGLRU, the operation
> > could just as well be enabling it. Also, can we drop
> > the _core suffix?
>
> OK. Next V5 patch will be:
>
> +static inline bool lru_gen_switching(void)
> +{
> + DECLARE_STATIC_KEY_FALSE(lru_switch);
> +
> + return static_branch_unlikely(&lru_switch);
> +}
>
> >
> >
> >> #ifdef CONFIG_LRU_GEN_ENABLED
> >> static inline bool lru_gen_enabled(void)
> >> {
> >> @@ -316,6 +322,11 @@ static inline bool lru_gen_enabled(void)
> >> return false;
> >> }
> >>
> >> +static inline bool lru_gen_draining(void)
> >
> > lru_gen_switching()? >
> >> +{
> >> + return false;
> >> +}
> >> +
> >> static inline bool lru_gen_in_fault(void)
> >> {
> >> return false;
> >> diff --git a/mm/rmap.c b/mm/rmap.c
> >> index 6398d7eef393..0b5f663f3062 100644
> >> --- a/mm/rmap.c
> >> +++ b/mm/rmap.c
> >> @@ -966,7 +966,7 @@ static bool folio_referenced_one(struct folio *folio,
> >> nr = folio_pte_batch(folio, pvmw.pte, pteval, max_nr);
> >> }
>
> OK. I'll be add following ducumentation that just you said.
> /* When LRU is switching, we don’t know where the surrounding folios
> are. —they could be on active/inactive lists or on MGLRU. So the
> simplest approach is to disable this look-around optimization.
> */
> >> - if (lru_gen_enabled() && pvmw.pte) {
> >> + if (lru_gen_enabled() && !lru_gen_draining() && pvmw.pte) {
> >
> > Ack. When LRU is switching, we don’t know where the
> > surrounding folios are—they could be on active/inactive
> > lists or on MGLRU. So the simplest approach is to
> > disable this look-around optimization.
> > But please add a comment here explaining it.
> >
> >
> >> if (lru_gen_look_around(&pvmw, nr))
> >> referenced++;
> >> } else if (pvmw.pte) {
> >> diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> index 33287ba4a500..88b9db06e331 100644
> >> --- a/mm/vmscan.c
> >> +++ b/mm/vmscan.c
> >> @@ -886,7 +886,7 @@ static enum folio_references folio_check_references(struct folio *folio,
> >> if (referenced_ptes == -1)
> >> return FOLIOREF_KEEP;
> >>
> >> - if (lru_gen_enabled()) {
>
> documentation as following:
>
> /*
> * During the MGLRU state transition (lru_gen_switching), we force
> * folios to follow the traditional active/inactive reference checking.
> *
> * While MGLRU is switching,the generational state of folios is in flux.
> * Falling back to the traditional logic (which relies on PG_referenced/
> * PG_active flags that are consistent across both mechanisms) provides
> * a stable, safe behavior for the folio until it is fully migrated back
> * to the traditional LRU lists. This avoids relying on potentially
> * inconsistent MGLRU generational metadata during the transition.
> */
>
> >> + if (lru_gen_enabled() && !lru_gen_draining()) {
> >
> > I’m curious what prompted you to do this.
> >
> > This feels a bit odd. I assume this effectively makes
> > folios on MGLRU, as well as those on active/inactive
> > lists, always follow the active/inactive logic.
> >
> > It might be fine, but it needs thorough documentation here.
> >
> > another approach would be:
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 33287ba4a500..91b60664b652 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -122,6 +122,9 @@ struct scan_control {
> > /* Proactive reclaim invoked by userspace */
> > unsigned int proactive:1;
> >
> > + /* Are we reclaiming from MGLRU */
> > + unsigned int lru_gen:1;
> > +
> > /*
> > * Cgroup memory below memory.low is protected as long as we
> > * don't threaten to OOM. If any cgroup is reclaimed at
> > @@ -886,7 +889,7 @@ static enum folio_references
> > folio_check_references(struct folio *folio,
> > if (referenced_ptes == -1)
> > return FOLIOREF_KEEP;
> >
> > - if (lru_gen_enabled()) {
> > + if (sc->lru_gen) {
> > if (!referenced_ptes)
> > return FOLIOREF_RECLAIM;
> >
> > This makes the logic perfectly correct (you know exactly
> > where your folios come from), but I’m not sure it’s worth it.
> >
> > Anyway, I’d like to understand why you always need to
> > use the active/inactive logic even for folios from MGLRU.
> > To me, it seems to work only by coincidence, which isn’t good.
> >
> > Thanks
> > Barry
>
> Hi Barry,
>
> I agree that using !lru_gen_draining() feels a bit like a fallback path.
> However, after considering your suggestion for sc->lru_gen, I’m
> concerned about the broad impact of modifying struct scan_control.Since
> lru_drain_core is a very transient state, I prefer a localized fix that
> doesn't propagate architectural changes throughout the entire reclaim stack.
>
> You mentioned that using the active/inactive logic feels like it works
> by 'coincidence'. To clarify, this is an intentional fallback: because
> the generational metadata in MGLRU becomes unreliable during draining,
> we intentionally downgrade these folios to the traditional logic. Since
> the PG_referenced and PG_active bits are maintained by the core VM and
> are consistent regardless of whether MGLRU is active, this fallback is
> technically sound and robust.
>
> I have added detailed documentation to the code to explain this design
> choice, clarifying that it's a deliberate transition strategy rather
> than a coincidence."
Nope. You still haven’t explained why the active/inactive LRU
logic makes it work. MGLRU and active/inactive use different
methods to determine whether a folio is hot or cold. You’re
forcing active/inactive logic to decide hot/cold for an MGLRU
folio. It’s not that simple—PG_referenced isn’t maintained
by the core; it’s specific to active/inactive. See folio_mark_accessed().
Best Regards
Barry
next prev parent reply other threads:[~2026-03-18 8:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-17 17:43 Leno Hou via B4 Relay
2026-03-18 7:16 ` Barry Song
2026-03-18 8:16 ` Leno Hou
2026-03-18 8:30 ` Barry Song [this message]
2026-03-18 12:56 ` Leno Hou
2026-03-18 21:29 ` Barry Song
2026-03-19 3:14 ` Leno Hou
2026-03-18 12:59 ` Leno Hou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGsJ_4y+SwwCtDu8zPYnuSdyS+1=q77d1Fwq6eHbqrnw8-6K=w@mail.gmail.com' \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=bfguo@icloud.com \
--cc=laoar.shao@gmail.com \
--cc=lenohou@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryncsn@gmail.com \
--cc=weixugc@google.com \
--cc=wjl.linux@gmail.com \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox