Re: [PATCH v4] mm/mglru: fix cgroup OOM during MGLRU state switching

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Leno Hou <lenohou@gmail.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	Jialing Wang <wjl.linux@gmail.com>,
	Yafang Shao <laoar.shao@gmail.com>, Yu Zhao <yuzhao@google.com>,
	Kairui Song <ryncsn@gmail.com>, Bingfang Guo <bfguo@icloud.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] mm/mglru: fix cgroup OOM during MGLRU state switching
Date: Wed, 18 Mar 2026 16:16:58 +0800	[thread overview]
Message-ID: <8c01a707-f798-4649-8441-d82dd0dac7b9@gmail.com> (raw)
In-Reply-To: <CAGsJ_4xHvtjkPp5Ar16UXqZhQYHpdYcqCWMmFbGGUvK-kCH+uw@mail.gmail.com>

On 3/18/26 3:16 PM, Barry Song wrote:
> On Wed, Mar 18, 2026 at 11:29 AM Leno Hou via B4 Relay
> <devnull+lenohou.gmail.com@kernel.org> wrote:
>>
>> From: Leno Hou <lenohou@gmail.com>
> 
> [...]
> 
>>
>> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
>> index ad50688d89db..1f6b19bf365b 100644
>> --- a/include/linux/mm_inline.h
>> +++ b/include/linux/mm_inline.h
>> @@ -102,6 +102,12 @@ static __always_inline enum lru_list folio_lru_list(const struct folio *folio)
>>
>>   #ifdef CONFIG_LRU_GEN
>>
>> +static inline bool lru_gen_draining(void)
>> +{
>> +       DECLARE_STATIC_KEY_FALSE(lru_drain_core);
>> +
>> +       return static_branch_unlikely(&lru_drain_core);
>> +}
> 
> Can we name it lru_gen_switch() or lru_switch?
> Since “drain” implies disabling MGLRU, the operation
> could just as well be enabling it. Also, can we drop
> the _core suffix?

OK. Next V5 patch will be:

+static inline bool lru_gen_switching(void)
+{
+       DECLARE_STATIC_KEY_FALSE(lru_switch);
+
+       return static_branch_unlikely(&lru_switch);
+}

> 
> 
>>   #ifdef CONFIG_LRU_GEN_ENABLED
>>   static inline bool lru_gen_enabled(void)
>>   {
>> @@ -316,6 +322,11 @@ static inline bool lru_gen_enabled(void)
>>          return false;
>>   }
>>
>> +static inline bool lru_gen_draining(void)
> 
> lru_gen_switching()? >
>> +{
>> +       return false;
>> +}
>> +
>>   static inline bool lru_gen_in_fault(void)
>>   {
>>          return false;
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 6398d7eef393..0b5f663f3062 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -966,7 +966,7 @@ static bool folio_referenced_one(struct folio *folio,
>>                          nr = folio_pte_batch(folio, pvmw.pte, pteval, max_nr);
>>                  }

OK. I'll be add following ducumentation that just you said.
/* When LRU is switching, we don’t know where the surrounding folios
are. —they could be on active/inactive lists or on MGLRU. So the 
simplest approach is to disable this look-around optimization.
*/
>> -               if (lru_gen_enabled() && pvmw.pte) {
>> +               if (lru_gen_enabled() && !lru_gen_draining() && pvmw.pte) {
> 
> Ack. When LRU is switching, we don’t know where the
> surrounding folios are—they could be on active/inactive
> lists or on MGLRU. So the simplest approach is to
> disable this look-around optimization.
> But please add a comment here explaining it.
> 
> 
>>                          if (lru_gen_look_around(&pvmw, nr))
>>                                  referenced++;
>>                  } else if (pvmw.pte) {
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 33287ba4a500..88b9db06e331 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -886,7 +886,7 @@ static enum folio_references folio_check_references(struct folio *folio,
>>          if (referenced_ptes == -1)
>>                  return FOLIOREF_KEEP;
>>
>> -       if (lru_gen_enabled()) {

documentation as following:

/*
  * During the MGLRU state transition (lru_gen_switching), we force
  * folios to follow the traditional active/inactive reference checking.
  *
  * While MGLRU is switching,the generational state of folios is in flux.
  * Falling back to the traditional logic (which relies on PG_referenced/
  * PG_active flags that are consistent across both mechanisms) provides
  * a stable, safe behavior for the folio until it is fully migrated back
  * to the traditional LRU lists. This avoids relying on potentially
  * inconsistent MGLRU generational metadata during the transition.
  */

>> +       if (lru_gen_enabled() && !lru_gen_draining()) {
> 
> I’m curious what prompted you to do this.
> 
> This feels a bit odd. I assume this effectively makes
> folios on MGLRU, as well as those on active/inactive
> lists, always follow the active/inactive logic.
> 
> It might be fine, but it needs thorough documentation here.
> 
> another approach would be:
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 33287ba4a500..91b60664b652 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -122,6 +122,9 @@ struct scan_control {
>          /* Proactive reclaim invoked by userspace */
>          unsigned int proactive:1;
> 
> +       /* Are we reclaiming from MGLRU */
> +       unsigned int lru_gen:1;
> +
>          /*
>           * Cgroup memory below memory.low is protected as long as we
>           * don't threaten to OOM. If any cgroup is reclaimed at
> @@ -886,7 +889,7 @@ static enum folio_references
> folio_check_references(struct folio *folio,
>          if (referenced_ptes == -1)
>                  return FOLIOREF_KEEP;
> 
> -       if (lru_gen_enabled()) {
> +       if (sc->lru_gen) {
>                  if (!referenced_ptes)
>                          return FOLIOREF_RECLAIM;
> 
> This makes the logic perfectly correct (you know exactly
> where your folios come from), but I’m not sure it’s worth it.
> 
> Anyway, I’d like to understand why you always need to
> use the active/inactive logic even for folios from MGLRU.
> To me, it seems to work only by coincidence, which isn’t good.
> 
> Thanks
> Barry

Hi Barry,

I agree that using !lru_gen_draining() feels a bit like a fallback path. 
However, after considering your suggestion for sc->lru_gen, I’m 
concerned about the broad impact of modifying struct scan_control.Since 
lru_drain_core is a very transient state, I prefer a localized fix that 
doesn't propagate architectural changes throughout the entire reclaim stack.

You mentioned that using the active/inactive logic feels like it works 
by 'coincidence'. To clarify, this is an intentional fallback: because 
the generational metadata in MGLRU becomes unreliable during draining, 
we intentionally downgrade these folios to the traditional logic. Since 
the PG_referenced and PG_active bits are maintained by the core VM and 
are consistent regardless of whether MGLRU is active, this fallback is 
technically sound and robust.

I have added detailed documentation to the code to explain this design 
choice, clarifying that it's a deliberate transition strategy rather 
than a coincidence."



Best Regards,
Leno Hou

next prev parent reply	other threads:[~2026-03-18  8:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 17:43 Leno Hou via B4 Relay
2026-03-18  7:16 ` Barry Song
2026-03-18  8:16   ` Leno Hou [this message]
2026-03-18  8:30     ` Barry Song
2026-03-18 12:56       ` Leno Hou
2026-03-18 21:29         ` Barry Song
2026-03-19  3:14           ` Leno Hou
2026-03-18 12:59       ` Leno Hou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c01a707-f798-4649-8441-d82dd0dac7b9@gmail.com \
    --to=lenohou@gmail.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=bfguo@icloud.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryncsn@gmail.com \
    --cc=weixugc@google.com \
    --cc=wjl.linux@gmail.com \
    --cc=yuanchu@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox