From: Barry Song <21cnbao@gmail.com>
To: Leno Hou <lenohou@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Jialing Wang <wjl.linux@gmail.com>,
Yafang Shao <laoar.shao@gmail.com>, Yu Zhao <yuzhao@google.com>,
Kairui Song <ryncsn@gmail.com>, Bingfang Guo <bfguo@icloud.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/2] mm/mglru: fix cgroup OOM during MGLRU state switching
Date: Fri, 13 Mar 2026 04:08:29 +0800 [thread overview]
Message-ID: <CAGsJ_4zeEpdGmDLKb8Z1NP7q+o2uqvzb1-u1oXZ8SaqOw_CJkg@mail.gmail.com> (raw)
In-Reply-To: <CAGQVrL9AmqOeGa4Ovv6v-0qdy_+4MeupEeU_HQ=-YWm5wRV_ag@mail.gmail.com>
On Fri, Mar 13, 2026 at 12:44 AM Leno Hou <lenohou@gmail.com> wrote:
>
> On 3/12/26 2:02 PM, Barry Song wrote:
> [...]
> >> This effectively eliminates the race window that previously triggered OOMs
> >> under high memory pressure.
> >
> > I don't think this eliminates the race window, but it does reduce it.
> > There is nothing preventing the draining state from changing while
> > you are shrinking.
> >
> Hi Barry,
>
> You raise a valid point about the race window.
>
> While the window exists, I believe the impact is effectively mitigated by
> The reclaim retry mechanism: Even if the reclaimer makes a suboptimal path
> decision due to the race, the kernel's reclaim loop (in do_try_to_free_pages)
> provides multiple retries. If the first pass fails to reclaim memory
> due to the
> race or an inconsistent state, the subsequent retry will observe the updated
> draining state and correctly direct the scan to the appropriate LRU lists.
>
> Anyway,thanks correct me and i'll updating the commit message.
I’m fine with this race condition as long as it is clearly documented
in the commit message.
>
> >>
> >> The issue was consistently reproduced on v6.1.157 and v6.18.3 using
> >> a high-pressure memory cgroup (v1) environment.
> >>
> >> To: Andrew Morton <akpm@linux-foundation.org>
> >> To: Axel Rasmussen <axelrasmussen@google.com>
> >> To: Yuanchu Xie <yuanchu@google.com>
> >> To: Wei Xu <weixugc@google.com>
> >> To: Barry Song <21cnbao@gmail.com>
> >> To: Jialing Wang <wjl.linux@gmail.com>
> >> To: Yafang Shao <laoar.shao@gmail.com>
> >> To: Yu Zhao <yuzhao@google.com>
> >> To: Kairui Song <ryncsn@gmail.com>
> >> To: Bingfang Guo <bfguo@icloud.com>
> >> Cc: linux-mm@kvack.org
> >> Cc: linux-kernel@vger.kernel.org
> >> Signed-off-by: Leno Hou <lenohou@gmail.com>
> >> ---
> >> include/linux/mm_inline.h | 5 +++++
> >> mm/rmap.c | 2 +-
> >> mm/swap.c | 14 ++++++++------
> >> mm/vmscan.c | 49 ++++++++++++++++++++++++++++++++++++++---------
> >> 4 files changed, 54 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
> >> index fa2d6ba811b5..e6443e22bf67 100644
> >> --- a/include/linux/mm_inline.h
> >> +++ b/include/linux/mm_inline.h
> >> @@ -321,6 +321,11 @@ static inline bool lru_gen_in_fault(void)
> >> return false;
> >> }
> >>
> >> +static inline int folio_lru_gen(const struct folio *folio)
> >> +{
> >> + return -1;
> >> +}
> >> +
> >> static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, bool reclaiming)
> >> {
> >> return false;
> >> diff --git a/mm/rmap.c b/mm/rmap.c
> >> index 0f00570d1b9e..488bcdca65ed 100644
> >> --- a/mm/rmap.c
> >> +++ b/mm/rmap.c
> >> @@ -958,7 +958,7 @@ static bool folio_referenced_one(struct folio *folio,
> >> return false;
> >> }
> >>
> >> - if (lru_gen_enabled() && pvmw.pte) {
> >> + if ((folio_lru_gen(folio) != -1) && pvmw.pte) {
> >
> > I am not quite sure if a folio's gen is set to -1 when it is isolated
> > from MGLRU for reclamation. If so, I don't think this would work.
> >
>
> Thanks for your feedback. You are absolutely right—relying solely on
> folio_lru_gen(folio) != -1 is insufficient because the generation tag
> is cleared during isolation in lru_gen_del_folio().
>
> In the current implementation, once a page is isolated, its generation
> information is lost, causing the logic to fall back to the traditional LRU
> path prematurely.
>
> To address this, I am changing the logic to:
> ```c
> @@ -966,7 +966,7 @@ static bool folio_referenced_one(struct folio *folio,
> nr = folio_pte_batch(folio, pvmw.pte, pteval, max_nr);
> }
>
> - if (lru_gen_enabled() && pvmw.pte) {
> + if (lru_gen_enabled() && !lru_gen_draining() && pvmw.pte) {
> if (lru_gen_look_around(&pvmw, nr))
> referenced++;
> } else if (pvmw.pte) {
> ```
> The rationale is that lru_gen_look_around() is an MGLRU-specific optimization
> that promotes pages based on generation aging. During the draining
> transition, we should strictly rely on traditional RMAP-based reference
> auditing to ensure that pages are correctly migrated to traditional LRU
> lists without being spuriously promoted by MGLRU logic. This avoids
> mixing MGLRU's generation-based promotion with traditional LRU's 'active'
> status, preventing potential reclaim state inconsistencies during the
> transition period."
Right, using folio_lru_gen(folio) != -1 is dangerous because it
implicitly depends on whether the folio is still on the LRU list.
Even if we never switch between MGLRU and the active/inactive LRU,
the check can become invalid once the folio is removed from the LRU
for various reasons—for example, during isolate_folio().
So if possible, we should identify the real issue that led us to rely
on folio_lru_gen(folio) != -1 in the first place, and avoid inventing
this kind of unstable condition check.
Please DO NOT depend on folio_lru_gen(folio) != -1. If it is
unavoidable, limit its use to the smallest scope possible.
Thanks
Barry
next prev parent reply other threads:[~2026-03-12 20:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 12:09 [PATCH v2 0/2] " Leno Hou via B4 Relay
2026-03-11 12:09 ` [PATCH v2 1/2] " Leno Hou via B4 Relay
2026-03-12 6:02 ` Barry Song
2026-03-12 16:44 ` Leno Hou
2026-03-12 20:08 ` Barry Song [this message]
2026-03-11 12:09 ` [PATCH v2 2/2] mm/mglru: maintain workingset refault context across state transitions Leno Hou via B4 Relay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4zeEpdGmDLKb8Z1NP7q+o2uqvzb1-u1oXZ8SaqOw_CJkg@mail.gmail.com \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=bfguo@icloud.com \
--cc=laoar.shao@gmail.com \
--cc=lenohou@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryncsn@gmail.com \
--cc=weixugc@google.com \
--cc=wjl.linux@gmail.com \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox