From: Leno Hou <lenohou@gmail.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Jialing Wang <wjl.linux@gmail.com>,
Yafang Shao <laoar.shao@gmail.com>, Yu Zhao <yuzhao@google.com>,
Bingfang Guo <bfguo@icloud.com>, Barry Song <baohua@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/2] mm/mglru: maintain workingset refault context across state transitions
Date: Wed, 18 Mar 2026 11:41:40 +0800 [thread overview]
Message-ID: <a02ce4c3-c443-4165-9dc3-3ad4d9816dfa@gmail.com> (raw)
In-Reply-To: <CAMgjq7ACdzXbte3ESz4MDAnBExop_zFvCi4PXwQXRPD8dvnmtQ@mail.gmail.com>
On 3/18/26 11:30 AM, Kairui Song wrote:
> On Mon, Mar 16, 2026 at 1:56 PM Leno Hou via B4 Relay
> <devnull+lenohou.gmail.com@kernel.org> wrote:
>>
>> From: Leno Hou <lenohou@gmail.com>
>>
>> When MGLRU state is toggled dynamically, existing shadow entries (eviction
>> tokens) lose their context. Traditional LRU and MGLRU handle workingset
>> refaults using different logic. Without context, shadow entries
>> re-activated by the "wrong" reclaim logic trigger excessive page
>> activations (pgactivate) and system thrashing, as the kernel cannot
>> correctly distinguish if a refaulted page was originally managed by
>> MGLRU or the traditional LRU.
>>
>> This patch introduces shadow entry context tracking:
>>
>> - Encode MGLRU origin: Introduce WORKINGSET_MGLRU_SHIFT into the shadow
>> entry (eviction token) encoding. This adds an 'is_mglru' bit to shadow
>> entries, allowing the kernel to correctly identify the originating
>> reclaim logic for a page even after the global MGLRU state has been
>> toggled.
>
> Hi Leno,
>
> I really don't think it's a good idea to waste one bit there just for
> the transition state which is rarely used. And if you switched between
> MGLRU / non-MGLRU then the refault distance check is already kind of
> meaning less unless we unify their logic of reactivation.
>
> BTW I tried that sometime ago: https://lwn.net/Articles/945266/
>
>>
>> - Refault logic dispatch: Use this 'is_mglru' bit in workingset_refault()
>> and workingset_test_recent() to dispatch refault events to the correct
>> handler (lru_gen_refault vs. traditional workingset refault).
>
> Hmm, restoring the folio ref count in MGLRU is not the same thing as
> reactivation or restoring the workingset flag in non-MGLRU case, and
> not really comparable. Not sure this will be helpful.
>
> Maybe for now we just igore this part, shadow is just a hint after
> all, switch the LRU at runtime is already a huge performance impact
> factor and not recommend, that the shadow part is trivial compared to
> that.
Hi Kairui,
Thank you for the insightful feedback. I completely agree with your
assessment: the workingset refault context is indeed just a hint, and
trying to align or convert these tokens between MGLRU and non-MGLRU
states is overly complex and likely unnecessary, especially given that
runtime switching is an extreme and infrequent operation.
I have decided to take your advice and completely remove the patches
related to workingset refault context tracking and folio_lru_gen state
checking.
My revised patch will focus solely on the lru_drain_core state machine,
which is the minimal and robust approach to address the primary issue:
preventing cgroup OOMs caused by the race condition during state
transitions. This should significantly reduce the complexity and risk of
the patch series.
I've sent a simplified v4 patch series that focuses strictly on the
lru_drain_core logic, removing all the disputed context-tracking code.
And this patch was tested on latest 7.0.0-rc1 with 1000 iterations
toggle on/off and no OOM.
Thank you for helping me sharpen the focus of this fix.
Best regards,
Leno Hou
next prev parent reply other threads:[~2026-03-18 3:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-15 18:18 [PATCH v3 0/2] mm/mglru: fix cgroup OOM during MGLRU state switching Leno Hou via B4 Relay
2026-03-15 18:18 ` [PATCH v3 1/2] " Leno Hou via B4 Relay
2026-03-17 7:52 ` Barry Song
2026-03-15 18:18 ` [PATCH v3 2/2] mm/mglru: maintain workingset refault context across state transitions Leno Hou via B4 Relay
2026-03-18 3:30 ` Kairui Song
2026-03-18 3:41 ` Leno Hou [this message]
2026-03-18 8:14 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a02ce4c3-c443-4165-9dc3-3ad4d9816dfa@gmail.com \
--to=lenohou@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=bfguo@icloud.com \
--cc=laoar.shao@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryncsn@gmail.com \
--cc=weixugc@google.com \
--cc=wjl.linux@gmail.com \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox