From: Barry Song <21cnbao@gmail.com>
To: lenohou@gmail.com
Cc: 21cnbao@gmail.com, akpm@linux-foundation.org,
axelrasmussen@google.com, laoar.shao@gmail.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
weixugc@google.com, wjl.linux@gmail.com, yuanchu@google.com,
yuzhao@google.com
Subject: Re: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
Date: Sun, 1 Mar 2026 12:10:22 +0800 [thread overview]
Message-ID: <20260301041022.66111-1-21cnbao@gmail.com> (raw)
In-Reply-To: <CAGsJ_4yCA008MKV8OKoV=Ge3B_KS1gnVxEPED7VQBmxJxR35hQ@mail.gmail.com>
On Sun, Mar 1, 2026 at 6:41 AM Barry Song <21cnbao@gmail.com> wrote:
>
> On Sun, Mar 1, 2026 at 5:28 AM Barry Song <21cnbao@gmail.com> wrote:
> [...]
> >
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 3e51190a55e4..ba306e986050 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -509,6 +509,8 @@ struct lru_gen_folio {
> > atomic_long_t refaulted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS];
> > /* whether the multi-gen LRU is enabled */
> > bool enabled;
> > + /* whether the multi-gen LRU is switching from/to active/inactive LRU */
> > + bool switching;
> > /* the memcg generation this lru_gen_folio belongs to */
> > u8 gen;
> > /* the list segment this lru_gen_folio belongs to */
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 0fc9373e8251..60fc611067c7 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -5196,6 +5196,7 @@ static void lru_gen_change_state(bool enabled)
> > VM_WARN_ON_ONCE(!state_is_valid(lruvec));
> >
> > lruvec->lrugen.enabled = enabled;
> > + smp_store_release(&lruvec->lrugen.switching, true);
>
> Sorry, I actually meant:
>
> + smp_store_release(&lruvec->lrugen.switching, true);
> lruvec->lrugen.enabled = enabled;
>
> But I guess we could still hit a race condition in extreme cases—switching
> MGLRU on or off as frequently as possible. The only reliable way is to check
> enabled during shrinking while holding the lruvec’s lock.
Sorry, I was talking to myself.... Since the switching and the 'enabled'
state are not inherently serialized with shrink_lruvec(), their values
can change at any time, leading to race conditions.
Therefore, I believe the only safe approach is:
1. Do not allow enabling or disabling MGLRU on an lruvec while
shrink_lruvec() is running.
2. Do not allow shrink_lruvec() to run while MGLRU is being enabled
or disabled on that lruvec.
Something like the following:
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 3e51190a55e4..c4b07159577e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -509,6 +509,7 @@ struct lru_gen_folio {
atomic_long_t refaulted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS];
/* whether the multi-gen LRU is enabled */
bool enabled;
+ struct rw_semaphore switch_lock;
/* the memcg generation this lru_gen_folio belongs to */
u8 gen;
/* the list segment this lru_gen_folio belongs to */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0fc9373e8251..aadf1e7c31cf 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5190,6 +5190,7 @@ static void lru_gen_change_state(bool enabled)
for_each_node(nid) {
struct lruvec *lruvec = get_lruvec(memcg, nid);
+ down_write(&lruvec->lrugen.switch_lock);
spin_lock_irq(&lruvec->lru_lock);
VM_WARN_ON_ONCE(!seq_is_valid(lruvec));
@@ -5204,6 +5205,7 @@ static void lru_gen_change_state(bool enabled)
}
spin_unlock_irq(&lruvec->lru_lock);
+ up_write(&lruvec->lrugen.switch_lock);
}
cond_resched();
@@ -5680,6 +5682,7 @@ void lru_gen_init_lruvec(struct lruvec *lruvec)
lrugen->max_seq = MIN_NR_GENS + 1;
lrugen->enabled = lru_gen_enabled();
+ init_rwsem(&lrugen->switch_lock);
for (i = 0; i <= MIN_NR_GENS + 1; i++)
lrugen->timestamps[i] = jiffies;
@@ -5780,10 +5783,14 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
bool proportional_reclaim;
struct blk_plug plug;
- if (lru_gen_enabled() && !root_reclaim(sc)) {
+#ifdef CONFIG_LRU_GEN
+ down_read(&lruvec->lrugen.switch_lock);
+ if (lruvec->lrugen.enabled && !root_reclaim(sc)) {
lru_gen_shrink_lruvec(lruvec, sc);
+ up_read(&lruvec->lrugen.switch_lock);
return;
}
+#endif
get_scan_count(lruvec, sc, nr);
@@ -5885,6 +5892,9 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
inactive_is_low(lruvec, LRU_INACTIVE_ANON))
shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
sc, LRU_ACTIVE_ANON);
+#ifdef CONFIG_LRU_GEN
+ up_read(&lruvec->lrugen.switch_lock);
+#endif
}
/* Use reclaim/compaction for costly allocs or under memory pressure */
--
Thanks
Barry
prev parent reply other threads:[~2026-03-01 4:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-28 16:10 Leno Hou
2026-02-28 18:58 ` Andrew Morton
2026-02-28 19:12 ` kernel test robot
2026-02-28 19:23 ` kernel test robot
2026-02-28 20:15 ` kernel test robot
2026-02-28 21:28 ` Barry Song
2026-02-28 22:41 ` Barry Song
2026-03-01 4:10 ` Barry Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260301041022.66111-1-21cnbao@gmail.com \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=laoar.shao@gmail.com \
--cc=lenohou@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=weixugc@google.com \
--cc=wjl.linux@gmail.com \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox