* [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
@ 2026-02-28 16:10 Leno Hou
2026-02-28 18:58 ` Andrew Morton
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Leno Hou @ 2026-02-28 16:10 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: Leno Hou, Andrew Morton, Axel Rasmussen, Yuanchu Xie, Wei Xu,
Barry Song, Jialing Wang, Yafang Shao, Yu Zhao
When the Multi-Gen LRU (MGLRU) state is toggled dynamically, a race
condition exists between the state switching and the memory reclaim
path. This can lead to unexpected cgroup OOM kills, even when plenty of
reclaimable memory is available.
*** Problem Description ***
The issue arises from a "reclaim vacuum" during the transition:
1. When disabling MGLRU, lru_gen_change_state() sets lrugen->enabled to
false before the pages are drained from MGLRU lists back to
traditional LRU lists.
2. Concurrent reclaimers in shrink_lruvec() see lrugen->enabled as false
and skip the MGLRU path.
3. However, these pages might not have reached the traditional LRU lists
yet, or the changes are not yet visible to all CPUs due to a lack of
synchronization.
4. get_scan_count() subsequently finds traditional LRU lists empty,
concludes there is no reclaimable memory, and triggers an OOM kill.
A similar race can occur during enablement, where the reclaimer sees
the new state but the MGLRU lists haven't been populated via
fill_evictable() yet.
*** Solution ***
Introduce a 'draining' state to bridge the gap during transitions:
- Use smp_store_release() and smp_load_acquire() to ensure the visibility
of 'enabled' and 'draining' flags across CPUs.
- Modify shrink_lruvec() to allow a "joint reclaim" period. If an lruvec
is in the 'draining' state, the reclaimer will attempt to scan MGLRU
lists first, and then fall through to traditional LRU lists instead
of returning early. This ensures that folios are visible to at least
one reclaim path at any given time.
*** Reproduction ***
The issue was consistently reproduced on v6.1.157 and v6.18.3 using
a high-pressure memory cgroup (v1) environment.
Reproduction steps:
1. Create a 16GB memcg and populate it with 10GB file cache (5GB active)
and 8GB active anonymous memory.
2. Toggle MGLRU state while performing new memory allocations to force
direct reclaim.
Reproduction script:
---
#!/bin/bash
# Fixed reproduction for memcg OOM during MGLRU toggle
set -euo pipefail
MGLRU_FILE="/sys/kernel/mm/lru_gen/enabled"
CGROUP_PATH="/sys/fs/cgroup/memory/memcg_oom_test"
# Switch MGLRU dynamically in the background
switch_mglru() {
local orig_val=$(cat "$MGLRU_FILE")
if [[ "$orig_val" != "0x0000" ]]; then
echo n > "$MGLRU_FILE" &
else
echo y > "$MGLRU_FILE" &
fi
}
# Setup 16G memcg
mkdir -p "$CGROUP_PATH"
echo $((16 * 1024 * 1024 * 1024)) > "$CGROUP_PATH/memory.limit_in_bytes"
echo $$ > "$CGROUP_PATH/cgroup.procs"
# 1. Build memory pressure (File + Anon)
dd if=/dev/urandom of=/tmp/test_file bs=1M count=10240
dd if=/tmp/test_file of=/dev/null bs=1M # Warm up cache
stress-ng --vm 1 --vm-bytes 8G --vm-keep -t 600 &
sleep 5
# 2. Trigger switch and concurrent allocation
switch_mglru
stress-ng --vm 1 --vm-bytes 2G --vm-populate --timeout 5s || echo "OOM Triggered"
# Check OOM counter
grep oom_kill "$CGROUP_PATH/memory.oom_control"
---
Signed-off-by: Leno Hou <lenohou@gmail.com>
---
To: linux-mm@kvack.org
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Barry Song <21cnbao@gmail.com>
Cc: Jialing Wang <wjl.linux@gmail.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
---
include/linux/mmzone.h | 2 ++
mm/vmscan.c | 14 +++++++++++---
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 7fb7331c5725..0648ce91dbc6 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -509,6 +509,8 @@ struct lru_gen_folio {
atomic_long_t refaulted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS];
/* whether the multi-gen LRU is enabled */
bool enabled;
+ /* whether the multi-gen LRU is draining to LRU */
+ bool draining;
/* the memcg generation this lru_gen_folio belongs to */
u8 gen;
/* the list segment this lru_gen_folio belongs to */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 06071995dacc..629a00681163 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5222,7 +5222,8 @@ static void lru_gen_change_state(bool enabled)
VM_WARN_ON_ONCE(!seq_is_valid(lruvec));
VM_WARN_ON_ONCE(!state_is_valid(lruvec));
- lruvec->lrugen.enabled = enabled;
+ smp_store_release(&lruvec->lrugen.enabled, enabled);
+ smp_store_release(&lruvec->lrugen.draining, true);
while (!(enabled ? fill_evictable(lruvec) : drain_evictable(lruvec))) {
spin_unlock_irq(&lruvec->lru_lock);
@@ -5230,6 +5231,8 @@ static void lru_gen_change_state(bool enabled)
spin_lock_irq(&lruvec->lru_lock);
}
+ smp_store_release(&lruvec->lrugen.draining, false);
+
spin_unlock_irq(&lruvec->lru_lock);
}
@@ -5813,10 +5816,15 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
unsigned long nr_to_reclaim = sc->nr_to_reclaim;
bool proportional_reclaim;
struct blk_plug plug;
+ bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
+ bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
- if (lru_gen_enabled() && !root_reclaim(sc)) {
+ if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
lru_gen_shrink_lruvec(lruvec, sc);
- return;
+
+ if (!lru_draining)
+ return;
+
}
get_scan_count(lruvec, sc, nr);
--
2.52.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
2026-02-28 16:10 [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching Leno Hou
@ 2026-02-28 18:58 ` Andrew Morton
2026-02-28 19:12 ` kernel test robot
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2026-02-28 18:58 UTC (permalink / raw)
To: Leno Hou
Cc: linux-mm, linux-kernel, Axel Rasmussen, Yuanchu Xie, Wei Xu,
Barry Song, Jialing Wang, Yafang Shao, Yu Zhao
On Sun, 1 Mar 2026 00:10:08 +0800 Leno Hou <lenohou@gmail.com> wrote:
> When the Multi-Gen LRU (MGLRU) state is toggled dynamically, a race
> condition exists between the state switching and the memory reclaim
> path. This can lead to unexpected cgroup OOM kills, even when plenty of
> reclaimable memory is available.
>
> ...
>
Nice description, thanks. I'll queue this for testing while we await
comments.
>
> Reproduction script:
> ---
Please avoid using the ^---$ separator in changelogs - it means "end of
changelog text"!
> Signed-off-by: Leno Hou <lenohou@gmail.com>
>
> ---
Ditto.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
2026-02-28 16:10 [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching Leno Hou
2026-02-28 18:58 ` Andrew Morton
@ 2026-02-28 19:12 ` kernel test robot
2026-02-28 19:23 ` kernel test robot
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2026-02-28 19:12 UTC (permalink / raw)
To: Leno Hou, linux-mm, linux-kernel
Cc: oe-kbuild-all, Leno Hou, Andrew Morton,
Linux Memory Management List, Axel Rasmussen, Yuanchu Xie,
Wei Xu, Barry Song, Jialing Wang, Yafang Shao, Yu Zhao
Hi Leno,
kernel test robot noticed the following build errors:
[auto build test ERROR on v7.0-rc1]
[also build test ERROR on linus/master next-20260227]
[cannot apply to akpm-mm/mm-everything]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Leno-Hou/mm-mglru-fix-cgroup-OOM-during-MGLRU-state-switching/20260301-001148
base: v7.0-rc1
patch link: https://lore.kernel.org/r/20260228161008.707-1-lenohou%40gmail.com
patch subject: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
config: x86_64-randconfig-001-20260301 (https://download.01.org/0day-ci/archive/20260301/202603010315.rTOWjv41-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260301/202603010315.rTOWjv41-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603010315.rTOWjv41-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
In file included from include/asm-generic/bitops/generic-non-atomic.h:7,
from include/linux/bitops.h:28,
from include/linux/thread_info.h:27,
from include/linux/spinlock.h:60,
from include/linux/mmzone.h:8,
from include/linux/gfp.h:7,
from include/linux/mm.h:8,
from mm/vmscan.c:15:
mm/vmscan.c: In function 'shrink_lruvec':
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
arch/x86/include/asm/barrier.h:68:17: note: in definition of macro '__smp_load_acquire'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
In file included from <command-line>:
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:642:53: note: in definition of macro '__unqual_scalar_typeof'
642 | #define __unqual_scalar_typeof(x) __typeof_unqual__(x)
| ^
include/asm-generic/rwonce.h:50:9: note: in expansion of macro '__READ_ONCE'
50 | __READ_ONCE(x); \
| ^~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
In file included from ./arch/x86/include/generated/asm/rwonce.h:1,
from include/linux/compiler.h:372,
from include/linux/static_call_types.h:7,
from arch/x86/include/asm/bug.h:141,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/mm.h:7:
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/asm-generic/rwonce.h:44:73: note: in definition of macro '__READ_ONCE'
44 | #define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
| ^
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:9: note: in expansion of macro 'compiletime_assert'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:28: note: in expansion of macro '__native_word'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:69:9: note: in expansion of macro 'compiletime_assert_atomic_type'
69 | compiletime_assert_atomic_type(*p); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:9: note: in expansion of macro 'compiletime_assert'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:28: note: in expansion of macro '__native_word'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:69:9: note: in expansion of macro 'compiletime_assert_atomic_type'
69 | compiletime_assert_atomic_type(*p); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:9: note: in expansion of macro 'compiletime_assert'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:28: note: in expansion of macro '__native_word'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:69:9: note: in expansion of macro 'compiletime_assert_atomic_type'
69 | compiletime_assert_atomic_type(*p); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
>> mm/vmscan.c:5785:55: error: 'struct lruvec' has no member named 'lrugen'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:9: note: in expansion of macro 'compiletime_assert'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~~~~~~
include/linux/compiler_types.h:709:28: note: in expansion of macro '__native_word'
709 | compiletime_assert(__native_word(t), \
| ^~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:69:9: note: in expansion of macro 'compiletime_assert_atomic_type'
69 | compiletime_assert_atomic_type(*p); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5785:31: note: in expansion of macro 'smp_load_acquire'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ^~~~~~~~~~~~~~~~
mm/vmscan.c:5786:53: error: 'struct lruvec' has no member named 'lrugen'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ^~
arch/x86/include/asm/barrier.h:68:17: note: in definition of macro '__smp_load_acquire'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^
mm/vmscan.c:5786:29: note: in expansion of macro 'smp_load_acquire'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ^~~~~~~~~~~~~~~~
mm/vmscan.c:5786:53: error: 'struct lruvec' has no member named 'lrugen'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5786:29: note: in expansion of macro 'smp_load_acquire'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ^~~~~~~~~~~~~~~~
mm/vmscan.c:5786:53: error: 'struct lruvec' has no member named 'lrugen'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/barrier.h:68:28: note: in expansion of macro 'READ_ONCE'
68 | typeof(*p) ___p1 = READ_ONCE(*p); \
| ^~~~~~~~~
include/asm-generic/barrier.h:176:29: note: in expansion of macro '__smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
mm/vmscan.c:5786:29: note: in expansion of macro 'smp_load_acquire'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ^~~~~~~~~~~~~~~~
mm/vmscan.c:5786:53: error: 'struct lruvec' has no member named 'lrugen'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ^~
include/linux/compiler_types.h:686:23: note: in definition of macro '__compiletime_assert'
686 | if (!(condition)) \
| ^~~~~~~~~
include/linux/compiler_types.h:706:9: note: in expansion of macro '_compiletime_assert'
706 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:28: note: in expansion of macro '__native_word'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
..
vim +5785 mm/vmscan.c
5774
5775 static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
5776 {
5777 unsigned long nr[NR_LRU_LISTS];
5778 unsigned long targets[NR_LRU_LISTS];
5779 unsigned long nr_to_scan;
5780 enum lru_list lru;
5781 unsigned long nr_reclaimed = 0;
5782 unsigned long nr_to_reclaim = sc->nr_to_reclaim;
5783 bool proportional_reclaim;
5784 struct blk_plug plug;
> 5785 bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
5786 bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
5787
> 5788 if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
5789 lru_gen_shrink_lruvec(lruvec, sc);
5790
5791 if (!lru_draining)
5792 return;
5793
5794 }
5795
5796 get_scan_count(lruvec, sc, nr);
5797
5798 /* Record the original scan target for proportional adjustments later */
5799 memcpy(targets, nr, sizeof(nr));
5800
5801 /*
5802 * Global reclaiming within direct reclaim at DEF_PRIORITY is a normal
5803 * event that can occur when there is little memory pressure e.g.
5804 * multiple streaming readers/writers. Hence, we do not abort scanning
5805 * when the requested number of pages are reclaimed when scanning at
5806 * DEF_PRIORITY on the assumption that the fact we are direct
5807 * reclaiming implies that kswapd is not keeping up and it is best to
5808 * do a batch of work at once. For memcg reclaim one check is made to
5809 * abort proportional reclaim if either the file or anon lru has already
5810 * dropped to zero at the first pass.
5811 */
5812 proportional_reclaim = (!cgroup_reclaim(sc) && !current_is_kswapd() &&
5813 sc->priority == DEF_PRIORITY);
5814
5815 blk_start_plug(&plug);
5816 while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
5817 nr[LRU_INACTIVE_FILE]) {
5818 unsigned long nr_anon, nr_file, percentage;
5819 unsigned long nr_scanned;
5820
5821 for_each_evictable_lru(lru) {
5822 if (nr[lru]) {
5823 nr_to_scan = min(nr[lru], SWAP_CLUSTER_MAX);
5824 nr[lru] -= nr_to_scan;
5825
5826 nr_reclaimed += shrink_list(lru, nr_to_scan,
5827 lruvec, sc);
5828 }
5829 }
5830
5831 cond_resched();
5832
5833 if (nr_reclaimed < nr_to_reclaim || proportional_reclaim)
5834 continue;
5835
5836 /*
5837 * For kswapd and memcg, reclaim at least the number of pages
5838 * requested. Ensure that the anon and file LRUs are scanned
5839 * proportionally what was requested by get_scan_count(). We
5840 * stop reclaiming one LRU and reduce the amount scanning
5841 * proportional to the original scan target.
5842 */
5843 nr_file = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE];
5844 nr_anon = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON];
5845
5846 /*
5847 * It's just vindictive to attack the larger once the smaller
5848 * has gone to zero. And given the way we stop scanning the
5849 * smaller below, this makes sure that we only make one nudge
5850 * towards proportionality once we've got nr_to_reclaim.
5851 */
5852 if (!nr_file || !nr_anon)
5853 break;
5854
5855 if (nr_file > nr_anon) {
5856 unsigned long scan_target = targets[LRU_INACTIVE_ANON] +
5857 targets[LRU_ACTIVE_ANON] + 1;
5858 lru = LRU_BASE;
5859 percentage = nr_anon * 100 / scan_target;
5860 } else {
5861 unsigned long scan_target = targets[LRU_INACTIVE_FILE] +
5862 targets[LRU_ACTIVE_FILE] + 1;
5863 lru = LRU_FILE;
5864 percentage = nr_file * 100 / scan_target;
5865 }
5866
5867 /* Stop scanning the smaller of the LRU */
5868 nr[lru] = 0;
5869 nr[lru + LRU_ACTIVE] = 0;
5870
5871 /*
5872 * Recalculate the other LRU scan count based on its original
5873 * scan target and the percentage scanning already complete
5874 */
5875 lru = (lru == LRU_FILE) ? LRU_BASE : LRU_FILE;
5876 nr_scanned = targets[lru] - nr[lru];
5877 nr[lru] = targets[lru] * (100 - percentage) / 100;
5878 nr[lru] -= min(nr[lru], nr_scanned);
5879
5880 lru += LRU_ACTIVE;
5881 nr_scanned = targets[lru] - nr[lru];
5882 nr[lru] = targets[lru] * (100 - percentage) / 100;
5883 nr[lru] -= min(nr[lru], nr_scanned);
5884 }
5885 blk_finish_plug(&plug);
5886 sc->nr_reclaimed += nr_reclaimed;
5887
5888 /*
5889 * Even if we did not try to evict anon pages at all, we want to
5890 * rebalance the anon lru active/inactive ratio.
5891 */
5892 if (can_age_anon_pages(lruvec, sc) &&
5893 inactive_is_low(lruvec, LRU_INACTIVE_ANON))
5894 shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
5895 sc, LRU_ACTIVE_ANON);
5896 }
5897
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
2026-02-28 16:10 [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching Leno Hou
2026-02-28 18:58 ` Andrew Morton
2026-02-28 19:12 ` kernel test robot
@ 2026-02-28 19:23 ` kernel test robot
2026-02-28 20:15 ` kernel test robot
2026-02-28 21:28 ` Barry Song
4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2026-02-28 19:23 UTC (permalink / raw)
To: Leno Hou, linux-mm, linux-kernel
Cc: llvm, oe-kbuild-all, Leno Hou, Andrew Morton,
Linux Memory Management List, Axel Rasmussen, Yuanchu Xie,
Wei Xu, Barry Song, Jialing Wang, Yafang Shao, Yu Zhao
Hi Leno,
kernel test robot noticed the following build warnings:
[auto build test WARNING on v7.0-rc1]
[also build test WARNING on linus/master next-20260227]
[cannot apply to akpm-mm/mm-everything]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Leno-Hou/mm-mglru-fix-cgroup-OOM-during-MGLRU-state-switching/20260301-001148
base: v7.0-rc1
patch link: https://lore.kernel.org/r/20260228161008.707-1-lenohou%40gmail.com
patch subject: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
config: um-defconfig (https://download.01.org/0day-ci/archive/20260301/202603010300.t6GYRWjK-lkp@intel.com/config)
compiler: clang version 23.0.0git (https://github.com/llvm/llvm-project 9a109fbb6e184ec9bcce10615949f598f4c974a9)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260301/202603010300.t6GYRWjK-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603010300.t6GYRWjK-lkp@intel.com/
All warnings (new ones prefixed by >>):
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
>> mm/vmscan.c:5788:37: warning: '&&' within '||' [-Wlogical-op-parentheses]
5788 | if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
| ~~ ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
mm/vmscan.c:5788:37: note: place parentheses around the '&&' expression to silence this warning
5788 | if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
| ^
| ( )
1 warning and 18 errors generated.
vim +5788 mm/vmscan.c
5774
5775 static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
5776 {
5777 unsigned long nr[NR_LRU_LISTS];
5778 unsigned long targets[NR_LRU_LISTS];
5779 unsigned long nr_to_scan;
5780 enum lru_list lru;
5781 unsigned long nr_reclaimed = 0;
5782 unsigned long nr_to_reclaim = sc->nr_to_reclaim;
5783 bool proportional_reclaim;
5784 struct blk_plug plug;
5785 bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
5786 bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
5787
> 5788 if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
5789 lru_gen_shrink_lruvec(lruvec, sc);
5790
5791 if (!lru_draining)
5792 return;
5793
5794 }
5795
5796 get_scan_count(lruvec, sc, nr);
5797
5798 /* Record the original scan target for proportional adjustments later */
5799 memcpy(targets, nr, sizeof(nr));
5800
5801 /*
5802 * Global reclaiming within direct reclaim at DEF_PRIORITY is a normal
5803 * event that can occur when there is little memory pressure e.g.
5804 * multiple streaming readers/writers. Hence, we do not abort scanning
5805 * when the requested number of pages are reclaimed when scanning at
5806 * DEF_PRIORITY on the assumption that the fact we are direct
5807 * reclaiming implies that kswapd is not keeping up and it is best to
5808 * do a batch of work at once. For memcg reclaim one check is made to
5809 * abort proportional reclaim if either the file or anon lru has already
5810 * dropped to zero at the first pass.
5811 */
5812 proportional_reclaim = (!cgroup_reclaim(sc) && !current_is_kswapd() &&
5813 sc->priority == DEF_PRIORITY);
5814
5815 blk_start_plug(&plug);
5816 while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
5817 nr[LRU_INACTIVE_FILE]) {
5818 unsigned long nr_anon, nr_file, percentage;
5819 unsigned long nr_scanned;
5820
5821 for_each_evictable_lru(lru) {
5822 if (nr[lru]) {
5823 nr_to_scan = min(nr[lru], SWAP_CLUSTER_MAX);
5824 nr[lru] -= nr_to_scan;
5825
5826 nr_reclaimed += shrink_list(lru, nr_to_scan,
5827 lruvec, sc);
5828 }
5829 }
5830
5831 cond_resched();
5832
5833 if (nr_reclaimed < nr_to_reclaim || proportional_reclaim)
5834 continue;
5835
5836 /*
5837 * For kswapd and memcg, reclaim at least the number of pages
5838 * requested. Ensure that the anon and file LRUs are scanned
5839 * proportionally what was requested by get_scan_count(). We
5840 * stop reclaiming one LRU and reduce the amount scanning
5841 * proportional to the original scan target.
5842 */
5843 nr_file = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE];
5844 nr_anon = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON];
5845
5846 /*
5847 * It's just vindictive to attack the larger once the smaller
5848 * has gone to zero. And given the way we stop scanning the
5849 * smaller below, this makes sure that we only make one nudge
5850 * towards proportionality once we've got nr_to_reclaim.
5851 */
5852 if (!nr_file || !nr_anon)
5853 break;
5854
5855 if (nr_file > nr_anon) {
5856 unsigned long scan_target = targets[LRU_INACTIVE_ANON] +
5857 targets[LRU_ACTIVE_ANON] + 1;
5858 lru = LRU_BASE;
5859 percentage = nr_anon * 100 / scan_target;
5860 } else {
5861 unsigned long scan_target = targets[LRU_INACTIVE_FILE] +
5862 targets[LRU_ACTIVE_FILE] + 1;
5863 lru = LRU_FILE;
5864 percentage = nr_file * 100 / scan_target;
5865 }
5866
5867 /* Stop scanning the smaller of the LRU */
5868 nr[lru] = 0;
5869 nr[lru + LRU_ACTIVE] = 0;
5870
5871 /*
5872 * Recalculate the other LRU scan count based on its original
5873 * scan target and the percentage scanning already complete
5874 */
5875 lru = (lru == LRU_FILE) ? LRU_BASE : LRU_FILE;
5876 nr_scanned = targets[lru] - nr[lru];
5877 nr[lru] = targets[lru] * (100 - percentage) / 100;
5878 nr[lru] -= min(nr[lru], nr_scanned);
5879
5880 lru += LRU_ACTIVE;
5881 nr_scanned = targets[lru] - nr[lru];
5882 nr[lru] = targets[lru] * (100 - percentage) / 100;
5883 nr[lru] -= min(nr[lru], nr_scanned);
5884 }
5885 blk_finish_plug(&plug);
5886 sc->nr_reclaimed += nr_reclaimed;
5887
5888 /*
5889 * Even if we did not try to evict anon pages at all, we want to
5890 * rebalance the anon lru active/inactive ratio.
5891 */
5892 if (can_age_anon_pages(lruvec, sc) &&
5893 inactive_is_low(lruvec, LRU_INACTIVE_ANON))
5894 shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
5895 sc, LRU_ACTIVE_ANON);
5896 }
5897
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
2026-02-28 16:10 [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching Leno Hou
` (2 preceding siblings ...)
2026-02-28 19:23 ` kernel test robot
@ 2026-02-28 20:15 ` kernel test robot
2026-02-28 21:28 ` Barry Song
4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2026-02-28 20:15 UTC (permalink / raw)
To: Leno Hou, linux-mm, linux-kernel
Cc: llvm, oe-kbuild-all, Leno Hou, Andrew Morton,
Linux Memory Management List, Axel Rasmussen, Yuanchu Xie,
Wei Xu, Barry Song, Jialing Wang, Yafang Shao, Yu Zhao
Hi Leno,
kernel test robot noticed the following build errors:
[auto build test ERROR on v7.0-rc1]
[also build test ERROR on linus/master next-20260227]
[cannot apply to akpm-mm/mm-everything]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Leno-Hou/mm-mglru-fix-cgroup-OOM-during-MGLRU-state-switching/20260301-001148
base: v7.0-rc1
patch link: https://lore.kernel.org/r/20260228161008.707-1-lenohou%40gmail.com
patch subject: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
config: um-defconfig (https://download.01.org/0day-ci/archive/20260301/202603010435.MBtvBCTp-lkp@intel.com/config)
compiler: clang version 23.0.0git (https://github.com/llvm/llvm-project 9a109fbb6e184ec9bcce10615949f598f4c974a9)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260301/202603010435.MBtvBCTp-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603010435.MBtvBCTp-lkp@intel.com/
All errors (new ones prefixed by >>):
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
>> mm/vmscan.c:5785:50: error: no member named 'lrugen' in 'struct lruvec'
5785 | bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5786:48: error: no member named 'lrugen' in 'struct lruvec'
5786 | bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
| ~~~~~~ ^
mm/vmscan.c:5788:37: warning: '&&' within '||' [-Wlogical-op-parentheses]
5788 | if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
| ~~ ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
mm/vmscan.c:5788:37: note: place parentheses around the '&&' expression to silence this warning
5788 | if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
| ^
| ( )
1 warning and 18 errors generated.
vim +5785 mm/vmscan.c
5774
5775 static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
5776 {
5777 unsigned long nr[NR_LRU_LISTS];
5778 unsigned long targets[NR_LRU_LISTS];
5779 unsigned long nr_to_scan;
5780 enum lru_list lru;
5781 unsigned long nr_reclaimed = 0;
5782 unsigned long nr_to_reclaim = sc->nr_to_reclaim;
5783 bool proportional_reclaim;
5784 struct blk_plug plug;
> 5785 bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
5786 bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
5787
5788 if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
5789 lru_gen_shrink_lruvec(lruvec, sc);
5790
5791 if (!lru_draining)
5792 return;
5793
5794 }
5795
5796 get_scan_count(lruvec, sc, nr);
5797
5798 /* Record the original scan target for proportional adjustments later */
5799 memcpy(targets, nr, sizeof(nr));
5800
5801 /*
5802 * Global reclaiming within direct reclaim at DEF_PRIORITY is a normal
5803 * event that can occur when there is little memory pressure e.g.
5804 * multiple streaming readers/writers. Hence, we do not abort scanning
5805 * when the requested number of pages are reclaimed when scanning at
5806 * DEF_PRIORITY on the assumption that the fact we are direct
5807 * reclaiming implies that kswapd is not keeping up and it is best to
5808 * do a batch of work at once. For memcg reclaim one check is made to
5809 * abort proportional reclaim if either the file or anon lru has already
5810 * dropped to zero at the first pass.
5811 */
5812 proportional_reclaim = (!cgroup_reclaim(sc) && !current_is_kswapd() &&
5813 sc->priority == DEF_PRIORITY);
5814
5815 blk_start_plug(&plug);
5816 while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
5817 nr[LRU_INACTIVE_FILE]) {
5818 unsigned long nr_anon, nr_file, percentage;
5819 unsigned long nr_scanned;
5820
5821 for_each_evictable_lru(lru) {
5822 if (nr[lru]) {
5823 nr_to_scan = min(nr[lru], SWAP_CLUSTER_MAX);
5824 nr[lru] -= nr_to_scan;
5825
5826 nr_reclaimed += shrink_list(lru, nr_to_scan,
5827 lruvec, sc);
5828 }
5829 }
5830
5831 cond_resched();
5832
5833 if (nr_reclaimed < nr_to_reclaim || proportional_reclaim)
5834 continue;
5835
5836 /*
5837 * For kswapd and memcg, reclaim at least the number of pages
5838 * requested. Ensure that the anon and file LRUs are scanned
5839 * proportionally what was requested by get_scan_count(). We
5840 * stop reclaiming one LRU and reduce the amount scanning
5841 * proportional to the original scan target.
5842 */
5843 nr_file = nr[LRU_INACTIVE_FILE] + nr[LRU_ACTIVE_FILE];
5844 nr_anon = nr[LRU_INACTIVE_ANON] + nr[LRU_ACTIVE_ANON];
5845
5846 /*
5847 * It's just vindictive to attack the larger once the smaller
5848 * has gone to zero. And given the way we stop scanning the
5849 * smaller below, this makes sure that we only make one nudge
5850 * towards proportionality once we've got nr_to_reclaim.
5851 */
5852 if (!nr_file || !nr_anon)
5853 break;
5854
5855 if (nr_file > nr_anon) {
5856 unsigned long scan_target = targets[LRU_INACTIVE_ANON] +
5857 targets[LRU_ACTIVE_ANON] + 1;
5858 lru = LRU_BASE;
5859 percentage = nr_anon * 100 / scan_target;
5860 } else {
5861 unsigned long scan_target = targets[LRU_INACTIVE_FILE] +
5862 targets[LRU_ACTIVE_FILE] + 1;
5863 lru = LRU_FILE;
5864 percentage = nr_file * 100 / scan_target;
5865 }
5866
5867 /* Stop scanning the smaller of the LRU */
5868 nr[lru] = 0;
5869 nr[lru + LRU_ACTIVE] = 0;
5870
5871 /*
5872 * Recalculate the other LRU scan count based on its original
5873 * scan target and the percentage scanning already complete
5874 */
5875 lru = (lru == LRU_FILE) ? LRU_BASE : LRU_FILE;
5876 nr_scanned = targets[lru] - nr[lru];
5877 nr[lru] = targets[lru] * (100 - percentage) / 100;
5878 nr[lru] -= min(nr[lru], nr_scanned);
5879
5880 lru += LRU_ACTIVE;
5881 nr_scanned = targets[lru] - nr[lru];
5882 nr[lru] = targets[lru] * (100 - percentage) / 100;
5883 nr[lru] -= min(nr[lru], nr_scanned);
5884 }
5885 blk_finish_plug(&plug);
5886 sc->nr_reclaimed += nr_reclaimed;
5887
5888 /*
5889 * Even if we did not try to evict anon pages at all, we want to
5890 * rebalance the anon lru active/inactive ratio.
5891 */
5892 if (can_age_anon_pages(lruvec, sc) &&
5893 inactive_is_low(lruvec, LRU_INACTIVE_ANON))
5894 shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
5895 sc, LRU_ACTIVE_ANON);
5896 }
5897
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching
2026-02-28 16:10 [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching Leno Hou
` (3 preceding siblings ...)
2026-02-28 20:15 ` kernel test robot
@ 2026-02-28 21:28 ` Barry Song
4 siblings, 0 replies; 6+ messages in thread
From: Barry Song @ 2026-02-28 21:28 UTC (permalink / raw)
To: lenohou
Cc: 21cnbao, akpm, axelrasmussen, laoar.shao, linux-kernel, linux-mm,
weixugc, wjl.linux, yuanchu, yuzhao
On Sun, Mar 1, 2026 at 12:10 AM Leno Hou <lenohou@gmail.com> wrote:
>
> When the Multi-Gen LRU (MGLRU) state is toggled dynamically, a race
> condition exists between the state switching and the memory reclaim
> path. This can lead to unexpected cgroup OOM kills, even when plenty of
> reclaimable memory is available.
>
> *** Problem Description ***
>
> The issue arises from a "reclaim vacuum" during the transition:
>
> 1. When disabling MGLRU, lru_gen_change_state() sets lrugen->enabled to
> false before the pages are drained from MGLRU lists back to
> traditional LRU lists.
> 2. Concurrent reclaimers in shrink_lruvec() see lrugen->enabled as false
> and skip the MGLRU path.
> 3. However, these pages might not have reached the traditional LRU lists
> yet, or the changes are not yet visible to all CPUs due to a lack of
> synchronization.
> 4. get_scan_count() subsequently finds traditional LRU lists empty,
> concludes there is no reclaimable memory, and triggers an OOM kill.
>
> A similar race can occur during enablement, where the reclaimer sees
> the new state but the MGLRU lists haven't been populated via
> fill_evictable() yet.
>
> *** Solution ***
>
> Introduce a 'draining' state to bridge the gap during transitions:
>
> - Use smp_store_release() and smp_load_acquire() to ensure the visibility
> of 'enabled' and 'draining' flags across CPUs.
> - Modify shrink_lruvec() to allow a "joint reclaim" period. If an lruvec
> is in the 'draining' state, the reclaimer will attempt to scan MGLRU
> lists first, and then fall through to traditional LRU lists instead
> of returning early. This ensures that folios are visible to at least
> one reclaim path at any given time.
>
> *** Reproduction ***
>
> The issue was consistently reproduced on v6.1.157 and v6.18.3 using
> a high-pressure memory cgroup (v1) environment.
>
> Reproduction steps:
> 1. Create a 16GB memcg and populate it with 10GB file cache (5GB active)
> and 8GB active anonymous memory.
> 2. Toggle MGLRU state while performing new memory allocations to force
> direct reclaim.
>
> Reproduction script:
> ---
> #!/bin/bash
> # Fixed reproduction for memcg OOM during MGLRU toggle
> set -euo pipefail
>
> MGLRU_FILE="/sys/kernel/mm/lru_gen/enabled"
> CGROUP_PATH="/sys/fs/cgroup/memory/memcg_oom_test"
>
> # Switch MGLRU dynamically in the background
> switch_mglru() {
> local orig_val=$(cat "$MGLRU_FILE")
> if [[ "$orig_val" != "0x0000" ]]; then
> echo n > "$MGLRU_FILE" &
> else
> echo y > "$MGLRU_FILE" &
> fi
> }
>
> # Setup 16G memcg
> mkdir -p "$CGROUP_PATH"
> echo $((16 * 1024 * 1024 * 1024)) > "$CGROUP_PATH/memory.limit_in_bytes"
> echo $$ > "$CGROUP_PATH/cgroup.procs"
>
> # 1. Build memory pressure (File + Anon)
> dd if=/dev/urandom of=/tmp/test_file bs=1M count=10240
> dd if=/tmp/test_file of=/dev/null bs=1M # Warm up cache
>
> stress-ng --vm 1 --vm-bytes 8G --vm-keep -t 600 &
> sleep 5
>
> # 2. Trigger switch and concurrent allocation
> switch_mglru
> stress-ng --vm 1 --vm-bytes 2G --vm-populate --timeout 5s || echo "OOM Triggered"
>
> # Check OOM counter
> grep oom_kill "$CGROUP_PATH/memory.oom_control"
> ---
>
> Signed-off-by: Leno Hou <lenohou@gmail.com>
>
> ---
> To: linux-mm@kvack.org
> To: linux-kernel@vger.kernel.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Axel Rasmussen <axelrasmussen@google.com>
> Cc: Yuanchu Xie <yuanchu@google.com>
> Cc: Wei Xu <weixugc@google.com>
> Cc: Barry Song <21cnbao@gmail.com>
> Cc: Jialing Wang <wjl.linux@gmail.com>
> Cc: Yafang Shao <laoar.shao@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> include/linux/mmzone.h | 2 ++
> mm/vmscan.c | 14 +++++++++++---
> 2 files changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 7fb7331c5725..0648ce91dbc6 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -509,6 +509,8 @@ struct lru_gen_folio {
> atomic_long_t refaulted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS];
> /* whether the multi-gen LRU is enabled */
> bool enabled;
> + /* whether the multi-gen LRU is draining to LRU */
> + bool draining;
> /* the memcg generation this lru_gen_folio belongs to */
> u8 gen;
> /* the list segment this lru_gen_folio belongs to */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 06071995dacc..629a00681163 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -5222,7 +5222,8 @@ static void lru_gen_change_state(bool enabled)
> VM_WARN_ON_ONCE(!seq_is_valid(lruvec));
> VM_WARN_ON_ONCE(!state_is_valid(lruvec));
>
> - lruvec->lrugen.enabled = enabled;
> + smp_store_release(&lruvec->lrugen.enabled, enabled);
> + smp_store_release(&lruvec->lrugen.draining, true);
>
> while (!(enabled ? fill_evictable(lruvec) : drain_evictable(lruvec))) {
> spin_unlock_irq(&lruvec->lru_lock);
> @@ -5230,6 +5231,8 @@ static void lru_gen_change_state(bool enabled)
> spin_lock_irq(&lruvec->lru_lock);
> }
>
> + smp_store_release(&lruvec->lrugen.draining, false);
> +
> spin_unlock_irq(&lruvec->lru_lock);
> }
>
> @@ -5813,10 +5816,15 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
> unsigned long nr_to_reclaim = sc->nr_to_reclaim;
> bool proportional_reclaim;
> struct blk_plug plug;
> + bool lrugen_enabled = smp_load_acquire(&lruvec->lrugen.enabled);
> + bool lru_draining = smp_load_acquire(&lruvec->lrugen.draining);
>
> - if (lru_gen_enabled() && !root_reclaim(sc)) {
> + if (lrugen_enabled || lru_draining && !root_reclaim(sc)) {
> lru_gen_shrink_lruvec(lruvec, sc);
> - return;
Is it possible to simply wait for draining to finish instead of performing
an lru_gen/lru shrink while lru_gen is being disabled or enabled?
Performing a shrink in an intermediate state may still involve a lot of
uncertainty, depending on how far the shrink has progressed and how much
remains in each side’s LRU?
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 3e51190a55e4..ba306e986050 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -509,6 +509,8 @@ struct lru_gen_folio {
atomic_long_t refaulted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS];
/* whether the multi-gen LRU is enabled */
bool enabled;
+ /* whether the multi-gen LRU is switching from/to active/inactive LRU */
+ bool switching;
/* the memcg generation this lru_gen_folio belongs to */
u8 gen;
/* the list segment this lru_gen_folio belongs to */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0fc9373e8251..60fc611067c7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5196,6 +5196,7 @@ static void lru_gen_change_state(bool enabled)
VM_WARN_ON_ONCE(!state_is_valid(lruvec));
lruvec->lrugen.enabled = enabled;
+ smp_store_release(&lruvec->lrugen.switching, true);
while (!(enabled ? fill_evictable(lruvec) : drain_evictable(lruvec))) {
spin_unlock_irq(&lruvec->lru_lock);
@@ -5203,6 +5204,8 @@ static void lru_gen_change_state(bool enabled)
spin_lock_irq(&lruvec->lru_lock);
}
+ smp_store_release(&lruvec->lrugen.switching, false);
+
spin_unlock_irq(&lruvec->lru_lock);
}
@@ -5780,6 +5783,10 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
bool proportional_reclaim;
struct blk_plug plug;
+#ifdef CONFIG_LRU_GEN
+ while (smp_load_acquire(&lruvec->lrugen.switching))
+ schedule_timeout_uninterruptible(HZ/100);
+#endif
if (lru_gen_enabled() && !root_reclaim(sc)) {
lru_gen_shrink_lruvec(lruvec, sc);
return;
--
> +
> + if (!lru_draining)
> + return;
> +
> }
>
> get_scan_count(lruvec, sc, nr);
> --
> 2.52.0
>
Thanks
Barry
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-28 21:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-28 16:10 [PATCH] mm/mglru: fix cgroup OOM during MGLRU state switching Leno Hou
2026-02-28 18:58 ` Andrew Morton
2026-02-28 19:12 ` kernel test robot
2026-02-28 19:23 ` kernel test robot
2026-02-28 20:15 ` kernel test robot
2026-02-28 21:28 ` Barry Song
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox