* [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
@ 2024-08-14 9:18 liuye
2024-08-14 21:27 ` Andrew Morton
2024-09-25 0:22 ` [PATCH] " Andrew Morton
0 siblings, 2 replies; 13+ messages in thread
From: liuye @ 2024-08-14 9:18 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel, liuye
This fixes the following hard lockup in function isolate_lru_folios
when memory reclaim.If the LRU mostly contains ineligible folios
May trigger watchdog.
watchdog: Watchdog detected hard LOCKUP on cpu 173
RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
Call Trace:
_raw_spin_lock_irqsave+0x31/0x40
folio_lruvec_lock_irqsave+0x5f/0x90
folio_batch_move_lru+0x91/0x150
lru_add_drain_per_cpu+0x1c/0x40
process_one_work+0x17d/0x350
worker_thread+0x27b/0x3a0
kthread+0xe8/0x120
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1b/0x30
lruvec->lru_lock owner:
PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
#0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
#1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
#2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
#3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
#4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
[exception RIP: isolate_lru_folios+403]
RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
<NMI exception stack>
#5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
#6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
#7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
#8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
#9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
crash>
Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: liuye <liuye@kylinos.cn>
---
include/linux/swap.h | 1 +
mm/vmscan.c | 7 +++++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index ba7ea95d1c57..afb3274c90ef 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -223,6 +223,7 @@ enum {
};
#define SWAP_CLUSTER_MAX 32UL
+#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
/* Bit flag in swap_map */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cfa839284b92..02a8f86d4883 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1655,6 +1655,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
unsigned long skipped = 0;
unsigned long scan, total_scan, nr_pages;
+ unsigned long max_nr_skipped = 0;
LIST_HEAD(folios_skipped);
total_scan = 0;
@@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx ||
- skip_cma(folio, sc)) {
+ /* Using max_nr_skipped to prevent hard LOCKUP*/
+ if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
+ (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ max_nr_skipped++;
goto move;
}
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-14 9:18 [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios liuye
@ 2024-08-14 21:27 ` Andrew Morton
2024-09-19 2:14 ` [PATCH v2] " liuye
2024-09-25 0:22 ` [PATCH] " Andrew Morton
1 sibling, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2024-08-14 21:27 UTC (permalink / raw)
To: liuye; +Cc: linux-mm, linux-kernel
On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
> This fixes the following hard lockup in function isolate_lru_folios
> when memory reclaim.If the LRU mostly contains ineligible folios
> May trigger watchdog.
>
> watchdog: Watchdog detected hard LOCKUP on cpu 173
> RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
> Call Trace:
> _raw_spin_lock_irqsave+0x31/0x40
> folio_lruvec_lock_irqsave+0x5f/0x90
> folio_batch_move_lru+0x91/0x150
> lru_add_drain_per_cpu+0x1c/0x40
> process_one_work+0x17d/0x350
> worker_thread+0x27b/0x3a0
> kthread+0xe8/0x120
> ret_from_fork+0x34/0x50
> ret_from_fork_asm+0x1b/0x30
>
> lruvec->lru_lock owner:
>
> PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
> #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
> #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
> #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
> #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
> #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
> [exception RIP: isolate_lru_folios+403]
> RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
> RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
> RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
> RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
> R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> <NMI exception stack>
> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
> #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
> #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
> #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
> crash>
Well that's bad.
> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Merged in 2016.
Can you please describe how to reproduce this? Under what circumstances
does it occur? Why do you think it took eight years to be discovered?
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1655,6 +1655,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
> unsigned long skipped = 0;
> unsigned long scan, total_scan, nr_pages;
> + unsigned long max_nr_skipped = 0;
> LIST_HEAD(folios_skipped);
>
> total_scan = 0;
> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> nr_pages = folio_nr_pages(folio);
> total_scan += nr_pages;
>
> - if (folio_zonenum(folio) > sc->reclaim_idx ||
> - skip_cma(folio, sc)) {
> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> + max_nr_skipped++;
> goto move;
> }
It looks like that will fix, but perhaps something more fundamental
needs to be done - we're doing a tremendous amount of pretty pointless
work here. Answers to my above questions will help us resolve this.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-14 21:27 ` Andrew Morton
@ 2024-09-19 2:14 ` liuye
2024-09-20 6:31 ` Bharata B Rao
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: liuye @ 2024-09-19 2:14 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, linux-mm, liuye
This fixes the following hard lockup in function isolate_lru_folios
when memory reclaim.If the LRU mostly contains ineligible folios
May trigger watchdog.
watchdog: Watchdog detected hard LOCKUP on cpu 173
RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
Call Trace:
_raw_spin_lock_irqsave+0x31/0x40
folio_lruvec_lock_irqsave+0x5f/0x90
folio_batch_move_lru+0x91/0x150
lru_add_drain_per_cpu+0x1c/0x40
process_one_work+0x17d/0x350
worker_thread+0x27b/0x3a0
kthread+0xe8/0x120
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1b/0x30
lruvec->lru_lock owner:
PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
#0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
#1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
#2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
#3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
#4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
[exception RIP: isolate_lru_folios+403]
RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
<NMI exception stack>
#5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
#6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
#7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
#8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
#9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
crash>
Scenario:
User processe are requesting a large amount of memory and keep page active.
Then a module continuously requests memory from ZONE_DMA32 area.
Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
However pages in the LRU(active_anon) list are mostly from
the ZONE_NORMAL area.
Reproduce:
Terminal 1: Construct to continuously increase pages active(anon).
mkdir /tmp/memory
mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
dd if=/dev/zero of=/tmp/memory/block bs=4M
tail /tmp/memory/block
Terminal 2:
vmstat -a 1
active will increase.
procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
r b swpd free inact active si so bi bo
1 0 0 1445623076 45898836 83646008 0 0 0
1 0 0 1445623076 43450228 86094616 0 0 0
1 0 0 1445623076 41003480 88541364 0 0 0
1 0 0 1445623076 38557088 90987756 0 0 0
1 0 0 1445623076 36109688 93435156 0 0 0
1 0 0 1445619552 33663256 95881632 0 0 0
1 0 0 1445619804 31217140 98327792 0 0 0
1 0 0 1445619804 28769988 100774944 0 0 0
1 0 0 1445619804 26322348 103222584 0 0 0
1 0 0 1445619804 23875592 105669340 0 0 0
cat /proc/meminfo | head
Active(anon) increase.
MemTotal: 1579941036 kB
MemFree: 1445618500 kB
MemAvailable: 1453013224 kB
Buffers: 6516 kB
Cached: 128653956 kB
SwapCached: 0 kB
Active: 118110812 kB
Inactive: 11436620 kB
Active(anon): 115345744 kB
Inactive(anon): 945292 kB
When the Active(anon) is 115345744 kB, insmod module triggers
the ZONE_DMA32 watermark.
perf record -e vmscan:mm_vmscan_lru_isolate -aR
perf script
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
nr_skipped=2 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
nr_skipped=0 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
nr_skipped=28835844 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
nr_skipped=28835844 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
nr_skipped=29 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
nr_skipped=0 nr_taken=0 lru=active_anon
See nr_scanned=28835844.
28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
If increase Active(anon) to 1000G then insmod module triggers
the ZONE_DMA32 watermark. hard lockup will occur.
In my device nr_scanned = 0000000003e3e937 when hard lockup.
Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
[ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
ffffc90006fb7c30: 0000000000000020 0000000000000000
ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
ffffc90006fb7c70: 0000000000000000 0000000000000000
ffffc90006fb7c80: 0000000000000000 0000000000000000
ffffc90006fb7c90: 0000000000000000 0000000000000000
ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
ffffc90006fb7cb0: 0000000000000000 0000000000000000
ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
About the Fixes:
Why did it take eight years to be discovered?
The problem requires the following conditions to occur:
1. The device memory should be large enough.
2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
3. The memory in ZONE_DMA32 needs to reach the watermark.
If the memory is not large enough, or if the usage design of ZONE_DMA32
area memory is reasonable, this problem is difficult to detect.
notes:
The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
but other suitable scenarios may also trigger the problem.
Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: liuye <liuye@kylinos.cn>
---
V1->V2 : Adjust code format and add scenario description, reproduction method.
---
---
include/linux/swap.h | 1 +
mm/vmscan.c | 6 +++++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index ba7ea95d1c57..afb3274c90ef 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -223,6 +223,7 @@ enum {
};
#define SWAP_CLUSTER_MAX 32UL
+#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
/* Bit flag in swap_map */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index bd489c1af228..d2e436a4f47d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1636,6 +1636,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
unsigned long skipped = 0;
unsigned long scan, total_scan, nr_pages;
+ unsigned long max_nr_skipped = 0;
LIST_HEAD(folios_skipped);
total_scan = 0;
@@ -1650,9 +1651,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx) {
+ /* Using max_nr_skipped to prevent hard LOCKUP*/
+ if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
+ (folio_zonenum(folio) > sc->reclaim_idx)) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ max_nr_skipped++;
goto move;
}
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-19 2:14 ` [PATCH v2] " liuye
@ 2024-09-20 6:31 ` Bharata B Rao
[not found] ` <1727070383769353.48.seg@mailgw.kylinos.cn>
2024-11-19 6:08 ` [PATCH v2 RESEND] " liuye
2 siblings, 0 replies; 13+ messages in thread
From: Bharata B Rao @ 2024-09-20 6:31 UTC (permalink / raw)
To: liuye, akpm
Cc: linux-kernel, linux-mm, Johannes Weiner, Dadhania, Nikunj,
Usama Arif, Yu Zhao, Zhaoyang Huang, Breno Leitao
On 19-Sep-24 7:44 AM, liuye wrote:
> This fixes the following hard lockup in function isolate_lru_folios
> when memory reclaim.If the LRU mostly contains ineligible folios
> May trigger watchdog.
>
> watchdog: Watchdog detected hard LOCKUP on cpu 173
> RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
> Call Trace:
> _raw_spin_lock_irqsave+0x31/0x40
> folio_lruvec_lock_irqsave+0x5f/0x90
> folio_batch_move_lru+0x91/0x150
> lru_add_drain_per_cpu+0x1c/0x40
> process_one_work+0x17d/0x350
> worker_thread+0x27b/0x3a0
> kthread+0xe8/0x120
> ret_from_fork+0x34/0x50
> ret_from_fork_asm+0x1b/0x30
>
> lruvec->lru_lock owner:
>
> PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
> #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
> #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
> #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
> #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
> #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
> [exception RIP: isolate_lru_folios+403]
> RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
> RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
> RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
> RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
> R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> <NMI exception stack>
> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
> #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
> #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
> #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
> crash>
>
> Scenario:
> User processe are requesting a large amount of memory and keep page active.
> Then a module continuously requests memory from ZONE_DMA32 area.
> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
> However pages in the LRU(active_anon) list are mostly from
> the ZONE_NORMAL area.
>
> Reproduce:
> Terminal 1: Construct to continuously increase pages active(anon).
> mkdir /tmp/memory
> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
> dd if=/dev/zero of=/tmp/memory/block bs=4M
> tail /tmp/memory/block
>
> Terminal 2:
> vmstat -a 1
> active will increase.
> procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
> r b swpd free inact active si so bi bo
> 1 0 0 1445623076 45898836 83646008 0 0 0
> 1 0 0 1445623076 43450228 86094616 0 0 0
> 1 0 0 1445623076 41003480 88541364 0 0 0
> 1 0 0 1445623076 38557088 90987756 0 0 0
> 1 0 0 1445623076 36109688 93435156 0 0 0
> 1 0 0 1445619552 33663256 95881632 0 0 0
> 1 0 0 1445619804 31217140 98327792 0 0 0
> 1 0 0 1445619804 28769988 100774944 0 0 0
> 1 0 0 1445619804 26322348 103222584 0 0 0
> 1 0 0 1445619804 23875592 105669340 0 0 0
>
> cat /proc/meminfo | head
> Active(anon) increase.
> MemTotal: 1579941036 kB
> MemFree: 1445618500 kB
> MemAvailable: 1453013224 kB
> Buffers: 6516 kB
> Cached: 128653956 kB
> SwapCached: 0 kB
> Active: 118110812 kB
> Inactive: 11436620 kB
> Active(anon): 115345744 kB
> Inactive(anon): 945292 kB
>
> When the Active(anon) is 115345744 kB, insmod module triggers
> the ZONE_DMA32 watermark.
>
> perf record -e vmscan:mm_vmscan_lru_isolate -aR
> perf script
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
> nr_skipped=2 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
> nr_skipped=0 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
> nr_skipped=28835844 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
> nr_skipped=28835844 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
> nr_skipped=29 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
> nr_skipped=0 nr_taken=0 lru=active_anon
>
> See nr_scanned=28835844.
> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>
> If increase Active(anon) to 1000G then insmod module triggers
> the ZONE_DMA32 watermark. hard lockup will occur.
>
> In my device nr_scanned = 0000000003e3e937 when hard lockup.
> Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>
> [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> ffffc90006fb7c30: 0000000000000020 0000000000000000
> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
> ffffc90006fb7c70: 0000000000000000 0000000000000000
> ffffc90006fb7c80: 0000000000000000 0000000000000000
> ffffc90006fb7c90: 0000000000000000 0000000000000000
> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
> ffffc90006fb7cb0: 0000000000000000 0000000000000000
> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>
> About the Fixes:
> Why did it take eight years to be discovered?
>
> The problem requires the following conditions to occur:
> 1. The device memory should be large enough.
> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>
> If the memory is not large enough, or if the usage design of ZONE_DMA32
> area memory is reasonable, this problem is difficult to detect.
>
> notes:
> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
> but other suitable scenarios may also trigger the problem.
This problem appears very similar to the one we reported sometime back at
https://lore.kernel.org/linux-mm/d2841226-e27b-4d3d-a578-63587a3aa4f3@amd.com/
where ~150 million folios were being skipped to isolate a few ZONE_DMA
folios.
>
> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
> Signed-off-by: liuye <liuye@kylinos.cn>
>
> ---
> V1->V2 : Adjust code format and add scenario description, reproduction method.
> ---
> ---
> include/linux/swap.h | 1 +
> mm/vmscan.c | 6 +++++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index ba7ea95d1c57..afb3274c90ef 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -223,6 +223,7 @@ enum {
> };
>
> #define SWAP_CLUSTER_MAX 32UL
> +#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
> #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
>
> /* Bit flag in swap_map */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bd489c1af228..d2e436a4f47d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1636,6 +1636,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
> unsigned long skipped = 0;
> unsigned long scan, total_scan, nr_pages;
> + unsigned long max_nr_skipped = 0;
> LIST_HEAD(folios_skipped);
>
> total_scan = 0;
> @@ -1650,9 +1651,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> nr_pages = folio_nr_pages(folio);
> total_scan += nr_pages;
>
> - if (folio_zonenum(folio) > sc->reclaim_idx) {
> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> + if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
> + (folio_zonenum(folio) > sc->reclaim_idx)) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> + max_nr_skipped++;
> goto move;
> }
I am not sure if the above would help in all scenarios as limiting the
skipped folios list to 1 million entries couldn't fix the soft/hard
lockup issue.
In fact what helped was the fix by Yu Zhao which released the lruvec
lock. This was posted for consideration at
https://lore.kernel.org/lkml/ZsTOwBffg5xSCUbP@gmail.com/T/
However this posting eventually resulted in the revert of
5da226dbfce3a2. Also some concerns about hoarding large number of folios
in skipped list and effect (on compaction) of releasing of lruvec
spinlock without clearing LRU flag were raised by Johannes.
Regards,
Bharata.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
[not found] ` <1727070383769353.48.seg@mailgw.kylinos.cn>
@ 2024-09-23 6:03 ` liuye
0 siblings, 0 replies; 13+ messages in thread
From: liuye @ 2024-09-23 6:03 UTC (permalink / raw)
To: Bharata B Rao, akpm
Cc: linux-kernel, linux-mm, Johannes Weiner, Dadhania, Nikunj,
Usama Arif, Yu Zhao, Zhaoyang Huang, Breno Leitao
On 2024/9/20 下午2:31, Bharata B Rao wrote:
> On 19-Sep-24 7:44 AM, liuye wrote:
>> This fixes the following hard lockup in function isolate_lru_folios
>> when memory reclaim.If the LRU mostly contains ineligible folios
>> May trigger watchdog.
>>
>> watchdog: Watchdog detected hard LOCKUP on cpu 173
>> RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
>> Call Trace:
>> _raw_spin_lock_irqsave+0x31/0x40
>> folio_lruvec_lock_irqsave+0x5f/0x90
>> folio_batch_move_lru+0x91/0x150
>> lru_add_drain_per_cpu+0x1c/0x40
>> process_one_work+0x17d/0x350
>> worker_thread+0x27b/0x3a0
>> kthread+0xe8/0x120
>> ret_from_fork+0x34/0x50
>> ret_from_fork_asm+0x1b/0x30
>>
>> lruvec->lru_lock owner:
>>
>> PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
>> #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
>> #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
>> #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
>> #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
>> #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
>> [exception RIP: isolate_lru_folios+403]
>> RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
>> RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
>> RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
>> RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
>> R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
>> R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>> <NMI exception stack>
>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>> #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
>> #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
>> #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
>> #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
>> crash>
>>
>> Scenario:
>> User processe are requesting a large amount of memory and keep page active.
>> Then a module continuously requests memory from ZONE_DMA32 area.
>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
>> However pages in the LRU(active_anon) list are mostly from
>> the ZONE_NORMAL area.
>>
>> Reproduce:
>> Terminal 1: Construct to continuously increase pages active(anon).
>> mkdir /tmp/memory
>> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
>> dd if=/dev/zero of=/tmp/memory/block bs=4M
>> tail /tmp/memory/block
>>
>> Terminal 2:
>> vmstat -a 1
>> active will increase.
>> procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
>> r b swpd free inact active si so bi bo
>> 1 0 0 1445623076 45898836 83646008 0 0 0
>> 1 0 0 1445623076 43450228 86094616 0 0 0
>> 1 0 0 1445623076 41003480 88541364 0 0 0
>> 1 0 0 1445623076 38557088 90987756 0 0 0
>> 1 0 0 1445623076 36109688 93435156 0 0 0
>> 1 0 0 1445619552 33663256 95881632 0 0 0
>> 1 0 0 1445619804 31217140 98327792 0 0 0
>> 1 0 0 1445619804 28769988 100774944 0 0 0
>> 1 0 0 1445619804 26322348 103222584 0 0 0
>> 1 0 0 1445619804 23875592 105669340 0 0 0
>>
>> cat /proc/meminfo | head
>> Active(anon) increase.
>> MemTotal: 1579941036 kB
>> MemFree: 1445618500 kB
>> MemAvailable: 1453013224 kB
>> Buffers: 6516 kB
>> Cached: 128653956 kB
>> SwapCached: 0 kB
>> Active: 118110812 kB
>> Inactive: 11436620 kB
>> Active(anon): 115345744 kB
>> Inactive(anon): 945292 kB
>>
>> When the Active(anon) is 115345744 kB, insmod module triggers
>> the ZONE_DMA32 watermark.
>>
>> perf record -e vmscan:mm_vmscan_lru_isolate -aR
>> perf script
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
>> nr_skipped=2 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
>> nr_skipped=0 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
>> nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
>> nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
>> nr_skipped=29 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
>> nr_skipped=0 nr_taken=0 lru=active_anon
>>
>> See nr_scanned=28835844.
>> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>>
>> If increase Active(anon) to 1000G then insmod module triggers
>> the ZONE_DMA32 watermark. hard lockup will occur.
>>
>> In my device nr_scanned = 0000000003e3e937 when hard lockup.
>> Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>>
>> [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>> ffffc90006fb7c30: 0000000000000020 0000000000000000
>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
>> ffffc90006fb7c70: 0000000000000000 0000000000000000
>> ffffc90006fb7c80: 0000000000000000 0000000000000000
>> ffffc90006fb7c90: 0000000000000000 0000000000000000
>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
>> ffffc90006fb7cb0: 0000000000000000 0000000000000000
>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>>
>> About the Fixes:
>> Why did it take eight years to be discovered?
>>
>> The problem requires the following conditions to occur:
>> 1. The device memory should be large enough.
>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
>> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>>
>> If the memory is not large enough, or if the usage design of ZONE_DMA32
>> area memory is reasonable, this problem is difficult to detect.
>>
>> notes:
>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
>> but other suitable scenarios may also trigger the problem.
>
> This problem appears very similar to the one we reported sometime back at
>
> https://lore.kernel.org/linux-mm/d2841226-e27b-4d3d-a578-63587a3aa4f3@amd.com/
>
> where ~150 million folios were being skipped to isolate a few ZONE_DMA folios.
>
Yes, similar to this scenario.
>>
>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
>> Signed-off-by: liuye <liuye@kylinos.cn>
>>
>> ---
>> V1->V2 : Adjust code format and add scenario description, reproduction method.
>> ---
>> ---
>> include/linux/swap.h | 1 +
>> mm/vmscan.c | 6 +++++-
>> 2 files changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index ba7ea95d1c57..afb3274c90ef 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -223,6 +223,7 @@ enum {
>> };
>> #define SWAP_CLUSTER_MAX 32UL
>> +#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
>> #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
>> /* Bit flag in swap_map */
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index bd489c1af228..d2e436a4f47d 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1636,6 +1636,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>> unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
>> unsigned long skipped = 0;
>> unsigned long scan, total_scan, nr_pages;
>> + unsigned long max_nr_skipped = 0;
>> LIST_HEAD(folios_skipped);
>> total_scan = 0;
>> @@ -1650,9 +1651,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>> nr_pages = folio_nr_pages(folio);
>> total_scan += nr_pages;
>> - if (folio_zonenum(folio) > sc->reclaim_idx) {
>> + /* Using max_nr_skipped to prevent hard LOCKUP*/
>> + if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
>> + (folio_zonenum(folio) > sc->reclaim_idx)) {
>> nr_skipped[folio_zonenum(folio)] += nr_pages;
>> move_to = &folios_skipped;
>> + max_nr_skipped++;
>> goto move;
>> }
>
> I am not sure if the above would help in all scenarios as limiting the skipped folios list to 1 million entries couldn't fix the soft/hard lockup issue.
>
This value should not be too large, the earliest value is 32, before b2e18757f2c9.
#define SWAP_CLUSTER_MAX 32UL
+#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
To prevent lock contention and lockup, this value should be neither too small nor too large.
Depending on the CPU frequency, the time to trigger the lockup will vary.
Not sure if this value of SWAP_CLUSTER_MAX_SKIPPED is the most appropriate, but it does work.
My patch works for all scenarios and does not change the earlier code logic.
> In fact what helped was the fix by Yu Zhao which released the lruvec lock. This was posted for consideration at
>
> https://lore.kernel.org/lkml/ZsTOwBffg5xSCUbP@gmail.com/T/
>
> However this posting eventually resulted in the revert of
> 5da226dbfce3a2. Also some concerns about hoarding large number of folios in skipped list and effect (on compaction) of releasing of lruvec spinlock without clearing LRU flag were raised by Johannes.
>
Regarding Yu Zhao's patch, unlocking and releasing the scheduler may cause changes in the lru list and more likely cause data corruption. And there are some other concerns you mentioned.
Of course, this method would be great if all the problems in all scenarios could be solved.
Please also let me know about other emails regarding this discussion. Cc me.
Thanks,
Liuye
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-14 9:18 [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios liuye
2024-08-14 21:27 ` Andrew Morton
@ 2024-09-25 0:22 ` Andrew Morton
2024-09-25 8:37 ` liuye
1 sibling, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2024-09-25 0:22 UTC (permalink / raw)
To: liuye; +Cc: linux-mm, linux-kernel
On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> nr_pages = folio_nr_pages(folio);
> total_scan += nr_pages;
>
> - if (folio_zonenum(folio) > sc->reclaim_idx ||
> - skip_cma(folio, sc)) {
> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> + max_nr_skipped++;
> goto move;
This hunk is not applicable to current mainline.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-25 0:22 ` [PATCH] " Andrew Morton
@ 2024-09-25 8:37 ` liuye
2024-09-25 9:29 ` Andrew Morton
0 siblings, 1 reply; 13+ messages in thread
From: liuye @ 2024-09-25 8:37 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
On 2024/9/25 上午8:22, Andrew Morton wrote:
> On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
>
>> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>> nr_pages = folio_nr_pages(folio);
>> total_scan += nr_pages;
>>
>> - if (folio_zonenum(folio) > sc->reclaim_idx ||
>> - skip_cma(folio, sc)) {
>> + /* Using max_nr_skipped to prevent hard LOCKUP*/
>> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
>> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
>> nr_skipped[folio_zonenum(folio)] += nr_pages;
>> move_to = &folios_skipped;
>> + max_nr_skipped++;
>> goto move;
>
> This hunk is not applicable to current mainline.
>
Please see the PATCH v2 in link [1], and the related discussion in link [2].
Then please explain why it is not applicable,thank you.
[1]:https://lore.kernel.org/all/20240919021443.9170-1-liuye@kylinos.cn/
[2]:https://lore.kernel.org/all/e878653e-d380-81c2-90a8-fd2d1d4e7287@kylinos.cn/
Thanks,
liuye
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-25 8:37 ` liuye
@ 2024-09-25 9:29 ` Andrew Morton
2024-09-25 9:53 ` liuye
0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2024-09-25 9:29 UTC (permalink / raw)
To: liuye; +Cc: linux-mm, linux-kernel
On Wed, 25 Sep 2024 16:37:14 +0800 liuye <liuye@kylinos.cn> wrote:
>
>
> On 2024/9/25 上午8:22, Andrew Morton wrote:
> > On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
> >
> >> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> >> nr_pages = folio_nr_pages(folio);
> >> total_scan += nr_pages;
> >>
> >> - if (folio_zonenum(folio) > sc->reclaim_idx ||
> >> - skip_cma(folio, sc)) {
> >> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> >> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
> >> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
> >> nr_skipped[folio_zonenum(folio)] += nr_pages;
> >> move_to = &folios_skipped;
> >> + max_nr_skipped++;
> >> goto move;
> >
> > This hunk is not applicable to current mainline.
> >
>
> Please see the PATCH v2 in link [1], and the related discussion in link [2].
> Then please explain why it is not applicable,thank you.
What I mean is that the patch doesn't apply.
Current mainline has
if (folio_zonenum(folio) > sc->reclaim_idx) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
goto move;
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-25 9:29 ` Andrew Morton
@ 2024-09-25 9:53 ` liuye
0 siblings, 0 replies; 13+ messages in thread
From: liuye @ 2024-09-25 9:53 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
On 2024/9/25 下午5:29, Andrew Morton wrote:
> On Wed, 25 Sep 2024 16:37:14 +0800 liuye <liuye@kylinos.cn> wrote:
>
>>
>>
>> On 2024/9/25 上午8:22, Andrew Morton wrote:
>>> On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
>>>
>>>> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>>>> nr_pages = folio_nr_pages(folio);
>>>> total_scan += nr_pages;
>>>>
>>>> - if (folio_zonenum(folio) > sc->reclaim_idx ||
>>>> - skip_cma(folio, sc)) {
>>>> + /* Using max_nr_skipped to prevent hard LOCKUP*/
>>>> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
>>>> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
>>>> nr_skipped[folio_zonenum(folio)] += nr_pages;
>>>> move_to = &folios_skipped;
>>>> + max_nr_skipped++;
>>>> goto move;
>>>
>>> This hunk is not applicable to current mainline.
>>>
>>
>> Please see the PATCH v2 in link [1], and the related discussion in link [2].
>> Then please explain why it is not applicable,thank you.
>
> What I mean is that the patch doesn't apply.
>
> Current mainline has
>
> if (folio_zonenum(folio) > sc->reclaim_idx) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> goto move;
> }
>
PATCH v2 base on current mainline.
@@ -1650,9 +1651,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx) {
+ /* Using max_nr_skipped to prevent hard LOCKUP*/
+ if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
+ (folio_zonenum(folio) > sc->reclaim_idx)) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ max_nr_skipped++;
goto move;
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 RESEND] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-19 2:14 ` [PATCH v2] " liuye
2024-09-20 6:31 ` Bharata B Rao
[not found] ` <1727070383769353.48.seg@mailgw.kylinos.cn>
@ 2024-11-19 6:08 ` liuye
2024-11-30 3:22 ` Andrew Morton
2 siblings, 1 reply; 13+ messages in thread
From: liuye @ 2024-11-19 6:08 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, linux-mm, liuye
This fixes the following hard lockup in function isolate_lru_folios
when memory reclaim.If the LRU mostly contains ineligible folios
May trigger watchdog.
watchdog: Watchdog detected hard LOCKUP on cpu 173
RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
Call Trace:
_raw_spin_lock_irqsave+0x31/0x40
folio_lruvec_lock_irqsave+0x5f/0x90
folio_batch_move_lru+0x91/0x150
lru_add_drain_per_cpu+0x1c/0x40
process_one_work+0x17d/0x350
worker_thread+0x27b/0x3a0
kthread+0xe8/0x120
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1b/0x30
lruvec->lru_lock owner:
PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
#0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
#1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
#2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
#3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
#4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
[exception RIP: isolate_lru_folios+403]
RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
<NMI exception stack>
#5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
#6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
#7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
#8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
#9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
crash>
Scenario:
User processe are requesting a large amount of memory and keep page active.
Then a module continuously requests memory from ZONE_DMA32 area.
Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
However pages in the LRU(active_anon) list are mostly from
the ZONE_NORMAL area.
Reproduce:
Terminal 1: Construct to continuously increase pages active(anon).
mkdir /tmp/memory
mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
dd if=/dev/zero of=/tmp/memory/block bs=4M
tail /tmp/memory/block
Terminal 2:
vmstat -a 1
active will increase.
procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
r b swpd free inact active si so bi bo
1 0 0 1445623076 45898836 83646008 0 0 0
1 0 0 1445623076 43450228 86094616 0 0 0
1 0 0 1445623076 41003480 88541364 0 0 0
1 0 0 1445623076 38557088 90987756 0 0 0
1 0 0 1445623076 36109688 93435156 0 0 0
1 0 0 1445619552 33663256 95881632 0 0 0
1 0 0 1445619804 31217140 98327792 0 0 0
1 0 0 1445619804 28769988 100774944 0 0 0
1 0 0 1445619804 26322348 103222584 0 0 0
1 0 0 1445619804 23875592 105669340 0 0 0
cat /proc/meminfo | head
Active(anon) increase.
MemTotal: 1579941036 kB
MemFree: 1445618500 kB
MemAvailable: 1453013224 kB
Buffers: 6516 kB
Cached: 128653956 kB
SwapCached: 0 kB
Active: 118110812 kB
Inactive: 11436620 kB
Active(anon): 115345744 kB
Inactive(anon): 945292 kB
When the Active(anon) is 115345744 kB, insmod module triggers
the ZONE_DMA32 watermark.
perf record -e vmscan:mm_vmscan_lru_isolate -aR
perf script
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
nr_skipped=2 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
nr_skipped=0 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
nr_skipped=28835844 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
nr_skipped=28835844 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
nr_skipped=29 nr_taken=0 lru=active_anon
isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
nr_skipped=0 nr_taken=0 lru=active_anon
See nr_scanned=28835844.
28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
If increase Active(anon) to 1000G then insmod module triggers
the ZONE_DMA32 watermark. hard lockup will occur.
In my device nr_scanned = 0000000003e3e937 when hard lockup.
Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
[ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
ffffc90006fb7c30: 0000000000000020 0000000000000000
ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
ffffc90006fb7c70: 0000000000000000 0000000000000000
ffffc90006fb7c80: 0000000000000000 0000000000000000
ffffc90006fb7c90: 0000000000000000 0000000000000000
ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
ffffc90006fb7cb0: 0000000000000000 0000000000000000
ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
About the Fixes:
Why did it take eight years to be discovered?
The problem requires the following conditions to occur:
1. The device memory should be large enough.
2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
3. The memory in ZONE_DMA32 needs to reach the watermark.
If the memory is not large enough, or if the usage design of ZONE_DMA32
area memory is reasonable, this problem is difficult to detect.
notes:
The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
but other suitable scenarios may also trigger the problem.
Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
---
V1->V2 : Adjust code format and add scenario description, reproduction method.
---
Signed-off-by: liuye <liuye@kylinos.cn>
---
include/linux/swap.h | 1 +
mm/vmscan.c | 6 +++++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index f3e0ac20c2e8..187715eec3cb 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -223,6 +223,7 @@ enum {
};
#define SWAP_CLUSTER_MAX 32UL
+#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
/* Bit flag in swap_map */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 28ba2b06fc7d..0bdfae413b4c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1657,6 +1657,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
unsigned long skipped = 0;
unsigned long scan, total_scan, nr_pages;
+ unsigned long max_nr_skipped = 0;
LIST_HEAD(folios_skipped);
total_scan = 0;
@@ -1671,9 +1672,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx) {
+ /* Using max_nr_skipped to prevent hard LOCKUP*/
+ if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
+ (folio_zonenum(folio) > sc->reclaim_idx)) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ max_nr_skipped++;
goto move;
}
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 RESEND] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-11-19 6:08 ` [PATCH v2 RESEND] " liuye
@ 2024-11-30 3:22 ` Andrew Morton
2024-12-05 3:55 ` Hugh Dickins
[not found] ` <1733382994392357.312.seg@mailgw.kylinos.cn>
0 siblings, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2024-11-30 3:22 UTC (permalink / raw)
To: liuye; +Cc: linux-kernel, linux-mm, Mel Gorman, Hugh Dickins, Yang Shi
On Tue, 19 Nov 2024 14:08:42 +0800 liuye <liuye@kylinos.cn> wrote:
> This fixes the following hard lockup in function isolate_lru_folios
> when memory reclaim.If the LRU mostly contains ineligible folios
> May trigger watchdog.
>
> watchdog: Watchdog detected hard LOCKUP on cpu 173
> RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
> Call Trace:
> _raw_spin_lock_irqsave+0x31/0x40
> folio_lruvec_lock_irqsave+0x5f/0x90
> folio_batch_move_lru+0x91/0x150
> lru_add_drain_per_cpu+0x1c/0x40
> process_one_work+0x17d/0x350
> worker_thread+0x27b/0x3a0
> kthread+0xe8/0x120
> ret_from_fork+0x34/0x50
> ret_from_fork_asm+0x1b/0x30
>
> lruvec->lru_lock owner:
>
> PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
> #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
> #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
> #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
> #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
> #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
> [exception RIP: isolate_lru_folios+403]
> RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
> RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
> RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
> RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
> R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> <NMI exception stack>
> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
> #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
> #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
> #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
> crash>
>
> Scenario:
> User processe are requesting a large amount of memory and keep page active.
> Then a module continuously requests memory from ZONE_DMA32 area.
> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
> However pages in the LRU(active_anon) list are mostly from
> the ZONE_NORMAL area.
>
> Reproduce:
> Terminal 1: Construct to continuously increase pages active(anon).
> mkdir /tmp/memory
> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
> dd if=/dev/zero of=/tmp/memory/block bs=4M
> tail /tmp/memory/block
>
> Terminal 2:
> vmstat -a 1
> active will increase.
> procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
> r b swpd free inact active si so bi bo
> 1 0 0 1445623076 45898836 83646008 0 0 0
> 1 0 0 1445623076 43450228 86094616 0 0 0
> 1 0 0 1445623076 41003480 88541364 0 0 0
> 1 0 0 1445623076 38557088 90987756 0 0 0
> 1 0 0 1445623076 36109688 93435156 0 0 0
> 1 0 0 1445619552 33663256 95881632 0 0 0
> 1 0 0 1445619804 31217140 98327792 0 0 0
> 1 0 0 1445619804 28769988 100774944 0 0 0
> 1 0 0 1445619804 26322348 103222584 0 0 0
> 1 0 0 1445619804 23875592 105669340 0 0 0
>
> cat /proc/meminfo | head
> Active(anon) increase.
> MemTotal: 1579941036 kB
> MemFree: 1445618500 kB
> MemAvailable: 1453013224 kB
> Buffers: 6516 kB
> Cached: 128653956 kB
> SwapCached: 0 kB
> Active: 118110812 kB
> Inactive: 11436620 kB
> Active(anon): 115345744 kB
> Inactive(anon): 945292 kB
>
> When the Active(anon) is 115345744 kB, insmod module triggers
> the ZONE_DMA32 watermark.
>
> perf record -e vmscan:mm_vmscan_lru_isolate -aR
> perf script
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
> nr_skipped=2 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
> nr_skipped=0 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
> nr_skipped=28835844 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
> nr_skipped=28835844 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
> nr_skipped=29 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
> nr_skipped=0 nr_taken=0 lru=active_anon
>
> See nr_scanned=28835844.
> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>
> If increase Active(anon) to 1000G then insmod module triggers
> the ZONE_DMA32 watermark. hard lockup will occur.
>
> In my device nr_scanned = 0000000003e3e937 when hard lockup.
> Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>
> [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> ffffc90006fb7c30: 0000000000000020 0000000000000000
> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
> ffffc90006fb7c70: 0000000000000000 0000000000000000
> ffffc90006fb7c80: 0000000000000000 0000000000000000
> ffffc90006fb7c90: 0000000000000000 0000000000000000
> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
> ffffc90006fb7cb0: 0000000000000000 0000000000000000
> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>
> About the Fixes:
> Why did it take eight years to be discovered?
>
> The problem requires the following conditions to occur:
> 1. The device memory should be large enough.
> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>
> If the memory is not large enough, or if the usage design of ZONE_DMA32
> area memory is reasonable, this problem is difficult to detect.
>
> notes:
> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
> but other suitable scenarios may also trigger the problem.
>
> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
>
Thanks.
This is old code. I agree on b2e18757f2c9 and thanks for digging that
out.
I'll add a cc:stable and shall queue it for testing, pending review
from others (please). It may be that the -stable tree maintainers ask
for a backport of this change into pre-folio-conversion kernels. But
given the obscurity of the workload, I'm not sure this would be worth
doing. Opinions are sought?
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -223,6 +223,7 @@ enum {
> };
>
> #define SWAP_CLUSTER_MAX 32UL
> +#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
> #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
>
> /* Bit flag in swap_map */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 28ba2b06fc7d..0bdfae413b4c 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1657,6 +1657,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
> unsigned long skipped = 0;
> unsigned long scan, total_scan, nr_pages;
> + unsigned long max_nr_skipped = 0;
> LIST_HEAD(folios_skipped);
>
> total_scan = 0;
> @@ -1671,9 +1672,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> nr_pages = folio_nr_pages(folio);
> total_scan += nr_pages;
>
> - if (folio_zonenum(folio) > sc->reclaim_idx) {
> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> + if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
> + (folio_zonenum(folio) > sc->reclaim_idx)) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> + max_nr_skipped++;
> goto move;
> }
>
> --
> 2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 RESEND] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-11-30 3:22 ` Andrew Morton
@ 2024-12-05 3:55 ` Hugh Dickins
[not found] ` <1733382994392357.312.seg@mailgw.kylinos.cn>
1 sibling, 0 replies; 13+ messages in thread
From: Hugh Dickins @ 2024-12-05 3:55 UTC (permalink / raw)
To: Andrew Morton
Cc: liuye, linux-kernel, linux-mm, Mel Gorman, Hugh Dickins,
Yang Shi, Minchan Kim, Michal Hocko, Johannes Weiner,
Bharata B Rao, Yu Zhao
[-- Attachment #1: Type: text/plain, Size: 10169 bytes --]
On Fri, 29 Nov 2024, Andrew Morton wrote:
> On Tue, 19 Nov 2024 14:08:42 +0800 liuye <liuye@kylinos.cn> wrote:
>
> > This fixes the following hard lockup in function isolate_lru_folios
> > when memory reclaim.If the LRU mostly contains ineligible folios
> > May trigger watchdog.
> >
> > watchdog: Watchdog detected hard LOCKUP on cpu 173
> > RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
> > Call Trace:
> > _raw_spin_lock_irqsave+0x31/0x40
> > folio_lruvec_lock_irqsave+0x5f/0x90
> > folio_batch_move_lru+0x91/0x150
> > lru_add_drain_per_cpu+0x1c/0x40
> > process_one_work+0x17d/0x350
> > worker_thread+0x27b/0x3a0
> > kthread+0xe8/0x120
> > ret_from_fork+0x34/0x50
> > ret_from_fork_asm+0x1b/0x30
> >
> > lruvec->lru_lock owner:
> >
> > PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
> > #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
> > #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
> > #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
> > #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
> > #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
> > [exception RIP: isolate_lru_folios+403]
> > RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
> > RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
> > RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
> > RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
> > R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
> > R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
> > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > <NMI exception stack>
> > #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> > #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
> > #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
> > #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
> > #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
> > crash>
> >
> > Scenario:
> > User processe are requesting a large amount of memory and keep page active.
> > Then a module continuously requests memory from ZONE_DMA32 area.
> > Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
> > However pages in the LRU(active_anon) list are mostly from
> > the ZONE_NORMAL area.
> >
> > Reproduce:
> > Terminal 1: Construct to continuously increase pages active(anon).
> > mkdir /tmp/memory
> > mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
> > dd if=/dev/zero of=/tmp/memory/block bs=4M
> > tail /tmp/memory/block
> >
> > Terminal 2:
> > vmstat -a 1
> > active will increase.
> > procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
> > r b swpd free inact active si so bi bo
> > 1 0 0 1445623076 45898836 83646008 0 0 0
> > 1 0 0 1445623076 43450228 86094616 0 0 0
> > 1 0 0 1445623076 41003480 88541364 0 0 0
> > 1 0 0 1445623076 38557088 90987756 0 0 0
> > 1 0 0 1445623076 36109688 93435156 0 0 0
> > 1 0 0 1445619552 33663256 95881632 0 0 0
> > 1 0 0 1445619804 31217140 98327792 0 0 0
> > 1 0 0 1445619804 28769988 100774944 0 0 0
> > 1 0 0 1445619804 26322348 103222584 0 0 0
> > 1 0 0 1445619804 23875592 105669340 0 0 0
> >
> > cat /proc/meminfo | head
> > Active(anon) increase.
> > MemTotal: 1579941036 kB
> > MemFree: 1445618500 kB
> > MemAvailable: 1453013224 kB
> > Buffers: 6516 kB
> > Cached: 128653956 kB
> > SwapCached: 0 kB
> > Active: 118110812 kB
> > Inactive: 11436620 kB
> > Active(anon): 115345744 kB
> > Inactive(anon): 945292 kB
> >
> > When the Active(anon) is 115345744 kB, insmod module triggers
> > the ZONE_DMA32 watermark.
> >
> > perf record -e vmscan:mm_vmscan_lru_isolate -aR
> > perf script
> > isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
> > nr_skipped=2 nr_taken=0 lru=active_anon
> > isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
> > nr_skipped=0 nr_taken=0 lru=active_anon
> > isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
> > nr_skipped=28835844 nr_taken=0 lru=active_anon
> > isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
> > nr_skipped=28835844 nr_taken=0 lru=active_anon
> > isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
> > nr_skipped=29 nr_taken=0 lru=active_anon
> > isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
> > nr_skipped=0 nr_taken=0 lru=active_anon
> >
> > See nr_scanned=28835844.
> > 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
> >
> > If increase Active(anon) to 1000G then insmod module triggers
> > the ZONE_DMA32 watermark. hard lockup will occur.
> >
> > In my device nr_scanned = 0000000003e3e937 when hard lockup.
> > Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
> >
> > [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> > ffffc90006fb7c30: 0000000000000020 0000000000000000
> > ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
> > ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
> > ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
> > ffffc90006fb7c70: 0000000000000000 0000000000000000
> > ffffc90006fb7c80: 0000000000000000 0000000000000000
> > ffffc90006fb7c90: 0000000000000000 0000000000000000
> > ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
> > ffffc90006fb7cb0: 0000000000000000 0000000000000000
> > ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
> >
> > About the Fixes:
> > Why did it take eight years to be discovered?
I don't think it took eight years to be discovered: it was long known
as a potential issue, but awkward to solve properly, and most of us have
survived well enough in practice that we've never given the time to it.
> >
> > The problem requires the following conditions to occur:
> > 1. The device memory should be large enough.
> > 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
> > 3. The memory in ZONE_DMA32 needs to reach the watermark.
> >
> > If the memory is not large enough, or if the usage design of ZONE_DMA32
> > area memory is reasonable, this problem is difficult to detect.
> >
> > notes:
> > The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
> > but other suitable scenarios may also trigger the problem.
> >
> > Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
> >
>
> Thanks.
>
> This is old code. I agree on b2e18757f2c9 and thanks for digging that
> out.
I disagree. Although that commit is the root cause of what led to this
hard lockup problem, I believe there was no such hard lockup in it:
if I thought that this patch were a good fix, I would say
Fixes: 791b48b64232 ("mm: vmscan: scan until it finds eligible pages")
which allowed the previously SWAP_CLUSTER_MAX-limited scan to go
skipping indefinitely while holding spinlock with interrupts disabled;
which this patch here now limits to 32k, but that still seems way too
many to me.
And then after its 32k skips, it gives up and reclaims a few unsuitable
folios instead, just so that it can return a non-0 number to the caller.
Unlikely to find and reclaim the suitable folios that it's looking for:
which, despite its faults, the unpatched code does manage to do.
>
> I'll add a cc:stable and shall queue it for testing, pending review
> from others (please). It may be that the -stable tree maintainers ask
> for a backport of this change into pre-folio-conversion kernels. But
> given the obscurity of the workload, I'm not sure this would be worth
> doing. Opinions are sought?
I think I've been Cc'ed because git blame fingered some nearby isolation
cleanups from me: I'm not the best person to comment, but I would give
this patch a NAK. If we are going to worry about this after seven years
(and with MGLRU approaching), I'd say the issue needs a better approach.
Liuye, please start by reverting 791b48b64232 (which seems to have been
implemented at the wrong level, inviting this hard lockup), and then
studying its commit message and fixing the OOM kills which it was trying
to fix - if they still exist after all the intervening years of tweaks.
Perhaps it's just a matter of adjusting get_scan_count() or shrink_lruvec(),
to be more persistent in the reclaim_idx high-skipping case.
I'd have liked to suggest an actual patch, but that's beyond me.
Thanks,
Hugh
>
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -223,6 +223,7 @@ enum {
> > };
> >
> > #define SWAP_CLUSTER_MAX 32UL
> > +#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
> > #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
> >
> > /* Bit flag in swap_map */
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 28ba2b06fc7d..0bdfae413b4c 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1657,6 +1657,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> > unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
> > unsigned long skipped = 0;
> > unsigned long scan, total_scan, nr_pages;
> > + unsigned long max_nr_skipped = 0;
> > LIST_HEAD(folios_skipped);
> >
> > total_scan = 0;
> > @@ -1671,9 +1672,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> > nr_pages = folio_nr_pages(folio);
> > total_scan += nr_pages;
> >
> > - if (folio_zonenum(folio) > sc->reclaim_idx) {
> > + /* Using max_nr_skipped to prevent hard LOCKUP*/
> > + if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
> > + (folio_zonenum(folio) > sc->reclaim_idx)) {
> > nr_skipped[folio_zonenum(folio)] += nr_pages;
> > move_to = &folios_skipped;
> > + max_nr_skipped++;
> > goto move;
> > }
> >
> > --
> > 2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 RESEND] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
[not found] ` <1733382994392357.312.seg@mailgw.kylinos.cn>
@ 2024-12-11 7:26 ` liuye
0 siblings, 0 replies; 13+ messages in thread
From: liuye @ 2024-12-11 7:26 UTC (permalink / raw)
To: Hugh Dickins, Andrew Morton
Cc: linux-kernel, linux-mm, Mel Gorman, Yang Shi, Minchan Kim,
Michal Hocko, Johannes Weiner, Bharata B Rao, Yu Zhao
On 2024/12/5 上午11:55, Hugh Dickins wrote:
> On Fri, 29 Nov 2024, Andrew Morton wrote:
>> On Tue, 19 Nov 2024 14:08:42 +0800 liuye <liuye@kylinos.cn> wrote:
>>
>>> This fixes the following hard lockup in function isolate_lru_folios
>>> when memory reclaim.If the LRU mostly contains ineligible folios
>>> May trigger watchdog.
>>>
>>> watchdog: Watchdog detected hard LOCKUP on cpu 173
>>> RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
>>> Call Trace:
>>> _raw_spin_lock_irqsave+0x31/0x40
>>> folio_lruvec_lock_irqsave+0x5f/0x90
>>> folio_batch_move_lru+0x91/0x150
>>> lru_add_drain_per_cpu+0x1c/0x40
>>> process_one_work+0x17d/0x350
>>> worker_thread+0x27b/0x3a0
>>> kthread+0xe8/0x120
>>> ret_from_fork+0x34/0x50
>>> ret_from_fork_asm+0x1b/0x30
>>>
>>> lruvec->lru_lock owner:
>>>
>>> PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
>>> #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
>>> #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
>>> #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
>>> #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
>>> #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
>>> [exception RIP: isolate_lru_folios+403]
>>> RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
>>> RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
>>> RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
>>> RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
>>> R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
>>> R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
>>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>>> <NMI exception stack>
>>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>>> #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
>>> #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
>>> #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
>>> #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
>>> crash>
>>>
>>> Scenario:
>>> User processe are requesting a large amount of memory and keep page active.
>>> Then a module continuously requests memory from ZONE_DMA32 area.
>>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
>>> However pages in the LRU(active_anon) list are mostly from
>>> the ZONE_NORMAL area.
>>>
>>> Reproduce:
>>> Terminal 1: Construct to continuously increase pages active(anon).
>>> mkdir /tmp/memory
>>> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
>>> dd if=/dev/zero of=/tmp/memory/block bs=4M
>>> tail /tmp/memory/block
>>>
>>> Terminal 2:
>>> vmstat -a 1
>>> active will increase.
>>> procs ---memory--- ---swap-- ---io---- -system-- ---cpu--- ...
>>> r b swpd free inact active si so bi bo
>>> 1 0 0 1445623076 45898836 83646008 0 0 0
>>> 1 0 0 1445623076 43450228 86094616 0 0 0
>>> 1 0 0 1445623076 41003480 88541364 0 0 0
>>> 1 0 0 1445623076 38557088 90987756 0 0 0
>>> 1 0 0 1445623076 36109688 93435156 0 0 0
>>> 1 0 0 1445619552 33663256 95881632 0 0 0
>>> 1 0 0 1445619804 31217140 98327792 0 0 0
>>> 1 0 0 1445619804 28769988 100774944 0 0 0
>>> 1 0 0 1445619804 26322348 103222584 0 0 0
>>> 1 0 0 1445619804 23875592 105669340 0 0 0
>>>
>>> cat /proc/meminfo | head
>>> Active(anon) increase.
>>> MemTotal: 1579941036 kB
>>> MemFree: 1445618500 kB
>>> MemAvailable: 1453013224 kB
>>> Buffers: 6516 kB
>>> Cached: 128653956 kB
>>> SwapCached: 0 kB
>>> Active: 118110812 kB
>>> Inactive: 11436620 kB
>>> Active(anon): 115345744 kB
>>> Inactive(anon): 945292 kB
>>>
>>> When the Active(anon) is 115345744 kB, insmod module triggers
>>> the ZONE_DMA32 watermark.
>>>
>>> perf record -e vmscan:mm_vmscan_lru_isolate -aR
>>> perf script
>>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2
>>> nr_skipped=2 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0
>>> nr_skipped=0 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844
>>> nr_skipped=28835844 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844
>>> nr_skipped=28835844 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29
>>> nr_skipped=29 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0
>>> nr_skipped=0 nr_taken=0 lru=active_anon
>>>
>>> See nr_scanned=28835844.
>>> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>>>
>>> If increase Active(anon) to 1000G then insmod module triggers
>>> the ZONE_DMA32 watermark. hard lockup will occur.
>>>
>>> In my device nr_scanned = 0000000003e3e937 when hard lockup.
>>> Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>>>
>>> [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>>> ffffc90006fb7c30: 0000000000000020 0000000000000000
>>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
>>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
>>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
>>> ffffc90006fb7c70: 0000000000000000 0000000000000000
>>> ffffc90006fb7c80: 0000000000000000 0000000000000000
>>> ffffc90006fb7c90: 0000000000000000 0000000000000000
>>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
>>> ffffc90006fb7cb0: 0000000000000000 0000000000000000
>>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>>>
>>> About the Fixes:
>>> Why did it take eight years to be discovered?
>
> I don't think it took eight years to be discovered: it was long known
> as a potential issue, but awkward to solve properly, and most of us have
> survived well enough in practice that we've never given the time to it.
>
Are there any discussions about this? URL?
>>>
>>> The problem requires the following conditions to occur:
>>> 1. The device memory should be large enough.
>>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
>>> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>>>
>>> If the memory is not large enough, or if the usage design of ZONE_DMA32
>>> area memory is reasonable, this problem is difficult to detect.
>>>
>>> notes:
>>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL,
>>> but other suitable scenarios may also trigger the problem.
>>>
>>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
>>>
>>
>> Thanks.
>>
>> This is old code. I agree on b2e18757f2c9 and thanks for digging that
>> out.
>
> I disagree. Although that commit is the root cause of what led to this
> hard lockup problem, I believe there was no such hard lockup in it:
> if I thought that this patch were a good fix, I would say
>
> Fixes: 791b48b64232 ("mm: vmscan: scan until it finds eligible pages")
>
> which allowed the previously SWAP_CLUSTER_MAX-limited scan to go
> skipping indefinitely while holding spinlock with interrupts disabled;
> which this patch here now limits to 32k, but that still seems way too
> many to me.
>
> And then after its 32k skips, it gives up and reclaims a few unsuitable
> folios instead, just so that it can return a non-0 number to the caller.
> Unlikely to find and reclaim the suitable folios that it's looking for:
> which, despite its faults, the unpatched code does manage to do.
>
This value should not be too large, the earliest value is 32,
before b2e18757f2c9.
#define SWAP_CLUSTER_MAX 32UL
+#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
To prevent lock contention and lockup, this value should be neither too
small nor too large. Depending on the CPU frequency, the time to trigger
the lockup will vary. Not sure if this value of SWAP_CLUSTER_MAX_SKIPPED
is the most appropriate, but it does work.
>>
>> I'll add a cc:stable and shall queue it for testing, pending review
>> from others (please). It may be that the -stable tree maintainers ask
>> for a backport of this change into pre-folio-conversion kernels. But
>> given the obscurity of the workload, I'm not sure this would be worth
>> doing. Opinions are sought?
>
> I think I've been Cc'ed because git blame fingered some nearby isolation
> cleanups from me: I'm not the best person to comment, but I would give
> this patch a NAK. If we are going to worry about this after seven years
> (and with MGLRU approaching), I'd say the issue needs a better approach.
>
> Liuye, please start by reverting 791b48b64232 (which seems to have been
> implemented at the wrong level, inviting this hard lockup), and then
> studying its commit message and fixing the OOM kills which it was trying
> to fix - if they still exist after all the intervening years of tweaks.
>
Memory reclaim skips a large number of ineligible zones's pages, causing OOM.
The memory reclaim mechanism needs to be optimized. But I think this
optimization should not be triggered by "mm/vmscan: fix hard lock in
function isolate_lru_folios". I suggest fixing the current issue first.
Thanks,
Liuye
> Perhaps it's just a matter of adjusting get_scan_count() or shrink_lruvec(),
> to be more persistent in the reclaim_idx high-skipping case.
>
> I'd have liked to suggest an actual patch, but that's beyond me.
>
> Thanks,
> Hugh
>
>>
>>> --- a/include/linux/swap.h
>>> +++ b/include/linux/swap.h
>>> @@ -223,6 +223,7 @@ enum {
>>> };
>>>
>>> #define SWAP_CLUSTER_MAX 32UL
>>> +#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
>>> #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
>>>
>>> /* Bit flag in swap_map */
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index 28ba2b06fc7d..0bdfae413b4c 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -1657,6 +1657,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>>> unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
>>> unsigned long skipped = 0;
>>> unsigned long scan, total_scan, nr_pages;
>>> + unsigned long max_nr_skipped = 0;
>>> LIST_HEAD(folios_skipped);
>>>
>>> total_scan = 0;
>>> @@ -1671,9 +1672,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>>> nr_pages = folio_nr_pages(folio);
>>> total_scan += nr_pages;
>>>
>>> - if (folio_zonenum(folio) > sc->reclaim_idx) {
>>> + /* Using max_nr_skipped to prevent hard LOCKUP*/
>>> + if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
>>> + (folio_zonenum(folio) > sc->reclaim_idx)) {
>>> nr_skipped[folio_zonenum(folio)] += nr_pages;
>>> move_to = &folios_skipped;
>>> + max_nr_skipped++;
>>> goto move;
>>> }
>>>
>>> --
>>> 2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-12-11 7:26 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-14 9:18 [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios liuye
2024-08-14 21:27 ` Andrew Morton
2024-09-19 2:14 ` [PATCH v2] " liuye
2024-09-20 6:31 ` Bharata B Rao
[not found] ` <1727070383769353.48.seg@mailgw.kylinos.cn>
2024-09-23 6:03 ` liuye
2024-11-19 6:08 ` [PATCH v2 RESEND] " liuye
2024-11-30 3:22 ` Andrew Morton
2024-12-05 3:55 ` Hugh Dickins
[not found] ` <1733382994392357.312.seg@mailgw.kylinos.cn>
2024-12-11 7:26 ` liuye
2024-09-25 0:22 ` [PATCH] " Andrew Morton
2024-09-25 8:37 ` liuye
2024-09-25 9:29 ` Andrew Morton
2024-09-25 9:53 ` liuye
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox