From: liuye <liuye@kylinos.cn>
To: akpm@linux-foundation.org
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, liuye@kylinos.cn
Subject: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
Date: Wed, 14 Aug 2024 17:18:25 +0800 [thread overview]
Message-ID: <20240814091825.27262-1-liuye@kylinos.cn> (raw)
This fixes the following hard lockup in function isolate_lru_folios
when memory reclaim.If the LRU mostly contains ineligible folios
May trigger watchdog.
watchdog: Watchdog detected hard LOCKUP on cpu 173
RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
Call Trace:
_raw_spin_lock_irqsave+0x31/0x40
folio_lruvec_lock_irqsave+0x5f/0x90
folio_batch_move_lru+0x91/0x150
lru_add_drain_per_cpu+0x1c/0x40
process_one_work+0x17d/0x350
worker_thread+0x27b/0x3a0
kthread+0xe8/0x120
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1b/0x30
lruvec->lru_lock owner:
PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
#0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
#1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
#2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
#3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
#4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
[exception RIP: isolate_lru_folios+403]
RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
<NMI exception stack>
#5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
#6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
#7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
#8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
#9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
crash>
Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: liuye <liuye@kylinos.cn>
---
include/linux/swap.h | 1 +
mm/vmscan.c | 7 +++++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index ba7ea95d1c57..afb3274c90ef 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -223,6 +223,7 @@ enum {
};
#define SWAP_CLUSTER_MAX 32UL
+#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
/* Bit flag in swap_map */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cfa839284b92..02a8f86d4883 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1655,6 +1655,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
unsigned long skipped = 0;
unsigned long scan, total_scan, nr_pages;
+ unsigned long max_nr_skipped = 0;
LIST_HEAD(folios_skipped);
total_scan = 0;
@@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx ||
- skip_cma(folio, sc)) {
+ /* Using max_nr_skipped to prevent hard LOCKUP*/
+ if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
+ (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ max_nr_skipped++;
goto move;
}
--
2.25.1
next reply other threads:[~2024-08-14 9:19 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-14 9:18 liuye [this message]
2024-08-14 21:27 ` Andrew Morton
2024-09-19 2:14 ` [PATCH v2] " liuye
2024-09-20 6:31 ` Bharata B Rao
[not found] ` <1727070383769353.48.seg@mailgw.kylinos.cn>
2024-09-23 6:03 ` liuye
2024-11-19 6:08 ` [PATCH v2 RESEND] " liuye
2024-11-30 3:22 ` Andrew Morton
2024-12-05 3:55 ` Hugh Dickins
[not found] ` <1733382994392357.312.seg@mailgw.kylinos.cn>
2024-12-11 7:26 ` liuye
2024-09-25 0:22 ` [PATCH] " Andrew Morton
2024-09-25 8:37 ` liuye
2024-09-25 9:29 ` Andrew Morton
2024-09-25 9:53 ` liuye
[not found] <20240815025226.8973-1-liuye@kylinos.cn>
2024-08-23 2:04 ` Re: " liuye
2024-09-03 2:34 ` liuye
2024-09-03 3:03 ` liuye
2024-09-06 1:16 ` liuye
2024-09-11 2:56 ` liuye
2024-12-05 19:17 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240814091825.27262-1-liuye@kylinos.cn \
--to=liuye@kylinos.cn \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox