From: Hui Zhu <hui.zhu@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>,
Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Hui Zhu <zhuhui@kylinos.cn>
Subject: [PATCH 0/2] mm/swap: fix missing locks in swap_reclaim_work()
Date: Fri, 6 Mar 2026 19:50:35 +0800 [thread overview]
Message-ID: <cover.1772797581.git.zhuhui@kylinos.cn> (raw)
From: Hui Zhu <zhuhui@kylinos.cn>
swap_cluster_alloc_table() assumes that the caller holds the following
locks:
ci->lock
percpu_swap_cluster.lock
si->global_cluster_lock (required for non-SWP_SOLIDSTATE devices)
There are five call paths leading to swap_cluster_alloc_table():
swap_alloc_hibernation_slot->cluster_alloc_swap_entry
->alloc_swap_scan_list->isolate_lock_cluster->swap_cluster_alloc_table
swap_alloc_slow->cluster_alloc_swap_entry->alloc_swap_scan_list
->isolate_lock_cluster->swap_cluster_alloc_table
swap_alloc_hibernation_slot->cluster_alloc_swap_entry
->swap_reclaim_full_clusters->isolate_lock_cluster
->swap_cluster_alloc_table
swap_alloc_slow->cluster_alloc_swap_entry->swap_reclaim_full_clusters
->isolate_lock_cluster->swap_cluster_alloc_table
swap_reclaim_work->swap_reclaim_full_clusters->isolate_lock_cluster
->swap_cluster_alloc_table
Other paths correctly acquire the necessary locks before calling
swap_cluster_alloc_table().
But the swap_reclaim_work() path fails to acquire
percpu_swap_cluster.lock and, for non-SWP_SOLIDSTATE devices,
si->global_cluster_lock.
The first patch ensures swap_reclaim_work() correctly acquires
percpu_swap_cluster.lock and si->global_cluster_lock before calling
swap_reclaim_full_clusters(). Without these locks, the preconditions
for swap_cluster_alloc_table() are not met.
The second patch adds lockdep assertions in swap_cluster_alloc_table()
to help catch such locking inconsistencies early.
I tried to reproduce this naturally, but the swap_reclaim_work path
rarely hits the !cluster_table_is_alloced(found) condition. To verify
the fix, I used GDB to force found->table to NULL, which triggered
the following warning due to the missing locks:
[ 554.388797] ------------[ cut here ]------------
[ 554.388932] WARNING: mm/swapfile.c:480 at isolate_lock_cluster+0x199/0x470, CPU#6: kworker/6:2/656
[ 554.388947] Modules linked in:
[ 554.388990] CPU: 6 UID: 0 PID: 656 Comm: kworker/6:2 Not tainted 7.0.0-rc2+ #28 PREEMPT(full)
[ 554.388995] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 554.389013] Workqueue: events swap_reclaim_work
[ 554.389020] RIP: 0010:isolate_lock_cluster+0x199/0x470
[ 554.389025] Code: 02 0f 0b 8b 35 dc 69 57 02 85 f6 74 b0 65 48 8b 05 f4 20 af 02 be ff ff ff ff 48 8d b8 60 98 31 84 e8 2b 0e f5 00 85 c0 75 93 <0f> 0b eb 8f 48 89 df e8 0b 78 f6 00 41 f6 45 10 10 0f 84 0b 01 00
[ 554.389028] RSP: 0018:ffffc9000183bd68 EFLAGS: 00010246
[ 554.389033] RAX: 0000000000000000 RBX: ffff88810a410060 RCX: 0000000000000000
[ 554.389037] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 554.389046] RBP: ffffc9000183bd88 R08: 0000000000000000 R09: 0000000000000000
[ 554.389048] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88811108e878
[ 554.389049] R13: ffff88811108e800 R14: ffff88811108ea90 R15: ffff888101e41e40
[ 554.389051] FS: 0000000000000000(0000) GS:ffff8881b7812000(0000) knlGS:0000000000000000
[ 554.389053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 554.389054] CR2: 000000c000637f80 CR3: 000000010cfd5006 CR4: 0000000000770ef0
[ 554.389065] PKRU: 55555554
[ 554.389067] Call Trace:
[ 554.389068] <TASK>
[ 554.389080] swap_reclaim_full_clusters+0x6b/0x350
[ 554.389083] ? __pfx_swap_reclaim_work+0x10/0x10
[ 554.389090] ? swap_reclaim_full_clusters+0x52/0x350
[ 554.389094] swap_reclaim_work+0x1a/0x30
[ 554.389097] process_one_work+0x223/0x770
[ 554.389106] worker_thread+0x1c6/0x3b0
[ 554.389110] ? __pfx_worker_thread+0x10/0x10
[ 554.389113] kthread+0xfe/0x140
[ 554.389117] ? __pfx_kthread+0x10/0x10
[ 554.389121] ret_from_fork+0x3d4/0x480
[ 554.389125] ? __pfx_kthread+0x10/0x10
[ 554.389129] ret_from_fork_asm+0x1a/0x30
[ 554.389141] </TASK>
[ 554.389142] irq event stamp: 9775
[ 554.389144] hardirqs last enabled at (9781): [<ffffffff8148ca99>] __up_console_sem+0x79/0xa0
[ 554.389150] hardirqs last disabled at (9786): [<ffffffff8148ca7e>] __up_console_sem+0x5e/0xa0
[ 554.389153] softirqs last enabled at (8676): [<ffffffff813b3aff>] __irq_exit_rcu+0x13f/0x160
[ 554.389156] softirqs last disabled at (8615): [<ffffffff813b3aff>] __irq_exit_rcu+0x13f/0x160
[ 554.389159] ---[ end trace 0000000000000000 ]---
[ 554.477105] ------------[ cut here ]------------
[ 554.477253] WARNING: mm/swapfile.c:480 at isolate_lock_cluster+0x199/0x470, CPU#6: kworker/6:2/656
[ 554.477264] Modules linked in:
[ 554.477277] CPU: 6 UID: 0 PID: 656 Comm: kworker/6:2 Tainted: G W 7.0.0-rc2+ #28 PREEMPT(full)
[ 554.477284] Tainted: [W]=WARN
[ 554.477288] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 554.477291] Workqueue: events swap_reclaim_work
[ 554.477294] RIP: 0010:isolate_lock_cluster+0x199/0x470
[ 554.477296] Code: 02 0f 0b 8b 35 dc 69 57 02 85 f6 74 b0 65 48 8b 05 f4 20 af 02 be ff ff ff ff 48 8d b8 60 98 31 84 e8 2b 0e f5 00 85 c0 75 93 <0f> 0b eb 8f 48 89 df e8 0b 78 f6 00 41 f6 45 10 10 0f
Hui Zhu (2):
mm/swap: fix missing locks in swap_reclaim_work()
mm/swap: add lockdep for si->global_cluster_lock in
swap_cluster_alloc_table()
mm/swapfile.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--
2.43.0
next reply other threads:[~2026-03-06 11:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 11:50 Hui Zhu [this message]
2026-03-06 11:50 ` [PATCH 1/2] " Hui Zhu
2026-03-06 13:52 ` YoungJun Park
2026-03-06 11:50 ` [PATCH 2/2] mm/swap: add lockdep for si->global_cluster_lock in swap_cluster_alloc_table() Hui Zhu
2026-03-06 14:08 ` YoungJun Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1772797581.git.zhuhui@kylinos.cn \
--to=hui.zhu@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=shikemeng@huaweicloud.com \
--cc=zhuhui@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox