From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>, Barry Song <baohua@kernel.org>,
Chris Li <chrisl@kernel.org>, Nhat Pham <nphamcs@gmail.com>,
Yosry Ahmed <yosry.ahmed@linux.dev>,
David Hildenbrand <david@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Youngjun Park <youngjun.park@lge.com>,
Hugh Dickins <hughd@google.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Ying Huang <ying.huang@linux.alibaba.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com>
Subject: [PATCH v4 01/19] mm, swap: rename __read_swap_cache_async to swap_cache_alloc_folio
Date: Fri, 05 Dec 2025 03:29:09 +0800 [thread overview]
Message-ID: <20251205-swap-table-p2-v4-1-cb7e28a26a40@tencent.com> (raw)
In-Reply-To: <20251205-swap-table-p2-v4-0-cb7e28a26a40@tencent.com>
From: Kairui Song <kasong@tencent.com>
__read_swap_cache_async is widely used to allocate and ensure a folio is
in swapcache, or get the folio if a folio is already there.
It's not async, and it's not doing any read. Rename it to better present
its usage, and prepare to be reworked as part of new swap cache APIs.
Also, add some comments for the function. Worth noting that the
skip_if_exists argument is an long existing workaround that will be
dropped soon.
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Acked-by: Chris Li <chrisl@kernel.org>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Kairui Song <kasong@tencent.com>
---
mm/swap.h | 6 +++---
mm/swap_state.c | 46 +++++++++++++++++++++++++++++++++-------------
mm/swapfile.c | 2 +-
mm/zswap.c | 4 ++--
4 files changed, 39 insertions(+), 19 deletions(-)
diff --git a/mm/swap.h b/mm/swap.h
index d034c13d8dd2..0fff92e42cfe 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -249,6 +249,9 @@ struct folio *swap_cache_get_folio(swp_entry_t entry);
void *swap_cache_get_shadow(swp_entry_t entry);
void swap_cache_add_folio(struct folio *folio, swp_entry_t entry, void **shadow);
void swap_cache_del_folio(struct folio *folio);
+struct folio *swap_cache_alloc_folio(swp_entry_t entry, gfp_t gfp_flags,
+ struct mempolicy *mpol, pgoff_t ilx,
+ bool *alloced, bool skip_if_exists);
/* Below helpers require the caller to lock and pass in the swap cluster. */
void __swap_cache_del_folio(struct swap_cluster_info *ci,
struct folio *folio, swp_entry_t entry, void *shadow);
@@ -261,9 +264,6 @@ void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int nr);
struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
struct vm_area_struct *vma, unsigned long addr,
struct swap_iocb **plug);
-struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_flags,
- struct mempolicy *mpol, pgoff_t ilx, bool *new_page_allocated,
- bool skip_if_exists);
struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t flag,
struct mempolicy *mpol, pgoff_t ilx);
struct folio *swapin_readahead(swp_entry_t entry, gfp_t flag,
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 5f97c6ae70a2..08252eaef32f 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -402,9 +402,29 @@ void swap_update_readahead(struct folio *folio, struct vm_area_struct *vma,
}
}
-struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
- struct mempolicy *mpol, pgoff_t ilx, bool *new_page_allocated,
- bool skip_if_exists)
+/**
+ * swap_cache_alloc_folio - Allocate folio for swapped out slot in swap cache.
+ * @entry: the swapped out swap entry to be binded to the folio.
+ * @gfp_mask: memory allocation flags
+ * @mpol: NUMA memory allocation policy to be applied
+ * @ilx: NUMA interleave index, for use only when MPOL_INTERLEAVE
+ * @new_page_allocated: sets true if allocation happened, false otherwise
+ * @skip_if_exists: if the slot is a partially cached state, return NULL.
+ * This is a workaround that would be removed shortly.
+ *
+ * Allocate a folio in the swap cache for one swap slot, typically before
+ * doing IO (e.g. swap in or zswap writeback). The swap slot indicated by
+ * @entry must have a non-zero swap count (swapped out).
+ * Currently only supports order 0.
+ *
+ * Context: Caller must protect the swap device with reference count or locks.
+ * Return: Returns the existing folio if @entry is cached already. Returns
+ * NULL if failed due to -ENOMEM or @entry have a swap count < 1.
+ */
+struct folio *swap_cache_alloc_folio(swp_entry_t entry, gfp_t gfp_mask,
+ struct mempolicy *mpol, pgoff_t ilx,
+ bool *new_page_allocated,
+ bool skip_if_exists)
{
struct swap_info_struct *si = __swap_entry_to_info(entry);
struct folio *folio;
@@ -452,12 +472,12 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
goto put_and_return;
/*
- * Protect against a recursive call to __read_swap_cache_async()
+ * Protect against a recursive call to swap_cache_alloc_folio()
* on the same entry waiting forever here because SWAP_HAS_CACHE
* is set but the folio is not the swap cache yet. This can
* happen today if mem_cgroup_swapin_charge_folio() below
* triggers reclaim through zswap, which may call
- * __read_swap_cache_async() in the writeback path.
+ * swap_cache_alloc_folio() in the writeback path.
*/
if (skip_if_exists)
goto put_and_return;
@@ -466,7 +486,7 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
* We might race against __swap_cache_del_folio(), and
* stumble across a swap_map entry whose SWAP_HAS_CACHE
* has not yet been cleared. Or race against another
- * __read_swap_cache_async(), which has set SWAP_HAS_CACHE
+ * swap_cache_alloc_folio(), which has set SWAP_HAS_CACHE
* in swap_map, but not yet added its folio to swap cache.
*/
schedule_timeout_uninterruptible(1);
@@ -525,7 +545,7 @@ struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
return NULL;
mpol = get_vma_policy(vma, addr, 0, &ilx);
- folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx,
+ folio = swap_cache_alloc_folio(entry, gfp_mask, mpol, ilx,
&page_allocated, false);
mpol_cond_put(mpol);
@@ -643,9 +663,9 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
blk_start_plug(&plug);
for (offset = start_offset; offset <= end_offset ; offset++) {
/* Ok, do the async read-ahead now */
- folio = __read_swap_cache_async(
- swp_entry(swp_type(entry), offset),
- gfp_mask, mpol, ilx, &page_allocated, false);
+ folio = swap_cache_alloc_folio(
+ swp_entry(swp_type(entry), offset), gfp_mask, mpol, ilx,
+ &page_allocated, false);
if (!folio)
continue;
if (page_allocated) {
@@ -662,7 +682,7 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
lru_add_drain(); /* Push any new pages onto the LRU now */
skip:
/* The page was likely read above, so no need for plugging here */
- folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx,
+ folio = swap_cache_alloc_folio(entry, gfp_mask, mpol, ilx,
&page_allocated, false);
if (unlikely(page_allocated))
swap_read_folio(folio, NULL);
@@ -767,7 +787,7 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask,
if (!si)
continue;
}
- folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx,
+ folio = swap_cache_alloc_folio(entry, gfp_mask, mpol, ilx,
&page_allocated, false);
if (si)
put_swap_device(si);
@@ -789,7 +809,7 @@ static struct folio *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask,
lru_add_drain();
skip:
/* The folio was likely read above, so no need for plugging here */
- folio = __read_swap_cache_async(targ_entry, gfp_mask, mpol, targ_ilx,
+ folio = swap_cache_alloc_folio(targ_entry, gfp_mask, mpol, targ_ilx,
&page_allocated, false);
if (unlikely(page_allocated))
swap_read_folio(folio, NULL);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 46d2008e4b99..e5284067a442 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1574,7 +1574,7 @@ static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
* CPU1 CPU2
* do_swap_page()
* ... swapoff+swapon
- * __read_swap_cache_async()
+ * swap_cache_alloc_folio()
* swapcache_prepare()
* __swap_duplicate()
* // check swap_map
diff --git a/mm/zswap.c b/mm/zswap.c
index 5d0f8b13a958..a7a2443912f4 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1014,8 +1014,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
return -EEXIST;
mpol = get_task_policy(current);
- folio = __read_swap_cache_async(swpentry, GFP_KERNEL, mpol,
- NO_INTERLEAVE_INDEX, &folio_was_allocated, true);
+ folio = swap_cache_alloc_folio(swpentry, GFP_KERNEL, mpol,
+ NO_INTERLEAVE_INDEX, &folio_was_allocated, true);
put_swap_device(si);
if (!folio)
return -ENOMEM;
--
2.52.0
next prev parent reply other threads:[~2025-12-04 19:29 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-04 19:29 [PATCH v4 00/19] mm, swap: swap table phase II: unify swapin use swap cache and cleanup flags Kairui Song
2025-12-04 19:29 ` Kairui Song [this message]
2025-12-04 19:29 ` [PATCH v4 02/19] mm, swap: split swap cache preparation loop into a standalone helper Kairui Song
2025-12-04 19:29 ` [PATCH v4 03/19] mm, swap: never bypass the swap cache even for SWP_SYNCHRONOUS_IO Kairui Song
2025-12-04 19:29 ` [PATCH v4 04/19] mm, swap: always try to free swap cache for SWP_SYNCHRONOUS_IO devices Kairui Song
2025-12-04 19:29 ` [PATCH v4 05/19] mm, swap: simplify the code and reduce indention Kairui Song
2025-12-04 19:29 ` [PATCH v4 06/19] mm, swap: free the swap cache after folio is mapped Kairui Song
2025-12-04 19:29 ` [PATCH v4 07/19] mm/shmem: never bypass the swap cache for SWP_SYNCHRONOUS_IO Kairui Song
2025-12-04 19:29 ` [PATCH v4 08/19] mm/shmem, swap: remove SWAP_MAP_SHMEM Kairui Song
2025-12-04 19:29 ` [PATCH v4 09/19] mm, swap: swap entry of a bad slot should not be considered as swapped out Kairui Song
2025-12-04 19:29 ` [PATCH v4 10/19] mm, swap: consolidate cluster reclaim and usability check Kairui Song
2025-12-04 19:29 ` [PATCH v4 11/19] mm, swap: split locked entry duplicating into a standalone helper Kairui Song
2025-12-04 19:29 ` [PATCH v4 12/19] mm, swap: use swap cache as the swap in synchronize layer Kairui Song
2025-12-04 19:29 ` [PATCH v4 13/19] mm, swap: remove workaround for unsynchronized swap map cache state Kairui Song
2025-12-04 19:29 ` [PATCH v4 14/19] mm, swap: cleanup swap entry management workflow Kairui Song
2025-12-04 19:29 ` [PATCH v4 15/19] mm, swap: add folio to swap cache directly on allocation Kairui Song
2025-12-04 19:29 ` [PATCH v4 16/19] mm, swap: check swap table directly for checking cache Kairui Song
2025-12-04 19:29 ` [PATCH v4 17/19] mm, swap: clean up and improve swap entries freeing Kairui Song
2025-12-04 19:29 ` [PATCH v4 18/19] mm, swap: drop the SWAP_HAS_CACHE flag Kairui Song
2025-12-04 19:29 ` [PATCH v4 19/19] mm, swap: remove no longer needed _swap_info_get Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251205-swap-table-p2-v4-1-cb7e28a26a40@tencent.com \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=nphamcs@gmail.com \
--cc=shikemeng@huaweicloud.com \
--cc=willy@infradead.org \
--cc=ying.huang@linux.alibaba.com \
--cc=yosry.ahmed@linux.dev \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox