From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Kairui Song <ryncsn@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
Hugh Dickins <hughd@google.com>, Chris Li <chrisl@kernel.org>,
Barry Song <baohua@kernel.org>, Baoquan He <bhe@redhat.com>,
Nhat Pham <nphamcs@gmail.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Ying Huang <ying.huang@linux.alibaba.com>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@redhat.com>,
Yosry Ahmed <yosryahmed@google.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>,
linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com>,
kernel test robot <oliver.sang@intel.com>
Subject: [PATCH v3 13/15] mm, swap: remove contention workaround for swap cache
Date: Thu, 11 Sep 2025 00:08:31 +0800 [thread overview]
Message-ID: <20250910160833.3464-14-ryncsn@gmail.com> (raw)
In-Reply-To: <20250910160833.3464-1-ryncsn@gmail.com>
From: Kairui Song <kasong@tencent.com>
Swap cluster setup will try to shuffle the clusters on initialization.
It was helpful to avoid contention for the swap cache space. The cluster
size (2M) was much smaller than each swap cache space (64M), so
shuffling the cluster means the allocator will try to allocate swap
slots that are in different swap cache spaces for each CPU, reducing the
chance of two CPUs using the same swap cache space, and hence reducing
the contention.
Now, swap cache is managed by swap clusters, this shuffle is pointless.
Just remove it, and clean up related macros.
This also improves the HDD swap performance as shuffling IO is a bad
idea for HDD, and now the shuffling is gone. Test have shown a ~40%
performance gain for HDD [1]:
Doing sequential swap in of 8G data using 8 processes with usemem,
average of 3 test runs:
Before: 1270.91 KB/s per process
After: 1849.54 KB/s per process
Link: https://lore.kernel.org/linux-mm/CAMgjq7AdauQ8=X0zeih2r21QoV=-WWj1hyBxLWRzq74n-C=-Ng@mail.gmail.com/ [1]
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202504241621.f27743ec-lkp@intel.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Chris Li <chrisl@kernel.org>
Reviewed-by: Barry Song <baohua@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
---
mm/swap.h | 4 ----
mm/swapfile.c | 32 ++++++++------------------------
mm/zswap.c | 7 +++++--
3 files changed, 13 insertions(+), 30 deletions(-)
diff --git a/mm/swap.h b/mm/swap.h
index adcd85fa8538..fe5c20922082 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -198,10 +198,6 @@ int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug);
void __swap_writepage(struct folio *folio, struct swap_iocb **swap_plug);
/* linux/mm/swap_state.c */
-/* One swap address space for each 64M swap space */
-#define SWAP_ADDRESS_SPACE_SHIFT 14
-#define SWAP_ADDRESS_SPACE_PAGES (1 << SWAP_ADDRESS_SPACE_SHIFT)
-#define SWAP_ADDRESS_SPACE_MASK (SWAP_ADDRESS_SPACE_PAGES - 1)
extern struct address_space swap_space __ro_after_init;
static inline struct address_space *swap_address_space(swp_entry_t entry)
{
diff --git a/mm/swapfile.c b/mm/swapfile.c
index a81ada4a361d..89659928465e 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3203,21 +3203,14 @@ static int setup_swap_map(struct swap_info_struct *si,
return 0;
}
-#define SWAP_CLUSTER_INFO_COLS \
- DIV_ROUND_UP(L1_CACHE_BYTES, sizeof(struct swap_cluster_info))
-#define SWAP_CLUSTER_SPACE_COLS \
- DIV_ROUND_UP(SWAP_ADDRESS_SPACE_PAGES, SWAPFILE_CLUSTER)
-#define SWAP_CLUSTER_COLS \
- max_t(unsigned int, SWAP_CLUSTER_INFO_COLS, SWAP_CLUSTER_SPACE_COLS)
-
static struct swap_cluster_info *setup_clusters(struct swap_info_struct *si,
union swap_header *swap_header,
unsigned long maxpages)
{
unsigned long nr_clusters = DIV_ROUND_UP(maxpages, SWAPFILE_CLUSTER);
struct swap_cluster_info *cluster_info;
- unsigned long i, j, idx;
int err = -ENOMEM;
+ unsigned long i;
cluster_info = kvcalloc(nr_clusters, sizeof(*cluster_info), GFP_KERNEL);
if (!cluster_info)
@@ -3266,22 +3259,13 @@ static struct swap_cluster_info *setup_clusters(struct swap_info_struct *si,
INIT_LIST_HEAD(&si->frag_clusters[i]);
}
- /*
- * Reduce false cache line sharing between cluster_info and
- * sharing same address space.
- */
- for (j = 0; j < SWAP_CLUSTER_COLS; j++) {
- for (i = 0; i < DIV_ROUND_UP(nr_clusters, SWAP_CLUSTER_COLS); i++) {
- struct swap_cluster_info *ci;
- idx = i * SWAP_CLUSTER_COLS + j;
- ci = cluster_info + idx;
- if (idx >= nr_clusters)
- continue;
- if (ci->count) {
- ci->flags = CLUSTER_FLAG_NONFULL;
- list_add_tail(&ci->list, &si->nonfull_clusters[0]);
- continue;
- }
+ for (i = 0; i < nr_clusters; i++) {
+ struct swap_cluster_info *ci = &cluster_info[i];
+
+ if (ci->count) {
+ ci->flags = CLUSTER_FLAG_NONFULL;
+ list_add_tail(&ci->list, &si->nonfull_clusters[0]);
+ } else {
ci->flags = CLUSTER_FLAG_FREE;
list_add_tail(&ci->list, &si->free_clusters);
}
diff --git a/mm/zswap.c b/mm/zswap.c
index 3dda4310099e..cba7077fda40 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -225,10 +225,13 @@ static bool zswap_has_pool;
* helpers and fwd declarations
**********************************/
+/* One swap address space for each 64M swap space */
+#define ZSWAP_ADDRESS_SPACE_SHIFT 14
+#define ZSWAP_ADDRESS_SPACE_PAGES (1 << ZSWAP_ADDRESS_SPACE_SHIFT)
static inline struct xarray *swap_zswap_tree(swp_entry_t swp)
{
return &zswap_trees[swp_type(swp)][swp_offset(swp)
- >> SWAP_ADDRESS_SPACE_SHIFT];
+ >> ZSWAP_ADDRESS_SPACE_SHIFT];
}
#define zswap_pool_debug(msg, p) \
@@ -1674,7 +1677,7 @@ int zswap_swapon(int type, unsigned long nr_pages)
struct xarray *trees, *tree;
unsigned int nr, i;
- nr = DIV_ROUND_UP(nr_pages, SWAP_ADDRESS_SPACE_PAGES);
+ nr = DIV_ROUND_UP(nr_pages, ZSWAP_ADDRESS_SPACE_PAGES);
trees = kvcalloc(nr, sizeof(*tree), GFP_KERNEL);
if (!trees) {
pr_err("alloc failed, zswap disabled for swap type %d\n", type);
--
2.51.0
next prev parent reply other threads:[~2025-09-10 16:09 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-10 16:08 [PATCH v3 00/15] mm, swap: introduce swap table as swap cache (phase I) Kairui Song
2025-09-10 16:08 ` [PATCH v3 01/15] docs/mm: add document for swap table Kairui Song
2025-09-10 16:13 ` Kairui Song
2025-09-10 16:37 ` Chris Li
2025-09-10 16:08 ` [PATCH v3 02/15] mm, swap: use unified helper for swap cache look up Kairui Song
2025-09-10 16:08 ` [PATCH v3 03/15] mm, swap: fix swap cache index error when retrying reclaim Kairui Song
2025-09-10 16:08 ` [PATCH v3 04/15] mm, swap: check page poison flag after locking it Kairui Song
2025-09-12 16:01 ` David Hildenbrand
2025-09-12 17:17 ` Nhat Pham
2025-09-10 16:08 ` [PATCH v3 05/15] mm, swap: always lock and check the swap cache folio before use Kairui Song
2025-09-12 16:04 ` David Hildenbrand
2025-09-15 15:07 ` Chris Li
2025-09-15 16:34 ` Kairui Song
2025-09-15 23:09 ` Nhat Pham
2025-09-10 16:08 ` [PATCH v3 06/15] mm, swap: rename and move some swap cluster definition and helpers Kairui Song
2025-09-10 16:08 ` [PATCH v3 07/15] mm, swap: tidy up swap device and cluster info helpers Kairui Song
2025-09-10 16:08 ` [PATCH v3 08/15] mm, swap: cleanup swap cache API and add kerneldoc Kairui Song
2025-09-10 16:08 ` [PATCH v3 09/15] mm/shmem, swap: remove redundant error handling for replacing folio Kairui Song
2025-09-12 8:22 ` Baolin Wang
2025-09-12 12:36 ` Kairui Song
2025-09-13 3:26 ` Baolin Wang
2025-09-15 15:13 ` Chris Li
2025-09-10 16:08 ` [PATCH v3 10/15] mm, swap: wrap swap cache replacement with a helper Kairui Song
2025-09-12 16:06 ` David Hildenbrand
2025-09-15 15:09 ` Chris Li
2025-09-10 16:08 ` [PATCH v3 11/15] mm, swap: use the swap table for the swap cache and switch API Kairui Song
2025-09-11 2:27 ` Lance Yang
2025-09-11 2:33 ` Lance Yang
2025-09-11 2:48 ` Kairui Song
2025-09-11 2:54 ` Lance Yang
2025-09-12 9:30 ` Lorenzo Stoakes
2025-09-12 9:42 ` Kairui Song
2025-09-12 9:47 ` Lorenzo Stoakes
2025-09-12 9:21 ` Lorenzo Stoakes
2025-09-10 16:08 ` [PATCH v3 12/15] mm, swap: mark swap address space ro and add context debug check Kairui Song
2025-09-10 16:08 ` Kairui Song [this message]
2025-09-10 16:08 ` [PATCH v3 14/15] mm, swap: implement dynamic allocation of swap table Kairui Song
2025-09-15 15:05 ` Chris Mason
2025-09-15 16:24 ` Kairui Song
2025-09-15 16:54 ` Chris Mason
2025-09-15 17:14 ` Chris Li
2025-09-15 18:03 ` Chris Li
2025-09-10 16:08 ` [PATCH v3 15/15] mm, swap: use a single page for swap table when the size fits Kairui Song
2025-09-10 22:40 ` [PATCH v3 00/15] mm, swap: introduce swap table as swap cache (phase I) Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250910160833.3464-14-ryncsn@gmail.com \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=nphamcs@gmail.com \
--cc=oliver.sang@intel.com \
--cc=shikemeng@huaweicloud.com \
--cc=willy@infradead.org \
--cc=ying.huang@linux.alibaba.com \
--cc=yosryahmed@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox