linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code
@ 2025-03-18 15:06 Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 1/8] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

v1->v2:
-Collect RVB from Tim
-Drop patch to factor out __swap_entry_free()
-Improve changelog and add more comment.
-Avoid unneeded lock re-aquire

Hi All,
This series contains some cleanups and improvements which are made
during learning swapfile. Here is a summary of the changes:
1. Function nameing improvments.
-Use "put" instead of "free" to name functions which only do actual free
when count drops to zero.
-Use "entry" to name function only frees one swap slot. Use "entries" to
name function could may free multi swap slots within one cluster. Use
"_nr" suffix to name function which could free multi swap slots spanning
cross multi clusters.
2. Eliminate the need to set swap slot to intermediate SWAP_HAS_CACHE
value before do actual free by using swap_entry_range_free()
3. Add helpers swap_entries_put_map() and swap_entries_put_cache() as a
general-purpose routine to free swap entries within a single cluster
which will try batch-remove first and fallback to put eatch entry
indvidually with cluster lock acquired/released only once. By using 
these helpers, we could remove repeated code, levarage batch-remove in
more cases and aoivd to acquire/release cluster lock for each single
swap entry.

Kemeng Shi (8):
  mm: swap: rename __swap_[entry/entries]_free[_locked] to
    swap_[entry/entries]_put[_locked]
  mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in
    swap_entry_range_free()
  mm: swap: use swap_entries_free() to free swap entry in
    swap_entry_put_locked()
  mm: swap: use swap_entries_free() drop last ref count in
    swap_entries_put_nr()
  mm: swap: drop last SWAP_MAP_SHMEM flag in batch in
    swap_entries_put_nr()
  mm: swap: free each cluster individually in swap_entries_put_map_nr()
  mm: swap: factor out helper to drop cache of entries within a single
    cluster
  mm: swap: replace cluster_swap_free_nr() with
    swap_entries_put_[map/cache]()

 mm/swapfile.c | 157 ++++++++++++++++++++++++--------------------------
 1 file changed, 75 insertions(+), 82 deletions(-)

-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/8] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked]
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 2/8] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

In __swap_entry_free[_locked] and __swap_entries_free, we decrease count
first and only free swap entry if count drops to zero. This behavior is
more akin to a put() operation rather than a free() operation. Therefore,
rename these functions with "put" instead of "free".
Additionally, add "_nr" suffix to swap_entries_put to indicate the input
range may span swap clusters.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 mm/swapfile.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 628f67974a7c..5a775456e26c 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1347,9 +1347,9 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
 	return NULL;
 }
 
-static unsigned char __swap_entry_free_locked(struct swap_info_struct *si,
-					      unsigned long offset,
-					      unsigned char usage)
+static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
+					   unsigned long offset,
+					   unsigned char usage)
 {
 	unsigned char count;
 	unsigned char has_cache;
@@ -1453,15 +1453,15 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
 	return NULL;
 }
 
-static unsigned char __swap_entry_free(struct swap_info_struct *si,
-				       swp_entry_t entry)
+static unsigned char swap_entry_put(struct swap_info_struct *si,
+				    swp_entry_t entry)
 {
 	struct swap_cluster_info *ci;
 	unsigned long offset = swp_offset(entry);
 	unsigned char usage;
 
 	ci = lock_cluster(si, offset);
-	usage = __swap_entry_free_locked(si, offset, 1);
+	usage = swap_entry_put_locked(si, offset, 1);
 	if (!usage)
 		swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
 	unlock_cluster(ci);
@@ -1469,8 +1469,8 @@ static unsigned char __swap_entry_free(struct swap_info_struct *si,
 	return usage;
 }
 
-static bool __swap_entries_free(struct swap_info_struct *si,
-		swp_entry_t entry, int nr)
+static bool swap_entries_put_nr(struct swap_info_struct *si,
+				swp_entry_t entry, int nr)
 {
 	unsigned long offset = swp_offset(entry);
 	unsigned int type = swp_type(entry);
@@ -1501,7 +1501,7 @@ static bool __swap_entries_free(struct swap_info_struct *si,
 fallback:
 	for (i = 0; i < nr; i++) {
 		if (data_race(si->swap_map[offset + i])) {
-			count = __swap_entry_free(si, swp_entry(type, offset + i));
+			count = swap_entry_put(si, swp_entry(type, offset + i));
 			if (count == SWAP_HAS_CACHE)
 				has_cache = true;
 		} else {
@@ -1552,7 +1552,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
 
 	ci = lock_cluster(si, offset);
 	do {
-		if (!__swap_entry_free_locked(si, offset, usage))
+		if (!swap_entry_put_locked(si, offset, usage))
 			swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
 	} while (++offset < end);
 	unlock_cluster(ci);
@@ -1599,7 +1599,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
 		swap_entry_range_free(si, ci, entry, size);
 	else {
 		for (int i = 0; i < size; i++, entry.val++) {
-			if (!__swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE))
+			if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
 				swap_entry_range_free(si, ci, entry, 1);
 		}
 	}
@@ -1798,7 +1798,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
 	/*
 	 * First free all entries in the range.
 	 */
-	any_only_cache = __swap_entries_free(si, entry, nr);
+	any_only_cache = swap_entries_put_nr(si, entry, nr);
 
 	/*
 	 * Short-circuit the below loop if none of the entries had their
@@ -1811,7 +1811,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
 	 * Now go back over the range trying to reclaim the swap cache. This is
 	 * more efficient for large folios because we will only try to reclaim
 	 * the swap once per folio in the common case. If we do
-	 * __swap_entry_free() and __try_to_reclaim_swap() in the same loop, the
+	 * swap_entry_put() and __try_to_reclaim_swap() in the same loop, the
 	 * latter will get a reference and lock the folio for every individual
 	 * page but will only succeed once the swap slot for every subpage is
 	 * zero.
@@ -3758,7 +3758,7 @@ int add_swap_count_continuation(swp_entry_t entry, gfp_t gfp_mask)
  * into, carry if so, or else fail until a new continuation page is allocated;
  * when the original swap_map count is decremented from 0 with continuation,
  * borrow from the continuation and report whether it still holds more.
- * Called while __swap_duplicate() or caller of __swap_entry_free_locked()
+ * Called while __swap_duplicate() or caller of swap_entry_put_locked()
  * holds cluster lock.
  */
 static bool swap_count_continued(struct swap_info_struct *si,
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 2/8] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free()
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 1/8] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-19  5:18   ` Kairui Song
  2025-03-18 15:06 ` [PATCH v2 3/8] mm: swap: use swap_entries_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

As all callers of swap_entry_range_free() have already ensured slots to
be freed are marked as SWAP_HAS_CACHE while holding the cluster lock,
the BUG_ON check can be safely removed. After this, the function
swap_entry_range_free() could drop any kind of last flag, rename it to
swap_entries_free() and update it's comment accordingly.

This is a preparation to use swap_entries_free() to drop last ref count
and SWAP_MAP_SHMEM flag.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 mm/swapfile.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 5a775456e26c..0aa7ce82c013 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -52,9 +52,9 @@
 static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
 				 unsigned char);
 static void free_swap_count_continuations(struct swap_info_struct *);
-static void swap_entry_range_free(struct swap_info_struct *si,
-				  struct swap_cluster_info *ci,
-				  swp_entry_t entry, unsigned int nr_pages);
+static void swap_entries_free(struct swap_info_struct *si,
+			      struct swap_cluster_info *ci,
+			      swp_entry_t entry, unsigned int nr_pages);
 static void swap_range_alloc(struct swap_info_struct *si,
 			     unsigned int nr_entries);
 static bool folio_swapcache_freeable(struct folio *folio);
@@ -1463,7 +1463,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
 	ci = lock_cluster(si, offset);
 	usage = swap_entry_put_locked(si, offset, 1);
 	if (!usage)
-		swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
+		swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
 	unlock_cluster(ci);
 
 	return usage;
@@ -1493,7 +1493,7 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
 	for (i = 0; i < nr; i++)
 		WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
 	if (!has_cache)
-		swap_entry_range_free(si, ci, entry, nr);
+		swap_entries_free(si, ci, entry, nr);
 	unlock_cluster(ci);
 
 	return has_cache;
@@ -1512,12 +1512,12 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
 }
 
 /*
- * Drop the last HAS_CACHE flag of swap entries, caller have to
- * ensure all entries belong to the same cgroup.
+ * Drop the last flag(1, SWAP_HAS_CACHE or SWAP_MAP_SHMEM) of swap entries,
+ * caller have to ensure all entries belong to the same cgroup and cluster.
  */
-static void swap_entry_range_free(struct swap_info_struct *si,
-				  struct swap_cluster_info *ci,
-				  swp_entry_t entry, unsigned int nr_pages)
+static void swap_entries_free(struct swap_info_struct *si,
+			      struct swap_cluster_info *ci,
+			      swp_entry_t entry, unsigned int nr_pages)
 {
 	unsigned long offset = swp_offset(entry);
 	unsigned char *map = si->swap_map + offset;
@@ -1530,7 +1530,6 @@ static void swap_entry_range_free(struct swap_info_struct *si,
 
 	ci->count -= nr_pages;
 	do {
-		VM_BUG_ON(*map != SWAP_HAS_CACHE);
 		*map = 0;
 	} while (++map < map_end);
 
@@ -1553,7 +1552,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
 	ci = lock_cluster(si, offset);
 	do {
 		if (!swap_entry_put_locked(si, offset, usage))
-			swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
+			swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
 	} while (++offset < end);
 	unlock_cluster(ci);
 }
@@ -1596,11 +1595,11 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
 
 	ci = lock_cluster(si, offset);
 	if (swap_only_has_cache(si, offset, size))
-		swap_entry_range_free(si, ci, entry, size);
+		swap_entries_free(si, ci, entry, size);
 	else {
 		for (int i = 0; i < size; i++, entry.val++) {
 			if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
-				swap_entry_range_free(si, ci, entry, 1);
+				swap_entries_free(si, ci, entry, 1);
 		}
 	}
 	unlock_cluster(ci);
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 3/8] mm: swap: use swap_entries_free() to free swap entry in swap_entry_put_locked()
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 1/8] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 2/8] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-18 18:08   ` Kairui Song
  2025-03-18 15:06 ` [PATCH v2 4/8] mm: swap: use swap_entries_free() drop last ref count in swap_entries_put_nr() Kemeng Shi
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

In swap_entry_put_locked(), we will set slot to SWAP_HAS_CACHE before
using swap_entries_free() to do actual swap entry freeing. This
introduce an unnecessary intermediate state.
By using swap_entries_free() in swap_entry_put_locked(), we can
eliminate the need to set slot to SWAP_HAS_CACHE.
This change would make the behavior of swap_entry_put_locked() more
consistent with other put() operations which will do actual free work
after put last reference.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 mm/swapfile.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 0aa7ce82c013..40e41e514813 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1348,9 +1348,11 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
 }
 
 static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
-					   unsigned long offset,
+					   struct swap_cluster_info *ci,
+					   swp_entry_t entry,
 					   unsigned char usage)
 {
+	unsigned long offset = swp_offset(entry);
 	unsigned char count;
 	unsigned char has_cache;
 
@@ -1382,7 +1384,7 @@ static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
 	if (usage)
 		WRITE_ONCE(si->swap_map[offset], usage);
 	else
-		WRITE_ONCE(si->swap_map[offset], SWAP_HAS_CACHE);
+		swap_entries_free(si, ci, entry, 1);
 
 	return usage;
 }
@@ -1461,9 +1463,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
 	unsigned char usage;
 
 	ci = lock_cluster(si, offset);
-	usage = swap_entry_put_locked(si, offset, 1);
-	if (!usage)
-		swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
+	usage = swap_entry_put_locked(si, ci, entry, 1);
 	unlock_cluster(ci);
 
 	return usage;
@@ -1551,8 +1551,8 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
 
 	ci = lock_cluster(si, offset);
 	do {
-		if (!swap_entry_put_locked(si, offset, usage))
-			swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
+		swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
+				      usage);
 	} while (++offset < end);
 	unlock_cluster(ci);
 }
@@ -1596,12 +1596,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
 	ci = lock_cluster(si, offset);
 	if (swap_only_has_cache(si, offset, size))
 		swap_entries_free(si, ci, entry, size);
-	else {
-		for (int i = 0; i < size; i++, entry.val++) {
-			if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
-				swap_entries_free(si, ci, entry, 1);
-		}
-	}
+	else
+		for (int i = 0; i < size; i++, entry.val++)
+			swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
 	unlock_cluster(ci);
 }
 
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 4/8] mm: swap: use swap_entries_free() drop last ref count in swap_entries_put_nr()
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
                   ` (2 preceding siblings ...)
  2025-03-18 15:06 ` [PATCH v2 3/8] mm: swap: use swap_entries_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 5/8] mm: swap: drop last SWAP_MAP_SHMEM flag in batch " Kemeng Shi
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

Use swap_entries_free() to directly free swap entries when the swap
entries are not cached and referenced, without needing to set swap
entries to set intermediate SWAP_HAS_CACHE state.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 mm/swapfile.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 40e41e514813..2a08d9e89f90 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1490,10 +1490,11 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
 		unlock_cluster(ci);
 		goto fallback;
 	}
-	for (i = 0; i < nr; i++)
-		WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
 	if (!has_cache)
 		swap_entries_free(si, ci, entry, nr);
+	else
+		for (i = 0; i < nr; i++)
+			WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
 	unlock_cluster(ci);
 
 	return has_cache;
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 5/8] mm: swap: drop last SWAP_MAP_SHMEM flag in batch in swap_entries_put_nr()
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
                   ` (3 preceding siblings ...)
  2025-03-18 15:06 ` [PATCH v2 4/8] mm: swap: use swap_entries_free() drop last ref count in swap_entries_put_nr() Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 6/8] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

The SWAP_MAP_SHMEM indicates last map from shmem. Therefore we can drop
SWAP_MAP_SHMEM in batch in similar way to drop last ref count in batch.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 mm/swapfile.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2a08d9e89f90..134b061ef1ae 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -192,7 +192,7 @@ static bool swap_is_last_map(struct swap_info_struct *si,
 	unsigned char *map_end = map + nr_pages;
 	unsigned char count = *map;
 
-	if (swap_count(count) != 1)
+	if (swap_count(count) != 1 && swap_count(count) != SWAP_MAP_SHMEM)
 		return false;
 
 	while (++map < map_end) {
@@ -1479,7 +1479,10 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
 	unsigned char count;
 	int i;
 
-	if (nr <= 1 || swap_count(data_race(si->swap_map[offset])) != 1)
+	if (nr <= 1)
+		goto fallback;
+	count = swap_count(data_race(si->swap_map[offset]));
+	if (count != 1 && count != SWAP_MAP_SHMEM)
 		goto fallback;
 	/* cross into another cluster */
 	if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER)
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 6/8] mm: swap: free each cluster individually in swap_entries_put_map_nr()
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
                   ` (4 preceding siblings ...)
  2025-03-18 15:06 ` [PATCH v2 5/8] mm: swap: drop last SWAP_MAP_SHMEM flag in batch " Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 7/8] mm: swap: factor out helper to drop cache of entries within a single cluster Kemeng Shi
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

1. Factor out general swap_entries_put_map() helper to drop entries belong
to one cluster. If entries are last map, free entries in batch, otherwise
put entries with cluster lock acquired and released only once.
2. Iterate and call swap_entries_put_map() for each cluster in
swap_entries_put_nr() to leverage batch-remove for last map belong to one
cluster and reduce lock acquire/release in fallback case.
3. As swap_entries_put_nr() won't handle SWAP_HSA_CACHE drop, rename it to
swap_entries_put_map_nr().
4. As we won't drop each entry invidually with swap_entry_put() now, do
reclaim in free_swap_and_cache_nr() is because swap_entries_put_map_nr()
is general routine to drop reference and the relcaim work should only be
done in free_swap_and_cache_nr(). Remove stale comment accordingly.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 70 +++++++++++++++++++++++----------------------------
 1 file changed, 32 insertions(+), 38 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 134b061ef1ae..370509eb2f1f 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1455,25 +1455,10 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
 	return NULL;
 }
 
-static unsigned char swap_entry_put(struct swap_info_struct *si,
-				    swp_entry_t entry)
-{
-	struct swap_cluster_info *ci;
-	unsigned long offset = swp_offset(entry);
-	unsigned char usage;
-
-	ci = lock_cluster(si, offset);
-	usage = swap_entry_put_locked(si, ci, entry, 1);
-	unlock_cluster(ci);
-
-	return usage;
-}
-
-static bool swap_entries_put_nr(struct swap_info_struct *si,
-				swp_entry_t entry, int nr)
+static bool swap_entries_put_map(struct swap_info_struct *si,
+				 swp_entry_t entry, int nr)
 {
 	unsigned long offset = swp_offset(entry);
-	unsigned int type = swp_type(entry);
 	struct swap_cluster_info *ci;
 	bool has_cache = false;
 	unsigned char count;
@@ -1484,14 +1469,10 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
 	count = swap_count(data_race(si->swap_map[offset]));
 	if (count != 1 && count != SWAP_MAP_SHMEM)
 		goto fallback;
-	/* cross into another cluster */
-	if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER)
-		goto fallback;
 
 	ci = lock_cluster(si, offset);
 	if (!swap_is_last_map(si, offset, nr, &has_cache)) {
-		unlock_cluster(ci);
-		goto fallback;
+		goto locked_fallback;
 	}
 	if (!has_cache)
 		swap_entries_free(si, ci, entry, nr);
@@ -1503,15 +1484,34 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
 	return has_cache;
 
 fallback:
-	for (i = 0; i < nr; i++) {
-		if (data_race(si->swap_map[offset + i])) {
-			count = swap_entry_put(si, swp_entry(type, offset + i));
-			if (count == SWAP_HAS_CACHE)
-				has_cache = true;
-		} else {
-			WARN_ON_ONCE(1);
-		}
+	ci = lock_cluster(si, offset);
+locked_fallback:
+	for (i = 0; i < nr; i++, entry.val++) {
+		count = swap_entry_put_locked(si, ci, entry, 1);
+		if (count == SWAP_HAS_CACHE)
+			has_cache = true;
+	}
+	unlock_cluster(ci);
+	return has_cache;
+
+}
+
+static bool swap_entries_put_map_nr(struct swap_info_struct *si,
+				    swp_entry_t entry, int nr)
+{
+	int cluster_nr, cluster_rest;
+	unsigned long offset = swp_offset(entry);
+	bool has_cache = false;
+
+	cluster_rest = SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER;
+	while (nr) {
+		cluster_nr = min(nr, cluster_rest);
+		has_cache |= swap_entries_put_map(si, entry, cluster_nr);
+		cluster_rest = SWAPFILE_CLUSTER;
+		nr -= cluster_nr;
+		entry.val += cluster_nr;
 	}
+
 	return has_cache;
 }
 
@@ -1798,7 +1798,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
 	/*
 	 * First free all entries in the range.
 	 */
-	any_only_cache = swap_entries_put_nr(si, entry, nr);
+	any_only_cache = swap_entries_put_map_nr(si, entry, nr);
 
 	/*
 	 * Short-circuit the below loop if none of the entries had their
@@ -1808,13 +1808,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
 		goto out;
 
 	/*
-	 * Now go back over the range trying to reclaim the swap cache. This is
-	 * more efficient for large folios because we will only try to reclaim
-	 * the swap once per folio in the common case. If we do
-	 * swap_entry_put() and __try_to_reclaim_swap() in the same loop, the
-	 * latter will get a reference and lock the folio for every individual
-	 * page but will only succeed once the swap slot for every subpage is
-	 * zero.
+	 * Now go back over the range trying to reclaim the swap cache.
 	 */
 	for (offset = start_offset; offset < end_offset; offset += nr) {
 		nr = 1;
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 7/8] mm: swap: factor out helper to drop cache of entries within a single cluster
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
                   ` (5 preceding siblings ...)
  2025-03-18 15:06 ` [PATCH v2 6/8] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-18 15:06 ` [PATCH v2 8/8] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() Kemeng Shi
  2025-03-18 17:03 ` [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Tim Chen
  8 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

Factor out helper swap_entries_put_cache() from put_swap_folio() to server
as a general-purpose routine for dropping cache flag of entries within a
single cluster.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 mm/swapfile.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 370509eb2f1f..919daaa3922a 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1455,6 +1455,21 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
 	return NULL;
 }
 
+static void swap_entries_put_cache(struct swap_info_struct *si,
+				   swp_entry_t entry, int nr)
+{
+	unsigned long offset = swp_offset(entry);
+	struct swap_cluster_info *ci;
+
+	ci = lock_cluster(si, offset);
+	if (swap_only_has_cache(si, offset, nr))
+		swap_entries_free(si, ci, entry, nr);
+	else
+		for (int i = 0; i < nr; i++, entry.val++)
+			swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
+	unlock_cluster(ci);
+}
+
 static bool swap_entries_put_map(struct swap_info_struct *si,
 				 swp_entry_t entry, int nr)
 {
@@ -1588,8 +1603,6 @@ void swap_free_nr(swp_entry_t entry, int nr_pages)
  */
 void put_swap_folio(struct folio *folio, swp_entry_t entry)
 {
-	unsigned long offset = swp_offset(entry);
-	struct swap_cluster_info *ci;
 	struct swap_info_struct *si;
 	int size = 1 << swap_entry_order(folio_order(folio));
 
@@ -1597,13 +1610,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
 	if (!si)
 		return;
 
-	ci = lock_cluster(si, offset);
-	if (swap_only_has_cache(si, offset, size))
-		swap_entries_free(si, ci, entry, size);
-	else
-		for (int i = 0; i < size; i++, entry.val++)
-			swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
-	unlock_cluster(ci);
+	swap_entries_put_cache(si, entry, size);
 }
 
 int __swap_count(swp_entry_t entry)
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 8/8] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]()
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
                   ` (6 preceding siblings ...)
  2025-03-18 15:06 ` [PATCH v2 7/8] mm: swap: factor out helper to drop cache of entries within a single cluster Kemeng Shi
@ 2025-03-18 15:06 ` Kemeng Shi
  2025-03-18 17:03 ` [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Tim Chen
  8 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-18 15:06 UTC (permalink / raw)
  To: akpm; +Cc: tim.c.chen, ryncsn, linux-mm, linux-kernel

Replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() to
remove repeat code and leverage batch-remove for entries with last flag.
After removing cluster_swap_free_nr, only functions with "_nr" suffix
could free entries spanning cross clusters. Add corresponding description
in comment of swap_entries_put_map_nr() as is first function with "_nr"
suffix and have a non-suffix variant function swap_entries_put_map().

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 30 +++++++++++-------------------
 1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 919daaa3922a..7f3e9ce998f9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1511,6 +1511,11 @@ static bool swap_entries_put_map(struct swap_info_struct *si,
 
 }
 
+/*
+ * Only functions with "_nr" suffix are able to free entries spanning
+ * cross multi clusters, so ensure the range is within a single cluster
+ * when freeing entries with functions without "_nr" suffix.
+ */
 static bool swap_entries_put_map_nr(struct swap_info_struct *si,
 				    swp_entry_t entry, int nr)
 {
@@ -1561,21 +1566,6 @@ static void swap_entries_free(struct swap_info_struct *si,
 		partial_free_cluster(si, ci);
 }
 
-static void cluster_swap_free_nr(struct swap_info_struct *si,
-		unsigned long offset, int nr_pages,
-		unsigned char usage)
-{
-	struct swap_cluster_info *ci;
-	unsigned long end = offset + nr_pages;
-
-	ci = lock_cluster(si, offset);
-	do {
-		swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
-				      usage);
-	} while (++offset < end);
-	unlock_cluster(ci);
-}
-
 /*
  * Caller has made sure that the swap device corresponding to entry
  * is still around or has not been recycled.
@@ -1592,7 +1582,7 @@ void swap_free_nr(swp_entry_t entry, int nr_pages)
 
 	while (nr_pages) {
 		nr = min_t(int, nr_pages, SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);
-		cluster_swap_free_nr(sis, offset, nr, 1);
+		swap_entries_put_map(sis, swp_entry(sis->type, offset), nr);
 		offset += nr;
 		nr_pages -= nr;
 	}
@@ -3615,11 +3605,13 @@ int swapcache_prepare(swp_entry_t entry, int nr)
 	return __swap_duplicate(entry, SWAP_HAS_CACHE, nr);
 }
 
+/*
+ * Caller should ensure entries belong to the same folio so
+ * the entries won't span cross cluster boundary.
+ */
 void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int nr)
 {
-	unsigned long offset = swp_offset(entry);
-
-	cluster_swap_free_nr(si, offset, nr, SWAP_HAS_CACHE);
+	swap_entries_put_cache(si, entry, nr);
 }
 
 struct swap_info_struct *swp_swap_info(swp_entry_t entry)
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code
  2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
                   ` (7 preceding siblings ...)
  2025-03-18 15:06 ` [PATCH v2 8/8] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() Kemeng Shi
@ 2025-03-18 17:03 ` Tim Chen
  8 siblings, 0 replies; 14+ messages in thread
From: Tim Chen @ 2025-03-18 17:03 UTC (permalink / raw)
  To: Kemeng Shi, akpm; +Cc: ryncsn, linux-mm, linux-kernel

On Tue, 2025-03-18 at 23:06 +0800, Kemeng Shi wrote:
> v1->v2:
> -Collect RVB from Tim
> -Drop patch to factor out __swap_entry_free()
> -Improve changelog and add more comment.
> -Avoid unneeded lock re-aquire
> 

The updated patch series look good to me.  You can add

Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>

for the series.


Tim


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/8] mm: swap: use swap_entries_free() to free swap entry in swap_entry_put_locked()
  2025-03-18 15:06 ` [PATCH v2 3/8] mm: swap: use swap_entries_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
@ 2025-03-18 18:08   ` Kairui Song
  2025-03-20  1:05     ` Kemeng Shi
  0 siblings, 1 reply; 14+ messages in thread
From: Kairui Song @ 2025-03-18 18:08 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: akpm, tim.c.chen, linux-mm, linux-kernel

On Tue, Mar 18, 2025 at 2:10 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>
> In swap_entry_put_locked(), we will set slot to SWAP_HAS_CACHE before
> using swap_entries_free() to do actual swap entry freeing. This
> introduce an unnecessary intermediate state.
> By using swap_entries_free() in swap_entry_put_locked(), we can
> eliminate the need to set slot to SWAP_HAS_CACHE.
> This change would make the behavior of swap_entry_put_locked() more
> consistent with other put() operations which will do actual free work
> after put last reference.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>  mm/swapfile.c | 23 ++++++++++-------------
>  1 file changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 0aa7ce82c013..40e41e514813 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1348,9 +1348,11 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
>  }
>
>  static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
> -                                          unsigned long offset,
> +                                          struct swap_cluster_info *ci,
> +                                          swp_entry_t entry,
>                                            unsigned char usage)
>  {
> +       unsigned long offset = swp_offset(entry);
>         unsigned char count;
>         unsigned char has_cache;
>
> @@ -1382,7 +1384,7 @@ static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
>         if (usage)
>                 WRITE_ONCE(si->swap_map[offset], usage);
>         else
> -               WRITE_ONCE(si->swap_map[offset], SWAP_HAS_CACHE);
> +               swap_entries_free(si, ci, entry, 1);
>
>         return usage;
>  }
> @@ -1461,9 +1463,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
>         unsigned char usage;
>
>         ci = lock_cluster(si, offset);
> -       usage = swap_entry_put_locked(si, offset, 1);
> -       if (!usage)
> -               swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
> +       usage = swap_entry_put_locked(si, ci, entry, 1);
>         unlock_cluster(ci);
>
>         return usage;
> @@ -1551,8 +1551,8 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
>
>         ci = lock_cluster(si, offset);
>         do {
> -               if (!swap_entry_put_locked(si, offset, usage))
> -                       swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
> +               swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
> +                                     usage);
>         } while (++offset < end);
>         unlock_cluster(ci);
>  }
> @@ -1596,12 +1596,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
>         ci = lock_cluster(si, offset);
>         if (swap_only_has_cache(si, offset, size))
>                 swap_entries_free(si, ci, entry, size);
> -       else {
> -               for (int i = 0; i < size; i++, entry.val++) {
> -                       if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
> -                               swap_entries_free(si, ci, entry, 1);
> -               }
> -       }
> +       else
> +               for (int i = 0; i < size; i++, entry.val++)
> +                       swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);

I'd prefer you keep the bracket here for more readability, and maybe
add bracket for the whole if statement, just a tiny nitpick so still:

Reviewed-by: Kairui Song <kasong@tencent.com>

>         unlock_cluster(ci);
>  }

>
> --
> 2.30.0
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/8] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free()
  2025-03-18 15:06 ` [PATCH v2 2/8] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
@ 2025-03-19  5:18   ` Kairui Song
  2025-03-20  1:10     ` Kemeng Shi
  0 siblings, 1 reply; 14+ messages in thread
From: Kairui Song @ 2025-03-19  5:18 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: akpm, tim.c.chen, linux-mm, linux-kernel

On Tue, Mar 18, 2025 at 2:10 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>
> As all callers of swap_entry_range_free() have already ensured slots to
> be freed are marked as SWAP_HAS_CACHE while holding the cluster lock,
> the BUG_ON check can be safely removed. After this, the function
> swap_entry_range_free() could drop any kind of last flag, rename it to
> swap_entries_free() and update it's comment accordingly.
>
> This is a preparation to use swap_entries_free() to drop last ref count
> and SWAP_MAP_SHMEM flag.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>  mm/swapfile.c | 27 +++++++++++++--------------
>  1 file changed, 13 insertions(+), 14 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 5a775456e26c..0aa7ce82c013 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -52,9 +52,9 @@
>  static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
>                                  unsigned char);
>  static void free_swap_count_continuations(struct swap_info_struct *);
> -static void swap_entry_range_free(struct swap_info_struct *si,
> -                                 struct swap_cluster_info *ci,
> -                                 swp_entry_t entry, unsigned int nr_pages);
> +static void swap_entries_free(struct swap_info_struct *si,
> +                             struct swap_cluster_info *ci,
> +                             swp_entry_t entry, unsigned int nr_pages);
>  static void swap_range_alloc(struct swap_info_struct *si,
>                              unsigned int nr_entries);
>  static bool folio_swapcache_freeable(struct folio *folio);
> @@ -1463,7 +1463,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
>         ci = lock_cluster(si, offset);
>         usage = swap_entry_put_locked(si, offset, 1);
>         if (!usage)
> -               swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
> +               swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
>         unlock_cluster(ci);
>
>         return usage;
> @@ -1493,7 +1493,7 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
>         for (i = 0; i < nr; i++)
>                 WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
>         if (!has_cache)
> -               swap_entry_range_free(si, ci, entry, nr);
> +               swap_entries_free(si, ci, entry, nr);
>         unlock_cluster(ci);
>
>         return has_cache;
> @@ -1512,12 +1512,12 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
>  }
>
>  /*
> - * Drop the last HAS_CACHE flag of swap entries, caller have to
> - * ensure all entries belong to the same cgroup.
> + * Drop the last flag(1, SWAP_HAS_CACHE or SWAP_MAP_SHMEM) of swap entries,
> + * caller have to ensure all entries belong to the same cgroup and cluster.
>   */
> -static void swap_entry_range_free(struct swap_info_struct *si,
> -                                 struct swap_cluster_info *ci,
> -                                 swp_entry_t entry, unsigned int nr_pages)
> +static void swap_entries_free(struct swap_info_struct *si,
> +                             struct swap_cluster_info *ci,
> +                             swp_entry_t entry, unsigned int nr_pages)
>  {
>         unsigned long offset = swp_offset(entry);
>         unsigned char *map = si->swap_map + offset;
> @@ -1530,7 +1530,6 @@ static void swap_entry_range_free(struct swap_info_struct *si,
>
>         ci->count -= nr_pages;
>         do {
> -               VM_BUG_ON(*map != SWAP_HAS_CACHE);

Hi Kemeng

Instead of just dropping this check, maybe it's better to change it to
something like VM_BUG_ON(!*map); to catch potential SWAP double free?
I've found this check very helpful for debugging, especially for
catching concurrency problems.

Or more strictly: VM_BUG_ON(*map !=  SWAP_HAS_CACHE && *map != 1 &&
*map != SWAP_MAP_SHMEM);, you may introduce a helper to check if a
entry is the "last map" like this and use it somewhere else too.


>                 *map = 0;
>         } while (++map < map_end);
>
> @@ -1553,7 +1552,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
>         ci = lock_cluster(si, offset);
>         do {
>                 if (!swap_entry_put_locked(si, offset, usage))
> -                       swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
> +                       swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
>         } while (++offset < end);
>         unlock_cluster(ci);
>  }
> @@ -1596,11 +1595,11 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
>
>         ci = lock_cluster(si, offset);
>         if (swap_only_has_cache(si, offset, size))
> -               swap_entry_range_free(si, ci, entry, size);
> +               swap_entries_free(si, ci, entry, size);
>         else {
>                 for (int i = 0; i < size; i++, entry.val++) {
>                         if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
> -                               swap_entry_range_free(si, ci, entry, 1);
> +                               swap_entries_free(si, ci, entry, 1);
>                 }
>         }
>         unlock_cluster(ci);
> --
> 2.30.0
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/8] mm: swap: use swap_entries_free() to free swap entry in swap_entry_put_locked()
  2025-03-18 18:08   ` Kairui Song
@ 2025-03-20  1:05     ` Kemeng Shi
  0 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-20  1:05 UTC (permalink / raw)
  To: Kairui Song; +Cc: akpm, tim.c.chen, linux-mm, linux-kernel



on 3/19/2025 2:08 AM, Kairui Song wrote:
> On Tue, Mar 18, 2025 at 2:10 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>>
>> In swap_entry_put_locked(), we will set slot to SWAP_HAS_CACHE before
>> using swap_entries_free() to do actual swap entry freeing. This
>> introduce an unnecessary intermediate state.
>> By using swap_entries_free() in swap_entry_put_locked(), we can
>> eliminate the need to set slot to SWAP_HAS_CACHE.
>> This change would make the behavior of swap_entry_put_locked() more
>> consistent with other put() operations which will do actual free work
>> after put last reference.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> ---
>>  mm/swapfile.c | 23 ++++++++++-------------
>>  1 file changed, 10 insertions(+), 13 deletions(-)
>>
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 0aa7ce82c013..40e41e514813 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -1348,9 +1348,11 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
>>  }
>>
>>  static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
>> -                                          unsigned long offset,
>> +                                          struct swap_cluster_info *ci,
>> +                                          swp_entry_t entry,
>>                                            unsigned char usage)
>>  {
>> +       unsigned long offset = swp_offset(entry);
>>         unsigned char count;
>>         unsigned char has_cache;
>>
>> @@ -1382,7 +1384,7 @@ static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
>>         if (usage)
>>                 WRITE_ONCE(si->swap_map[offset], usage);
>>         else
>> -               WRITE_ONCE(si->swap_map[offset], SWAP_HAS_CACHE);
>> +               swap_entries_free(si, ci, entry, 1);
>>
>>         return usage;
>>  }
>> @@ -1461,9 +1463,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
>>         unsigned char usage;
>>
>>         ci = lock_cluster(si, offset);
>> -       usage = swap_entry_put_locked(si, offset, 1);
>> -       if (!usage)
>> -               swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
>> +       usage = swap_entry_put_locked(si, ci, entry, 1);
>>         unlock_cluster(ci);
>>
>>         return usage;
>> @@ -1551,8 +1551,8 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
>>
>>         ci = lock_cluster(si, offset);
>>         do {
>> -               if (!swap_entry_put_locked(si, offset, usage))
>> -                       swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
>> +               swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
>> +                                     usage);
>>         } while (++offset < end);
>>         unlock_cluster(ci);
>>  }
>> @@ -1596,12 +1596,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
>>         ci = lock_cluster(si, offset);
>>         if (swap_only_has_cache(si, offset, size))
>>                 swap_entries_free(si, ci, entry, size);
>> -       else {
>> -               for (int i = 0; i < size; i++, entry.val++) {
>> -                       if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
>> -                               swap_entries_free(si, ci, entry, 1);
>> -               }
>> -       }
>> +       else
>> +               for (int i = 0; i < size; i++, entry.val++)
>> +                       swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
> 
> I'd prefer you keep the bracket here for more readability, and maybe
> add bracket for the whole if statement, just a tiny nitpick so still:
Thanks for review. Both ways are acceptable to me. I will keep the
bracket in next version.

Thanks,
Kemeng
> 
> Reviewed-by: Kairui Song <kasong@tencent.com>
> 
>>         unlock_cluster(ci);
>>  }
> 
>>
>> --
>> 2.30.0
>>
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/8] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free()
  2025-03-19  5:18   ` Kairui Song
@ 2025-03-20  1:10     ` Kemeng Shi
  0 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-03-20  1:10 UTC (permalink / raw)
  To: Kairui Song; +Cc: akpm, tim.c.chen, linux-mm, linux-kernel



on 3/19/2025 1:18 PM, Kairui Song wrote:
> On Tue, Mar 18, 2025 at 2:10 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>>
>> As all callers of swap_entry_range_free() have already ensured slots to
>> be freed are marked as SWAP_HAS_CACHE while holding the cluster lock,
>> the BUG_ON check can be safely removed. After this, the function
>> swap_entry_range_free() could drop any kind of last flag, rename it to
>> swap_entries_free() and update it's comment accordingly.
>>
>> This is a preparation to use swap_entries_free() to drop last ref count
>> and SWAP_MAP_SHMEM flag.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>> ---
>>  mm/swapfile.c | 27 +++++++++++++--------------
>>  1 file changed, 13 insertions(+), 14 deletions(-)
>>
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 5a775456e26c..0aa7ce82c013 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -52,9 +52,9 @@
>>  static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
>>                                  unsigned char);
>>  static void free_swap_count_continuations(struct swap_info_struct *);
>> -static void swap_entry_range_free(struct swap_info_struct *si,
>> -                                 struct swap_cluster_info *ci,
>> -                                 swp_entry_t entry, unsigned int nr_pages);
>> +static void swap_entries_free(struct swap_info_struct *si,
>> +                             struct swap_cluster_info *ci,
>> +                             swp_entry_t entry, unsigned int nr_pages);
>>  static void swap_range_alloc(struct swap_info_struct *si,
>>                              unsigned int nr_entries);
>>  static bool folio_swapcache_freeable(struct folio *folio);
>> @@ -1463,7 +1463,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
>>         ci = lock_cluster(si, offset);
>>         usage = swap_entry_put_locked(si, offset, 1);
>>         if (!usage)
>> -               swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
>> +               swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
>>         unlock_cluster(ci);
>>
>>         return usage;
>> @@ -1493,7 +1493,7 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
>>         for (i = 0; i < nr; i++)
>>                 WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
>>         if (!has_cache)
>> -               swap_entry_range_free(si, ci, entry, nr);
>> +               swap_entries_free(si, ci, entry, nr);
>>         unlock_cluster(ci);
>>
>>         return has_cache;
>> @@ -1512,12 +1512,12 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
>>  }
>>
>>  /*
>> - * Drop the last HAS_CACHE flag of swap entries, caller have to
>> - * ensure all entries belong to the same cgroup.
>> + * Drop the last flag(1, SWAP_HAS_CACHE or SWAP_MAP_SHMEM) of swap entries,
>> + * caller have to ensure all entries belong to the same cgroup and cluster.
>>   */
>> -static void swap_entry_range_free(struct swap_info_struct *si,
>> -                                 struct swap_cluster_info *ci,
>> -                                 swp_entry_t entry, unsigned int nr_pages)
>> +static void swap_entries_free(struct swap_info_struct *si,
>> +                             struct swap_cluster_info *ci,
>> +                             swp_entry_t entry, unsigned int nr_pages)
>>  {
>>         unsigned long offset = swp_offset(entry);
>>         unsigned char *map = si->swap_map + offset;
>> @@ -1530,7 +1530,6 @@ static void swap_entry_range_free(struct swap_info_struct *si,
>>
>>         ci->count -= nr_pages;
>>         do {
>> -               VM_BUG_ON(*map != SWAP_HAS_CACHE);
> 
> Hi Kemeng
> 
> Instead of just dropping this check, maybe it's better to change it to
> something like VM_BUG_ON(!*map); to catch potential SWAP double free?
> I've found this check very helpful for debugging, especially for
> catching concurrency problems.
Sure, will keep VM_BUG_ON as it's useful.
> 
> Or more strictly: VM_BUG_ON(*map !=  SWAP_HAS_CACHE && *map != 1 &&
> *map != SWAP_MAP_SHMEM);, you may introduce a helper to check if a
> entry is the "last map" like this and use it somewhere else too.
I'd add VM_BUG_ON in this way to catch unexpected freeing.

Thanks,
Kemeng
> 
> 
>>                 *map = 0;
>>         } while (++map < map_end);
>>
>> @@ -1553,7 +1552,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
>>         ci = lock_cluster(si, offset);
>>         do {
>>                 if (!swap_entry_put_locked(si, offset, usage))
>> -                       swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
>> +                       swap_entries_free(si, ci, swp_entry(si->type, offset), 1);
>>         } while (++offset < end);
>>         unlock_cluster(ci);
>>  }
>> @@ -1596,11 +1595,11 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
>>
>>         ci = lock_cluster(si, offset);
>>         if (swap_only_has_cache(si, offset, size))
>> -               swap_entry_range_free(si, ci, entry, size);
>> +               swap_entries_free(si, ci, entry, size);
>>         else {
>>                 for (int i = 0; i < size; i++, entry.val++) {
>>                         if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
>> -                               swap_entry_range_free(si, ci, entry, 1);
>> +                               swap_entries_free(si, ci, entry, 1);
>>                 }
>>         }
>>         unlock_cluster(ci);
>> --
>> 2.30.0
>>
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-03-20  1:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-18 15:06 [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 1/8] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 2/8] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
2025-03-19  5:18   ` Kairui Song
2025-03-20  1:10     ` Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 3/8] mm: swap: use swap_entries_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
2025-03-18 18:08   ` Kairui Song
2025-03-20  1:05     ` Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 4/8] mm: swap: use swap_entries_free() drop last ref count in swap_entries_put_nr() Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 5/8] mm: swap: drop last SWAP_MAP_SHMEM flag in batch " Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 6/8] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 7/8] mm: swap: factor out helper to drop cache of entries within a single cluster Kemeng Shi
2025-03-18 15:06 ` [PATCH v2 8/8] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() Kemeng Shi
2025-03-18 17:03 ` [PATCH v2 0/8] Minor cleanups and improvements to swap freeing code Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox