* Re: [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper
2025-03-13 21:05 ` [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper Kemeng Shi
@ 2025-03-13 17:42 ` Kairui Song
2025-03-14 7:32 ` Kemeng Shi
0 siblings, 1 reply; 25+ messages in thread
From: Kairui Song @ 2025-03-13 17:42 UTC (permalink / raw)
To: Kemeng Shi; +Cc: akpm, linux-mm, linux-kernel
On Thu, Mar 13, 2025 at 8:09 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>
> Factor out the actual swap entry freeing logic to new helper
> __swap_entries_free().
> This allow us to futher simplify other swap entry freeing code by
> leveraging __swap_entries_free() helper function.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
> mm/swapfile.c | 30 ++++++++++++++++++++----------
> 1 file changed, 20 insertions(+), 10 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 5a775456e26c..7c886f9dd6f9 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1347,6 +1347,25 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
> return NULL;
> }
>
> +static inline void __swap_entries_free(struct swap_info_struct *si,
> + struct swap_cluster_info *ci,
> + swp_entry_t entry, unsigned int nr_pages)
> +{
> + unsigned long offset = swp_offset(entry);
> +
> + VM_BUG_ON(cluster_is_empty(ci));
> + VM_BUG_ON(ci->count < nr_pages);
> +
> + ci->count -= nr_pages;
> + mem_cgroup_uncharge_swap(entry, nr_pages);
> + swap_range_free(si, offset, nr_pages);
> +
> + if (!ci->count)
> + free_cluster(si, ci);
> + else
> + partial_free_cluster(si, ci);
> +}
> +
> static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
> unsigned long offset,
> unsigned char usage)
> @@ -1525,22 +1544,13 @@ static void swap_entry_range_free(struct swap_info_struct *si,
>
> /* It should never free entries across different clusters */
> VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1));
> - VM_BUG_ON(cluster_is_empty(ci));
> - VM_BUG_ON(ci->count < nr_pages);
>
> - ci->count -= nr_pages;
> do {
> VM_BUG_ON(*map != SWAP_HAS_CACHE);
> *map = 0;
> } while (++map < map_end);
>
> - mem_cgroup_uncharge_swap(entry, nr_pages);
> - swap_range_free(si, offset, nr_pages);
> -
> - if (!ci->count)
> - free_cluster(si, ci);
> - else
> - partial_free_cluster(si, ci);
> + __swap_entries_free(si, ci, entry, nr_pages);
> }
>
> static void cluster_swap_free_nr(struct swap_info_struct *si,
> --
> 2.30.0
>
>
Hi Kemeng,
This patch is a bit too trivial to be a standalone one, you can fold
it with the later one easily. Also you may want to carry the
VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1)); check
as well.
But, that is basically just renaming swap_entry_range_free, you may
just remove the HAS_CACHE check as you rename it, that way the changes
should be cleaner.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 0/9] Minor cleanups and improvements to swap freeing code
@ 2025-03-13 21:05 Kemeng Shi
2025-03-13 21:05 ` [PATCH 1/9] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
` (9 more replies)
0 siblings, 10 replies; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
Hi All,
This series contains some cleanups and improvements which are made
during learning swapfile. Here is a summary of the changes:
1. Function nameing improvments.
-Use "put" instead of "free" to name functions which only do actual free
when count drops to zero.
-Use "entry" to name function only frees one swap slot. Use "entries" to
name function could may free multi swap slots within one cluster. Use
"_nr" suffix to name function which could free multi swap slots spanning
cross multi clusters.
2. Eliminate the need to set swap slot to intermediate SWAP_HAS_CACHE
value before do actual free by using swap_entry_range_free()
3. Add helpers swap_entries_put_map() and swap_entries_put_cache() as a
general-purpose routine to free swap entries within a single cluster
which will try batch-remove first and fallback to put eatch entry
indvidually with cluster lock acquired/released only once. By using
these helpers, we could remove repeated code, levarage batch-remove in
more cases and aoivd to acquire/release cluster lock for each single
swap entry.
Kemeng Shi (9):
mm: swap: rename __swap_[entry/entries]_free[_locked] to
swap_[entry/entries]_put[_locked]
mm: swap: factor out the actual swap entry freeing logic to new helper
mm: swap: use __swap_entry_free() to free swap entry in
swap_entry_put_locked()
mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in
swap_entry_range_free()
mm: swap: use swap_entries_free() drop last 1 flag in
swap_entries_put_nr()
mm: swap: drop last SWAP_MAP_SHMEM flag in batch in
swap_entries_put_nr()
mm: swap: free each cluster individually in swap_entries_put_map_nr()
mm: swap: factor out helper to drop cache of entries within a single
cluster
mm: swap: replace cluster_swap_free_nr() with
swap_entries_put_[map/cache]()
mm/swapfile.c | 173 +++++++++++++++++++++++++-------------------------
1 file changed, 86 insertions(+), 87 deletions(-)
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/9] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked]
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-14 20:37 ` Tim Chen
2025-03-13 21:05 ` [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper Kemeng Shi
` (8 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
In __swap_entry_free[_locked] and __swap_entries_free, we decrease count
first and only free swap entry if count drops to zero. This behavior is
more akin to a put() operation rather than a free() operation. Therefore,
rename these functions with "put" instead of "free".
Additionally, add "_nr" suffix to swap_entries_put to indicate the input
range may span swap clusters.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 628f67974a7c..5a775456e26c 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1347,9 +1347,9 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
return NULL;
}
-static unsigned char __swap_entry_free_locked(struct swap_info_struct *si,
- unsigned long offset,
- unsigned char usage)
+static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
+ unsigned long offset,
+ unsigned char usage)
{
unsigned char count;
unsigned char has_cache;
@@ -1453,15 +1453,15 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
return NULL;
}
-static unsigned char __swap_entry_free(struct swap_info_struct *si,
- swp_entry_t entry)
+static unsigned char swap_entry_put(struct swap_info_struct *si,
+ swp_entry_t entry)
{
struct swap_cluster_info *ci;
unsigned long offset = swp_offset(entry);
unsigned char usage;
ci = lock_cluster(si, offset);
- usage = __swap_entry_free_locked(si, offset, 1);
+ usage = swap_entry_put_locked(si, offset, 1);
if (!usage)
swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
unlock_cluster(ci);
@@ -1469,8 +1469,8 @@ static unsigned char __swap_entry_free(struct swap_info_struct *si,
return usage;
}
-static bool __swap_entries_free(struct swap_info_struct *si,
- swp_entry_t entry, int nr)
+static bool swap_entries_put_nr(struct swap_info_struct *si,
+ swp_entry_t entry, int nr)
{
unsigned long offset = swp_offset(entry);
unsigned int type = swp_type(entry);
@@ -1501,7 +1501,7 @@ static bool __swap_entries_free(struct swap_info_struct *si,
fallback:
for (i = 0; i < nr; i++) {
if (data_race(si->swap_map[offset + i])) {
- count = __swap_entry_free(si, swp_entry(type, offset + i));
+ count = swap_entry_put(si, swp_entry(type, offset + i));
if (count == SWAP_HAS_CACHE)
has_cache = true;
} else {
@@ -1552,7 +1552,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
ci = lock_cluster(si, offset);
do {
- if (!__swap_entry_free_locked(si, offset, usage))
+ if (!swap_entry_put_locked(si, offset, usage))
swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
} while (++offset < end);
unlock_cluster(ci);
@@ -1599,7 +1599,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
swap_entry_range_free(si, ci, entry, size);
else {
for (int i = 0; i < size; i++, entry.val++) {
- if (!__swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE))
+ if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
swap_entry_range_free(si, ci, entry, 1);
}
}
@@ -1798,7 +1798,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
/*
* First free all entries in the range.
*/
- any_only_cache = __swap_entries_free(si, entry, nr);
+ any_only_cache = swap_entries_put_nr(si, entry, nr);
/*
* Short-circuit the below loop if none of the entries had their
@@ -1811,7 +1811,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
* Now go back over the range trying to reclaim the swap cache. This is
* more efficient for large folios because we will only try to reclaim
* the swap once per folio in the common case. If we do
- * __swap_entry_free() and __try_to_reclaim_swap() in the same loop, the
+ * swap_entry_put() and __try_to_reclaim_swap() in the same loop, the
* latter will get a reference and lock the folio for every individual
* page but will only succeed once the swap slot for every subpage is
* zero.
@@ -3758,7 +3758,7 @@ int add_swap_count_continuation(swp_entry_t entry, gfp_t gfp_mask)
* into, carry if so, or else fail until a new continuation page is allocated;
* when the original swap_map count is decremented from 0 with continuation,
* borrow from the continuation and report whether it still holds more.
- * Called while __swap_duplicate() or caller of __swap_entry_free_locked()
+ * Called while __swap_duplicate() or caller of swap_entry_put_locked()
* holds cluster lock.
*/
static bool swap_count_continued(struct swap_info_struct *si,
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
2025-03-13 21:05 ` [PATCH 1/9] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-13 17:42 ` Kairui Song
2025-03-13 21:05 ` [PATCH 3/9] mm: swap: use __swap_entry_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
` (7 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
Factor out the actual swap entry freeing logic to new helper
__swap_entries_free().
This allow us to futher simplify other swap entry freeing code by
leveraging __swap_entries_free() helper function.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 5a775456e26c..7c886f9dd6f9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1347,6 +1347,25 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
return NULL;
}
+static inline void __swap_entries_free(struct swap_info_struct *si,
+ struct swap_cluster_info *ci,
+ swp_entry_t entry, unsigned int nr_pages)
+{
+ unsigned long offset = swp_offset(entry);
+
+ VM_BUG_ON(cluster_is_empty(ci));
+ VM_BUG_ON(ci->count < nr_pages);
+
+ ci->count -= nr_pages;
+ mem_cgroup_uncharge_swap(entry, nr_pages);
+ swap_range_free(si, offset, nr_pages);
+
+ if (!ci->count)
+ free_cluster(si, ci);
+ else
+ partial_free_cluster(si, ci);
+}
+
static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
unsigned long offset,
unsigned char usage)
@@ -1525,22 +1544,13 @@ static void swap_entry_range_free(struct swap_info_struct *si,
/* It should never free entries across different clusters */
VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1));
- VM_BUG_ON(cluster_is_empty(ci));
- VM_BUG_ON(ci->count < nr_pages);
- ci->count -= nr_pages;
do {
VM_BUG_ON(*map != SWAP_HAS_CACHE);
*map = 0;
} while (++map < map_end);
- mem_cgroup_uncharge_swap(entry, nr_pages);
- swap_range_free(si, offset, nr_pages);
-
- if (!ci->count)
- free_cluster(si, ci);
- else
- partial_free_cluster(si, ci);
+ __swap_entries_free(si, ci, entry, nr_pages);
}
static void cluster_swap_free_nr(struct swap_info_struct *si,
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 3/9] mm: swap: use __swap_entry_free() to free swap entry in swap_entry_put_locked()
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
2025-03-13 21:05 ` [PATCH 1/9] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
2025-03-13 21:05 ` [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-14 20:59 ` Tim Chen
2025-03-13 21:05 ` [PATCH 4/9] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
` (6 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
In swap_entry_put_locked(), we will set slot to SWAP_HAS_CACHE before
using swap_entry_range_free to do actual swap entry freeing. This
introduce an unnecessary intermediate state.
By using __swap_entry_free() in swap_entry_put_locked(), we can eliminate
the need to set slot to SWAP_HAS_CACHE.
This change would make the behavior of swap_entry_put_locked() more
consistent with other put() operations which will do actual free work
after put last reference.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 28 ++++++++++++----------------
1 file changed, 12 insertions(+), 16 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 7c886f9dd6f9..ba37b9bff586 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1367,9 +1367,11 @@ static inline void __swap_entries_free(struct swap_info_struct *si,
}
static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
- unsigned long offset,
+ struct swap_cluster_info *ci,
+ swp_entry_t entry,
unsigned char usage)
{
+ unsigned long offset = swp_offset(entry);
unsigned char count;
unsigned char has_cache;
@@ -1398,10 +1400,9 @@ static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
}
usage = count | has_cache;
- if (usage)
- WRITE_ONCE(si->swap_map[offset], usage);
- else
- WRITE_ONCE(si->swap_map[offset], SWAP_HAS_CACHE);
+ WRITE_ONCE(si->swap_map[offset], usage);
+ if (!usage)
+ __swap_entries_free(si, ci, entry, 1);
return usage;
}
@@ -1480,9 +1481,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
unsigned char usage;
ci = lock_cluster(si, offset);
- usage = swap_entry_put_locked(si, offset, 1);
- if (!usage)
- swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
+ usage = swap_entry_put_locked(si, ci, entry, 1);
unlock_cluster(ci);
return usage;
@@ -1562,8 +1561,8 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
ci = lock_cluster(si, offset);
do {
- if (!swap_entry_put_locked(si, offset, usage))
- swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
+ swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
+ usage);
} while (++offset < end);
unlock_cluster(ci);
}
@@ -1607,12 +1606,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
ci = lock_cluster(si, offset);
if (swap_only_has_cache(si, offset, size))
swap_entry_range_free(si, ci, entry, size);
- else {
- for (int i = 0; i < size; i++, entry.val++) {
- if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
- swap_entry_range_free(si, ci, entry, 1);
- }
- }
+ else
+ for (int i = 0; i < size; i++, entry.val++)
+ swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
unlock_cluster(ci);
}
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 4/9] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free()
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
` (2 preceding siblings ...)
2025-03-13 21:05 ` [PATCH 3/9] mm: swap: use __swap_entry_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-14 21:09 ` Tim Chen
2025-03-13 21:05 ` [PATCH 5/9] mm: swap: use swap_entries_free() drop last 1 flag in swap_entries_put_nr() Kemeng Shi
` (5 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
As all callers of swap_entry_range_free() have already ensured slots to
be freed are marked as SWAP_HAS_CACHE while holding the cluster lock,
the BUG_ON check can be safely removed. After this, the function
swap_entry_range_free() could drop any kind of last flag, rename it to
swap_entries_free() and update it's comment accordingly.
This is a preparation to use swap_entries_free() to drop last 1 and
SWAP_MAP_SHMEM flag.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index ba37b9bff586..14b7b37996ff 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -52,9 +52,9 @@
static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
unsigned char);
static void free_swap_count_continuations(struct swap_info_struct *);
-static void swap_entry_range_free(struct swap_info_struct *si,
- struct swap_cluster_info *ci,
- swp_entry_t entry, unsigned int nr_pages);
+static void swap_entries_free(struct swap_info_struct *si,
+ struct swap_cluster_info *ci,
+ swp_entry_t entry, unsigned int nr_pages);
static void swap_range_alloc(struct swap_info_struct *si,
unsigned int nr_entries);
static bool folio_swapcache_freeable(struct folio *folio);
@@ -1511,7 +1511,7 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
for (i = 0; i < nr; i++)
WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
if (!has_cache)
- swap_entry_range_free(si, ci, entry, nr);
+ swap_entries_free(si, ci, entry, nr);
unlock_cluster(ci);
return has_cache;
@@ -1530,12 +1530,12 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
}
/*
- * Drop the last HAS_CACHE flag of swap entries, caller have to
- * ensure all entries belong to the same cgroup.
+ * Drop the last flag(1, SWAP_HAS_CACHE or SWAP_MAP_SHMEM) of swap entries,
+ * caller have to ensure all entries belong to the same cgroup.
*/
-static void swap_entry_range_free(struct swap_info_struct *si,
- struct swap_cluster_info *ci,
- swp_entry_t entry, unsigned int nr_pages)
+static void swap_entries_free(struct swap_info_struct *si,
+ struct swap_cluster_info *ci,
+ swp_entry_t entry, unsigned int nr_pages)
{
unsigned long offset = swp_offset(entry);
unsigned char *map = si->swap_map + offset;
@@ -1545,7 +1545,6 @@ static void swap_entry_range_free(struct swap_info_struct *si,
VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1));
do {
- VM_BUG_ON(*map != SWAP_HAS_CACHE);
*map = 0;
} while (++map < map_end);
@@ -1605,7 +1604,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
ci = lock_cluster(si, offset);
if (swap_only_has_cache(si, offset, size))
- swap_entry_range_free(si, ci, entry, size);
+ swap_entries_free(si, ci, entry, size);
else
for (int i = 0; i < size; i++, entry.val++)
swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 5/9] mm: swap: use swap_entries_free() drop last 1 flag in swap_entries_put_nr()
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
` (3 preceding siblings ...)
2025-03-13 21:05 ` [PATCH 4/9] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-14 21:21 ` Tim Chen
2025-03-13 21:05 ` [PATCH 6/9] mm: swap: drop last SWAP_MAP_SHMEM flag in batch " Kemeng Shi
` (4 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
Use swap_entries_free() to drop last 1 flag to eliminate the need to set
slot to the intermediate SWAP_HAS_CACHE state.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 14b7b37996ff..0ce0ca08594e 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1508,10 +1508,11 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
unlock_cluster(ci);
goto fallback;
}
- for (i = 0; i < nr; i++)
- WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
if (!has_cache)
swap_entries_free(si, ci, entry, nr);
+ else
+ for (i = 0; i < nr; i++)
+ WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
unlock_cluster(ci);
return has_cache;
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 6/9] mm: swap: drop last SWAP_MAP_SHMEM flag in batch in swap_entries_put_nr()
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
` (4 preceding siblings ...)
2025-03-13 21:05 ` [PATCH 5/9] mm: swap: use swap_entries_free() drop last 1 flag in swap_entries_put_nr() Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-14 21:34 ` Tim Chen
2025-03-13 21:05 ` [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
` (3 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
The SWAP_MAP_SHMEM indicates last map from shmem. Therefore we can drop
SWAP_MAP_SHMEM in batch in similar way to drop last 1 flag in batch.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 0ce0ca08594e..2d0f5d630211 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -192,7 +192,7 @@ static bool swap_is_last_map(struct swap_info_struct *si,
unsigned char *map_end = map + nr_pages;
unsigned char count = *map;
- if (swap_count(count) != 1)
+ if (swap_count(count) != 1 && swap_count(count) != SWAP_MAP_SHMEM)
return false;
while (++map < map_end) {
@@ -1497,7 +1497,10 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
unsigned char count;
int i;
- if (nr <= 1 || swap_count(data_race(si->swap_map[offset])) != 1)
+ if (nr <= 1)
+ goto fallback;
+ count = swap_count(data_race(si->swap_map[offset]));
+ if (count != 1 && count != SWAP_MAP_SHMEM)
goto fallback;
/* cross into another cluster */
if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER)
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr()
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
` (5 preceding siblings ...)
2025-03-13 21:05 ` [PATCH 6/9] mm: swap: drop last SWAP_MAP_SHMEM flag in batch " Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-14 20:53 ` Tim Chen
2025-03-17 17:30 ` Tim Chen
2025-03-13 21:05 ` [PATCH 8/9] mm: swap: factor out helper to drop cache of entries within a single cluster Kemeng Shi
` (2 subsequent siblings)
9 siblings, 2 replies; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
1. Factor out general swap_entries_put_map() helper to drop entries belong
to one cluster. If entries are last map, free entries in batch, otherwise
put entries with cluster lock acquired and released only once.
2. Iterate and call swap_entries_put_map() for each cluster in
swap_entries_put_nr() to leverage batch-remove for last map belong to one
cluster and reduce lock acquire/release in fallback case.
3. As swap_entries_put_nr() won't handle SWAP_HSA_CACHE drop, rename it to
swap_entries_put_map_nr().
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 58 +++++++++++++++++++++++++--------------------------
1 file changed, 29 insertions(+), 29 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2d0f5d630211..ebac9ff74ba7 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1473,25 +1473,10 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
return NULL;
}
-static unsigned char swap_entry_put(struct swap_info_struct *si,
- swp_entry_t entry)
+static bool swap_entries_put_map(struct swap_info_struct *si,
+ swp_entry_t entry, int nr)
{
- struct swap_cluster_info *ci;
unsigned long offset = swp_offset(entry);
- unsigned char usage;
-
- ci = lock_cluster(si, offset);
- usage = swap_entry_put_locked(si, ci, entry, 1);
- unlock_cluster(ci);
-
- return usage;
-}
-
-static bool swap_entries_put_nr(struct swap_info_struct *si,
- swp_entry_t entry, int nr)
-{
- unsigned long offset = swp_offset(entry);
- unsigned int type = swp_type(entry);
struct swap_cluster_info *ci;
bool has_cache = false;
unsigned char count;
@@ -1502,9 +1487,6 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
count = swap_count(data_race(si->swap_map[offset]));
if (count != 1 && count != SWAP_MAP_SHMEM)
goto fallback;
- /* cross into another cluster */
- if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER)
- goto fallback;
ci = lock_cluster(si, offset);
if (!swap_is_last_map(si, offset, nr, &has_cache)) {
@@ -1521,15 +1503,33 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
return has_cache;
fallback:
- for (i = 0; i < nr; i++) {
- if (data_race(si->swap_map[offset + i])) {
- count = swap_entry_put(si, swp_entry(type, offset + i));
- if (count == SWAP_HAS_CACHE)
- has_cache = true;
- } else {
- WARN_ON_ONCE(1);
- }
+ ci = lock_cluster(si, offset);
+ for (i = 0; i < nr; i++, entry.val++) {
+ count = swap_entry_put_locked(si, ci, entry, 1);
+ if (count == SWAP_HAS_CACHE)
+ has_cache = true;
}
+ unlock_cluster(ci);
+ return has_cache;
+
+}
+
+static bool swap_entries_put_map_nr(struct swap_info_struct *si,
+ swp_entry_t entry, int nr)
+{
+ int cluster_nr, cluster_rest;
+ unsigned long offset = swp_offset(entry);
+ bool has_cache = false;
+
+ cluster_rest = SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER;
+ while (nr) {
+ cluster_nr = min(nr, cluster_rest);
+ has_cache |= swap_entries_put_map(si, entry, cluster_nr);
+ cluster_rest = SWAPFILE_CLUSTER;
+ nr -= cluster_nr;
+ entry.val += cluster_nr;
+ }
+
return has_cache;
}
@@ -1807,7 +1807,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
/*
* First free all entries in the range.
*/
- any_only_cache = swap_entries_put_nr(si, entry, nr);
+ any_only_cache = swap_entries_put_map_nr(si, entry, nr);
/*
* Short-circuit the below loop if none of the entries had their
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 8/9] mm: swap: factor out helper to drop cache of entries within a single cluster
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
` (6 preceding siblings ...)
2025-03-13 21:05 ` [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-15 0:24 ` Tim Chen
2025-03-13 21:05 ` [PATCH 9/9] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() Kemeng Shi
2025-03-14 20:27 ` [PATCH 0/9] Minor cleanups and improvements to swap freeing code Tim Chen
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
Factor out helper swap_entries_put_cache() from put_swap_folio() to server
as a general-purpose routine for dropping cache flag of entries within a
single cluster.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index ebac9ff74ba7..343b34eb2a81 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1473,6 +1473,21 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
return NULL;
}
+static void swap_entries_put_cache(struct swap_info_struct *si,
+ swp_entry_t entry, int nr)
+{
+ unsigned long offset = swp_offset(entry);
+ struct swap_cluster_info *ci;
+
+ ci = lock_cluster(si, offset);
+ if (swap_only_has_cache(si, offset, nr))
+ swap_entries_free(si, ci, entry, nr);
+ else
+ for (int i = 0; i < nr; i++, entry.val++)
+ swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
+ unlock_cluster(ci);
+}
+
static bool swap_entries_put_map(struct swap_info_struct *si,
swp_entry_t entry, int nr)
{
@@ -1597,8 +1612,6 @@ void swap_free_nr(swp_entry_t entry, int nr_pages)
*/
void put_swap_folio(struct folio *folio, swp_entry_t entry)
{
- unsigned long offset = swp_offset(entry);
- struct swap_cluster_info *ci;
struct swap_info_struct *si;
int size = 1 << swap_entry_order(folio_order(folio));
@@ -1606,13 +1619,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
if (!si)
return;
- ci = lock_cluster(si, offset);
- if (swap_only_has_cache(si, offset, size))
- swap_entries_free(si, ci, entry, size);
- else
- for (int i = 0; i < size; i++, entry.val++)
- swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
- unlock_cluster(ci);
+ swap_entries_put_cache(si, entry, size);
}
int __swap_count(swp_entry_t entry)
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 9/9] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]()
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
` (7 preceding siblings ...)
2025-03-13 21:05 ` [PATCH 8/9] mm: swap: factor out helper to drop cache of entries within a single cluster Kemeng Shi
@ 2025-03-13 21:05 ` Kemeng Shi
2025-03-17 18:23 ` Tim Chen
2025-03-14 20:27 ` [PATCH 0/9] Minor cleanups and improvements to swap freeing code Tim Chen
9 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-13 21:05 UTC (permalink / raw)
To: akpm; +Cc: kasong, linux-mm, linux-kernel
Replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() to
remove repeat code and leverage batch-remove for entries with last flag.
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
mm/swapfile.c | 21 ++-------------------
1 file changed, 2 insertions(+), 19 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 343b34eb2a81..c27cf09d84a6 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1570,21 +1570,6 @@ static void swap_entries_free(struct swap_info_struct *si,
__swap_entries_free(si, ci, entry, nr_pages);
}
-static void cluster_swap_free_nr(struct swap_info_struct *si,
- unsigned long offset, int nr_pages,
- unsigned char usage)
-{
- struct swap_cluster_info *ci;
- unsigned long end = offset + nr_pages;
-
- ci = lock_cluster(si, offset);
- do {
- swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
- usage);
- } while (++offset < end);
- unlock_cluster(ci);
-}
-
/*
* Caller has made sure that the swap device corresponding to entry
* is still around or has not been recycled.
@@ -1601,7 +1586,7 @@ void swap_free_nr(swp_entry_t entry, int nr_pages)
while (nr_pages) {
nr = min_t(int, nr_pages, SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);
- cluster_swap_free_nr(sis, offset, nr, 1);
+ swap_entries_put_map(sis, swp_entry(sis->type, offset), nr);
offset += nr;
nr_pages -= nr;
}
@@ -3632,9 +3617,7 @@ int swapcache_prepare(swp_entry_t entry, int nr)
void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int nr)
{
- unsigned long offset = swp_offset(entry);
-
- cluster_swap_free_nr(si, offset, nr, SWAP_HAS_CACHE);
+ swap_entries_put_cache(si, entry, nr);
}
struct swap_info_struct *swp_swap_info(swp_entry_t entry)
--
2.30.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper
2025-03-13 17:42 ` Kairui Song
@ 2025-03-14 7:32 ` Kemeng Shi
2025-03-14 7:47 ` Kairui Song
0 siblings, 1 reply; 25+ messages in thread
From: Kemeng Shi @ 2025-03-14 7:32 UTC (permalink / raw)
To: Kairui Song; +Cc: akpm, linux-mm, linux-kernel
on 3/14/2025 1:42 AM, Kairui Song wrote:
> On Thu, Mar 13, 2025 at 8:09 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>>
>> Factor out the actual swap entry freeing logic to new helper
>> __swap_entries_free().
>> This allow us to futher simplify other swap entry freeing code by
>> leveraging __swap_entries_free() helper function.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> ---
>> mm/swapfile.c | 30 ++++++++++++++++++++----------
>> 1 file changed, 20 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 5a775456e26c..7c886f9dd6f9 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -1347,6 +1347,25 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
>> return NULL;
>> }
>>
>> +static inline void __swap_entries_free(struct swap_info_struct *si,
>> + struct swap_cluster_info *ci,
>> + swp_entry_t entry, unsigned int nr_pages)
>> +{
>> + unsigned long offset = swp_offset(entry);
>> +
>> + VM_BUG_ON(cluster_is_empty(ci));
>> + VM_BUG_ON(ci->count < nr_pages);
>> +
>> + ci->count -= nr_pages;
>> + mem_cgroup_uncharge_swap(entry, nr_pages);
>> + swap_range_free(si, offset, nr_pages);
>> +
>> + if (!ci->count)
>> + free_cluster(si, ci);
>> + else
>> + partial_free_cluster(si, ci);
>> +}
>> +
>> static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
>> unsigned long offset,
>> unsigned char usage)
>> @@ -1525,22 +1544,13 @@ static void swap_entry_range_free(struct swap_info_struct *si,
>>
>> /* It should never free entries across different clusters */
>> VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1));
>> - VM_BUG_ON(cluster_is_empty(ci));
>> - VM_BUG_ON(ci->count < nr_pages);
>>
>> - ci->count -= nr_pages;
>> do {
>> VM_BUG_ON(*map != SWAP_HAS_CACHE);
>> *map = 0;
>> } while (++map < map_end);
>>
>> - mem_cgroup_uncharge_swap(entry, nr_pages);
>> - swap_range_free(si, offset, nr_pages);
>> -
>> - if (!ci->count)
>> - free_cluster(si, ci);
>> - else
>> - partial_free_cluster(si, ci);
>> + __swap_entries_free(si, ci, entry, nr_pages);
>> }
>>
>> static void cluster_swap_free_nr(struct swap_info_struct *si,
>> --
>> 2.30.0
>>
>>
>
> Hi Kemeng,
Hello Kairui,
Thanks for feedback.
>
> This patch is a bit too trivial to be a standalone one, you can fold
> it with the later one easily. Also you may want to carry the
> VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1)); check
> as well.
Sure, I will do this in next version.
>
> But, that is basically just renaming swap_entry_range_free, you may
> just remove the HAS_CACHE check as you rename it, that way the changes
> should be cleaner.
>
Sorry, I don't quite follow. Are you suggesting that should fold this
patch to later one which removes the HAS_CACHE check and renmae the
swap_entry_range_free.
Thanks,
Kemeng
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper
2025-03-14 7:32 ` Kemeng Shi
@ 2025-03-14 7:47 ` Kairui Song
2025-03-14 8:39 ` Kemeng Shi
0 siblings, 1 reply; 25+ messages in thread
From: Kairui Song @ 2025-03-14 7:47 UTC (permalink / raw)
To: Kemeng Shi; +Cc: akpm, linux-mm, linux-kernel
On Fri, Mar 14, 2025 at 3:32 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>
> on 3/14/2025 1:42 AM, Kairui Song wrote:
> > On Thu, Mar 13, 2025 at 8:09 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
> >>
> >> Factor out the actual swap entry freeing logic to new helper
> >> __swap_entries_free().
> >> This allow us to futher simplify other swap entry freeing code by
> >> leveraging __swap_entries_free() helper function.
> >>
> >> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> >> ---
> >> mm/swapfile.c | 30 ++++++++++++++++++++----------
> >> 1 file changed, 20 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/mm/swapfile.c b/mm/swapfile.c
> >> index 5a775456e26c..7c886f9dd6f9 100644
> >> --- a/mm/swapfile.c
> >> +++ b/mm/swapfile.c
> >> @@ -1347,6 +1347,25 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
> >> return NULL;
> >> }
> >>
> >> +static inline void __swap_entries_free(struct swap_info_struct *si,
> >> + struct swap_cluster_info *ci,
> >> + swp_entry_t entry, unsigned int nr_pages)
> >> +{
> >> + unsigned long offset = swp_offset(entry);
> >> +
> >> + VM_BUG_ON(cluster_is_empty(ci));
> >> + VM_BUG_ON(ci->count < nr_pages);
> >> +
> >> + ci->count -= nr_pages;
> >> + mem_cgroup_uncharge_swap(entry, nr_pages);
> >> + swap_range_free(si, offset, nr_pages);
> >> +
> >> + if (!ci->count)
> >> + free_cluster(si, ci);
> >> + else
> >> + partial_free_cluster(si, ci);
> >> +}
> >> +
> >> static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
> >> unsigned long offset,
> >> unsigned char usage)
> >> @@ -1525,22 +1544,13 @@ static void swap_entry_range_free(struct swap_info_struct *si,
> >>
> >> /* It should never free entries across different clusters */
> >> VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1));
> >> - VM_BUG_ON(cluster_is_empty(ci));
> >> - VM_BUG_ON(ci->count < nr_pages);
> >>
> >> - ci->count -= nr_pages;
> >> do {
> >> VM_BUG_ON(*map != SWAP_HAS_CACHE);
> >> *map = 0;
> >> } while (++map < map_end);
> >>
> >> - mem_cgroup_uncharge_swap(entry, nr_pages);
> >> - swap_range_free(si, offset, nr_pages);
> >> -
> >> - if (!ci->count)
> >> - free_cluster(si, ci);
> >> - else
> >> - partial_free_cluster(si, ci);
> >> + __swap_entries_free(si, ci, entry, nr_pages);
> >> }
> >>
> >> static void cluster_swap_free_nr(struct swap_info_struct *si,
> >> --
> >> 2.30.0
> >>
> >>
> >
> > Hi Kemeng,
> Hello Kairui,
>
> Thanks for feedback.
> >
> > This patch is a bit too trivial to be a standalone one, you can fold
> > it with the later one easily. Also you may want to carry the
> > VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1)); check
> > as well.
> Sure, I will do this in next version.
> >
> > But, that is basically just renaming swap_entry_range_free, you may
> > just remove the HAS_CACHE check as you rename it, that way the changes
> > should be cleaner.
> >
> Sorry, I don't quite follow. Are you suggesting that should fold this
> patch to later one which removes the HAS_CACHE check and renmae the
> swap_entry_range_free.
Hi,
Just some of my nitpicks :)
After you move these parts out of swap_entry_put_locked, there is
almost nothing left in swap_entry_put_locked except an "open coded
memset". And in your next patch (also after the whole series), all
callers of __swap_entries_free will have to call an "open coded
memset" anyway, so these changes seem redundant and could be improved.
BTW your next patch has a typo in the commit message:
s/__swap_entriy_free/__swap_entries_free/g.
>
> Thanks,
> Kemeng
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper
2025-03-14 7:47 ` Kairui Song
@ 2025-03-14 8:39 ` Kemeng Shi
0 siblings, 0 replies; 25+ messages in thread
From: Kemeng Shi @ 2025-03-14 8:39 UTC (permalink / raw)
To: Kairui Song; +Cc: akpm, linux-mm, linux-kernel
on 3/14/2025 3:47 PM, Kairui Song wrote:
> On Fri, Mar 14, 2025 at 3:32 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>>
>> on 3/14/2025 1:42 AM, Kairui Song wrote:
>>> On Thu, Mar 13, 2025 at 8:09 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>>>>
>>>> Factor out the actual swap entry freeing logic to new helper
>>>> __swap_entries_free().
>>>> This allow us to futher simplify other swap entry freeing code by
>>>> leveraging __swap_entries_free() helper function.
>>>>
>>>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>>>> ---
>>>> mm/swapfile.c | 30 ++++++++++++++++++++----------
>>>> 1 file changed, 20 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>>>> index 5a775456e26c..7c886f9dd6f9 100644
>>>> --- a/mm/swapfile.c
>>>> +++ b/mm/swapfile.c
>>>> @@ -1347,6 +1347,25 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
>>>> return NULL;
>>>> }
>>>>
>>>> +static inline void __swap_entries_free(struct swap_info_struct *si,
>>>> + struct swap_cluster_info *ci,
>>>> + swp_entry_t entry, unsigned int nr_pages)
>>>> +{
>>>> + unsigned long offset = swp_offset(entry);
>>>> +
>>>> + VM_BUG_ON(cluster_is_empty(ci));
>>>> + VM_BUG_ON(ci->count < nr_pages);
>>>> +
>>>> + ci->count -= nr_pages;
>>>> + mem_cgroup_uncharge_swap(entry, nr_pages);
>>>> + swap_range_free(si, offset, nr_pages);
>>>> +
>>>> + if (!ci->count)
>>>> + free_cluster(si, ci);
>>>> + else
>>>> + partial_free_cluster(si, ci);
>>>> +}
>>>> +
>>>> static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
>>>> unsigned long offset,
>>>> unsigned char usage)
>>>> @@ -1525,22 +1544,13 @@ static void swap_entry_range_free(struct swap_info_struct *si,
>>>>
>>>> /* It should never free entries across different clusters */
>>>> VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1));
>>>> - VM_BUG_ON(cluster_is_empty(ci));
>>>> - VM_BUG_ON(ci->count < nr_pages);
>>>>
>>>> - ci->count -= nr_pages;
>>>> do {
>>>> VM_BUG_ON(*map != SWAP_HAS_CACHE);
>>>> *map = 0;
>>>> } while (++map < map_end);
>>>>
>>>> - mem_cgroup_uncharge_swap(entry, nr_pages);
>>>> - swap_range_free(si, offset, nr_pages);
>>>> -
>>>> - if (!ci->count)
>>>> - free_cluster(si, ci);
>>>> - else
>>>> - partial_free_cluster(si, ci);
>>>> + __swap_entries_free(si, ci, entry, nr_pages);
>>>> }
>>>>
>>>> static void cluster_swap_free_nr(struct swap_info_struct *si,
>>>> --
>>>> 2.30.0
>>>>
>>>>
>>>
>>> Hi Kemeng,
>> Hello Kairui,
>>
>> Thanks for feedback.
>>>
>>> This patch is a bit too trivial to be a standalone one, you can fold
>>> it with the later one easily. Also you may want to carry the
>>> VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1)); check
>>> as well.
>> Sure, I will do this in next version.
>>>
>>> But, that is basically just renaming swap_entry_range_free, you may
>>> just remove the HAS_CACHE check as you rename it, that way the changes
>>> should be cleaner.
>>>
>> Sorry, I don't quite follow. Are you suggesting that should fold this
>> patch to later one which removes the HAS_CACHE check and renmae the
>> swap_entry_range_free.
>
> Hi,
>
> Just some of my nitpicks :)
>
> After you move these parts out of swap_entry_put_locked, there is
> almost nothing left in swap_entry_put_locked except an "open coded
> memset". And in your next patch (also after the whole series), all
> callers of __swap_entries_free will have to call an "open coded
> memset" anyway, so these changes seem redundant and could be improved.
Right, we can simply use swap_entries_free() instead of __swap_entries_free()
int swap_entry_put_locked() after the whole series and this change seems not
necessary. Will improve this in next version.
>
> BTW your next patch has a typo in the commit message:
> s/__swap_entriy_free/__swap_entries_free/g.
Will fix this in next version.
Thanks,
Kemeng
>
>>
>> Thanks,
>> Kemeng
>>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/9] Minor cleanups and improvements to swap freeing code
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
` (8 preceding siblings ...)
2025-03-13 21:05 ` [PATCH 9/9] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() Kemeng Shi
@ 2025-03-14 20:27 ` Tim Chen
2025-03-18 1:29 ` Kemeng Shi
9 siblings, 1 reply; 25+ messages in thread
From: Tim Chen @ 2025-03-14 20:27 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> Hi All,
> This series contains some cleanups and improvements which are made
> during learning swapfile. Here is a summary of the changes:
Nice work.
> 1. Function nameing improvments.
> -Use "put" instead of "free" to name functions which only do actual free
> when count drops to zero.
> -Use "entry" to name function only frees one swap slot. Use "entries" to
> name function could may free multi swap slots within one cluster. Use
> "_nr" suffix to name function which could free multi swap slots spanning
> cross multi clusters.
Will be nice to add some comments in the code about functions with _nr
crossing the cluster boundaries and those without stay within a cluster.
> 2. Eliminate the need to set swap slot to intermediate SWAP_HAS_CACHE
> value before do actual free by using swap_entry_range_free()
> 3. Add helpers swap_entries_put_map() and swap_entries_put_cache() as a
> general-purpose routine to free swap entries within a single cluster
> which will try batch-remove first and fallback to put eatch entry
> indvidually with cluster lock acquired/released only once. By using
> these helpers, we could remove repeated code, levarage batch-remove in
> more cases and aoivd to acquire/release cluster lock for each single
> swap entry.
Wonder if the batching shows up in any swap performance improvement?
Tim
>
> Kemeng Shi (9):
> mm: swap: rename __swap_[entry/entries]_free[_locked] to
> swap_[entry/entries]_put[_locked]
> mm: swap: factor out the actual swap entry freeing logic to new helper
> mm: swap: use __swap_entry_free() to free swap entry in
> swap_entry_put_locked()
> mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in
> swap_entry_range_free()
> mm: swap: use swap_entries_free() drop last 1 flag in
> swap_entries_put_nr()
> mm: swap: drop last SWAP_MAP_SHMEM flag in batch in
> swap_entries_put_nr()
> mm: swap: free each cluster individually in swap_entries_put_map_nr()
> mm: swap: factor out helper to drop cache of entries within a single
> cluster
> mm: swap: replace cluster_swap_free_nr() with
> swap_entries_put_[map/cache]()
>
> mm/swapfile.c | 173 +++++++++++++++++++++++++-------------------------
> 1 file changed, 86 insertions(+), 87 deletions(-)
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/9] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked]
2025-03-13 21:05 ` [PATCH 1/9] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
@ 2025-03-14 20:37 ` Tim Chen
0 siblings, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-14 20:37 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> In __swap_entry_free[_locked] and __swap_entries_free, we decrease count
> first and only free swap entry if count drops to zero. This behavior is
> more akin to a put() operation rather than a free() operation. Therefore,
> rename these functions with "put" instead of "free".
> Additionally, add "_nr" suffix to swap_entries_put to indicate the input
> range may span swap clusters.
>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
> mm/swapfile.c | 28 ++++++++++++++--------------
> 1 file changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 628f67974a7c..5a775456e26c 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1347,9 +1347,9 @@ static struct swap_info_struct *_swap_info_get(swp_entry_t entry)
> return NULL;
> }
>
> -static unsigned char __swap_entry_free_locked(struct swap_info_struct *si,
> - unsigned long offset,
> - unsigned char usage)
> +static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
> + unsigned long offset,
> + unsigned char usage)
> {
> unsigned char count;
> unsigned char has_cache;
> @@ -1453,15 +1453,15 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
> return NULL;
> }
>
> -static unsigned char __swap_entry_free(struct swap_info_struct *si,
> - swp_entry_t entry)
> +static unsigned char swap_entry_put(struct swap_info_struct *si,
> + swp_entry_t entry)
> {
> struct swap_cluster_info *ci;
> unsigned long offset = swp_offset(entry);
> unsigned char usage;
>
> ci = lock_cluster(si, offset);
> - usage = __swap_entry_free_locked(si, offset, 1);
> + usage = swap_entry_put_locked(si, offset, 1);
> if (!usage)
> swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
> unlock_cluster(ci);
> @@ -1469,8 +1469,8 @@ static unsigned char __swap_entry_free(struct swap_info_struct *si,
> return usage;
> }
>
> -static bool __swap_entries_free(struct swap_info_struct *si,
> - swp_entry_t entry, int nr)
> +static bool swap_entries_put_nr(struct swap_info_struct *si,
> + swp_entry_t entry, int nr)
> {
> unsigned long offset = swp_offset(entry);
> unsigned int type = swp_type(entry);
> @@ -1501,7 +1501,7 @@ static bool __swap_entries_free(struct swap_info_struct *si,
> fallback:
> for (i = 0; i < nr; i++) {
> if (data_race(si->swap_map[offset + i])) {
> - count = __swap_entry_free(si, swp_entry(type, offset + i));
> + count = swap_entry_put(si, swp_entry(type, offset + i));
> if (count == SWAP_HAS_CACHE)
> has_cache = true;
> } else {
> @@ -1552,7 +1552,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
>
> ci = lock_cluster(si, offset);
> do {
> - if (!__swap_entry_free_locked(si, offset, usage))
> + if (!swap_entry_put_locked(si, offset, usage))
> swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
> } while (++offset < end);
> unlock_cluster(ci);
> @@ -1599,7 +1599,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
> swap_entry_range_free(si, ci, entry, size);
> else {
> for (int i = 0; i < size; i++, entry.val++) {
> - if (!__swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE))
> + if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
> swap_entry_range_free(si, ci, entry, 1);
> }
> }
> @@ -1798,7 +1798,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
> /*
> * First free all entries in the range.
> */
> - any_only_cache = __swap_entries_free(si, entry, nr);
> + any_only_cache = swap_entries_put_nr(si, entry, nr);
>
> /*
> * Short-circuit the below loop if none of the entries had their
> @@ -1811,7 +1811,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
> * Now go back over the range trying to reclaim the swap cache. This is
> * more efficient for large folios because we will only try to reclaim
> * the swap once per folio in the common case. If we do
> - * __swap_entry_free() and __try_to_reclaim_swap() in the same loop, the
> + * swap_entry_put() and __try_to_reclaim_swap() in the same loop, the
> * latter will get a reference and lock the folio for every individual
> * page but will only succeed once the swap slot for every subpage is
> * zero.
> @@ -3758,7 +3758,7 @@ int add_swap_count_continuation(swp_entry_t entry, gfp_t gfp_mask)
> * into, carry if so, or else fail until a new continuation page is allocated;
> * when the original swap_map count is decremented from 0 with continuation,
> * borrow from the continuation and report whether it still holds more.
> - * Called while __swap_duplicate() or caller of __swap_entry_free_locked()
> + * Called while __swap_duplicate() or caller of swap_entry_put_locked()
> * holds cluster lock.
> */
> static bool swap_count_continued(struct swap_info_struct *si,
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr()
2025-03-13 21:05 ` [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
@ 2025-03-14 20:53 ` Tim Chen
2025-03-17 17:30 ` Tim Chen
1 sibling, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-14 20:53 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> 1. Factor out general swap_entries_put_map() helper to drop entries belong
> to one cluster. If entries are last map, free entries in batch, otherwise
> put entries with cluster lock acquired and released only once.
> 2. Iterate and call swap_entries_put_map() for each cluster in
> swap_entries_put_nr() to leverage batch-remove for last map belong to one
> cluster and reduce lock acquire/release in fallback case.
> 3. As swap_entries_put_nr() won't handle SWAP_HSA_CACHE drop, rename it to
> swap_entries_put_map_nr().
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
> mm/swapfile.c | 58 +++++++++++++++++++++++++--------------------------
> 1 file changed, 29 insertions(+), 29 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 2d0f5d630211..ebac9ff74ba7 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1473,25 +1473,10 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
> return NULL;
> }
>
> -static unsigned char swap_entry_put(struct swap_info_struct *si,
> - swp_entry_t entry)
Need to update comment in free_swap_and_cache_nr() now that swap_entry_put() goes away.
* the swap once per folio in the common case. If we do
* swap_entry_put() and __try_to_reclaim_swap() in the same loop, the
* latter will get a reference and lock the folio for every individual ...
> +static bool swap_entries_put_map(struct swap_info_struct *si,
> + swp_entry_t entry, int nr)
> {
> - struct swap_cluster_info *ci;
> unsigned long offset = swp_offset(entry);
> - unsigned char usage;
> -
> - ci = lock_cluster(si, offset);
> - usage = swap_entry_put_locked(si, ci, entry, 1);
> - unlock_cluster(ci);
> -
> - return usage;
> -}
> -
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/9] mm: swap: use __swap_entry_free() to free swap entry in swap_entry_put_locked()
2025-03-13 21:05 ` [PATCH 3/9] mm: swap: use __swap_entry_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
@ 2025-03-14 20:59 ` Tim Chen
0 siblings, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-14 20:59 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> In swap_entry_put_locked(), we will set slot to SWAP_HAS_CACHE before
> using swap_entry_range_free to do actual swap entry freeing. This
> introduce an unnecessary intermediate state.
> By using __swap_entry_free() in swap_entry_put_locked(), we can eliminate
> the need to set slot to SWAP_HAS_CACHE.
> This change would make the behavior of swap_entry_put_locked() more
> consistent with other put() operations which will do actual free work
> after put last reference.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
> mm/swapfile.c | 28 ++++++++++++----------------
> 1 file changed, 12 insertions(+), 16 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 7c886f9dd6f9..ba37b9bff586 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1367,9 +1367,11 @@ static inline void __swap_entries_free(struct swap_info_struct *si,
> }
>
> static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
> - unsigned long offset,
> + struct swap_cluster_info *ci,
> + swp_entry_t entry,
> unsigned char usage)
> {
> + unsigned long offset = swp_offset(entry);
> unsigned char count;
> unsigned char has_cache;
>
> @@ -1398,10 +1400,9 @@ static unsigned char swap_entry_put_locked(struct swap_info_struct *si,
> }
>
> usage = count | has_cache;
> - if (usage)
> - WRITE_ONCE(si->swap_map[offset], usage);
> - else
> - WRITE_ONCE(si->swap_map[offset], SWAP_HAS_CACHE);
> + WRITE_ONCE(si->swap_map[offset], usage);
> + if (!usage)
> + __swap_entries_free(si, ci, entry, 1);
>
> return usage;
> }
> @@ -1480,9 +1481,7 @@ static unsigned char swap_entry_put(struct swap_info_struct *si,
> unsigned char usage;
>
> ci = lock_cluster(si, offset);
> - usage = swap_entry_put_locked(si, offset, 1);
> - if (!usage)
> - swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
> + usage = swap_entry_put_locked(si, ci, entry, 1);
> unlock_cluster(ci);
>
> return usage;
> @@ -1562,8 +1561,8 @@ static void cluster_swap_free_nr(struct swap_info_struct *si,
>
> ci = lock_cluster(si, offset);
> do {
> - if (!swap_entry_put_locked(si, offset, usage))
> - swap_entry_range_free(si, ci, swp_entry(si->type, offset), 1);
> + swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
> + usage);
> } while (++offset < end);
> unlock_cluster(ci);
> }
> @@ -1607,12 +1606,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
> ci = lock_cluster(si, offset);
> if (swap_only_has_cache(si, offset, size))
> swap_entry_range_free(si, ci, entry, size);
> - else {
> - for (int i = 0; i < size; i++, entry.val++) {
> - if (!swap_entry_put_locked(si, offset + i, SWAP_HAS_CACHE))
> - swap_entry_range_free(si, ci, entry, 1);
> - }
> - }
> + else
> + for (int i = 0; i < size; i++, entry.val++)
> + swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
> unlock_cluster(ci);
> }
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/9] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free()
2025-03-13 21:05 ` [PATCH 4/9] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
@ 2025-03-14 21:09 ` Tim Chen
0 siblings, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-14 21:09 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> As all callers of swap_entry_range_free() have already ensured slots to
> be freed are marked as SWAP_HAS_CACHE while holding the cluster lock,
> the BUG_ON check can be safely removed. After this, the function
> swap_entry_range_free() could drop any kind of last flag, rename it to
> swap_entries_free() and update it's comment accordingly.
>
> This is a preparation to use swap_entries_free() to drop last 1 and
Probably clearer to say
drop last ref count
instead or drop last 1
> SWAP_MAP_SHMEM flag.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
> mm/swapfile.c | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index ba37b9bff586..14b7b37996ff 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -52,9 +52,9 @@
> static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
> unsigned char);
> static void free_swap_count_continuations(struct swap_info_struct *);
> -static void swap_entry_range_free(struct swap_info_struct *si,
> - struct swap_cluster_info *ci,
> - swp_entry_t entry, unsigned int nr_pages);
> +static void swap_entries_free(struct swap_info_struct *si,
> + struct swap_cluster_info *ci,
> + swp_entry_t entry, unsigned int nr_pages);
> static void swap_range_alloc(struct swap_info_struct *si,
> unsigned int nr_entries);
> static bool folio_swapcache_freeable(struct folio *folio);
> @@ -1511,7 +1511,7 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
> for (i = 0; i < nr; i++)
> WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
> if (!has_cache)
> - swap_entry_range_free(si, ci, entry, nr);
> + swap_entries_free(si, ci, entry, nr);
> unlock_cluster(ci);
>
> return has_cache;
> @@ -1530,12 +1530,12 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
> }
>
> /*
> - * Drop the last HAS_CACHE flag of swap entries, caller have to
> - * ensure all entries belong to the same cgroup.
> + * Drop the last flag(1, SWAP_HAS_CACHE or SWAP_MAP_SHMEM) of swap entries,
> + * caller have to ensure all entries belong to the same cgroup.
Will be nice to modify the above comment:
all entries belong to the same cgroup and cluster.
Otherwise
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
>
> -static void swap_entry_range_free(struct swap_info_struct *si,
> - struct swap_cluster_info *ci,
> - swp_entry_t entry, unsigned int nr_pages)
> +static void swap_entries_free(struct swap_info_struct *si,
> + struct swap_cluster_info *ci,
> + swp_entry_t entry, unsigned int nr_pages)
> {
> unsigned long offset = swp_offset(entry);
> unsigned char *map = si->swap_map + offset;
> @@ -1545,7 +1545,6 @@ static void swap_entry_range_free(struct swap_info_struct *si,
> VM_BUG_ON(ci != offset_to_cluster(si, offset + nr_pages - 1));
>
> do {
> - VM_BUG_ON(*map != SWAP_HAS_CACHE);
> *map = 0;
> } while (++map < map_end);
>
> @@ -1605,7 +1604,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
>
> ci = lock_cluster(si, offset);
> if (swap_only_has_cache(si, offset, size))
> - swap_entry_range_free(si, ci, entry, size);
> + swap_entries_free(si, ci, entry, size);
> else
> for (int i = 0; i < size; i++, entry.val++)
> swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 5/9] mm: swap: use swap_entries_free() drop last 1 flag in swap_entries_put_nr()
2025-03-13 21:05 ` [PATCH 5/9] mm: swap: use swap_entries_free() drop last 1 flag in swap_entries_put_nr() Kemeng Shi
@ 2025-03-14 21:21 ` Tim Chen
0 siblings, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-14 21:21 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> Use swap_entries_free() to drop last 1 flag to eliminate the need to set
> slot to the intermediate SWAP_HAS_CACHE state.
>
Use swap_entries_free() to directly free swap entries when
the swap entries are not cached and referenced, without
needing to set swap entries to intermediate SWAP_HAS_CACHE state.
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
> mm/swapfile.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
Use swap_entries_free()
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 14b7b37996ff..0ce0ca08594e 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1508,10 +1508,11 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
> unlock_cluster(ci);
> goto fallback;
> }
> - for (i = 0; i < nr; i++)
> - WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
> if (!has_cache)
> swap_entries_free(si, ci, entry, nr);
> + else
> + for (i = 0; i < nr; i++)
> + WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE);
> unlock_cluster(ci);
>
> return has_cache;
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 6/9] mm: swap: drop last SWAP_MAP_SHMEM flag in batch in swap_entries_put_nr()
2025-03-13 21:05 ` [PATCH 6/9] mm: swap: drop last SWAP_MAP_SHMEM flag in batch " Kemeng Shi
@ 2025-03-14 21:34 ` Tim Chen
0 siblings, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-14 21:34 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> The SWAP_MAP_SHMEM indicates last map from shmem. Therefore we can drop
> SWAP_MAP_SHMEM in batch in similar way to drop last 1 flag in batch.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
> mm/swapfile.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 0ce0ca08594e..2d0f5d630211 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -192,7 +192,7 @@ static bool swap_is_last_map(struct swap_info_struct *si,
> unsigned char *map_end = map + nr_pages;
> unsigned char count = *map;
>
> - if (swap_count(count) != 1)
> + if (swap_count(count) != 1 && swap_count(count) != SWAP_MAP_SHMEM)
> return false;
>
> while (++map < map_end) {
> @@ -1497,7 +1497,10 @@ static bool swap_entries_put_nr(struct swap_info_struct *si,
> unsigned char count;
> int i;
>
> - if (nr <= 1 || swap_count(data_race(si->swap_map[offset])) != 1)
> + if (nr <= 1)
> + goto fallback;
> + count = swap_count(data_race(si->swap_map[offset]));
> + if (count != 1 && count != SWAP_MAP_SHMEM)
> goto fallback;
> /* cross into another cluster */
> if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER)
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 8/9] mm: swap: factor out helper to drop cache of entries within a single cluster
2025-03-13 21:05 ` [PATCH 8/9] mm: swap: factor out helper to drop cache of entries within a single cluster Kemeng Shi
@ 2025-03-15 0:24 ` Tim Chen
0 siblings, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-15 0:24 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> Factor out helper swap_entries_put_cache() from put_swap_folio() to server
> as a general-purpose routine for dropping cache flag of entries within a
> single cluster.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
> mm/swapfile.c | 25 ++++++++++++++++---------
> 1 file changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index ebac9ff74ba7..343b34eb2a81 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1473,6 +1473,21 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
> return NULL;
> }
>
> +static void swap_entries_put_cache(struct swap_info_struct *si,
> + swp_entry_t entry, int nr)
> +{
> + unsigned long offset = swp_offset(entry);
> + struct swap_cluster_info *ci;
> +
> + ci = lock_cluster(si, offset);
> + if (swap_only_has_cache(si, offset, nr))
> + swap_entries_free(si, ci, entry, nr);
> + else
> + for (int i = 0; i < nr; i++, entry.val++)
> + swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
> + unlock_cluster(ci);
> +}
> +
> static bool swap_entries_put_map(struct swap_info_struct *si,
> swp_entry_t entry, int nr)
> {
> @@ -1597,8 +1612,6 @@ void swap_free_nr(swp_entry_t entry, int nr_pages)
> */
> void put_swap_folio(struct folio *folio, swp_entry_t entry)
> {
> - unsigned long offset = swp_offset(entry);
> - struct swap_cluster_info *ci;
> struct swap_info_struct *si;
> int size = 1 << swap_entry_order(folio_order(folio));
>
> @@ -1606,13 +1619,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry)
> if (!si)
> return;
>
> - ci = lock_cluster(si, offset);
> - if (swap_only_has_cache(si, offset, size))
> - swap_entries_free(si, ci, entry, size);
> - else
> - for (int i = 0; i < size; i++, entry.val++)
> - swap_entry_put_locked(si, ci, entry, SWAP_HAS_CACHE);
> - unlock_cluster(ci);
> + swap_entries_put_cache(si, entry, size);
> }
>
> int __swap_count(swp_entry_t entry)
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr()
2025-03-13 21:05 ` [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
2025-03-14 20:53 ` Tim Chen
@ 2025-03-17 17:30 ` Tim Chen
1 sibling, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-17 17:30 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
>
> fallback:
> - for (i = 0; i < nr; i++) {
> - if (data_race(si->swap_map[offset + i])) {
> - count = swap_entry_put(si, swp_entry(type, offset + i));
> - if (count == SWAP_HAS_CACHE)
> - has_cache = true;
> - } else {
> - WARN_ON_ONCE(1);
> - }
> + ci = lock_cluster(si, offset);
Relocking the cluster is unnecessary overhead if we don't release the ci lock
prior to jumping to fallback label for the locked case.
Maybe something like:
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9cd93a03b25c..41aa841b86d6 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1504,8 +1504,7 @@ static bool swap_entries_put_map(struct swap_info_struct *si,
ci = lock_cluster(si, offset);
if (!swap_is_last_map(si, offset, nr, &has_cache)) {
- unlock_cluster(ci);
- goto fallback;
+ goto locked_fallback;
}
if (!has_cache)
swap_entries_free(si, ci, entry, nr);
@@ -1518,6 +1517,7 @@ static bool swap_entries_put_map(struct swap_info_struct *si,
fallback:
ci = lock_cluster(si, offset);
+locked_fallback:
for (i = 0; i < nr; i++, entry.val++) {
count = swap_entry_put_locked(si, ci, entry, 1);
if (count == SWAP_HAS_CACHE)
Tim
> + for (i = 0; i < nr; i++, entry.val++) {
> + count = swap_entry_put_locked(si, ci, entry, 1);
> + if (count == SWAP_HAS_CACHE)
> + has_cache = true;
> }
> + unlock_cluster(ci);
> + return has_cache;
> +
> +}
> +
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 9/9] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]()
2025-03-13 21:05 ` [PATCH 9/9] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() Kemeng Shi
@ 2025-03-17 18:23 ` Tim Chen
0 siblings, 0 replies; 25+ messages in thread
From: Tim Chen @ 2025-03-17 18:23 UTC (permalink / raw)
To: Kemeng Shi, akpm; +Cc: kasong, linux-mm, linux-kernel
On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
> Replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() to
> remove repeat code and leverage batch-remove for entries with last flag.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
> mm/swapfile.c | 21 ++-------------------
> 1 file changed, 2 insertions(+), 19 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 343b34eb2a81..c27cf09d84a6 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1570,21 +1570,6 @@ static void swap_entries_free(struct swap_info_struct *si,
> __swap_entries_free(si, ci, entry, nr_pages);
> }
>
> -static void cluster_swap_free_nr(struct swap_info_struct *si,
> - unsigned long offset, int nr_pages,
> - unsigned char usage)
> -{
> - struct swap_cluster_info *ci;
> - unsigned long end = offset + nr_pages;
> -
> - ci = lock_cluster(si, offset);
> - do {
> - swap_entry_put_locked(si, ci, swp_entry(si->type, offset),
> - usage);
> - } while (++offset < end);
> - unlock_cluster(ci);
> -}
> -
> /*
> * Caller has made sure that the swap device corresponding to entry
> * is still around or has not been recycled.
> @@ -1601,7 +1586,7 @@ void swap_free_nr(swp_entry_t entry, int nr_pages)
>
> while (nr_pages) {
> nr = min_t(int, nr_pages, SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);
> - cluster_swap_free_nr(sis, offset, nr, 1);
> + swap_entries_put_map(sis, swp_entry(sis->type, offset), nr);
> offset += nr;
> nr_pages -= nr;
> }
> @@ -3632,9 +3617,7 @@ int swapcache_prepare(swp_entry_t entry, int nr)
>
> void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int nr)
> {
> - unsigned long offset = swp_offset(entry);
> -
> - cluster_swap_free_nr(si, offset, nr, SWAP_HAS_CACHE);
> + swap_entries_put_cache(si, entry, nr);
swap_entries_put_cache() assumes nr does not cross cluster boundary
as we only lock the cluster associated with the beginning entry.
Current callers to swapcache_clear() like
do_swap_page() and shmem_swapin_folio, call swapcache_clear()
only for pages within a folio so we do fall in a cluster so
we are okay.
Perhaps we should document with a comment that
caller to swapcache_clear() should use the function only for pages in
a folio so the pages don't cross clusters for future users
of swapcache_clear().
Otherwise the patch looks good.
Tim
> }
>
> struct swap_info_struct *swp_swap_info(swp_entry_t entry)
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/9] Minor cleanups and improvements to swap freeing code
2025-03-14 20:27 ` [PATCH 0/9] Minor cleanups and improvements to swap freeing code Tim Chen
@ 2025-03-18 1:29 ` Kemeng Shi
0 siblings, 0 replies; 25+ messages in thread
From: Kemeng Shi @ 2025-03-18 1:29 UTC (permalink / raw)
To: Tim Chen, akpm; +Cc: kasong, linux-mm, linux-kernel
on 3/15/2025 4:27 AM, Tim Chen wrote:
> On Fri, 2025-03-14 at 05:05 +0800, Kemeng Shi wrote:
>> Hi All,
>> This series contains some cleanups and improvements which are made
>> during learning swapfile. Here is a summary of the changes:
>
Hello,
> Nice work>
>> 1. Function nameing improvments.
>> -Use "put" instead of "free" to name functions which only do actual free
>> when count drops to zero.
>> -Use "entry" to name function only frees one swap slot. Use "entries" to
>> name function could may free multi swap slots within one cluster. Use
>> "_nr" suffix to name function which could free multi swap slots spanning
>> cross multi clusters.
>
> Will be nice to add some comments in the code about functions with _nr
> crossing the cluster boundaries and those without stay within a cluster.
Thanks for reviewing and all advises to this series. Will improve in next
version.
>
>> 2. Eliminate the need to set swap slot to intermediate SWAP_HAS_CACHE
>> value before do actual free by using swap_entry_range_free()
>> 3. Add helpers swap_entries_put_map() and swap_entries_put_cache() as a
>> general-purpose routine to free swap entries within a single cluster
>> which will try batch-remove first and fallback to put eatch entry
>> indvidually with cluster lock acquired/released only once. By using
>> these helpers, we could remove repeated code, levarage batch-remove in
>> more cases and aoivd to acquire/release cluster lock for each single
>> swap entry.
>
> Wonder if the batching shows up in any swap performance improvement
I have a simple test which is roughly as following:
#define SIZE 1024*1024*1024
#define HP_SIZE 2 * 1024 * 1024
p = mmap(...,SIZE,...)
madvise(p, SIZE, MADV_HUGEPAGE)
memset(p, 0x11, SIZE); /* alloc page */
madvise(p, SIZE, MADV_PAGEOUT); /* swap out */
gettimeofday(&tv_b, NULL);
for (j = 0; j < SIZE; j+= HP_SIZE) {
((char *)p)[j] = 0; /* swap in and free swap entry */
}
The time for swap-in and free swap entry shows no significant change.
Since this is more of a cleanup series, no further testing has been
conducted yet. I would appreciate it if you have any additional test
cases that could benefit from batching.
Thank,
Kemeng
.
>
> Tim
>
>>
>> Kemeng Shi (9):
>> mm: swap: rename __swap_[entry/entries]_free[_locked] to
>> swap_[entry/entries]_put[_locked]
>> mm: swap: factor out the actual swap entry freeing logic to new helper
>> mm: swap: use __swap_entry_free() to free swap entry in
>> swap_entry_put_locked()
>> mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in
>> swap_entry_range_free()
>> mm: swap: use swap_entries_free() drop last 1 flag in
>> swap_entries_put_nr()
>> mm: swap: drop last SWAP_MAP_SHMEM flag in batch in
>> swap_entries_put_nr()
>> mm: swap: free each cluster individually in swap_entries_put_map_nr()
>> mm: swap: factor out helper to drop cache of entries within a single
>> cluster
>> mm: swap: replace cluster_swap_free_nr() with
>> swap_entries_put_[map/cache]()
>>
>> mm/swapfile.c | 173 +++++++++++++++++++++++++-------------------------
>> 1 file changed, 86 insertions(+), 87 deletions(-)
>>
>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2025-03-18 1:29 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-13 21:05 [PATCH 0/9] Minor cleanups and improvements to swap freeing code Kemeng Shi
2025-03-13 21:05 ` [PATCH 1/9] mm: swap: rename __swap_[entry/entries]_free[_locked] to swap_[entry/entries]_put[_locked] Kemeng Shi
2025-03-14 20:37 ` Tim Chen
2025-03-13 21:05 ` [PATCH 2/9] mm: swap: factor out the actual swap entry freeing logic to new helper Kemeng Shi
2025-03-13 17:42 ` Kairui Song
2025-03-14 7:32 ` Kemeng Shi
2025-03-14 7:47 ` Kairui Song
2025-03-14 8:39 ` Kemeng Shi
2025-03-13 21:05 ` [PATCH 3/9] mm: swap: use __swap_entry_free() to free swap entry in swap_entry_put_locked() Kemeng Shi
2025-03-14 20:59 ` Tim Chen
2025-03-13 21:05 ` [PATCH 4/9] mm: swap: remove unneeded VM_BUG_ON(*map != SWAP_HAS_CACHE) in swap_entry_range_free() Kemeng Shi
2025-03-14 21:09 ` Tim Chen
2025-03-13 21:05 ` [PATCH 5/9] mm: swap: use swap_entries_free() drop last 1 flag in swap_entries_put_nr() Kemeng Shi
2025-03-14 21:21 ` Tim Chen
2025-03-13 21:05 ` [PATCH 6/9] mm: swap: drop last SWAP_MAP_SHMEM flag in batch " Kemeng Shi
2025-03-14 21:34 ` Tim Chen
2025-03-13 21:05 ` [PATCH 7/9] mm: swap: free each cluster individually in swap_entries_put_map_nr() Kemeng Shi
2025-03-14 20:53 ` Tim Chen
2025-03-17 17:30 ` Tim Chen
2025-03-13 21:05 ` [PATCH 8/9] mm: swap: factor out helper to drop cache of entries within a single cluster Kemeng Shi
2025-03-15 0:24 ` Tim Chen
2025-03-13 21:05 ` [PATCH 9/9] mm: swap: replace cluster_swap_free_nr() with swap_entries_put_[map/cache]() Kemeng Shi
2025-03-17 18:23 ` Tim Chen
2025-03-14 20:27 ` [PATCH 0/9] Minor cleanups and improvements to swap freeing code Tim Chen
2025-03-18 1:29 ` Kemeng Shi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox