[PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c
@ 2025-02-22 16:08 Kemeng Shi
  2025-02-22 16:08 ` [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters() Kemeng Shi
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-02-22 16:08 UTC (permalink / raw)
  To: akpm, kasong; +Cc: linux-mm, linux-kernel

This series includes several random fixes and cleanups to the mm/swapfile.c
code. The issues were discovered during code review and can only manifest
under extremely rare and race-condition scenarios. Further details about
each fix can be found in the respective patches. Thanks for your
attention!

Kemeng Shi (6):
  mm: swap: avoid losting cluster in swap_reclaim_full_clusters()
  mm: swap: use correct step in loop to wait all clusters in
    wait_for_allocation()
  mm, swap: avoid BUG_ON in relocate_cluster()
  mm, swap: remove setting SWAP_MAP_BAD for discard cluster
  mm, swap: correct comment in swap_usage_sub()
  mm: swap: remove stale comment of swap_reclaim_full_clusters()

 mm/swapfile.c | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters()
  2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
@ 2025-02-22 16:08 ` Kemeng Shi
  2025-02-22 17:19   ` Kairui Song
  2025-02-22 16:08 ` [PATCH 2/6] mm: swap: use correct step in loop to wait all clusters in wait_for_allocation() Kemeng Shi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Kemeng Shi @ 2025-02-22 16:08 UTC (permalink / raw)
  To: akpm, kasong; +Cc: linux-mm, linux-kernel

If no swap cache is reclaimed, cluster taken off from full_clusters list
will not be put in any list and may not be reused. Do relocate_cluster
for such cluster to fix the issue.

Fixes: 3b644773eefda ("mm, swap: reduce contention on device lock")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 34baefb000b5..e5f58ab86329 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -861,6 +861,10 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
 			offset++;
 		}
 
+		/* in case no swap cache is reclaimed */
+		if (ci->flags == CLUSTER_FLAG_NONE)
+			relocate_cluster(si, ci);
+
 		unlock_cluster(ci);
 		if (to_scan <= 0)
 			break;
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/6] mm: swap: use correct step in loop to wait all clusters in wait_for_allocation()
  2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
  2025-02-22 16:08 ` [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters() Kemeng Shi
@ 2025-02-22 16:08 ` Kemeng Shi
  2025-02-22 17:32   ` Kairui Song
  2025-02-22 16:08 ` [PATCH 3/6] mm, swap: avoid BUG_ON in relocate_cluster() Kemeng Shi
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Kemeng Shi @ 2025-02-22 16:08 UTC (permalink / raw)
  To: akpm, kasong; +Cc: linux-mm, linux-kernel

Use correct step in loop to wait all clusters in wait_for_allocation().
If we miss some cluster in wait_for_allocation(), use after free may
occurs as following:
shmem_writepage                  swapoff
 folio_alloc_swap
  get_swap_pages
   scan_swap_map_slots
    cluster_alloc_swap_entry
     alloc_swap_scan_cluster
      cluster_alloc_range
       /* SWP_WRITEOK is valid */
       if (!(si->flags & SWP_WRITEOK))

                                  ...
                                  del_from_avail_list(p, true);
                                  ...
                                  /* miss the cluster in shmem_writepage */
                                  wait_for_allocation()
                                  ...
                                  try_to_unuse()

       memset(si->swap_map + start, usage, nr_pages);
       swap_range_alloc(si, nr_pages);
       ci->count += nr_pages;
       /* return a valid entry */

                                  ...
                                  exit_swap_address_space(p->type);
                                  ...

 ...
 add_to_swap_cache
  /* dereference swap_address_space(entry) which is NULL */
  xas_lock_irq(&xas);

Fixes: e47bd46eab97e ("mm, swap: hold a reference during scan and cleanup flag usage")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index e5f58ab86329..425126c0a07d 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2627,7 +2627,6 @@ static void wait_for_allocation(struct swap_info_struct *si)
 	for (offset = 0; offset < end; offset += SWAPFILE_CLUSTER) {
 		ci = lock_cluster(si, offset);
 		unlock_cluster(ci);
-		offset += SWAPFILE_CLUSTER;
 	}
 }
 
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/6] mm, swap: avoid BUG_ON in relocate_cluster()
  2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
  2025-02-22 16:08 ` [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters() Kemeng Shi
  2025-02-22 16:08 ` [PATCH 2/6] mm: swap: use correct step in loop to wait all clusters in wait_for_allocation() Kemeng Shi
@ 2025-02-22 16:08 ` Kemeng Shi
  2025-02-22 18:43   ` Kairui Song
  2025-02-22 16:08 ` [PATCH 4/6] mm, swap: remove setting SWAP_MAP_BAD for discard cluster Kemeng Shi
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Kemeng Shi @ 2025-02-22 16:08 UTC (permalink / raw)
  To: akpm, kasong; +Cc: linux-mm, linux-kernel

If allocation is racy with swapoff, we may call free_cluster for cluster
already in free list and trigger bug on as following:
Allocation                        Swapoff
cluster_alloc_swap_entry
 ...
 /* may get a free cluster with offset */
 offset = xxx;
 if (offset)
  ci = lock_cluster(si, offset);

                                  ...
                                   del_from_avail_list(p, true);
                                    si->flags &= ~SWP_WRITEOK;

  alloc_swap_scan_cluster(si, ci, ...)
   ...
   /* failed to alloc entry from free entry */
   if (!cluster_alloc_range(...))
    break;
   ...
   /* add back a free cluster */
   relocate_cluster(si, ci);
    if (!ci->count)
     free_cluster(si, ci);
      VM_BUG_ON(ci->flags == CLUSTER_FLAG_FREE);

Despite bug_on could be triggered, call free_cluster() for free cluster
only move cluster to tail of list and should be fine.

Check cluster is not free before calling free_cluster() in
relocate_cluster() to avoid bug_on.

Fixes: 3b644773eefda ("mm, swap: reduce contention on device lock")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 425126c0a07d..fc45b9d56639 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -653,7 +653,8 @@ static void relocate_cluster(struct swap_info_struct *si,
 		return;
 
 	if (!ci->count) {
-		free_cluster(si, ci);
+		if (ci->flags != CLUSTER_FLAG_FREE)
+			free_cluster(si, ci);
 	} else if (ci->count != SWAPFILE_CLUSTER) {
 		if (ci->flags != CLUSTER_FLAG_FRAG)
 			move_cluster(si, ci, &si->frag_clusters[ci->order],
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/6] mm, swap: remove setting SWAP_MAP_BAD for discard cluster
  2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
                   ` (2 preceding siblings ...)
  2025-02-22 16:08 ` [PATCH 3/6] mm, swap: avoid BUG_ON in relocate_cluster() Kemeng Shi
@ 2025-02-22 16:08 ` Kemeng Shi
  2025-02-24  1:24   ` Kairui Song
  2025-02-22 16:08 ` [PATCH 5/6] mm, swap: correct comment in swap_usage_sub() Kemeng Shi
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Kemeng Shi @ 2025-02-22 16:08 UTC (permalink / raw)
  To: akpm, kasong; +Cc: linux-mm, linux-kernel

Before alloc from a cluster, we will aqcuire cluster's lock and make
sure it is usable by cluster_is_usable(), so there is no need to
set SWAP_MAP_BAD for cluster to be discarded.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index fc45b9d56639..c640f77a464a 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -479,15 +479,6 @@ static void move_cluster(struct swap_info_struct *si,
 static void swap_cluster_schedule_discard(struct swap_info_struct *si,
 		struct swap_cluster_info *ci)
 {
-	unsigned int idx = cluster_index(si, ci);
-	/*
-	 * If scan_swap_map_slots() can't find a free cluster, it will check
-	 * si->swap_map directly. To make sure the discarding cluster isn't
-	 * taken by scan_swap_map_slots(), mark the swap entries bad (occupied).
-	 * It will be cleared after discard
-	 */
-	memset(si->swap_map + idx * SWAPFILE_CLUSTER,
-			SWAP_MAP_BAD, SWAPFILE_CLUSTER);
 	VM_BUG_ON(ci->flags == CLUSTER_FLAG_FREE);
 	move_cluster(si, ci, &si->discard_clusters, CLUSTER_FLAG_DISCARD);
 	schedule_work(&si->discard_work);
@@ -571,8 +562,6 @@ static bool swap_do_scheduled_discard(struct swap_info_struct *si)
 		 * return the cluster to allocation list.
 		 */
 		ci->flags = CLUSTER_FLAG_NONE;
-		memset(si->swap_map + idx * SWAPFILE_CLUSTER,
-				0, SWAPFILE_CLUSTER);
 		__free_cluster(si, ci);
 		spin_unlock(&ci->lock);
 		ret = true;
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 5/6] mm, swap: correct comment in swap_usage_sub()
  2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
                   ` (3 preceding siblings ...)
  2025-02-22 16:08 ` [PATCH 4/6] mm, swap: remove setting SWAP_MAP_BAD for discard cluster Kemeng Shi
@ 2025-02-22 16:08 ` Kemeng Shi
  2025-02-22 16:08 ` [PATCH 6/6] mm: swap: remove stale comment of swap_reclaim_full_clusters() Kemeng Shi
  2025-02-23  1:44 ` [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Andrew Morton
  6 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-02-22 16:08 UTC (permalink / raw)
  To: akpm, kasong; +Cc: linux-mm, linux-kernel

We will add si back to plist in swap_usage_sub(), just correct the wrong
comment which says we will remove si from plist in swap_usage_sub().

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index c640f77a464a..6ff57ed23e27 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1098,7 +1098,7 @@ static void swap_usage_sub(struct swap_info_struct *si, unsigned int nr_entries)
 
 	/*
 	 * If device is not full, and SWAP_USAGE_OFFLIST_BIT is set,
-	 * remove it from the plist.
+	 * add it to the plist.
 	 */
 	if (unlikely(val & SWAP_USAGE_OFFLIST_BIT))
 		add_to_avail_list(si, false);
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 6/6] mm: swap: remove stale comment of swap_reclaim_full_clusters()
  2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
                   ` (4 preceding siblings ...)
  2025-02-22 16:08 ` [PATCH 5/6] mm, swap: correct comment in swap_usage_sub() Kemeng Shi
@ 2025-02-22 16:08 ` Kemeng Shi
  2025-02-23  1:44 ` [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Andrew Morton
  6 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-02-22 16:08 UTC (permalink / raw)
  To: akpm, kasong; +Cc: linux-mm, linux-kernel

swap_reclaim_full_clusters() has no return value now, just remove the
stale comment which says swap_reclaim_full_clusters() wil return a bool
value.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/swapfile.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6ff57ed23e27..dc0dc5a26a88 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -820,7 +820,6 @@ static unsigned int alloc_swap_scan_cluster(struct swap_info_struct *si,
 	return found;
 }
 
-/* Return true if reclaimed a whole cluster */
 static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
 {
 	long to_scan = 1;
-- 
2.30.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters()
  2025-02-22 16:08 ` [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters() Kemeng Shi
@ 2025-02-22 17:19   ` Kairui Song
  2025-02-24  1:17     ` Kemeng Shi
  0 siblings, 1 reply; 14+ messages in thread
From: Kairui Song @ 2025-02-22 17:19 UTC (permalink / raw)
  To: Kemeng Shi, akpm; +Cc: linux-mm, linux-kernel

On Sat, Feb 22, 2025 at 3:12 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>
> If no swap cache is reclaimed, cluster taken off from full_clusters list
> will not be put in any list and may not be reused. Do relocate_cluster
> for such cluster to fix the issue.
>
> Fixes: 3b644773eefda ("mm, swap: reduce contention on device lock")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>  mm/swapfile.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 34baefb000b5..e5f58ab86329 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -861,6 +861,10 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
>                         offset++;
>                 }
>
> +               /* in case no swap cache is reclaimed */
> +               if (ci->flags == CLUSTER_FLAG_NONE)
> +                       relocate_cluster(si, ci);
> +
>                 unlock_cluster(ci);
>                 if (to_scan <= 0)
>                         break;
> --
> 2.30.0

Thanks. A little nick pick, "losting" is not a word, I think you mean "leaking".

And BTW maybe it's better to describe the result of this leak in a bit
more details, "cluster leaking from lists" and "will not be reused"
looked a bit scary at a glance to me. But realizing it's full
clusters, they will be moved back to a list if any slots on them are
freed, so the worst result is inefficiently reclaiming of HAS_CACHE
slots, we didn't really lose these clusters.

We do need to fix it though. So other than the commit summary and
message nitpick:

Reviewed-by: Kairui Song <kasong@tencent.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/6] mm: swap: use correct step in loop to wait all clusters in wait_for_allocation()
  2025-02-22 16:08 ` [PATCH 2/6] mm: swap: use correct step in loop to wait all clusters in wait_for_allocation() Kemeng Shi
@ 2025-02-22 17:32   ` Kairui Song
  0 siblings, 0 replies; 14+ messages in thread
From: Kairui Song @ 2025-02-22 17:32 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: akpm, linux-mm, linux-kernel

On Sat, Feb 22, 2025 at 3:12 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>
> Use correct step in loop to wait all clusters in wait_for_allocation().
> If we miss some cluster in wait_for_allocation(), use after free may
> occurs as following:
> shmem_writepage                  swapoff
>  folio_alloc_swap
>   get_swap_pages
>    scan_swap_map_slots
>     cluster_alloc_swap_entry
>      alloc_swap_scan_cluster
>       cluster_alloc_range
>        /* SWP_WRITEOK is valid */
>        if (!(si->flags & SWP_WRITEOK))
>
>                                   ...
>                                   del_from_avail_list(p, true);
>                                   ...
>                                   /* miss the cluster in shmem_writepage */
>                                   wait_for_allocation()
>                                   ...
>                                   try_to_unuse()
>
>        memset(si->swap_map + start, usage, nr_pages);
>        swap_range_alloc(si, nr_pages);
>        ci->count += nr_pages;
>        /* return a valid entry */
>
>                                   ...
>                                   exit_swap_address_space(p->type);
>                                   ...
>
>  ...
>  add_to_swap_cache
>   /* dereference swap_address_space(entry) which is NULL */
>   xas_lock_irq(&xas);
>
> Fixes: e47bd46eab97e ("mm, swap: hold a reference during scan and cleanup flag usage")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>  mm/swapfile.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index e5f58ab86329..425126c0a07d 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2627,7 +2627,6 @@ static void wait_for_allocation(struct swap_info_struct *si)
>         for (offset = 0; offset < end; offset += SWAPFILE_CLUSTER) {
>                 ci = lock_cluster(si, offset);
>                 unlock_cluster(ci);
> -               offset += SWAPFILE_CLUSTER;
>         }
>  }
>

Thanks, good catch.

Reviewed-by: Kairui Song <kasong@tencent.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/6] mm, swap: avoid BUG_ON in relocate_cluster()
  2025-02-22 16:08 ` [PATCH 3/6] mm, swap: avoid BUG_ON in relocate_cluster() Kemeng Shi
@ 2025-02-22 18:43   ` Kairui Song
  0 siblings, 0 replies; 14+ messages in thread
From: Kairui Song @ 2025-02-22 18:43 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: akpm, linux-mm, linux-kernel

On Sat, Feb 22, 2025 at 3:12 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>
> If allocation is racy with swapoff, we may call free_cluster for cluster
> already in free list and trigger bug on as following:

Maybe capitalize this "bug on" to BUG_ON to be consistent with the title.

> Allocation                        Swapoff
> cluster_alloc_swap_entry
>  ...
>  /* may get a free cluster with offset */
>  offset = xxx;
>  if (offset)
>   ci = lock_cluster(si, offset);
>
>                                   ...
>                                    del_from_avail_list(p, true);
>                                     si->flags &= ~SWP_WRITEOK;
>
>   alloc_swap_scan_cluster(si, ci, ...)
>    ...
>    /* failed to alloc entry from free entry */
>    if (!cluster_alloc_range(...))
>     break;
>    ...
>    /* add back a free cluster */
>    relocate_cluster(si, ci);
>     if (!ci->count)
>      free_cluster(si, ci);
>       VM_BUG_ON(ci->flags == CLUSTER_FLAG_FREE);
>
> Despite bug_on could be triggered, call free_cluster() for free cluster
> only move cluster to tail of list and should be fine.
>
> Check cluster is not free before calling free_cluster() in
> relocate_cluster() to avoid bug_on.

Same here.

>
> Fixes: 3b644773eefda ("mm, swap: reduce contention on device lock")
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>  mm/swapfile.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 425126c0a07d..fc45b9d56639 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -653,7 +653,8 @@ static void relocate_cluster(struct swap_info_struct *si,
>                 return;
>
>         if (!ci->count) {
> -               free_cluster(si, ci);
> +               if (ci->flags != CLUSTER_FLAG_FREE)
> +                       free_cluster(si, ci);
>         } else if (ci->count != SWAPFILE_CLUSTER) {
>                 if (ci->flags != CLUSTER_FLAG_FRAG)
>                         move_cluster(si, ci, &si->frag_clusters[ci->order],
> --
> 2.30.0

Thanks, other than minor commit message issue:

Reviewed-by: Kairui Song <kasong@tencent.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c
  2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
                   ` (5 preceding siblings ...)
  2025-02-22 16:08 ` [PATCH 6/6] mm: swap: remove stale comment of swap_reclaim_full_clusters() Kemeng Shi
@ 2025-02-23  1:44 ` Andrew Morton
  2025-02-24  1:27   ` Kemeng Shi
  6 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2025-02-23  1:44 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: kasong, linux-mm, linux-kernel

On Sun, 23 Feb 2025 00:08:44 +0800 Kemeng Shi <shikemeng@huaweicloud.com> wrote:

> This series includes several random fixes and cleanups to the mm/swapfile.c
> code. The issues were discovered during code review and can only manifest
> under extremely rare and race-condition scenarios. Further details about
> each fix can be found in the respective patches. Thanks for your
> attention!

Thanks.  I queued the first three patches in mm-hotfixes-stable, as
they address post-6.13 issues which should be addressed during this -rc
cycle.

I queued the other three patches for 6.15-rcX.

I corrected the Fixes: hash in [2/6].

I made some changelog alterations, along the lines that Kairui
mentioned.  Please check the results and send along any alterations
which you feel are desirable.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters()
  2025-02-22 17:19   ` Kairui Song
@ 2025-02-24  1:17     ` Kemeng Shi
  0 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-02-24  1:17 UTC (permalink / raw)
  To: Kairui Song, akpm; +Cc: linux-mm, linux-kernel



on 2/23/2025 1:19 AM, Kairui Song wrote:
> On Sat, Feb 22, 2025 at 3:12 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
>>
>> If no swap cache is reclaimed, cluster taken off from full_clusters list
>> will not be put in any list and may not be reused. Do relocate_cluster
>> for such cluster to fix the issue.
>>
>> Fixes: 3b644773eefda ("mm, swap: reduce contention on device lock")
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> ---
>>  mm/swapfile.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 34baefb000b5..e5f58ab86329 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -861,6 +861,10 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
>>                         offset++;
>>                 }
>>
>> +               /* in case no swap cache is reclaimed */
>> +               if (ci->flags == CLUSTER_FLAG_NONE)
>> +                       relocate_cluster(si, ci);
>> +
>>                 unlock_cluster(ci);
>>                 if (to_scan <= 0)
>>                         break;
>> --
>> 2.30.0
> 
> Thanks. A little nick pick, "losting" is not a word, I think you mean "leaking".
> 
> And BTW maybe it's better to describe the result of this leak in a bit
> more details, "cluster leaking from lists" and "will not be reused"
> looked a bit scary at a glance to me. But realizing it's full
> clusters, they will be moved back to a list if any slots on them are
> freed, so the worst result is inefficiently reclaiming of HAS_CACHE
> slots, we didn't really lose these clusters.
> 
> We do need to fix it though. So other than the commit summary and
> message nitpick:
Thanks for feedback, I will improve commit summary and message in next
verstion.
> 
> Reviewed-by: Kairui Song <kasong@tencent.com>
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/6] mm, swap: remove setting SWAP_MAP_BAD for discard cluster
  2025-02-22 16:08 ` [PATCH 4/6] mm, swap: remove setting SWAP_MAP_BAD for discard cluster Kemeng Shi
@ 2025-02-24  1:24   ` Kairui Song
  0 siblings, 0 replies; 14+ messages in thread
From: Kairui Song @ 2025-02-24  1:24 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: Andrew Morton, linux-mm, LKML

[-- Attachment #1: Type: text/plain, Size: 2037 bytes --]

On Sat, Feb 22, 2025 at 3:12 PM Kemeng Shi <shikemeng@huaweicloud.com>
wrote:
>
> Before alloc from a cluster, we will aqcuire cluster's lock and make
> sure it is usable by cluster_is_usable(), so there is no need to
> set SWAP_MAP_BAD for cluster to be discarded.
>
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>  mm/swapfile.c | 11 -----------
>  1 file changed, 11 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index fc45b9d56639..c640f77a464a 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -479,15 +479,6 @@ static void move_cluster(struct swap_info_struct *si,
>  static void swap_cluster_schedule_discard(struct swap_info_struct *si,
>                 struct swap_cluster_info *ci)
>  {
> -       unsigned int idx = cluster_index(si, ci);
> -       /*
> -        * If scan_swap_map_slots() can't find a free cluster, it will
check
> -        * si->swap_map directly. To make sure the discarding cluster
isn't
> -        * taken by scan_swap_map_slots(), mark the swap entries bad
(occupied).
> -        * It will be cleared after discard
> -        */
> -       memset(si->swap_map + idx * SWAPFILE_CLUSTER,
> -                       SWAP_MAP_BAD, SWAPFILE_CLUSTER);
>         VM_BUG_ON(ci->flags == CLUSTER_FLAG_FREE);
>         move_cluster(si, ci, &si->discard_clusters, CLUSTER_FLAG_DISCARD);
>         schedule_work(&si->discard_work);
> @@ -571,8 +562,6 @@ static bool swap_do_scheduled_discard(struct
swap_info_struct *si)
>                  * return the cluster to allocation list.
>                  */
>                 ci->flags = CLUSTER_FLAG_NONE;
> -               memset(si->swap_map + idx * SWAPFILE_CLUSTER,
> -                               0, SWAPFILE_CLUSTER);
>                 __free_cluster(si, ci);
>                 spin_unlock(&ci->lock);
>                 ret = true;

Good, I wanted to do this some time ago, but forgot about this while busy
with other things.

Reviewed-by: Kairui Song <kasong@tencent.com>

[-- Attachment #2: Type: text/html, Size: 2795 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c
  2025-02-23  1:44 ` [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Andrew Morton
@ 2025-02-24  1:27   ` Kemeng Shi
  0 siblings, 0 replies; 14+ messages in thread
From: Kemeng Shi @ 2025-02-24  1:27 UTC (permalink / raw)
  To: Andrew Morton; +Cc: kasong, linux-mm, linux-kernel



on 2/23/2025 9:44 AM, Andrew Morton wrote:
> On Sun, 23 Feb 2025 00:08:44 +0800 Kemeng Shi <shikemeng@huaweicloud.com> wrote:
> 
>> This series includes several random fixes and cleanups to the mm/swapfile.c
>> code. The issues were discovered during code review and can only manifest
>> under extremely rare and race-condition scenarios. Further details about
>> each fix can be found in the respective patches. Thanks for your
>> attention!
> 
> Thanks.  I queued the first three patches in mm-hotfixes-stable, as
> they address post-6.13 issues which should be addressed during this -rc
> cycle.
> 
> I queued the other three patches for 6.15-rcX.
> 
> I corrected the Fixes: hash in [2/6].
> 
> I made some changelog alterations, along the lines that Kairui
> mentioned.  Please check the results and send along any alterations
> which you feel are desirable.
> 
Thanks for correcting the changelog. The results look good to me
except patch 1 may still need some changelog improvement. Will
send a v2 of patch 1 soon.



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-02-24  1:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-22 16:08 [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Kemeng Shi
2025-02-22 16:08 ` [PATCH 1/6] mm: swap: avoid losting cluster in swap_reclaim_full_clusters() Kemeng Shi
2025-02-22 17:19   ` Kairui Song
2025-02-24  1:17     ` Kemeng Shi
2025-02-22 16:08 ` [PATCH 2/6] mm: swap: use correct step in loop to wait all clusters in wait_for_allocation() Kemeng Shi
2025-02-22 17:32   ` Kairui Song
2025-02-22 16:08 ` [PATCH 3/6] mm, swap: avoid BUG_ON in relocate_cluster() Kemeng Shi
2025-02-22 18:43   ` Kairui Song
2025-02-22 16:08 ` [PATCH 4/6] mm, swap: remove setting SWAP_MAP_BAD for discard cluster Kemeng Shi
2025-02-24  1:24   ` Kairui Song
2025-02-22 16:08 ` [PATCH 5/6] mm, swap: correct comment in swap_usage_sub() Kemeng Shi
2025-02-22 16:08 ` [PATCH 6/6] mm: swap: remove stale comment of swap_reclaim_full_clusters() Kemeng Shi
2025-02-23  1:44 ` [PATCH 0/6] Some random fixes and cleanups to mm/swapfile.c Andrew Morton
2025-02-24  1:27   ` Kemeng Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox