linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] kernel/power: fix swap device reference handling in hibernation swap path
@ 2026-03-02 16:53 Youngjun Park
  2026-03-02 16:53 ` [RFC PATCH 1/2] mm/swap: release swap reference on each hibernation slot allocation Youngjun Park
  2026-03-02 16:53 ` [RFC PATCH 2/2] kernel/power: hold swap device reference across hibernation swap operation Youngjun Park
  0 siblings, 2 replies; 3+ messages in thread
From: Youngjun Park @ 2026-03-02 16:53 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-mm, rafael, lenb, pavel, akpm, chrisl, kasong, shikemeng,
	nphamcs, bhe, baohua, youngjun.park

This series addresses two issues in the hibernation swap path.

First, grabbing and releasing the swap device reference on every slot
allocation is inefficient across the entire hibernation swap path.

Second, in the uswsusp path, only the swap type value is retrieved at
lookup time without holding a reference. If swapoff races after the
type is acquired, subsequent slot allocations operate on a stale swap
device.

The fix is to hold the swap device reference from the point the swap
device is looked up, and release it once at each exit path.

  Patch 1: Release the reference immediately after each slot allocation
            as a preparatory step.
  Patch 2: Lift the reference acquisition to the lookup site and place
            put_swap_device_by_type() at all relevant cleanup paths in
            swap.c and user.c.

This series is based on mm-new.

I'm sending this as RFC because my familiarity with the kernel/power
and snapshot paths is limited. I believe the approach is reasonable,
but I'd appreciate any feedback before moving forward with proper
testing and a formal submission.

Thanks,
Youngjun Park

Youngjun Park (2):
  mm/swap: release swap reference on each hibernation slot allocation
  kernel/power: hold swap device reference across hibernation swap
    operation

 include/linux/swap.h |  1 +
 kernel/power/swap.c  | 12 +++++++---
 kernel/power/user.c  |  9 +++++++-
 mm/swapfile.c        | 55 ++++++++++++++++++++++----------------------
 4 files changed, 45 insertions(+), 32 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [RFC PATCH 1/2] mm/swap: release swap reference on each hibernation slot allocation
  2026-03-02 16:53 [RFC PATCH 0/2] kernel/power: fix swap device reference handling in hibernation swap path Youngjun Park
@ 2026-03-02 16:53 ` Youngjun Park
  2026-03-02 16:53 ` [RFC PATCH 2/2] kernel/power: hold swap device reference across hibernation swap operation Youngjun Park
  1 sibling, 0 replies; 3+ messages in thread
From: Youngjun Park @ 2026-03-02 16:53 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-mm, rafael, lenb, pavel, akpm, chrisl, kasong, shikemeng,
	nphamcs, bhe, baohua, youngjun.park

Currently, only the swap type value is retrieved at lookup time without
holding a reference. If swapoff races after the type is acquired, the
type value becomes invalid and subsequent slot allocations operate on
a stale swap device.

Additionally, grabbing and releasing the reference on every slot
allocation is inefficient. The proper approach is to hold the reference
from the swap device lookup and release it once when it is no longer
needed.

This is a preparatory change. A subsequent commit will lift the
reference acquisition to the lookup site and replace the per-slot
acquire/release with a single reference held across the entire
hibernation swap operation.

Signed-off-by: Youngjun Park <youngjun.park@lge.com>
---
 include/linux/swap.h |  1 +
 mm/swapfile.c        | 55 ++++++++++++++++++++++----------------------
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7a09df6977a5..37bf7cf21594 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -442,6 +442,7 @@ extern bool swap_entry_swapped(struct swap_info_struct *si, swp_entry_t entry);
 extern int swp_swapcount(swp_entry_t entry);
 struct backing_dev_info;
 extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
+extern void put_swap_device_by_type(int type);
 sector_t swap_folio_sector(struct folio *folio);
 
 /*
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 915bc93964db..f505dd1f7571 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1860,6 +1860,10 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
 	return NULL;
 }
 
+void put_swap_device_by_type(int type)
+{
+	percpu_ref_put(&swap_info[type]->users);
+}
 /*
  * Free a set of swap slots after their swap count dropped to zero, or will be
  * zero after putting the last ref (saves one __swap_cluster_put_entry call).
@@ -2085,30 +2089,28 @@ swp_entry_t swap_alloc_hibernation_slot(int type)
 		goto fail;
 
 	/* This is called for allocating swap entry, not cache */
-	if (get_swap_device_info(si)) {
-		if (si->flags & SWP_WRITEOK) {
-			/*
-			 * Try the local cluster first if it matches the device. If
-			 * not, try grab a new cluster and override local cluster.
-			 */
-			local_lock(&percpu_swap_cluster.lock);
-			pcp_si = this_cpu_read(percpu_swap_cluster.si[0]);
-			pcp_offset = this_cpu_read(percpu_swap_cluster.offset[0]);
-			if (pcp_si == si && pcp_offset) {
-				ci = swap_cluster_lock(si, pcp_offset);
-				if (cluster_is_usable(ci, 0))
-					offset = alloc_swap_scan_cluster(si, ci, NULL, pcp_offset);
-				else
-					swap_cluster_unlock(ci);
-			}
-			if (!offset)
-				offset = cluster_alloc_swap_entry(si, NULL);
-			local_unlock(&percpu_swap_cluster.lock);
-			if (offset)
-				entry = swp_entry(si->type, offset);
+	if (si->flags & SWP_WRITEOK) {
+		/*
+		 * Try the local cluster first if it matches the device. If
+		 * not, try grab a new cluster and override local cluster.
+		 */
+		local_lock(&percpu_swap_cluster.lock);
+		pcp_si = this_cpu_read(percpu_swap_cluster.si[0]);
+		pcp_offset = this_cpu_read(percpu_swap_cluster.offset[0]);
+		if (pcp_si == si && pcp_offset) {
+			ci = swap_cluster_lock(si, pcp_offset);
+			if (cluster_is_usable(ci, 0))
+				offset = alloc_swap_scan_cluster(si, ci, NULL, pcp_offset);
+			else
+				swap_cluster_unlock(ci);
 		}
-		put_swap_device(si);
+		if (!offset)
+			offset = cluster_alloc_swap_entry(si, NULL);
+		local_unlock(&percpu_swap_cluster.lock);
+		if (offset)
+			entry = swp_entry(si->type, offset);
 	}
+
 fail:
 	return entry;
 }
@@ -2116,14 +2118,10 @@ swp_entry_t swap_alloc_hibernation_slot(int type)
 /* Free a slot allocated by swap_alloc_hibernation_slot */
 void swap_free_hibernation_slot(swp_entry_t entry)
 {
-	struct swap_info_struct *si;
+	struct swap_info_struct *si = __swap_entry_to_info(entry);
 	struct swap_cluster_info *ci;
 	pgoff_t offset = swp_offset(entry);
 
-	si = get_swap_device(entry);
-	if (WARN_ON(!si))
-		return;
-
 	ci = swap_cluster_lock(si, offset);
 	__swap_cluster_put_entry(ci, offset % SWAPFILE_CLUSTER);
 	__swap_cluster_free_entries(si, ci, offset % SWAPFILE_CLUSTER, 1);
@@ -2131,7 +2129,6 @@ void swap_free_hibernation_slot(swp_entry_t entry)
 
 	/* In theory readahead might add it to the swap cache by accident */
 	__try_to_reclaim_swap(si, offset, TTRS_ANYWAY);
-	put_swap_device(si);
 }
 
 /*
@@ -2160,6 +2157,7 @@ int swap_type_of(dev_t device, sector_t offset)
 			struct swap_extent *se = first_se(sis);
 
 			if (se->start_block == offset) {
+				get_swap_device_info(sis);
 				spin_unlock(&swap_lock);
 				return type;
 			}
@@ -2180,6 +2178,7 @@ int find_first_swap(dev_t *device)
 		if (!(sis->flags & SWP_WRITEOK))
 			continue;
 		*device = sis->bdev->bd_dev;
+		get_swap_device_info(sis);
 		spin_unlock(&swap_lock);
 		return type;
 	}
-- 
2.34.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [RFC PATCH 2/2] kernel/power: hold swap device reference across hibernation swap operation
  2026-03-02 16:53 [RFC PATCH 0/2] kernel/power: fix swap device reference handling in hibernation swap path Youngjun Park
  2026-03-02 16:53 ` [RFC PATCH 1/2] mm/swap: release swap reference on each hibernation slot allocation Youngjun Park
@ 2026-03-02 16:53 ` Youngjun Park
  1 sibling, 0 replies; 3+ messages in thread
From: Youngjun Park @ 2026-03-02 16:53 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-mm, rafael, lenb, pavel, akpm, chrisl, kasong, shikemeng,
	nphamcs, bhe, baohua, youngjun.park

Acquire the swap device reference at the point the swap device is looked
up and release it at each exit path, rather than grabbing and dropping it
on every slot allocation.

This also fixes a race where only the swap type value was retrieved at
lookup time without holding a reference. If swapoff raced after the type
was acquired, subsequent operations would reference a stale swap device.

put_swap_device_by_type() is now placed at all relevant cleanup paths in
both swap.c and user.c to ensure the reference is properly released when
the hibernation swap operation completes or fails.

Signed-off-by: Youngjun Park <youngjun.park@lge.com>
---
 kernel/power/swap.c | 12 +++++++++---
 kernel/power/user.c |  9 ++++++++-
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 2e64869bb5a0..c230b0fa5a5f 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -350,9 +350,10 @@ static int swsusp_swap_check(void)
 
 	hib_resume_bdev_file = bdev_file_open_by_dev(swsusp_resume_device,
 			BLK_OPEN_WRITE, NULL, NULL);
-	if (IS_ERR(hib_resume_bdev_file))
+	if (IS_ERR(hib_resume_bdev_file)) {
+		put_swap_device_by_type(root_swap);
 		return PTR_ERR(hib_resume_bdev_file);
-
+	}
 	return 0;
 }
 
@@ -418,6 +419,7 @@ static int get_swap_writer(struct swap_map_handle *handle)
 err_rel:
 	release_swap_writer(handle);
 err_close:
+	put_swap_device_by_type(root_swap);
 	swsusp_close();
 	return ret;
 }
@@ -480,8 +482,11 @@ static int swap_writer_finish(struct swap_map_handle *handle,
 		flush_swap_writer(handle);
 	}
 
-	if (error)
+	if (error) {
 		free_all_swap_pages(root_swap);
+		put_swap_device_by_type(root_swap);
+	}
+
 	release_swap_writer(handle);
 	swsusp_close();
 
@@ -1647,6 +1652,7 @@ int swsusp_unmark(void)
 	 * We just returned from suspend, we don't need the image any more.
 	 */
 	free_all_swap_pages(root_swap);
+	put_swap_device_by_type(root_swap);
 
 	return error;
 }
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 4401cfe26e5c..9cb6c24d49ea 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -90,8 +90,11 @@ static int snapshot_open(struct inode *inode, struct file *filp)
 			data->free_bitmaps = !error;
 		}
 	}
-	if (error)
+	if (error) {
 		hibernate_release();
+		if (data->swap >= 0)
+			put_swap_device_by_type(data->swap);
+	}
 
 	data->frozen = false;
 	data->ready = false;
@@ -115,6 +118,8 @@ static int snapshot_release(struct inode *inode, struct file *filp)
 	data = filp->private_data;
 	data->dev = 0;
 	free_all_swap_pages(data->swap);
+	if (data->swap >= 0)
+		put_swap_device_by_type(data->swap);
 	if (data->frozen) {
 		pm_restore_gfp_mask();
 		free_basic_memory_bitmaps();
@@ -235,6 +240,8 @@ static int snapshot_set_swap_area(struct snapshot_data *data,
 		offset = swap_area.offset;
 	}
 
+	if (data->swap >= 0)
+		put_swap_device_by_type(data->swap);
 	/*
 	 * User space encodes device types as two-byte values,
 	 * so we need to recode them
-- 
2.34.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-02 17:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-02 16:53 [RFC PATCH 0/2] kernel/power: fix swap device reference handling in hibernation swap path Youngjun Park
2026-03-02 16:53 ` [RFC PATCH 1/2] mm/swap: release swap reference on each hibernation slot allocation Youngjun Park
2026-03-02 16:53 ` [RFC PATCH 2/2] kernel/power: hold swap device reference across hibernation swap operation Youngjun Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox