linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Li <chrisl@kernel.org>
To: YoungJun Park <youngjun.park@lge.com>
Cc: rafael@kernel.org, akpm@linux-foundation.org, kasong@tencent.com,
	 pavel@kernel.org, shikemeng@huaweicloud.com, nphamcs@gmail.com,
	 bhe@redhat.com, baohua@kernel.org, usama.arif@linux.dev,
	 linux-pm@vger.kernel.org, linux-mm@kvack.org,
	hyungjun.cho@lge.com
Subject: Re: [RFC PATCH v2] mm/swap, PM: hibernate: hold swap device reference across swap operation
Date: Wed, 11 Mar 2026 00:31:37 -0700	[thread overview]
Message-ID: <CACePvbW_t+dqFVZfde6aXxbREaX4iiSxd4knFGKQyDkbxprH6A@mail.gmail.com> (raw)
In-Reply-To: <aa551UFgiq+gUl/T@yjaykim-PowerEdge-T330>

On Mon, Mar 9, 2026 at 12:42 AM YoungJun Park <youngjun.park@lge.com> wrote:
>
> On Sun, Mar 08, 2026 at 11:43:20PM -0700, Chris Li wrote:
>
> > Agree. That place needs fixing. We will make two patches.
> >
> > Patch 1. Fix the swap off  racing between lookup and first allocation
> > on suspend.
> > swap_type_of() is very tricky for the device swap because of the
> > conditional lookup of the si->start_block matching the offset or not.
> > That make this patch very complex.
> >
> > One idea to brainstorm:
> >
> > So we can get the reference count on during snapshot_open(), after
> > checking "root_swap" still points to valid swsusp_resume_device.
> > Then we release the reference count on "root_swap" during snapshot_release().
> >
> > That might side step the complexity of  swap_type_of() doing the
> > si->start_block checking.
> >
> > It should fix the bug you described here more simply.
>
> While that approach would be great as a minimal fix, I think we still
> cannot avoid the following situation.
>
> Until the first swap offset is allocated, we cannot guarantee that swapoff
> won't happen. To be safe, I think it is difficult to prevent swapoff
> without holding the swap_lock.

Grab the swap device reference at the beginning of `snapshot_open`,
before any swap_offset allocation, until `snapshot_close`. That should
prevent the swapoff? The swapoff must wait until the reference is
dropped at snapshot_close().
I assume the swap entry allocation happens between snapshot_open() and
snapshot_close().

> So, to stick to the minimal fix principle and only address the currently
> possible bug in uswsusp, we could consider:
>
> 1) Creating a separate function to grab the reference for uswsusp, and
>    put it in snapshot_close().
Ack.

> 2) Adding a parameter to swap_type_of() to decide whether to acquire the
>    reference or not, and put it in swsusp_close()

In my mind, shouldn't the first point 1) be enough? Not sure 2) is needed.

Chris




>
> On all strategies, we do not grab the
> reference when taking an in-kernel snapshot, and do not add alloc/free
> get/put.
>
> > > My proposal is to grab the reference at the lookup point to close this
> > > initial race.
> >
> > That is my suggested patch 1.
> >
> > > If we do that, I believe we can remove the per-slot
> > > get/put calls entirely, as the initial reference is sufficient to keep the
> >
> > I suggest that as the patch 2. It is an optimization to eliminate the
> > get/put pairs. It is optional. without it is fine in terms of
> > correctness. Might not worth the trouble for patch 2.
>
> Yes, I agree. I will split the patch into two as you suggested and think
> about it further.
>
> > > device alive until the operation completes.
> > >
> > > Regarding the reference release strategy in this patch:
> > >
> > > 1. uswsusp: The reference is released when the snapshot device file
> > >    is closed(snapshot_release) and error paths.
> > > 2. not uswsusp`: I only added reference release in the error paths.
> >
> > That part makes this patch complex and harder to review. Need to
> > carefully check whether we take the reference count or not.
> >
> > >
> > > About 2.. I conclude that on a successful resume, the system state reverts to
>
> > > the snapshot point, making an explicit release unnecessary. However,
> > > I am not 100% certain if this holds true for the swap reference
> > > context.
> >
> > That is the part I try to avoid: the very fragmented error condition
> > for reference counting.
> > Hopefully, with patch 1 idea we don't need that complexity.
>
> I agree with you.
> But, I believe it can be a safe modification that can be sufficiently
> verified through review.
>
> I would love to hear the thoughts of the hibernation maintainers and other
> reviewers on this. Although there are some complex parts, I think this
> modification has clear benefits.
>
> Thanks
>
> Best regards,
> Youngjun Park
>


      reply	other threads:[~2026-03-11  7:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06  2:46 Youngjun Park
2026-03-06  6:55 ` Chris Li
2026-03-06  8:02   ` YoungJun Park
2026-03-09  6:43     ` Chris Li
2026-03-09  7:42       ` YoungJun Park
2026-03-11  7:31         ` Chris Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACePvbW_t+dqFVZfde6aXxbREaX4iiSxd4knFGKQyDkbxprH6A@mail.gmail.com \
    --to=chrisl@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=hyungjun.cho@lge.com \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=pavel@kernel.org \
    --cc=rafael@kernel.org \
    --cc=shikemeng@huaweicloud.com \
    --cc=usama.arif@linux.dev \
    --cc=youngjun.park@lge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox