Re: [PATCH 1/2] mm, swap: simplify checking if a folio is swapped

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kairui Song <ryncsn@gmail.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Chris Li <chrisl@kernel.org>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	Nhat Pham <nphamcs@gmail.com>,  Baoquan He <bhe@redhat.com>,
	Barry Song <baohua@kernel.org>,
	 "Rafael J. Wysocki" <rafael@kernel.org>,
	Carsten Grohmann <carstengrohmann@gmx.de>,
	linux-kernel@vger.kernel.org,
	 "open list:SUSPEND TO RAM" <linux-pm@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm, swap: simplify checking if a folio is swapped
Date: Sun, 15 Feb 2026 18:41:05 +0800	[thread overview]
Message-ID: <CAMgjq7C2K5mAVcieX5rUUk9Qu3o7OgHkFqBU86AsvxNQG0uoUw@mail.gmail.com> (raw)
In-Reply-To: <20260215103815.87329-1-ryncsn@gmail.com>

On Sun, Feb 15, 2026 at 6:38 PM Kairui Song <ryncsn@gmail.com> wrote:
>
> From: Kairui Song <kasong@tencent.com>
>
> Clean up and simplify how we check if a folio is swapped. The helper
> already requires the folio to be in swap cache and locked. That's enough
> to pin the swap cluster from being freed, so there is no need to lock
> anything else to avoid UAF.
>
> And besides, we have cleaned up and defined the swap operation to be
> mostly folio based, and now the only place a folio will have any of its
> swap slots' count increased from 0 to 1 is folio_dup_swap, which also
> requires the folio lock. So as we are holding the folio lock here, a
> folio can't change its swap status from not swapped (all swap slots have
> a count of 0) to swapped (any slot has a swap count larger than 0).
>
> So there won't be any false negatives of this helper if we simply depend
> on the folio lock to stabilize the cluster.
>
> We are only using this helper to determine if we can and should release
> the swap cache. So false positives are completely harmless, and also
> already exist before. Depending on the timing, previously, it's also
> possible that a racing thread releases the swap count right after
> releasing the ci lock and before this helper returns. In any case, the
> worst that could happen is we leave a clean swap cache. It will still be
> reclaimed when under pressure just fine.
>
> So, in conclusion, we can simplify and make the check much simpler and
> lockless. Also, rename it to folio_maybe_swapped to reflect the design.
>
> Signed-off-by: Kairui Song <kasong@tencent.com>
> ---
>  mm/swap.h     |  5 ++--
>  mm/swapfile.c | 82 ++++++++++++++++++++++++++++-----------------------
>  2 files changed, 48 insertions(+), 39 deletions(-)
>
> diff --git a/mm/swap.h b/mm/swap.h
> index 9fc5fecdcfdf..3ee761ee8348 100644
> --- a/mm/swap.h
> +++ b/mm/swap.h
> @@ -195,12 +195,13 @@ extern int swap_retry_table_alloc(swp_entry_t entry, gfp_t gfp);
>   *
>   * folio_alloc_swap(): the entry point for a folio to be swapped
>   * out. It allocates swap slots and pins the slots with swap cache.
> - * The slots start with a swap count of zero.
> + * The slots start with a swap count of zero. The slots are pinned
> + * by swap cache reference which doesn't contribute to swap count.
>   *
>   * folio_dup_swap(): increases the swap count of a folio, usually
>   * during it gets unmapped and a swap entry is installed to replace
>   * it (e.g., swap entry in page table). A swap slot with swap
> - * count == 0 should only be increasd by this helper.
> + * count == 0 can only be increased by this helper.
>   *
>   * folio_put_swap(): does the opposite thing of folio_dup_swap().
>   */
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 9628015fd8cf..cb18960a6089 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1743,7 +1743,11 @@ int folio_alloc_swap(struct folio *folio)
>   * @subpage: if not NULL, only increase the swap count of this subpage.
>   *
>   * Typically called when the folio is unmapped and have its swap entry to
> - * take its palce.
> + * take its place: Swap entries allocated to a folio has count == 0 and pinned
> + * by swap cache. The swap cache pin doesn't increase the swap count. This
> + * helper sets the initial count == 1 and increases the count as the folio is
> + * unmapped and swap entries referencing the slots are generated to replace
> + * the folio.
>   *
>   * Context: Caller must ensure the folio is locked and in the swap cache.
>   * NOTE: The caller also has to ensure there is no raced call to
> @@ -1944,49 +1948,44 @@ int swp_swapcount(swp_entry_t entry)
>         return count < 0 ? 0 : count;
>  }
>
> -static bool swap_page_trans_huge_swapped(struct swap_info_struct *si,
> -                                        swp_entry_t entry, int order)
> +/*
> + * folio_maybe_swapped - Test if a folio covers any swap slot with count > 0.
> + *
> + * Check if a folio is swapped. Holding the folio lock ensures the folio won't
> + * go from not-swapped to swapped because the initial swap count increment can
> + * only be done by folio_dup_swap, which also locks the folio. But a concurrent
> + * decrease of swap count is possible through swap_put_entries_direct, so this
> + * may return a false positive.
> + *
> + * Context: Caller must ensure the folio is locked and in the swap cache.
> + */
> +static bool folio_maybe_swapped(struct folio *folio)
>  {
> +       swp_entry_t entry = folio->swap;
>         struct swap_cluster_info *ci;
> -       unsigned int nr_pages = 1 << order;
> -       unsigned long roffset = swp_offset(entry);
> -       unsigned long offset = round_down(roffset, nr_pages);
> -       unsigned int ci_off;
> -       int i;
> +       unsigned int ci_off, ci_end;
>         bool ret = false;
>
> -       ci = swap_cluster_lock(si, offset);
> -       if (nr_pages == 1) {
> -               ci_off = roffset % SWAPFILE_CLUSTER;
> -               if (swp_tb_get_count(__swap_table_get(ci, ci_off)))
> -                       ret = true;
> -               goto unlock_out;
> -       }
> -       for (i = 0; i < nr_pages; i++) {
> -               ci_off = (offset + i) % SWAPFILE_CLUSTER;
> -               if (swp_tb_get_count(__swap_table_get(ci, ci_off))) {
> -                       ret = true;
> -                       break;
> -               }
> -       }
> -unlock_out:
> -       swap_cluster_unlock(ci);
> -       return ret;
> -}
> -
> -static bool folio_swapped(struct folio *folio)
> -{
> -       swp_entry_t entry = folio->swap;
> -       struct swap_info_struct *si;
> -
>         VM_WARN_ON_ONCE_FOLIO(!folio_test_locked(folio), folio);
>         VM_WARN_ON_ONCE_FOLIO(!folio_test_swapcache(folio), folio);
>
> -       si = __swap_entry_to_info(entry);
> -       if (!IS_ENABLED(CONFIG_THP_SWAP) || likely(!folio_test_large(folio)))
> -               return swap_entry_swapped(si, entry);
> +       ci = __swap_entry_to_cluster(entry);
> +       ci_off = swp_cluster_offset(entry);
> +       ci_end = ci_off + folio_nr_pages(folio);
> +       /*
> +        * Extra locking not needed, folio lock ensures its swap entries
> +        * won't be released, the backing data won't be gone either.
> +        */
> +       rcu_read_lock();
> +       do {
> +               if (__swp_tb_get_count(__swap_table_get(ci, ci_off))) {
> +                       ret = true;
> +                       break;
> +               }
> +       } while (++ci_off < ci_end);
> +       rcu_read_unlock();
>
> -       return swap_page_trans_huge_swapped(si, entry, folio_order(folio));
> +       return ret;
>  }
>
>  static bool folio_swapcache_freeable(struct folio *folio)
> @@ -2032,7 +2031,7 @@ bool folio_free_swap(struct folio *folio)
>  {
>         if (!folio_swapcache_freeable(folio))
>                 return false;
> -       if (folio_swapped(folio))
> +       if (folio_maybe_swapped(folio))
>                 return false;
>
>         swap_cache_del_folio(folio);
> @@ -3710,6 +3709,8 @@ void si_swapinfo(struct sysinfo *val)
>   *
>   * Context: Caller must ensure there is no race condition on the reference
>   * owner. e.g., locking the PTL of a PTE containing the entry being increased.
> + * Also the swap entry must have a count >= 1. Otherwise folio_dup_swap should
> + * be used.
>   */
>  int swap_dup_entry_direct(swp_entry_t entry)
>  {
> @@ -3721,6 +3722,13 @@ int swap_dup_entry_direct(swp_entry_t entry)
>                 return -EINVAL;
>         }
>
> +       /*
> +        * The caller must be increasing the swap count from a direct
> +        * reference of the swap slot (e.g. a swap entry in page table).
> +        * So the swap count must be >= 1.
> +        */
> +       VM_WARN_ON_ONCE(!swap_entry_swapped(si, entry));
> +
>         return swap_dup_entries_cluster(si, swp_offset(entry), 1);
>  }
>
> --
> 2.52.0
>

Very sorry about this :/, something is wrong with my local branch
so this is the wrong patch.

Please ignore this series, thanks!

next prev parent reply	other threads:[~2026-02-15 10:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-15 10:25 [PATCH 0/2] mm/swap: hibernate: improve hibernate performance with new allocator Kairui Song
2026-02-15 10:25 ` [PATCH 2/2] mm, swap: merge common convention and simplify allocation helper Kairui Song
2026-02-15 10:38 ` [PATCH 1/2] mm, swap: simplify checking if a folio is swapped Kairui Song
2026-02-15 10:41   ` Kairui Song [this message]
2026-02-15 11:18 ` [PATCH 0/2] mm/swap: hibernate: improve hibernate performance with new allocator Kairui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMgjq7C2K5mAVcieX5rUUk9Qu3o7OgHkFqBU86AsvxNQG0uoUw@mail.gmail.com \
    --to=ryncsn@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=carstengrohmann@gmx.de \
    --cc=chrisl@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=rafael@kernel.org \
    --cc=shikemeng@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox