From: Kairui Song <ryncsn@gmail.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>, Chris Li <chrisl@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosry.ahmed@linux.dev>,
Chengming Zhou <chengming.zhou@linux.dev>,
Youngjun Park <youngjun.park@lge.com>,
LKML <linux-kernel@vger.kernel.org>,
stable@vger.kernel.org
Subject: Re: [PATCH] Revert "mm, swap: avoid redundant swap device pinning"
Date: Mon, 10 Nov 2025 13:33:05 +0800 [thread overview]
Message-ID: <CAMgjq7DnaD-bH1efF9c1X0XAvZaMufzBUGxxeRrRAJBzBe59+g@mail.gmail.com> (raw)
In-Reply-To: <2025111053-saddlebag-maybe-0edc@gregkh>
Greg KH <gregkh@linuxfoundation.org> 于 2025年11月10日周一 09:01写道:
>
> On Mon, Nov 10, 2025 at 02:06:03AM +0800, Kairui Song via B4 Relay wrote:
> > From: Kairui Song <kasong@tencent.com>
> >
> > This reverts commit 78524b05f1a3e16a5d00cc9c6259c41a9d6003ce.
> >
> > While reviewing recent leaf entry changes, I noticed that commit
> > 78524b05f1a3 ("mm, swap: avoid redundant swap device pinning") isn't
> > correct. It's true that most all callers of __read_swap_cache_async are
> > already holding a swap entry reference, so the repeated swap device
> > pinning isn't needed on the same swap device, but it is possible that
> > VMA readahead (swap_vma_readahead()) may encounter swap entries from a
> > different swap device when there are multiple swap devices, and call
> > __read_swap_cache_async without holding a reference to that swap device.
> >
> > So it is possible to cause a UAF if swapoff of device A raced with
> > swapin on device B, and VMA readahead tries to read swap entries from
> > device A. It's not easy to trigger but in theory possible to cause real
> > issues. And besides, that commit made swap more vulnerable to issues
> > like corrupted page tables.
> >
> > Just revert it. __read_swap_cache_async isn't that sensitive to
> > performance after all, as it's mostly used for SSD/HDD swap devices with
> > readahead. SYNCHRONOUS_IO devices may fallback onto it for swap count >
> > 1 entries, but very soon we will have a new helper and routine for
> > such devices, so they will never touch this helper or have redundant
> > swap device reference overhead.
> >
> > Fixes: 78524b05f1a3 ("mm, swap: avoid redundant swap device pinning")
> > Signed-off-by: Kairui Song <kasong@tencent.com>
> > ---
> > mm/swap_state.c | 14 ++++++--------
> > mm/zswap.c | 8 +-------
> > 2 files changed, 7 insertions(+), 15 deletions(-)
> >
> > diff --git a/mm/swap_state.c b/mm/swap_state.c
> > index 3f85a1c4cfd9..0c25675de977 100644
> > --- a/mm/swap_state.c
> > +++ b/mm/swap_state.c
> > @@ -406,13 +406,17 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> > struct mempolicy *mpol, pgoff_t ilx, bool *new_page_allocated,
> > bool skip_if_exists)
> > {
> > - struct swap_info_struct *si = __swap_entry_to_info(entry);
> > + struct swap_info_struct *si;
> > struct folio *folio;
> > struct folio *new_folio = NULL;
> > struct folio *result = NULL;
> > void *shadow = NULL;
> >
> > *new_page_allocated = false;
> > + si = get_swap_device(entry);
> > + if (!si)
> > + return NULL;
> > +
> > for (;;) {
> > int err;
> >
> > @@ -499,6 +503,7 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> > put_swap_folio(new_folio, entry);
> > folio_unlock(new_folio);
> > put_and_return:
> > + put_swap_device(si);
> > if (!(*new_page_allocated) && new_folio)
> > folio_put(new_folio);
> > return result;
> > @@ -518,16 +523,11 @@ struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> > struct vm_area_struct *vma, unsigned long addr,
> > struct swap_iocb **plug)
> > {
> > - struct swap_info_struct *si;
> > bool page_allocated;
> > struct mempolicy *mpol;
> > pgoff_t ilx;
> > struct folio *folio;
> >
> > - si = get_swap_device(entry);
> > - if (!si)
> > - return NULL;
> > -
> > mpol = get_vma_policy(vma, addr, 0, &ilx);
> > folio = __read_swap_cache_async(entry, gfp_mask, mpol, ilx,
> > &page_allocated, false);
> > @@ -535,8 +535,6 @@ struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> >
> > if (page_allocated)
> > swap_read_folio(folio, plug);
> > -
> > - put_swap_device(si);
> > return folio;
> > }
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 5d0f8b13a958..aefe71fd160c 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -1005,18 +1005,12 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
> > struct folio *folio;
> > struct mempolicy *mpol;
> > bool folio_was_allocated;
> > - struct swap_info_struct *si;
> > int ret = 0;
> >
> > /* try to allocate swap cache folio */
> > - si = get_swap_device(swpentry);
> > - if (!si)
> > - return -EEXIST;
> > -
> > mpol = get_task_policy(current);
> > folio = __read_swap_cache_async(swpentry, GFP_KERNEL, mpol,
> > - NO_INTERLEAVE_INDEX, &folio_was_allocated, true);
> > - put_swap_device(si);
> > + NO_INTERLEAVE_INDEX, &folio_was_allocated, true);
> > if (!folio)
> > return -ENOMEM;
> >
> >
> > ---
> > base-commit: 02dafa01ec9a00c3758c1c6478d82fe601f5f1ba
> > change-id: 20251109-revert-78524b05f1a3-04a1295bef8a
> >
> > Best regards,
> > --
> > Kairui Song <kasong@tencent.com>
> >
> >
> >
>
> <formletter>
>
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree. Please read:
> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
>
> </formletter>
Thanks for the info, my bad, I was trying new tools to send patches so
the Cc tags were missing, will fix it. This patch is meant to be
merged into the mainline first.
next prev parent reply other threads:[~2025-11-10 5:33 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-09 18:06 Kairui Song via B4 Relay
2025-11-10 1:00 ` Greg KH
2025-11-10 5:33 ` Kairui Song [this message]
2025-11-10 1:56 ` Huang, Ying
2025-11-10 5:32 ` Kairui Song
2025-11-10 10:50 ` Huang, Ying
2025-11-10 11:37 ` Kairui Song
2025-11-10 12:33 ` Kairui Song
2025-11-11 6:48 ` Huang, Ying
2025-11-14 15:18 ` Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMgjq7DnaD-bH1efF9c1X0XAvZaMufzBUGxxeRrRAJBzBe59+g@mail.gmail.com \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chengming.zhou@linux.dev \
--cc=chrisl@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=shikemeng@huaweicloud.com \
--cc=stable@vger.kernel.org \
--cc=yosry.ahmed@linux.dev \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox