linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Chris Li <chrisl@kernel.org>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	kasong@tencent.com, baohua@kernel.org, nphamcs@gmail.com,
	shikemeng@huaweicloud.com
Subject: Re: [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device
Date: Fri, 3 Oct 2025 10:38:47 +0800	[thread overview]
Message-ID: <aN83NwGpMCFvrOEy@MiWiFi-R3L-srv> (raw)
In-Reply-To: <CACePvbXeFxbtH3c0Wvn56m+wsK0wqxP86ge5Ud73OvqnfDOwMw@mail.gmail.com>

On 10/02/25 at 08:59am, Chris Li wrote:
> On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > Now, swap_active_head is only used to find a present swap device when
> > trying to swapoff it. In fact, swap_info[] is a short array which is
> > 32 at maximum, and usually the unused one can be reused, so the
> > searching for target mostly only iterates the foremost several used
> > slots. And swapoff is a rarely used operation, efficiency is not so
> > important. Then it's unnecessary to get a plist to make it.
> >
> > Here go by iterating swap_info[] to find the swap device instead of
> > iterating swap_active_head.
> >
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/swapfile.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 5d71c748a2fe..18b52cc20749 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -2641,6 +2641,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> >         struct inode *inode;
> >         struct filename *pathname;
> >         int err, found = 0;
> > +       unsigned int type;
> >
> >         if (!capable(CAP_SYS_ADMIN))
> >                 return -EPERM;
> > @@ -2658,7 +2659,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> >
> >         mapping = victim->f_mapping;
> >         spin_lock(&swap_lock);
> > -       plist_for_each_entry(p, &swap_active_head, list) {
> > +       for (type = 0; type < nr_swapfiles; type++) {
> > +               p = swap_info[type];
> 
> After some careful thinking of the swapoff behavior, I now consider
> this patch buggy.
> 
> On behavior change is that, previously the swapoff is dependent on the
> swap_active_list and only swapon device is on that list.
> 
> Now you change to enumerate every device at swap off and depend on the
> p->flags & SWP_WRITEOK to filter out the device. Which will go through
> all 32 swap devices regardless.
> 
> >                 if (p->flags & SWP_WRITEOK) {
> >                         if (p->swap_file->f_mapping == mapping) {
> 
> Considering the following race:
> 
> CPU1:
> swapoff( "/dev/sda1")
> spin_lock(&swap_lock);
> 
> for (type = 0; type < nr_swapfiles; type++) {
>      if (p->flags & SWP_WRITEOK) {
>          // found the device.
> }
> ....
> spin_lock(&p->lock);
> del_from_avail_list(p, true);
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> plist_del(&p->list, &swap_active_head);
> atomic_long_sub(p->pages, &nr_swap_pages);
> total_swap_pages -= p->pages;
> spin_unlock(&p->lock);
> spin_unlock(&swap_lock);
> 
> // at this point the p->flags still have SWP_WRITEOK.
> // swap_lock and p->lock are both unlocked.

No, SWP_WRITEOK is cleared in del_from_avail_list(). At this point,
there's no more SWP_WRITEOK in p->flags when both swap_lock and p->lock
are lifted.

> 
> Now the second CPU swapoff race begins
> 
>                                                        CPU2:
>                                                        swapoff( "/dev/sda1")
> 
> spin_lock(&swap_lock); // can take this lock
> 
>                                                        for (type = 0;
> type < nr_swapfiles; type++) {
>                                                        if (p->flags &
> SWP_WRITEOK) {
>                                                               // also
> found the device. because p->flags are not clear to 0 yet.
>                                                        }
>                                                        ....
> 
> spin_lock(&p->lock); // can also take this lock
> 
> del_from_avail_list(p, true);
> 
> plist_del(&p->list, &swap_active_head); <=== blow up here
>                                                        // remove p not
> the in list.
> 
> atomic_long_sub(p->pages, &nr_swap_pages);
>                                                        // double subtracting ..
> 
> total_swap_pages -= p->pages;
>                                                        // double
> subtracting as well.
>                                                        spin_unlock(&p->lock);
>                                                        spin_unlock(&swap_lock);
> 
> Please confirm or deny if that race is possible.
> If it is, consider it a NACK to this patch.
> 
> Please consider that even trivial patches can have subtle behavior
> change and it has non zero cost to reviewers. It can introduce subtle
> bugs unintentionally as well. In this case, I feel that the
> gain(removing a list, total less than 1K) is too small,  not a very
> good return on investment.
> 
> Chris
> 



  reply	other threads:[~2025-10-03  2:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-01  4:34 [PATCH 0/3] mm/swap: remove plist swap_active_head Baoquan He
2025-10-01  4:34 ` [PATCH 1/3] mm/swapfile.c: remove __has_usable_swap() Baoquan He
2025-10-01  4:34 ` [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device Baoquan He
2025-10-02 15:59   ` Chris Li
2025-10-03  2:38     ` Baoquan He [this message]
2025-10-03  4:50       ` Chris Li
2025-10-03  5:29         ` Baoquan He
2025-10-01  4:34 ` [PATCH 3/3] mm/swap: remove unneeded swap_active_head Baoquan He
2025-10-02  8:33   ` Chris Li
2025-10-02 13:42     ` Baoquan He
2025-10-09  3:26   ` Andrew Morton
2025-10-09  7:47     ` Baoquan He
2025-10-09 17:09       ` Chris Li
2025-10-10  2:56         ` YoungJun Park
2025-10-10  1:28       ` Andrew Morton
2025-10-10  2:14         ` Baoquan He
2025-10-10  2:34           ` Chris Li
2025-10-10  2:33         ` Chris Li
2025-10-10  2:52         ` Chris Li
2025-10-02  6:04 ` [PATCH 0/3] mm/swap: remove plist swap_active_head Chris Li
2025-10-02 13:09   ` Baoquan He
2025-10-02 16:23     ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aN83NwGpMCFvrOEy@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=chrisl@kernel.org \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox