linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Li <chrisl@kernel.org>
To: Baoquan He <bhe@redhat.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	kasong@tencent.com,  baohua@kernel.org, nphamcs@gmail.com,
	shikemeng@huaweicloud.com
Subject: Re: [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device
Date: Thu, 2 Oct 2025 21:50:51 -0700	[thread overview]
Message-ID: <CACePvbWgKCjW0fHT03QUXWsj9sV0+1Co9_=t9OWQA5-ApEJyRg@mail.gmail.com> (raw)
In-Reply-To: <aN83NwGpMCFvrOEy@MiWiFi-R3L-srv>

On Thu, Oct 2, 2025 at 7:39 PM Baoquan He <bhe@redhat.com> wrote:
>
> On 10/02/25 at 08:59am, Chris Li wrote:
> > On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
> > >
> > > Now, swap_active_head is only used to find a present swap device when
> > > trying to swapoff it. In fact, swap_info[] is a short array which is
> > > 32 at maximum, and usually the unused one can be reused, so the
> > > searching for target mostly only iterates the foremost several used
> > > slots. And swapoff is a rarely used operation, efficiency is not so
> > > important. Then it's unnecessary to get a plist to make it.
> > >
> > > Here go by iterating swap_info[] to find the swap device instead of
> > > iterating swap_active_head.
> > >
> > > Signed-off-by: Baoquan He <bhe@redhat.com>
> > > ---
> > >  mm/swapfile.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > > index 5d71c748a2fe..18b52cc20749 100644
> > > --- a/mm/swapfile.c
> > > +++ b/mm/swapfile.c
> > > @@ -2641,6 +2641,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> > >         struct inode *inode;
> > >         struct filename *pathname;
> > >         int err, found = 0;
> > > +       unsigned int type;
> > >
> > >         if (!capable(CAP_SYS_ADMIN))
> > >                 return -EPERM;
> > > @@ -2658,7 +2659,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> > >
> > >         mapping = victim->f_mapping;
> > >         spin_lock(&swap_lock);
> > > -       plist_for_each_entry(p, &swap_active_head, list) {
> > > +       for (type = 0; type < nr_swapfiles; type++) {
> > > +               p = swap_info[type];
> >
> > After some careful thinking of the swapoff behavior, I now consider
> > this patch buggy.
> >
> > On behavior change is that, previously the swapoff is dependent on the
> > swap_active_list and only swapon device is on that list.
> >
> > Now you change to enumerate every device at swap off and depend on the
> > p->flags & SWP_WRITEOK to filter out the device. Which will go through
> > all 32 swap devices regardless.
> >
> > >                 if (p->flags & SWP_WRITEOK) {
> > >                         if (p->swap_file->f_mapping == mapping) {
> >
> > Considering the following race:
> >
> > CPU1:
> > swapoff( "/dev/sda1")
> > spin_lock(&swap_lock);
> >
> > for (type = 0; type < nr_swapfiles; type++) {
> >      if (p->flags & SWP_WRITEOK) {
> >          // found the device.
> > }
> > ....
> > spin_lock(&p->lock);
> > del_from_avail_list(p, true);
>   ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > plist_del(&p->list, &swap_active_head);
> > atomic_long_sub(p->pages, &nr_swap_pages);
> > total_swap_pages -= p->pages;
> > spin_unlock(&p->lock);
> > spin_unlock(&swap_lock);
> >
> > // at this point the p->flags still have SWP_WRITEOK.
> > // swap_lock and p->lock are both unlocked.
>
> No, SWP_WRITEOK is cleared in del_from_avail_list(). At this point,
> there's no more SWP_WRITEOK in p->flags when both swap_lock and p->lock
> are lifted.

Ah, you are right. I am wrong. I missed the "si->flags &=
~SWP_WRITEOK;" inside the del_from_avail_list().

I read at the end of swapoff it set the si->flags = 0; thinking the
SWP_WRITEOK was clear there. It is cleared earlier.

That should be fine. Sorry about the misread.

>
> >
> > Now the second CPU swapoff race begins
> >
> >                                                        CPU2:
> >                                                        swapoff( "/dev/sda1")
> >
> > spin_lock(&swap_lock); // can take this lock
> >
> >                                                        for (type = 0;
> > type < nr_swapfiles; type++) {
> >                                                        if (p->flags &
> > SWP_WRITEOK) {
> >                                                               // also
> > found the device. because p->flags are not clear to 0 yet.
> >                                                        }
> >                                                        ....
> >
> > spin_lock(&p->lock); // can also take this lock
> >
> > del_from_avail_list(p, true);
> >
> > plist_del(&p->list, &swap_active_head); <=== blow up here
> >                                                        // remove p not
> > the in list.
> >
> > atomic_long_sub(p->pages, &nr_swap_pages);
> >                                                        // double subtracting ..
> >
> > total_swap_pages -= p->pages;
> >                                                        // double
> > subtracting as well.
> >                                                        spin_unlock(&p->lock);
> >                                                        spin_unlock(&swap_lock);
> >
> > Please confirm or deny if that race is possible.
> > If it is, consider it a NACK to this patch.

I withdraw the NACK. Sorry about that.

Chris


  reply	other threads:[~2025-10-03  4:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-01  4:34 [PATCH 0/3] mm/swap: remove plist swap_active_head Baoquan He
2025-10-01  4:34 ` [PATCH 1/3] mm/swapfile.c: remove __has_usable_swap() Baoquan He
2025-10-01  4:34 ` [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device Baoquan He
2025-10-02 15:59   ` Chris Li
2025-10-03  2:38     ` Baoquan He
2025-10-03  4:50       ` Chris Li [this message]
2025-10-03  5:29         ` Baoquan He
2025-10-01  4:34 ` [PATCH 3/3] mm/swap: remove unneeded swap_active_head Baoquan He
2025-10-02  8:33   ` Chris Li
2025-10-02 13:42     ` Baoquan He
2025-10-09  3:26   ` Andrew Morton
2025-10-09  7:47     ` Baoquan He
2025-10-09 17:09       ` Chris Li
2025-10-10  2:56         ` YoungJun Park
2025-10-10  1:28       ` Andrew Morton
2025-10-10  2:14         ` Baoquan He
2025-10-10  2:34           ` Chris Li
2025-10-10  2:33         ` Chris Li
2025-10-10  2:52         ` Chris Li
2025-10-02  6:04 ` [PATCH 0/3] mm/swap: remove plist swap_active_head Chris Li
2025-10-02 13:09   ` Baoquan He
2025-10-02 16:23     ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACePvbWgKCjW0fHT03QUXWsj9sV0+1Co9_=t9OWQA5-ApEJyRg@mail.gmail.com' \
    --to=chrisl@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox