linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: YoungJun Park <youngjun.park@lge.com>
To: Baoquan He <bhe@redhat.com>
Cc: akpm@linux-foundation.org, chrisl@kernel.org, kasong@tencent.com,
	shikemeng@huaweicloud.com, nphamcs@gmail.com, baohua@kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH 1/2] mm/swapfile: fix list iteration in swap_sync_discard
Date: Thu, 27 Nov 2025 11:54:49 +0900	[thread overview]
Message-ID: <aSe9eckUCGP9eTS+@yjaykim-PowerEdge-T330> (raw)
In-Reply-To: <aSe0VrozSSD0xeGl@MiWiFi-R3L-srv>

On Thu, Nov 27, 2025 at 10:15:50AM +0800, Baoquan He wrote:
> On 11/26/25 at 01:30am, Youngjun Park wrote:
> > swap_sync_discard() has an issue where if the next device becomes full
> > and is removed from the plist during iteration, the operation fails
> > even when other swap devices with pending discard entries remain
> > available.
> > 
> > Fix by checking plist_node_empty(&next->list) and restarting iteration
> > when the next node is removed during discard operations.
> > 
> > Additionally, switch from swap_avail_lock/swap_avail_head to swap_lock/
> > swap_active_head. This means the iteration is only affected by swapoff
> > operations rather than frequent availability changes, reducing
> > exceptional condition checks and lock contention.
> > 
> > Fixes: 686ea517f471 ("mm, swap: do not perform synchronous discard during allocation")
> > Suggested-by: Kairui Song <kasong@tencent.com>
> > Signed-off-by: Youngjun Park <youngjun.park@lge.com>
> > ---
> >  mm/swapfile.c | 18 +++++++++++-------
> >  1 file changed, 11 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index d12332423a06..998271aa09c3 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -1387,21 +1387,25 @@ static bool swap_sync_discard(void)
> >  	bool ret = false;
> >  	struct swap_info_struct *si, *next;
> >  
> > -	spin_lock(&swap_avail_lock);
> > -	plist_for_each_entry_safe(si, next, &swap_avail_head, avail_list) {
> > -		spin_unlock(&swap_avail_lock);
> > +	spin_lock(&swap_lock);
> > +start_over:
> > +	plist_for_each_entry_safe(si, next, &swap_active_head, list) {
> > +		spin_unlock(&swap_lock);
> >  		if (get_swap_device_info(si)) {
> >  			if (si->flags & SWP_PAGE_DISCARD)
> >  				ret = swap_do_scheduled_discard(si);
> >  			put_swap_device(si);
> >  		}
> >  		if (ret)
> > -			return true;
> > -		spin_lock(&swap_avail_lock);
> > +			return ret;
> > +
> > +		spin_lock(&swap_lock);
> > +		if (plist_node_empty(&next->list))
> > +			goto start_over;
> 
> If there are many si with the same priority, or there are several si

Is this because of the requeue that happens while iterating over
`swap_avail_head`? But, requeue does not make node empty.
Also,  since we are iterating over `swap_active_head`,
it seems like it wouldn’t happen. 

> spread in different memcg when swap.tier is available, are we going to
> keep looping here to start over and over again possibly?

I think loop cannot happen on here by that reason.
But, Loop can possbily happen between swap_alloc_slow and swap_sync_discard.

If `swap.tier` is applied, I think you’re referring to the situation where
`si`s not belonging to the current tier are discarded successfully, and then
the next iteration goes through the available list again for the swap devices
in the same tier. As you mentioned, a needless looping situation could occur. 
(if discards accumulate very quickly, could it even lead to an infinite loop.)
If `swap.tier` is applied, this part may also need to be modified.

> The old code is supposed to go through the plist to do one round of discarding? 

After your review, I thought more about it — if continuous swap on/off occurs
while the `swap_lock` is released, it seems that we could keep hitting
`plist_node_empty`.
However, I think this case is very unlikely, so it
shouldn’t be a problem. 
Actually, swap_alloc_slow already works that way.
What do you think?

In the old code, if a swapoff occurs and swap usage becomes zero, causing it
to be removed from the `avail_list`, it ends up doing a one-round discarding.
If we don’t like the idea of looping due to continuous swap on/off, we could
consider adding a retry count or removing the `plist_node_empty` check.

> Not sure if I got the code wrong, or the chance it very tiny.
> Thanks
> Baoquan

I answered based on my understanding, but please correct me if I misunderstood your point.

Thanks for the review.
Youngjun Park


  reply	other threads:[~2025-11-27  2:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 16:30 [PATCH 0/2] mm/swapfile: fix and cleanup swap list iterations Youngjun Park
2025-11-25 16:30 ` [PATCH 1/2] mm/swapfile: fix list iteration in swap_sync_discard Youngjun Park
2025-11-26 18:23   ` Kairui Song
2025-11-27  2:22     ` YoungJun Park
2025-11-27  2:15   ` Baoquan He
2025-11-27  2:54     ` YoungJun Park [this message]
2025-11-27  5:42     ` YoungJun Park
2025-11-27  8:06       ` Baoquan He
2025-11-27  9:34         ` YoungJun Park
2025-11-27 10:32           ` Baoquan He
2025-11-27 10:44             ` YoungJun Park
2025-11-27 10:50               ` Baoquan He
2025-11-25 16:30 ` [PATCH 2/2] mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate Youngjun Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSe9eckUCGP9eTS+@yjaykim-PowerEdge-T330 \
    --to=youngjun.park@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=kasong@tencent.com \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox