* [PATCH 1/3] mm/swapfile.c: remove __has_usable_swap()
2025-10-01 4:34 [PATCH 0/3] mm/swap: remove plist swap_active_head Baoquan He
@ 2025-10-01 4:34 ` Baoquan He
2025-10-01 4:34 ` [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device Baoquan He
` (2 subsequent siblings)
3 siblings, 0 replies; 22+ messages in thread
From: Baoquan He @ 2025-10-01 4:34 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, chrisl, kasong, baohua, nphamcs, shikemeng, Baoquan He
Global total_swap_pages can indicate if usable swap deivice is present,
use it instead of __has_usable_swap().
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/swapfile.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2bd8bd76ea28..5d71c748a2fe 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3836,11 +3836,6 @@ static void free_swap_count_continuations(struct swap_info_struct *si)
}
#if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP)
-static bool __has_usable_swap(void)
-{
- return !plist_head_empty(&swap_active_head);
-}
-
void __folio_throttle_swaprate(struct folio *folio, gfp_t gfp)
{
struct swap_info_struct *si, *next;
@@ -3848,7 +3843,7 @@ void __folio_throttle_swaprate(struct folio *folio, gfp_t gfp)
if (!(gfp & __GFP_IO))
return;
- if (!__has_usable_swap())
+ if (!total_swap_pages)
return;
if (!blk_cgroup_congested())
--
2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread* [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device
2025-10-01 4:34 [PATCH 0/3] mm/swap: remove plist swap_active_head Baoquan He
2025-10-01 4:34 ` [PATCH 1/3] mm/swapfile.c: remove __has_usable_swap() Baoquan He
@ 2025-10-01 4:34 ` Baoquan He
2025-10-02 15:59 ` Chris Li
2025-10-01 4:34 ` [PATCH 3/3] mm/swap: remove unneeded swap_active_head Baoquan He
2025-10-02 6:04 ` [PATCH 0/3] mm/swap: remove plist swap_active_head Chris Li
3 siblings, 1 reply; 22+ messages in thread
From: Baoquan He @ 2025-10-01 4:34 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, chrisl, kasong, baohua, nphamcs, shikemeng, Baoquan He
Now, swap_active_head is only used to find a present swap device when
trying to swapoff it. In fact, swap_info[] is a short array which is
32 at maximum, and usually the unused one can be reused, so the
searching for target mostly only iterates the foremost several used
slots. And swapoff is a rarely used operation, efficiency is not so
important. Then it's unnecessary to get a plist to make it.
Here go by iterating swap_info[] to find the swap device instead of
iterating swap_active_head.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/swapfile.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 5d71c748a2fe..18b52cc20749 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2641,6 +2641,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
struct inode *inode;
struct filename *pathname;
int err, found = 0;
+ unsigned int type;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
@@ -2658,7 +2659,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
mapping = victim->f_mapping;
spin_lock(&swap_lock);
- plist_for_each_entry(p, &swap_active_head, list) {
+ for (type = 0; type < nr_swapfiles; type++) {
+ p = swap_info[type];
if (p->flags & SWP_WRITEOK) {
if (p->swap_file->f_mapping == mapping) {
found = 1;
--
2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device
2025-10-01 4:34 ` [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device Baoquan He
@ 2025-10-02 15:59 ` Chris Li
2025-10-03 2:38 ` Baoquan He
0 siblings, 1 reply; 22+ messages in thread
From: Chris Li @ 2025-10-02 15:59 UTC (permalink / raw)
To: Baoquan He; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
>
> Now, swap_active_head is only used to find a present swap device when
> trying to swapoff it. In fact, swap_info[] is a short array which is
> 32 at maximum, and usually the unused one can be reused, so the
> searching for target mostly only iterates the foremost several used
> slots. And swapoff is a rarely used operation, efficiency is not so
> important. Then it's unnecessary to get a plist to make it.
>
> Here go by iterating swap_info[] to find the swap device instead of
> iterating swap_active_head.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
> mm/swapfile.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 5d71c748a2fe..18b52cc20749 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2641,6 +2641,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> struct inode *inode;
> struct filename *pathname;
> int err, found = 0;
> + unsigned int type;
>
> if (!capable(CAP_SYS_ADMIN))
> return -EPERM;
> @@ -2658,7 +2659,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
>
> mapping = victim->f_mapping;
> spin_lock(&swap_lock);
> - plist_for_each_entry(p, &swap_active_head, list) {
> + for (type = 0; type < nr_swapfiles; type++) {
> + p = swap_info[type];
After some careful thinking of the swapoff behavior, I now consider
this patch buggy.
On behavior change is that, previously the swapoff is dependent on the
swap_active_list and only swapon device is on that list.
Now you change to enumerate every device at swap off and depend on the
p->flags & SWP_WRITEOK to filter out the device. Which will go through
all 32 swap devices regardless.
> if (p->flags & SWP_WRITEOK) {
> if (p->swap_file->f_mapping == mapping) {
Considering the following race:
CPU1:
swapoff( "/dev/sda1")
spin_lock(&swap_lock);
for (type = 0; type < nr_swapfiles; type++) {
if (p->flags & SWP_WRITEOK) {
// found the device.
}
....
spin_lock(&p->lock);
del_from_avail_list(p, true);
plist_del(&p->list, &swap_active_head);
atomic_long_sub(p->pages, &nr_swap_pages);
total_swap_pages -= p->pages;
spin_unlock(&p->lock);
spin_unlock(&swap_lock);
// at this point the p->flags still have SWP_WRITEOK.
// swap_lock and p->lock are both unlocked.
Now the second CPU swapoff race begins
CPU2:
swapoff( "/dev/sda1")
spin_lock(&swap_lock); // can take this lock
for (type = 0;
type < nr_swapfiles; type++) {
if (p->flags &
SWP_WRITEOK) {
// also
found the device. because p->flags are not clear to 0 yet.
}
....
spin_lock(&p->lock); // can also take this lock
del_from_avail_list(p, true);
plist_del(&p->list, &swap_active_head); <=== blow up here
// remove p not
the in list.
atomic_long_sub(p->pages, &nr_swap_pages);
// double subtracting ..
total_swap_pages -= p->pages;
// double
subtracting as well.
spin_unlock(&p->lock);
spin_unlock(&swap_lock);
Please confirm or deny if that race is possible.
If it is, consider it a NACK to this patch.
Please consider that even trivial patches can have subtle behavior
change and it has non zero cost to reviewers. It can introduce subtle
bugs unintentionally as well. In this case, I feel that the
gain(removing a list, total less than 1K) is too small, not a very
good return on investment.
Chris
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device
2025-10-02 15:59 ` Chris Li
@ 2025-10-03 2:38 ` Baoquan He
2025-10-03 4:50 ` Chris Li
0 siblings, 1 reply; 22+ messages in thread
From: Baoquan He @ 2025-10-03 2:38 UTC (permalink / raw)
To: Chris Li; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On 10/02/25 at 08:59am, Chris Li wrote:
> On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > Now, swap_active_head is only used to find a present swap device when
> > trying to swapoff it. In fact, swap_info[] is a short array which is
> > 32 at maximum, and usually the unused one can be reused, so the
> > searching for target mostly only iterates the foremost several used
> > slots. And swapoff is a rarely used operation, efficiency is not so
> > important. Then it's unnecessary to get a plist to make it.
> >
> > Here go by iterating swap_info[] to find the swap device instead of
> > iterating swap_active_head.
> >
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> > mm/swapfile.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 5d71c748a2fe..18b52cc20749 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -2641,6 +2641,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> > struct inode *inode;
> > struct filename *pathname;
> > int err, found = 0;
> > + unsigned int type;
> >
> > if (!capable(CAP_SYS_ADMIN))
> > return -EPERM;
> > @@ -2658,7 +2659,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> >
> > mapping = victim->f_mapping;
> > spin_lock(&swap_lock);
> > - plist_for_each_entry(p, &swap_active_head, list) {
> > + for (type = 0; type < nr_swapfiles; type++) {
> > + p = swap_info[type];
>
> After some careful thinking of the swapoff behavior, I now consider
> this patch buggy.
>
> On behavior change is that, previously the swapoff is dependent on the
> swap_active_list and only swapon device is on that list.
>
> Now you change to enumerate every device at swap off and depend on the
> p->flags & SWP_WRITEOK to filter out the device. Which will go through
> all 32 swap devices regardless.
>
> > if (p->flags & SWP_WRITEOK) {
> > if (p->swap_file->f_mapping == mapping) {
>
> Considering the following race:
>
> CPU1:
> swapoff( "/dev/sda1")
> spin_lock(&swap_lock);
>
> for (type = 0; type < nr_swapfiles; type++) {
> if (p->flags & SWP_WRITEOK) {
> // found the device.
> }
> ....
> spin_lock(&p->lock);
> del_from_avail_list(p, true);
~~~~~~~~~~~~~~~~~~~~~~~~~~~
> plist_del(&p->list, &swap_active_head);
> atomic_long_sub(p->pages, &nr_swap_pages);
> total_swap_pages -= p->pages;
> spin_unlock(&p->lock);
> spin_unlock(&swap_lock);
>
> // at this point the p->flags still have SWP_WRITEOK.
> // swap_lock and p->lock are both unlocked.
No, SWP_WRITEOK is cleared in del_from_avail_list(). At this point,
there's no more SWP_WRITEOK in p->flags when both swap_lock and p->lock
are lifted.
>
> Now the second CPU swapoff race begins
>
> CPU2:
> swapoff( "/dev/sda1")
>
> spin_lock(&swap_lock); // can take this lock
>
> for (type = 0;
> type < nr_swapfiles; type++) {
> if (p->flags &
> SWP_WRITEOK) {
> // also
> found the device. because p->flags are not clear to 0 yet.
> }
> ....
>
> spin_lock(&p->lock); // can also take this lock
>
> del_from_avail_list(p, true);
>
> plist_del(&p->list, &swap_active_head); <=== blow up here
> // remove p not
> the in list.
>
> atomic_long_sub(p->pages, &nr_swap_pages);
> // double subtracting ..
>
> total_swap_pages -= p->pages;
> // double
> subtracting as well.
> spin_unlock(&p->lock);
> spin_unlock(&swap_lock);
>
> Please confirm or deny if that race is possible.
> If it is, consider it a NACK to this patch.
>
> Please consider that even trivial patches can have subtle behavior
> change and it has non zero cost to reviewers. It can introduce subtle
> bugs unintentionally as well. In this case, I feel that the
> gain(removing a list, total less than 1K) is too small, not a very
> good return on investment.
>
> Chris
>
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device
2025-10-03 2:38 ` Baoquan He
@ 2025-10-03 4:50 ` Chris Li
2025-10-03 5:29 ` Baoquan He
0 siblings, 1 reply; 22+ messages in thread
From: Chris Li @ 2025-10-03 4:50 UTC (permalink / raw)
To: Baoquan He; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On Thu, Oct 2, 2025 at 7:39 PM Baoquan He <bhe@redhat.com> wrote:
>
> On 10/02/25 at 08:59am, Chris Li wrote:
> > On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
> > >
> > > Now, swap_active_head is only used to find a present swap device when
> > > trying to swapoff it. In fact, swap_info[] is a short array which is
> > > 32 at maximum, and usually the unused one can be reused, so the
> > > searching for target mostly only iterates the foremost several used
> > > slots. And swapoff is a rarely used operation, efficiency is not so
> > > important. Then it's unnecessary to get a plist to make it.
> > >
> > > Here go by iterating swap_info[] to find the swap device instead of
> > > iterating swap_active_head.
> > >
> > > Signed-off-by: Baoquan He <bhe@redhat.com>
> > > ---
> > > mm/swapfile.c | 4 +++-
> > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > > index 5d71c748a2fe..18b52cc20749 100644
> > > --- a/mm/swapfile.c
> > > +++ b/mm/swapfile.c
> > > @@ -2641,6 +2641,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> > > struct inode *inode;
> > > struct filename *pathname;
> > > int err, found = 0;
> > > + unsigned int type;
> > >
> > > if (!capable(CAP_SYS_ADMIN))
> > > return -EPERM;
> > > @@ -2658,7 +2659,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> > >
> > > mapping = victim->f_mapping;
> > > spin_lock(&swap_lock);
> > > - plist_for_each_entry(p, &swap_active_head, list) {
> > > + for (type = 0; type < nr_swapfiles; type++) {
> > > + p = swap_info[type];
> >
> > After some careful thinking of the swapoff behavior, I now consider
> > this patch buggy.
> >
> > On behavior change is that, previously the swapoff is dependent on the
> > swap_active_list and only swapon device is on that list.
> >
> > Now you change to enumerate every device at swap off and depend on the
> > p->flags & SWP_WRITEOK to filter out the device. Which will go through
> > all 32 swap devices regardless.
> >
> > > if (p->flags & SWP_WRITEOK) {
> > > if (p->swap_file->f_mapping == mapping) {
> >
> > Considering the following race:
> >
> > CPU1:
> > swapoff( "/dev/sda1")
> > spin_lock(&swap_lock);
> >
> > for (type = 0; type < nr_swapfiles; type++) {
> > if (p->flags & SWP_WRITEOK) {
> > // found the device.
> > }
> > ....
> > spin_lock(&p->lock);
> > del_from_avail_list(p, true);
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > plist_del(&p->list, &swap_active_head);
> > atomic_long_sub(p->pages, &nr_swap_pages);
> > total_swap_pages -= p->pages;
> > spin_unlock(&p->lock);
> > spin_unlock(&swap_lock);
> >
> > // at this point the p->flags still have SWP_WRITEOK.
> > // swap_lock and p->lock are both unlocked.
>
> No, SWP_WRITEOK is cleared in del_from_avail_list(). At this point,
> there's no more SWP_WRITEOK in p->flags when both swap_lock and p->lock
> are lifted.
Ah, you are right. I am wrong. I missed the "si->flags &=
~SWP_WRITEOK;" inside the del_from_avail_list().
I read at the end of swapoff it set the si->flags = 0; thinking the
SWP_WRITEOK was clear there. It is cleared earlier.
That should be fine. Sorry about the misread.
>
> >
> > Now the second CPU swapoff race begins
> >
> > CPU2:
> > swapoff( "/dev/sda1")
> >
> > spin_lock(&swap_lock); // can take this lock
> >
> > for (type = 0;
> > type < nr_swapfiles; type++) {
> > if (p->flags &
> > SWP_WRITEOK) {
> > // also
> > found the device. because p->flags are not clear to 0 yet.
> > }
> > ....
> >
> > spin_lock(&p->lock); // can also take this lock
> >
> > del_from_avail_list(p, true);
> >
> > plist_del(&p->list, &swap_active_head); <=== blow up here
> > // remove p not
> > the in list.
> >
> > atomic_long_sub(p->pages, &nr_swap_pages);
> > // double subtracting ..
> >
> > total_swap_pages -= p->pages;
> > // double
> > subtracting as well.
> > spin_unlock(&p->lock);
> > spin_unlock(&swap_lock);
> >
> > Please confirm or deny if that race is possible.
> > If it is, consider it a NACK to this patch.
I withdraw the NACK. Sorry about that.
Chris
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device
2025-10-03 4:50 ` Chris Li
@ 2025-10-03 5:29 ` Baoquan He
0 siblings, 0 replies; 22+ messages in thread
From: Baoquan He @ 2025-10-03 5:29 UTC (permalink / raw)
To: Chris Li; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On 10/02/25 at 09:50pm, Chris Li wrote:
> On Thu, Oct 2, 2025 at 7:39 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > On 10/02/25 at 08:59am, Chris Li wrote:
> > > On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
> > > >
> > > > Now, swap_active_head is only used to find a present swap device when
> > > > trying to swapoff it. In fact, swap_info[] is a short array which is
> > > > 32 at maximum, and usually the unused one can be reused, so the
> > > > searching for target mostly only iterates the foremost several used
> > > > slots. And swapoff is a rarely used operation, efficiency is not so
> > > > important. Then it's unnecessary to get a plist to make it.
> > > >
> > > > Here go by iterating swap_info[] to find the swap device instead of
> > > > iterating swap_active_head.
> > > >
> > > > Signed-off-by: Baoquan He <bhe@redhat.com>
> > > > ---
> > > > mm/swapfile.c | 4 +++-
> > > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > > > index 5d71c748a2fe..18b52cc20749 100644
> > > > --- a/mm/swapfile.c
> > > > +++ b/mm/swapfile.c
> > > > @@ -2641,6 +2641,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> > > > struct inode *inode;
> > > > struct filename *pathname;
> > > > int err, found = 0;
> > > > + unsigned int type;
> > > >
> > > > if (!capable(CAP_SYS_ADMIN))
> > > > return -EPERM;
> > > > @@ -2658,7 +2659,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> > > >
> > > > mapping = victim->f_mapping;
> > > > spin_lock(&swap_lock);
> > > > - plist_for_each_entry(p, &swap_active_head, list) {
> > > > + for (type = 0; type < nr_swapfiles; type++) {
> > > > + p = swap_info[type];
> > >
> > > After some careful thinking of the swapoff behavior, I now consider
> > > this patch buggy.
> > >
> > > On behavior change is that, previously the swapoff is dependent on the
> > > swap_active_list and only swapon device is on that list.
> > >
> > > Now you change to enumerate every device at swap off and depend on the
> > > p->flags & SWP_WRITEOK to filter out the device. Which will go through
> > > all 32 swap devices regardless.
> > >
> > > > if (p->flags & SWP_WRITEOK) {
> > > > if (p->swap_file->f_mapping == mapping) {
> > >
> > > Considering the following race:
> > >
> > > CPU1:
> > > swapoff( "/dev/sda1")
> > > spin_lock(&swap_lock);
> > >
> > > for (type = 0; type < nr_swapfiles; type++) {
> > > if (p->flags & SWP_WRITEOK) {
> > > // found the device.
> > > }
> > > ....
> > > spin_lock(&p->lock);
> > > del_from_avail_list(p, true);
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > plist_del(&p->list, &swap_active_head);
> > > atomic_long_sub(p->pages, &nr_swap_pages);
> > > total_swap_pages -= p->pages;
> > > spin_unlock(&p->lock);
> > > spin_unlock(&swap_lock);
> > >
> > > // at this point the p->flags still have SWP_WRITEOK.
> > > // swap_lock and p->lock are both unlocked.
> >
> > No, SWP_WRITEOK is cleared in del_from_avail_list(). At this point,
> > there's no more SWP_WRITEOK in p->flags when both swap_lock and p->lock
> > are lifted.
>
> Ah, you are right. I am wrong. I missed the "si->flags &=
> ~SWP_WRITEOK;" inside the del_from_avail_list().
>
> I read at the end of swapoff it set the si->flags = 0; thinking the
> SWP_WRITEOK was clear there. It is cleared earlier.
>
> That should be fine. Sorry about the misread.
It's OK, thanks for careful reviewing and checking.
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-01 4:34 [PATCH 0/3] mm/swap: remove plist swap_active_head Baoquan He
2025-10-01 4:34 ` [PATCH 1/3] mm/swapfile.c: remove __has_usable_swap() Baoquan He
2025-10-01 4:34 ` [PATCH 2/3] mm/swapfile.c: use swap_info[] to find the swap device Baoquan He
@ 2025-10-01 4:34 ` Baoquan He
2025-10-02 8:33 ` Chris Li
2025-10-09 3:26 ` Andrew Morton
2025-10-02 6:04 ` [PATCH 0/3] mm/swap: remove plist swap_active_head Chris Li
3 siblings, 2 replies; 22+ messages in thread
From: Baoquan He @ 2025-10-01 4:34 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, chrisl, kasong, baohua, nphamcs, shikemeng, Baoquan He
There's no user of swap_active_head, remove it now.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
include/linux/swap.h | 1 -
mm/swapfile.c | 20 ++++----------------
2 files changed, 4 insertions(+), 17 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 5b7a39b20f58..dfc0cc9fc166 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -296,7 +296,6 @@ struct swap_info_struct {
struct percpu_ref users; /* indicate and keep swap device valid. */
unsigned long flags; /* SWP_USED etc: see above */
signed short prio; /* swap priority of this type */
- struct plist_node list; /* entry in swap_active_head */
signed char type; /* strange name for an index */
unsigned int max; /* extent of the swap_map */
unsigned char *swap_map; /* vmalloc'ed array of usage counts */
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 18b52cc20749..80b34dc86a95 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -85,17 +85,10 @@ static const char Bad_offset[] = "Bad swap offset entry ";
static const char Unused_offset[] = "Unused swap offset entry ";
/*
- * all active swap_info_structs
- * protected with swap_lock, and ordered by priority.
- */
-static PLIST_HEAD(swap_active_head);
-
-/*
- * all available (active, not full) swap_info_structs
- * protected with swap_avail_lock, ordered by priority.
- * This is used by folio_alloc_swap() instead of swap_active_head
- * because swap_active_head includes all swap_info_structs,
- * but folio_alloc_swap() doesn't need to look at full ones.
+ * All available (active, not full) swap_info_structs protected with
+ * swap_avail_lock, ordered by priority.
+ * This is used by folio_alloc_swap() because folio_alloc_swap()
+ * doesn't need to look at full ones.
* This uses its own lock instead of swap_lock because when a
* swap_info_struct changes between not-full/full, it needs to
* add/remove itself to/from this list, but the swap_info_struct->lock
@@ -2539,7 +2532,6 @@ static void setup_swap_info(struct swap_info_struct *si, int prio,
* the plist prio is negated because plist ordering is
* low-to-high, while swap ordering is high-to-low
*/
- si->list.prio = -si->prio;
si->avail_list.prio = -si->prio;
si->swap_map = swap_map;
si->cluster_info = cluster_info;
@@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
assert_spin_locked(&swap_lock);
- plist_add(&si->list, &swap_active_head);
-
/* Add back to available list */
add_to_avail_list(si, true);
}
@@ -2682,7 +2672,6 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
}
spin_lock(&p->lock);
del_from_avail_list(p, true);
- plist_del(&p->list, &swap_active_head);
atomic_long_sub(p->pages, &nr_swap_pages);
total_swap_pages -= p->pages;
spin_unlock(&p->lock);
@@ -2958,7 +2947,6 @@ static struct swap_info_struct *alloc_swap_info(void)
*/
}
p->swap_extent_root = RB_ROOT;
- plist_node_init(&p->list, 0);
plist_node_init(&p->avail_list, 0);
p->flags = SWP_USED;
spin_unlock(&swap_lock);
--
2.41.0
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-01 4:34 ` [PATCH 3/3] mm/swap: remove unneeded swap_active_head Baoquan He
@ 2025-10-02 8:33 ` Chris Li
2025-10-02 13:42 ` Baoquan He
2025-10-09 3:26 ` Andrew Morton
1 sibling, 1 reply; 22+ messages in thread
From: Chris Li @ 2025-10-02 8:33 UTC (permalink / raw)
To: Baoquan He; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
>
> There's no user of swap_active_head, remove it now.
Removing swap_active_head for 32 swap devices is less than 1K memory
saving? Only the power off code uses it now so the reduction of
complexity is very limited as well. I consider this series payout is
pretty small but has considerable review overhead.
Another consideration is that swap.tiers is possible to use the
swap_active_head to find swap devices belonging to a tier, because the
swap tier is a range of priority. We can revisit it after the initial
implementation for swap.tiers.
Let's hold off the removal of swap_active_head a bit.
Chris
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
> include/linux/swap.h | 1 -
> mm/swapfile.c | 20 ++++----------------
> 2 files changed, 4 insertions(+), 17 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 5b7a39b20f58..dfc0cc9fc166 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -296,7 +296,6 @@ struct swap_info_struct {
> struct percpu_ref users; /* indicate and keep swap device valid. */
> unsigned long flags; /* SWP_USED etc: see above */
> signed short prio; /* swap priority of this type */
> - struct plist_node list; /* entry in swap_active_head */
> signed char type; /* strange name for an index */
> unsigned int max; /* extent of the swap_map */
> unsigned char *swap_map; /* vmalloc'ed array of usage counts */
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 18b52cc20749..80b34dc86a95 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -85,17 +85,10 @@ static const char Bad_offset[] = "Bad swap offset entry ";
> static const char Unused_offset[] = "Unused swap offset entry ";
>
> /*
> - * all active swap_info_structs
> - * protected with swap_lock, and ordered by priority.
> - */
> -static PLIST_HEAD(swap_active_head);
> -
> -/*
> - * all available (active, not full) swap_info_structs
> - * protected with swap_avail_lock, ordered by priority.
> - * This is used by folio_alloc_swap() instead of swap_active_head
> - * because swap_active_head includes all swap_info_structs,
> - * but folio_alloc_swap() doesn't need to look at full ones.
> + * All available (active, not full) swap_info_structs protected with
> + * swap_avail_lock, ordered by priority.
> + * This is used by folio_alloc_swap() because folio_alloc_swap()
> + * doesn't need to look at full ones.
> * This uses its own lock instead of swap_lock because when a
> * swap_info_struct changes between not-full/full, it needs to
> * add/remove itself to/from this list, but the swap_info_struct->lock
> @@ -2539,7 +2532,6 @@ static void setup_swap_info(struct swap_info_struct *si, int prio,
> * the plist prio is negated because plist ordering is
> * low-to-high, while swap ordering is high-to-low
> */
> - si->list.prio = -si->prio;
> si->avail_list.prio = -si->prio;
> si->swap_map = swap_map;
> si->cluster_info = cluster_info;
> @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
>
> assert_spin_locked(&swap_lock);
>
> - plist_add(&si->list, &swap_active_head);
> -
> /* Add back to available list */
> add_to_avail_list(si, true);
> }
> @@ -2682,7 +2672,6 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
> }
> spin_lock(&p->lock);
> del_from_avail_list(p, true);
> - plist_del(&p->list, &swap_active_head);
> atomic_long_sub(p->pages, &nr_swap_pages);
> total_swap_pages -= p->pages;
> spin_unlock(&p->lock);
> @@ -2958,7 +2947,6 @@ static struct swap_info_struct *alloc_swap_info(void)
> */
> }
> p->swap_extent_root = RB_ROOT;
> - plist_node_init(&p->list, 0);
> plist_node_init(&p->avail_list, 0);
> p->flags = SWP_USED;
> spin_unlock(&swap_lock);
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-02 8:33 ` Chris Li
@ 2025-10-02 13:42 ` Baoquan He
0 siblings, 0 replies; 22+ messages in thread
From: Baoquan He @ 2025-10-02 13:42 UTC (permalink / raw)
To: Chris Li; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On 10/02/25 at 01:33am, Chris Li wrote:
> On Tue, Sep 30, 2025 at 9:35 PM Baoquan He <bhe@redhat.com> wrote:
> >
> > There's no user of swap_active_head, remove it now.
>
> Removing swap_active_head for 32 swap devices is less than 1K memory
> saving? Only the power off code uses it now so the reduction of
> complexity is very limited as well. I consider this series payout is
> pretty small but has considerable review overhead.
>
> Another consideration is that swap.tiers is possible to use the
> swap_active_head to find swap devices belonging to a tier, because the
> swap tier is a range of priority. We can revisit it after the initial
> implementation for swap.tiers.
>
> Let's hold off the removal of swap_active_head a bit.
OK, if it has a potential user, agree we can hold it off for now. Thanks
for careful reviewing.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-01 4:34 ` [PATCH 3/3] mm/swap: remove unneeded swap_active_head Baoquan He
2025-10-02 8:33 ` Chris Li
@ 2025-10-09 3:26 ` Andrew Morton
2025-10-09 7:47 ` Baoquan He
1 sibling, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2025-10-09 3:26 UTC (permalink / raw)
To: Baoquan He; +Cc: linux-mm, chrisl, kasong, baohua, nphamcs, shikemeng
On Wed, 1 Oct 2025 12:34:36 +0800 Baoquan He <bhe@redhat.com> wrote:
> There's no user of swap_active_head, remove it now.
>
> ...
>
> @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
>
> assert_spin_locked(&swap_lock);
>
> - plist_add(&si->list, &swap_active_head);
> -
There's a large comment in _enable_swap_info() which needs to be
removed or updated?
> /* Add back to available list */
> add_to_avail_list(si, true);
> }
I fixed the rejects and my swapfile.c still has a couple of references
to swap_active_head. Can we please have a redo&resend against mm-new?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-09 3:26 ` Andrew Morton
@ 2025-10-09 7:47 ` Baoquan He
2025-10-09 17:09 ` Chris Li
2025-10-10 1:28 ` Andrew Morton
0 siblings, 2 replies; 22+ messages in thread
From: Baoquan He @ 2025-10-09 7:47 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, chrisl, kasong, baohua, nphamcs, shikemeng
On 10/08/25 at 08:26pm, Andrew Morton wrote:
> On Wed, 1 Oct 2025 12:34:36 +0800 Baoquan He <bhe@redhat.com> wrote:
>
> > There's no user of swap_active_head, remove it now.
> >
> > ...
> >
> > @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
> >
> > assert_spin_locked(&swap_lock);
> >
> > - plist_add(&si->list, &swap_active_head);
> > -
>
> There's a large comment in _enable_swap_info() which needs to be
> removed or updated?
This patchset depends on below patchset:
[PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
That large comment in _enable_swap_info() has been removed in
"[PATCH v3 0/2] mm/swapfile.c: select the swap device with default
priority round robin"
>
> > /* Add back to available list */
> > add_to_avail_list(si, true);
> > }
>
> I fixed the rejects and my swapfile.c still has a couple of references
> to swap_active_head. Can we please have a redo&resend against mm-new?
not sure if it's caused by the dependency on "[PATCH v3 0/2] mm/swapfile.c:
select the swap device with default priority round robin". I will rebase
it on mm-new to see.
By the way, Chris worried the plist swap_active_head could be reused by
the ongoing swap-tier work and suggested to hold off this patchset till
swap-tier work is clear. Now seems it won't impact swap-tier, I will
send a v2 against mm-new.
Thanks a lot.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-09 7:47 ` Baoquan He
@ 2025-10-09 17:09 ` Chris Li
2025-10-10 2:56 ` YoungJun Park
2025-10-10 1:28 ` Andrew Morton
1 sibling, 1 reply; 22+ messages in thread
From: Chris Li @ 2025-10-09 17:09 UTC (permalink / raw)
To: Baoquan He
Cc: Andrew Morton, linux-mm, kasong, baohua, nphamcs, shikemeng,
YoungJun Park
Hi Baoquan,
On Thu, Oct 9, 2025 at 12:47 AM Baoquan He <bhe@redhat.com> wrote:
>
> By the way, Chris worried the plist swap_active_head could be reused by
> the ongoing swap-tier work and suggested to hold off this patchset till
> swap-tier work is clear. Now seems it won't impact swap-tier, I will
> send a v2 against mm-new.
Sorry I haven't made it clear earlier. I withdraw the NACK because
that is my misunderstanding of the code behavior. It does not mean I
clear the swap tiers will not use swap_active_head yet, I actually
haven't done that investigation.
I still suggest deferring this series which deletes the
"swap_active_head". Wait until YoungPark's swap tiers to be sent out
on the list. Let's wait for about 1 month or so, we should have a much
better understanding if the swap tiers series needs to use the
"swap_active_head". We are not in a hurry to remove it, right? The
earliest this can go into the official tree is 6.19 anyway. We still
have some time.
YoungPark, do you know if you want to use "swap_active_head" in your
pending patch series? Even better send your pending RFC patches to the
list and Baoquan can collaborate on it.
Chris
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-09 17:09 ` Chris Li
@ 2025-10-10 2:56 ` YoungJun Park
0 siblings, 0 replies; 22+ messages in thread
From: YoungJun Park @ 2025-10-10 2:56 UTC (permalink / raw)
To: Chris Li
Cc: Baoquan He, Andrew Morton, linux-mm, kasong, baohua, nphamcs, shikemeng
On Thu, Oct 09, 2025 at 10:09:56AM -0700, Chris Li wrote:
> Hi Baoquan,
>
> On Thu, Oct 9, 2025 at 12:47 AM Baoquan He <bhe@redhat.com> wrote:
> >
> > By the way, Chris worried the plist swap_active_head could be reused by
> > the ongoing swap-tier work and suggested to hold off this patchset till
> > swap-tier work is clear. Now seems it won't impact swap-tier, I will
> > send a v2 against mm-new.
>
> Sorry I haven't made it clear earlier. I withdraw the NACK because
> that is my misunderstanding of the code behavior. It does not mean I
> clear the swap tiers will not use swap_active_head yet, I actually
> haven't done that investigation.
>
> I still suggest deferring this series which deletes the
> "swap_active_head". Wait until YoungPark's swap tiers to be sent out
Thanks Chris.
"swap_active_head" is used on my current working patch.
> on the list. Let's wait for about 1 month or so, we should have a much
Yes, I am on the construct writing swap tier RFC.
it takes a few weeks to share on LKML.
> better understanding if the swap tiers series needs to use the
> "swap_active_head". We are not in a hurry to remove it, right? The
> earliest this can go into the official tree is 6.19 anyway. We still
> have some time.
> YoungPark, do you know if you want to use "swap_active_head" in your
On current version of my patch, I use "swap_active_head"
on swap_tier assignment logic when traversing current swapfile.
It is efficient to use "swap_active_head" on traversing the swapfile
in priority order.
I think about more on this part.
> pending patch series? Even better send your pending RFC patches to the
> list and Baoquan can collaborate on it.
As mentioned above, share it a few weeks later.
Thanks,
Youngjun Park
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-09 7:47 ` Baoquan He
2025-10-09 17:09 ` Chris Li
@ 2025-10-10 1:28 ` Andrew Morton
2025-10-10 2:14 ` Baoquan He
` (2 more replies)
1 sibling, 3 replies; 22+ messages in thread
From: Andrew Morton @ 2025-10-10 1:28 UTC (permalink / raw)
To: Baoquan He; +Cc: linux-mm, chrisl, kasong, baohua, nphamcs, shikemeng
On Thu, 9 Oct 2025 15:47:06 +0800 Baoquan He <bhe@redhat.com> wrote:
> On 10/08/25 at 08:26pm, Andrew Morton wrote:
> > On Wed, 1 Oct 2025 12:34:36 +0800 Baoquan He <bhe@redhat.com> wrote:
> >
> > > There's no user of swap_active_head, remove it now.
> > >
> > > ...
> > >
> > > @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
> > >
> > > assert_spin_locked(&swap_lock);
> > >
> > > - plist_add(&si->list, &swap_active_head);
> > > -
> >
> > There's a large comment in _enable_swap_info() which needs to be
> > removed or updated?
>
> This patchset depends on below patchset:
> [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
Oh, OK, sorry, missed that.
(When I'm working on the post-merge-window patch pile (700+ emails) I
go in reverse-time order, so highlighting dependencies like this helps!)
> By the way, Chris worried the plist swap_active_head could be reused by
> the ongoing swap-tier work and suggested to hold off this patchset till
> swap-tier work is clear. Now seems it won't impact swap-tier, I will
> send a v2 against mm-new.
OK.
Is Baoquan's "[PATCH v3 0/2] mm/swapfile.c: select the swap device with
default priority round robin" series still considered good to merge?
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-10 1:28 ` Andrew Morton
@ 2025-10-10 2:14 ` Baoquan He
2025-10-10 2:34 ` Chris Li
2025-10-10 2:33 ` Chris Li
2025-10-10 2:52 ` Chris Li
2 siblings, 1 reply; 22+ messages in thread
From: Baoquan He @ 2025-10-10 2:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, chrisl, kasong, baohua, nphamcs, shikemeng
On 10/09/25 at 06:28pm, Andrew Morton wrote:
> On Thu, 9 Oct 2025 15:47:06 +0800 Baoquan He <bhe@redhat.com> wrote:
>
> > On 10/08/25 at 08:26pm, Andrew Morton wrote:
> > > On Wed, 1 Oct 2025 12:34:36 +0800 Baoquan He <bhe@redhat.com> wrote:
> > >
> > > > There's no user of swap_active_head, remove it now.
> > > >
> > > > ...
> > > >
> > > > @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
> > > >
> > > > assert_spin_locked(&swap_lock);
> > > >
> > > > - plist_add(&si->list, &swap_active_head);
> > > > -
> > >
> > > There's a large comment in _enable_swap_info() which needs to be
> > > removed or updated?
> >
> > This patchset depends on below patchset:
> > [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
>
> Oh, OK, sorry, missed that.
>
> (When I'm working on the post-merge-window patch pile (700+ emails) I
> go in reverse-time order, so highlighting dependencies like this helps!)
I see, will mention the dependency at the beginning of cover letter to
highlight this.
>
> > By the way, Chris worried the plist swap_active_head could be reused by
> > the ongoing swap-tier work and suggested to hold off this patchset till
> > swap-tier work is clear. Now seems it won't impact swap-tier, I will
> > send a v2 against mm-new.
>
> OK.
>
> Is Baoquan's "[PATCH v3 0/2] mm/swapfile.c: select the swap device with
> default priority round robin" series still considered good to merge?
Up to now no objection received to the patchset and got ACK from Chris
for "[PATCH v3 0/2] mm/swapfile.c: select the swap device with default
priority round robin".
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-10 2:14 ` Baoquan He
@ 2025-10-10 2:34 ` Chris Li
0 siblings, 0 replies; 22+ messages in thread
From: Chris Li @ 2025-10-10 2:34 UTC (permalink / raw)
To: Baoquan He; +Cc: Andrew Morton, linux-mm, kasong, baohua, nphamcs, shikemeng
On Thu, Oct 9, 2025 at 7:15 PM Baoquan He <bhe@redhat.com> wrote:
>
> On 10/09/25 at 06:28pm, Andrew Morton wrote:
> > On Thu, 9 Oct 2025 15:47:06 +0800 Baoquan He <bhe@redhat.com> wrote:
> >
> > > On 10/08/25 at 08:26pm, Andrew Morton wrote:
> > > > On Wed, 1 Oct 2025 12:34:36 +0800 Baoquan He <bhe@redhat.com> wrote:
> > > >
> > > > > There's no user of swap_active_head, remove it now.
> > > > >
> > > > > ...
> > > > >
> > > > > @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
> > > > >
> > > > > assert_spin_locked(&swap_lock);
> > > > >
> > > > > - plist_add(&si->list, &swap_active_head);
> > > > > -
> > > >
> > > > There's a large comment in _enable_swap_info() which needs to be
> > > > removed or updated?
> > >
> > > This patchset depends on below patchset:
> > > [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
> >
> > Oh, OK, sorry, missed that.
> >
> > (When I'm working on the post-merge-window patch pile (700+ emails) I
> > go in reverse-time order, so highlighting dependencies like this helps!)
>
> I see, will mention the dependency at the beginning of cover letter to
> highlight this.
Right. Please hold off this series for swap tiers for a month or so.
> > > By the way, Chris worried the plist swap_active_head could be reused by
> > > the ongoing swap-tier work and suggested to hold off this patchset till
> > > swap-tier work is clear. Now seems it won't impact swap-tier, I will
> > > send a v2 against mm-new.
> >
> > OK.
> >
> > Is Baoquan's "[PATCH v3 0/2] mm/swapfile.c: select the swap device with
> > default priority round robin" series still considered good to merge?
>
> Up to now no objection received to the patchset and got ACK from Chris
> for "[PATCH v3 0/2] mm/swapfile.c: select the swap device with default
> priority round robin".
Ack that series is good to merge.
Chris
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-10 1:28 ` Andrew Morton
2025-10-10 2:14 ` Baoquan He
@ 2025-10-10 2:33 ` Chris Li
2025-10-10 2:52 ` Chris Li
2 siblings, 0 replies; 22+ messages in thread
From: Chris Li @ 2025-10-10 2:33 UTC (permalink / raw)
To: Andrew Morton; +Cc: Baoquan He, linux-mm, kasong, baohua, nphamcs, shikemeng
On Thu, Oct 9, 2025 at 6:28 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 9 Oct 2025 15:47:06 +0800 Baoquan He <bhe@redhat.com> wrote:
>
> > On 10/08/25 at 08:26pm, Andrew Morton wrote:
> > > On Wed, 1 Oct 2025 12:34:36 +0800 Baoquan He <bhe@redhat.com> wrote:
> > >
> > > > There's no user of swap_active_head, remove it now.
> > > >
> > > > ...
> > > >
> > > > @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
> > > >
> > > > assert_spin_locked(&swap_lock);
> > > >
> > > > - plist_add(&si->list, &swap_active_head);
> > > > -
> > >
> > > There's a large comment in _enable_swap_info() which needs to be
> > > removed or updated?
> >
> > This patchset depends on below patchset:
> > [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
>
> Oh, OK, sorry, missed that.
Yes, that round robin series is consider good to merge.
>
> (When I'm working on the post-merge-window patch pile (700+ emails) I
> go in reverse-time order, so highlighting dependencies like this helps!)
Understand, that is why I emphasize to Baoquan to highlight the
dependency patch series as well.
> > By the way, Chris worried the plist swap_active_head could be reused by
> > the ongoing swap-tier work and suggested to hold off this patchset till
> > swap-tier work is clear. Now seems it won't impact swap-tier, I will
> > send a v2 against mm-new.
I will wish this swap_active_head patch series to hold off for a
month or so to make sure swap tiers won't use it.
>
> OK.
>
> Is Baoquan's "[PATCH v3 0/2] mm/swapfile.c: select the swap device with
> default priority round robin" series still considered good to merge?
That series is still considered good to merge. It has 18% performance
gain by switching the complex per node id swap file priority to simple
round robin. Let me reply to that patch email thread.
Chris
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] mm/swap: remove unneeded swap_active_head
2025-10-10 1:28 ` Andrew Morton
2025-10-10 2:14 ` Baoquan He
2025-10-10 2:33 ` Chris Li
@ 2025-10-10 2:52 ` Chris Li
2 siblings, 0 replies; 22+ messages in thread
From: Chris Li @ 2025-10-10 2:52 UTC (permalink / raw)
To: Andrew Morton
Cc: Baoquan He, linux-mm, kasong, baohua, nphamcs, shikemeng, YoungJun Park
Hi Andrew,
Please drop/ignore this patch series that removes the
swap_active_head. I just received the patch from YoungJun. Just as I
worried, YoungJun is indeed using the swap_active_head in his swap
tiers patch series. Let's not remove it.
I copy/paste from YoungJun's patches here. It has confirmed usage of that list.
+
+ p = plist_first_entry(&swap_active_head, struct swap_info_struct, list);
+ for_each_active_tier(tier) {
+ plist_for_each_entry_continue(p, &swap_active_head, list) {
+ if (p->tier_off_mask != DEFAULT_OFF_MASK)
Chris
On Thu, Oct 9, 2025 at 6:28 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 9 Oct 2025 15:47:06 +0800 Baoquan He <bhe@redhat.com> wrote:
>
> > On 10/08/25 at 08:26pm, Andrew Morton wrote:
> > > On Wed, 1 Oct 2025 12:34:36 +0800 Baoquan He <bhe@redhat.com> wrote:
> > >
> > > > There's no user of swap_active_head, remove it now.
> > > >
> > > > ...
> > > >
> > > > @@ -2553,8 +2545,6 @@ static void _enable_swap_info(struct swap_info_struct *si)
> > > >
> > > > assert_spin_locked(&swap_lock);
> > > >
> > > > - plist_add(&si->list, &swap_active_head);
> > > > -
> > >
> > > There's a large comment in _enable_swap_info() which needs to be
> > > removed or updated?
> >
> > This patchset depends on below patchset:
> > [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
>
> Oh, OK, sorry, missed that.
>
> (When I'm working on the post-merge-window patch pile (700+ emails) I
> go in reverse-time order, so highlighting dependencies like this helps!)
>
> > By the way, Chris worried the plist swap_active_head could be reused by
> > the ongoing swap-tier work and suggested to hold off this patchset till
> > swap-tier work is clear. Now seems it won't impact swap-tier, I will
> > send a v2 against mm-new.
>
> OK.
>
> Is Baoquan's "[PATCH v3 0/2] mm/swapfile.c: select the swap device with
> default priority round robin" series still considered good to merge?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/3] mm/swap: remove plist swap_active_head
2025-10-01 4:34 [PATCH 0/3] mm/swap: remove plist swap_active_head Baoquan He
` (2 preceding siblings ...)
2025-10-01 4:34 ` [PATCH 3/3] mm/swap: remove unneeded swap_active_head Baoquan He
@ 2025-10-02 6:04 ` Chris Li
2025-10-02 13:09 ` Baoquan He
3 siblings, 1 reply; 22+ messages in thread
From: Chris Li @ 2025-10-02 6:04 UTC (permalink / raw)
To: Baoquan He; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On Tue, Sep 30, 2025 at 9:34 PM Baoquan He <bhe@redhat.com> wrote:
>
Can you please confirm or deny this series is depend on the "[PATCH v3
0/2] mm/swapfile.c: select the swap device with default priority round
robin" series.
If it is depend on that series, you need to declare that in the
beginning, so Andrew will not try to apply it without the depended
series.
> In mm/swapfile.c, there are two plist variables, swap_active_head and
> swap_avail_head. swap_avail_head contains all available (active, not full)
> swap_info_structs orderred by priority. While swap_active_head contains
Nit: spelling : orderred -> ordered
> all active swap_info_structs (active, nor full and full) orderred by
Same here, ordered.
> priority. Earlier, it serves three purposes:
>
> 1) When swapoff one swap device in the middle, swap devices of priority
> lower than the swapped off swap device will be promoted up value one,
> e.g, I swapped off zram1 of priority '-3', then zram2 promoted to
> have priority '-3', zram3 has priority '-4'. This is done via plist
> swap_active_head.
> - This code has been taken off in
> - [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
See above, if there is patch dependency, declare it early.
> # swapon
> NAME TYPE SIZE USED PRIO
> /dev/zram0 partition 16G 0B -2
> /dev/zram1 partition 16G 0B -3
> /dev/zram2 partition 16G 0B -4
> /dev/zram3 partition 16G 0B -5
I am a bit confused. If the "[PATCH v3 0/2] mm/swapfile.c: select the
swap device with default priority round robin" series already change
the default priority to -1. How does it get the -2 -3 -4 -5? You seems
describe the behavior before that revert series. Can you please
clarify.
> # swapoff /dev/zram1
> # swapon
> NAME TYPE SIZE USED PRIO
> /dev/zram0 partition 16G 0B -2
> /dev/zram2 partition 16G 0B -3
> /dev/zram3 partition 16G 0B -4
Is that the behavior before your reverting node id for swapfile series?
Assume it has patch dependency on the earlier series. Wouldn't better
you describe the behavior after the round robin series and how you can
do better?
> 2) Find a swap device in swap_active_head when swap off.
> - This can be done through iterating swap_info[] instead. Change is
> done in patch 2.
Can you please clarify what is the benefit of remove plist swap_active_head?
e.g. Does it simply the code or make swap code perform better.
> 3) Judge if there's any active swap device via __has_usable_swap().
> - This can be done by checking total_swap_pages instead. Change is
> done in patch 1.
>
> Among them, the purpose 1) is the most important, while it has been
> taken off in below patchset. So this patchset removing swap_active_head
> sits on top of it.
> - [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
> - https://lore.kernel.org/all/20250930063311.14126-1-bhe@redhat.com/T/#u
It seems you mean to say this patch series is depend on that PATCH v3 series.
Chris
>
> Baoquan He (3):
> mm/swapfile.c: remove __has_usable_swap()
> mm/swapfile.c: use swap_info[] to find the swap device
> mm/swap: remove unneeded swap_active_head
>
> include/linux/swap.h | 1 -
> mm/swapfile.c | 31 ++++++++-----------------------
> 2 files changed, 8 insertions(+), 24 deletions(-)
>
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH 0/3] mm/swap: remove plist swap_active_head
2025-10-02 6:04 ` [PATCH 0/3] mm/swap: remove plist swap_active_head Chris Li
@ 2025-10-02 13:09 ` Baoquan He
2025-10-02 16:23 ` Chris Li
0 siblings, 1 reply; 22+ messages in thread
From: Baoquan He @ 2025-10-02 13:09 UTC (permalink / raw)
To: Chris Li; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On 10/01/25 at 11:04pm, Chris Li wrote:
> On Tue, Sep 30, 2025 at 9:34 PM Baoquan He <bhe@redhat.com> wrote:
> >
>
> Can you please confirm or deny this series is depend on the "[PATCH v3
> 0/2] mm/swapfile.c: select the swap device with default priority round
> robin" series.
Thanks a lot for reviewing this carefully. This series depends on
patchset [PATCH v3 0/2] mm/swapfile.c: select the swap device with default
priority round robin" series.
>
> If it is depend on that series, you need to declare that in the
> beginning, so Andrew will not try to apply it without the depended
> series.
Sorry, I should have told that clearly at the beginning.
>
> > In mm/swapfile.c, there are two plist variables, swap_active_head and
> > swap_avail_head. swap_avail_head contains all available (active, not full)
> > swap_info_structs orderred by priority. While swap_active_head contains
>
> Nit: spelling : orderred -> ordered
Thanks.
>
> > all active swap_info_structs (active, nor full and full) orderred by
>
> Same here, ordered.
>
> > priority. Earlier, it serves three purposes:
> >
> > 1) When swapoff one swap device in the middle, swap devices of priority
> > lower than the swapped off swap device will be promoted up value one,
> > e.g, I swapped off zram1 of priority '-3', then zram2 promoted to
> > have priority '-3', zram3 has priority '-4'. This is done via plist
> > swap_active_head.
> > - This code has been taken off in
> > - [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
>
> See above, if there is patch dependency, declare it early.
Agree, will do in future posting.
>
> > # swapon
> > NAME TYPE SIZE USED PRIO
> > /dev/zram0 partition 16G 0B -2
> > /dev/zram1 partition 16G 0B -3
> > /dev/zram2 partition 16G 0B -4
> > /dev/zram3 partition 16G 0B -5
>
> I am a bit confused. If the "[PATCH v3 0/2] mm/swapfile.c: select the
> swap device with default priority round robin" series already change
> the default priority to -1. How does it get the -2 -3 -4 -5? You seems
> describe the behavior before that revert series. Can you please
> clarify.
In the given example, It demonstrates every device will get +1 priority
promotion after the swapped off device.
I just intended to tell the purposes of swap_active_head in the current
kernel, since my patches haven't been merged. I also had a feeling this
description is messy before posting, diddn't expect it could cause these
confusion.
I should have mentioned this series depends on "[PATCH v3 0/2] mm/swapfile.c:
select the swap device with default priority round robin" series, then I
don't need to mention this purpose of swap_active_head, that could make
the describing easier.
>
> > # swapoff /dev/zram1
> > # swapon
> > NAME TYPE SIZE USED PRIO
> > /dev/zram0 partition 16G 0B -2
> > /dev/zram2 partition 16G 0B -3
> > /dev/zram3 partition 16G 0B -4
>
> Is that the behavior before your reverting node id for swapfile series?
> Assume it has patch dependency on the earlier series. Wouldn't better
> you describe the behavior after the round robin series and how you can
> do better?
Right, I agree.
>
> > 2) Find a swap device in swap_active_head when swap off.
> > - This can be done through iterating swap_info[] instead. Change is
> > done in patch 2.
>
> Can you please clarify what is the benefit of remove plist swap_active_head?
> e.g. Does it simply the code or make swap code perform better.
After series "[PATCH v3 0/2] mm/swapfile.c: select the swap device with
default priority round robin", plist swap_active_head is only used in
swapoff api to scan it to find a swap device according to the device
name. It can be replaced with scanning swap_info[], and there's no
efficiency degradation. And we can remove field "struct plist_node
list;" in struct swap_info_struct, and the relevant handling code which
is unnecessary.
>
> > 3) Judge if there's any active swap device via __has_usable_swap().
> > - This can be done by checking total_swap_pages instead. Change is
> > done in patch 1.
> >
> > Among them, the purpose 1) is the most important, while it has been
> > taken off in below patchset. So this patchset removing swap_active_head
> > sits on top of it.
> > - [PATCH v3 0/2] mm/swapfile.c: select the swap device with default priority round robin
> > - https://lore.kernel.org/all/20250930063311.14126-1-bhe@redhat.com/T/#u
>
> It seems you mean to say this patch series is depend on that PATCH v3 series.
Yeah, I should have mentioned this at the beginning to make the log
clearer.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 0/3] mm/swap: remove plist swap_active_head
2025-10-02 13:09 ` Baoquan He
@ 2025-10-02 16:23 ` Chris Li
0 siblings, 0 replies; 22+ messages in thread
From: Chris Li @ 2025-10-02 16:23 UTC (permalink / raw)
To: Baoquan He; +Cc: linux-mm, akpm, kasong, baohua, nphamcs, shikemeng
On Thu, Oct 2, 2025 at 6:09 AM Baoquan He <bhe@redhat.com> wrote:
> > Can you please clarify what is the benefit of remove plist swap_active_head?
> > e.g. Does it simply the code or make swap code perform better.
>
> After series "[PATCH v3 0/2] mm/swapfile.c: select the swap device with
> default priority round robin", plist swap_active_head is only used in
> swapoff api to scan it to find a swap device according to the device
> name. It can be replaced with scanning swap_info[], and there's no
> efficiency degradation. And we can remove field "struct plist_node
> list;" in struct swap_info_struct, and the relevant handling code which
> is unnecessary.
So the benefit is removing a list. Thank you for confirming that.
BTW, I think I found a race bug in your patch 2/3, I replied in the
patch email thread.
Let me know if that race is valid or not.
Chris
^ permalink raw reply [flat|nested] 22+ messages in thread