linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] optimize the logic for handling dirty file folios during reclaim
@ 2025-10-17  7:53 Baolin Wang
  2025-10-17  7:53 ` [PATCH v2 1/2] mm: vmscan: filter out the dirty file folios for node_reclaim() Baolin Wang
  2025-10-17  7:53 ` [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios Baolin Wang
  0 siblings, 2 replies; 5+ messages in thread
From: Baolin Wang @ 2025-10-17  7:53 UTC (permalink / raw)
  To: akpm, hannes
  Cc: david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes,
	hughd, willy, baolin.wang, linux-mm, linux-kernel

Since we no longer attempt to write back filesystem folios during reclaim,
some logic for handling dirty file folios in the reclaim process also needs
to be updated. Please check the details in each patch.

Changes from v1:
- Fix the folio_test_reclaim() check.
- Rebase on the mm-new branch.

Baolin Wang (2):
  mm: vmscan: filter out the dirty file folios for node_reclaim()
  mm: vmscan: simplify the logic for activating dirty file folios

 include/linux/mmzone.h |  4 ----
 mm/vmscan.c            | 33 ++++++++-------------------------
 2 files changed, 8 insertions(+), 29 deletions(-)

-- 
2.43.7



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] mm: vmscan: filter out the dirty file folios for node_reclaim()
  2025-10-17  7:53 [PATCH v2 0/2] optimize the logic for handling dirty file folios during reclaim Baolin Wang
@ 2025-10-17  7:53 ` Baolin Wang
  2025-10-17  7:53 ` [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios Baolin Wang
  1 sibling, 0 replies; 5+ messages in thread
From: Baolin Wang @ 2025-10-17  7:53 UTC (permalink / raw)
  To: akpm, hannes
  Cc: david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes,
	hughd, willy, baolin.wang, linux-mm, linux-kernel

After commit 6b0dfabb3555 ("fs: Remove aops->writepage"), we no longer
attempt to write back filesystem folios in pageout(), and only tmpfs/shmem
folios and anonymous swapcache folios can be written back. Therefore,
we should also filter out the dirty filesystem folios for node_reclaim()
to avoid unnecessary LRU scans.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/vmscan.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c80fcae7f2a1..65f299e4b8f0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -7601,9 +7601,11 @@ static unsigned long node_pagecache_reclaimable(struct pglist_data *pgdat)
 	else
 		nr_pagecache_reclaimable = node_unmapped_file_pages(pgdat);
 
-	/* If we can't clean pages, remove dirty pages from consideration */
-	if (!(node_reclaim_mode & RECLAIM_WRITE))
-		delta += node_page_state(pgdat, NR_FILE_DIRTY);
+	/*
+	 * Since we can't clean folios through reclaim, remove dirty file
+	 * folios from consideration.
+	 */
+	delta += node_page_state(pgdat, NR_FILE_DIRTY);
 
 	/* Watch for any possible underflows due to delta */
 	if (unlikely(delta > nr_pagecache_reclaimable))
-- 
2.43.7



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios
  2025-10-17  7:53 [PATCH v2 0/2] optimize the logic for handling dirty file folios during reclaim Baolin Wang
  2025-10-17  7:53 ` [PATCH v2 1/2] mm: vmscan: filter out the dirty file folios for node_reclaim() Baolin Wang
@ 2025-10-17  7:53 ` Baolin Wang
  2025-10-17 12:02   ` Michal Hocko
  1 sibling, 1 reply; 5+ messages in thread
From: Baolin Wang @ 2025-10-17  7:53 UTC (permalink / raw)
  To: akpm, hannes
  Cc: david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes,
	hughd, willy, baolin.wang, linux-mm, linux-kernel

After commit 6b0dfabb3555 ("fs: Remove aops->writepage"), we no longer
attempt to write back filesystem folios through reclaim.

However, in the shrink_folio_list() function, there still remains some
logic related to writeback control of dirty file folios. The original
logic was that, for direct reclaim, or when folio_test_reclaim() is false,
or the PGDAT_DIRTY flag is not set, the dirty file folios would be directly
activated to avoid being scanned again; otherwise, it will try to writeback
the dirty file folios. However, since we can no longer perform writeback on
dirty folios, the dirty file folios will still be activated.

Additionally, under the original logic, if we continue to try writeback dirty
file folios, we will also check the references flag, sc->may_writepage, and
may_enter_fs(), which may result in dirty file folios being left in the inactive
list. This is unreasonable. Even if these dirty folios are scanned again, we
still cannot clean them.

Therefore, the checks on these dirty file folios appear to be redundant and can
be removed. Dirty file folios should be directly moved to the active list to
avoid being scanned again. Since we set the PG_reclaim flag for the dirty folios,
once the writeback is completed, they will be moved back to the tail of the
inactive list to be retried for quick reclaim.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 include/linux/mmzone.h |  4 ----
 mm/vmscan.c            | 25 +++----------------------
 2 files changed, 3 insertions(+), 26 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 7fb7331c5725..4398e027f450 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1060,10 +1060,6 @@ struct zone {
 } ____cacheline_internodealigned_in_smp;
 
 enum pgdat_flags {
-	PGDAT_DIRTY,			/* reclaim scanning has recently found
-					 * many dirty file pages at the tail
-					 * of the LRU.
-					 */
 	PGDAT_WRITEBACK,		/* reclaim scanning has recently found
 					 * many pages under writeback
 					 */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 65f299e4b8f0..c922bad2b8fd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1387,21 +1387,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 
 		mapping = folio_mapping(folio);
 		if (folio_test_dirty(folio)) {
-			/*
-			 * Only kswapd can writeback filesystem folios
-			 * to avoid risk of stack overflow. But avoid
-			 * injecting inefficient single-folio I/O into
-			 * flusher writeback as much as possible: only
-			 * write folios when we've encountered many
-			 * dirty folios, and when we've already scanned
-			 * the rest of the LRU for clean folios and see
-			 * the same dirty folios again (with the reclaim
-			 * flag set).
-			 */
-			if (folio_is_file_lru(folio) &&
-			    (!current_is_kswapd() ||
-			     !folio_test_reclaim(folio) ||
-			     !test_bit(PGDAT_DIRTY, &pgdat->flags))) {
+			if (folio_is_file_lru(folio)) {
 				/*
 				 * Immediately reclaim when written back.
 				 * Similar in principle to folio_deactivate()
@@ -1410,7 +1396,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 				 */
 				node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE,
 						nr_pages);
-				folio_set_reclaim(folio);
+				if (!folio_test_reclaim(folio))
+					folio_set_reclaim(folio);
 
 				goto activate_locked;
 			}
@@ -6105,11 +6092,6 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 		if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken)
 			set_bit(PGDAT_WRITEBACK, &pgdat->flags);
 
-		/* Allow kswapd to start writing pages during reclaim.*/
-		if (sc->nr.unqueued_dirty &&
-			sc->nr.unqueued_dirty == sc->nr.file_taken)
-			set_bit(PGDAT_DIRTY, &pgdat->flags);
-
 		/*
 		 * If kswapd scans pages marked for immediate
 		 * reclaim and under writeback (nr_immediate), it
@@ -6850,7 +6832,6 @@ static void clear_pgdat_congested(pg_data_t *pgdat)
 
 	clear_bit(LRUVEC_NODE_CONGESTED, &lruvec->flags);
 	clear_bit(LRUVEC_CGROUP_CONGESTED, &lruvec->flags);
-	clear_bit(PGDAT_DIRTY, &pgdat->flags);
 	clear_bit(PGDAT_WRITEBACK, &pgdat->flags);
 }
 
-- 
2.43.7



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios
  2025-10-17  7:53 ` [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios Baolin Wang
@ 2025-10-17 12:02   ` Michal Hocko
  2025-10-20  7:34     ` Baolin Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2025-10-17 12:02 UTC (permalink / raw)
  To: Baolin Wang
  Cc: akpm, hannes, david, zhengqi.arch, shakeel.butt, lorenzo.stoakes,
	hughd, willy, linux-mm, linux-kernel

On Fri 17-10-25 15:53:07, Baolin Wang wrote:
> After commit 6b0dfabb3555 ("fs: Remove aops->writepage"), we no longer
> attempt to write back filesystem folios through reclaim.
> 
> However, in the shrink_folio_list() function, there still remains some
> logic related to writeback control of dirty file folios. The original
> logic was that, for direct reclaim, or when folio_test_reclaim() is false,
> or the PGDAT_DIRTY flag is not set, the dirty file folios would be directly
> activated to avoid being scanned again; otherwise, it will try to writeback
> the dirty file folios. However, since we can no longer perform writeback on
> dirty folios, the dirty file folios will still be activated.
> 
> Additionally, under the original logic, if we continue to try writeback dirty
> file folios, we will also check the references flag, sc->may_writepage, and
> may_enter_fs(), which may result in dirty file folios being left in the inactive
> list. This is unreasonable. Even if these dirty folios are scanned again, we
> still cannot clean them.
> 
> Therefore, the checks on these dirty file folios appear to be redundant and can
> be removed. Dirty file folios should be directly moved to the active list to
> avoid being scanned again. Since we set the PG_reclaim flag for the dirty folios,
> once the writeback is completed, they will be moved back to the tail of the
> inactive list to be retried for quick reclaim.

Is there any actual problem you are trying to address or is this a code
clean up? How have you evaluated this change? 

> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  include/linux/mmzone.h |  4 ----
>  mm/vmscan.c            | 25 +++----------------------
>  2 files changed, 3 insertions(+), 26 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 7fb7331c5725..4398e027f450 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1060,10 +1060,6 @@ struct zone {
>  } ____cacheline_internodealigned_in_smp;
>  
>  enum pgdat_flags {
> -	PGDAT_DIRTY,			/* reclaim scanning has recently found
> -					 * many dirty file pages at the tail
> -					 * of the LRU.
> -					 */
>  	PGDAT_WRITEBACK,		/* reclaim scanning has recently found
>  					 * many pages under writeback
>  					 */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 65f299e4b8f0..c922bad2b8fd 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1387,21 +1387,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  
>  		mapping = folio_mapping(folio);
>  		if (folio_test_dirty(folio)) {
> -			/*
> -			 * Only kswapd can writeback filesystem folios
> -			 * to avoid risk of stack overflow. But avoid
> -			 * injecting inefficient single-folio I/O into
> -			 * flusher writeback as much as possible: only
> -			 * write folios when we've encountered many
> -			 * dirty folios, and when we've already scanned
> -			 * the rest of the LRU for clean folios and see
> -			 * the same dirty folios again (with the reclaim
> -			 * flag set).
> -			 */
> -			if (folio_is_file_lru(folio) &&
> -			    (!current_is_kswapd() ||
> -			     !folio_test_reclaim(folio) ||
> -			     !test_bit(PGDAT_DIRTY, &pgdat->flags))) {
> +			if (folio_is_file_lru(folio)) {
>  				/*
>  				 * Immediately reclaim when written back.
>  				 * Similar in principle to folio_deactivate()
> @@ -1410,7 +1396,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  				 */
>  				node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE,
>  						nr_pages);
> -				folio_set_reclaim(folio);
> +				if (!folio_test_reclaim(folio))
> +					folio_set_reclaim(folio);
>  
>  				goto activate_locked;
>  			}
> @@ -6105,11 +6092,6 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>  		if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken)
>  			set_bit(PGDAT_WRITEBACK, &pgdat->flags);
>  
> -		/* Allow kswapd to start writing pages during reclaim.*/
> -		if (sc->nr.unqueued_dirty &&
> -			sc->nr.unqueued_dirty == sc->nr.file_taken)
> -			set_bit(PGDAT_DIRTY, &pgdat->flags);
> -
>  		/*
>  		 * If kswapd scans pages marked for immediate
>  		 * reclaim and under writeback (nr_immediate), it
> @@ -6850,7 +6832,6 @@ static void clear_pgdat_congested(pg_data_t *pgdat)
>  
>  	clear_bit(LRUVEC_NODE_CONGESTED, &lruvec->flags);
>  	clear_bit(LRUVEC_CGROUP_CONGESTED, &lruvec->flags);
> -	clear_bit(PGDAT_DIRTY, &pgdat->flags);
>  	clear_bit(PGDAT_WRITEBACK, &pgdat->flags);
>  }
>  
> -- 
> 2.43.7
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios
  2025-10-17 12:02   ` Michal Hocko
@ 2025-10-20  7:34     ` Baolin Wang
  0 siblings, 0 replies; 5+ messages in thread
From: Baolin Wang @ 2025-10-20  7:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm, hannes, david, zhengqi.arch, shakeel.butt, lorenzo.stoakes,
	hughd, willy, linux-mm, linux-kernel



On 2025/10/17 20:02, Michal Hocko wrote:
> On Fri 17-10-25 15:53:07, Baolin Wang wrote:
>> After commit 6b0dfabb3555 ("fs: Remove aops->writepage"), we no longer
>> attempt to write back filesystem folios through reclaim.
>>
>> However, in the shrink_folio_list() function, there still remains some
>> logic related to writeback control of dirty file folios. The original
>> logic was that, for direct reclaim, or when folio_test_reclaim() is false,
>> or the PGDAT_DIRTY flag is not set, the dirty file folios would be directly
>> activated to avoid being scanned again; otherwise, it will try to writeback
>> the dirty file folios. However, since we can no longer perform writeback on
>> dirty folios, the dirty file folios will still be activated.
>>
>> Additionally, under the original logic, if we continue to try writeback dirty
>> file folios, we will also check the references flag, sc->may_writepage, and
>> may_enter_fs(), which may result in dirty file folios being left in the inactive
>> list. This is unreasonable. Even if these dirty folios are scanned again, we
>> still cannot clean them.
>>
>> Therefore, the checks on these dirty file folios appear to be redundant and can
>> be removed. Dirty file folios should be directly moved to the active list to
>> avoid being scanned again. Since we set the PG_reclaim flag for the dirty folios,
>> once the writeback is completed, they will be moved back to the tail of the
>> inactive list to be retried for quick reclaim.
> 
> Is there any actual problem you are trying to address or is this a code
> clean up? How have you evaluated this change?

This patch is more of a cleanup patch, since dirty file folios are also 
activated in pageout(), so there are essentially no significant logical 
changes. Moreover, this patch set is a continuation of the previous 
cleanup work[1] for dirty file folios, and further cleanup and 
optimization work for file folios reclamation is still ongoing.

I conducted some evaluations (such as building the kernel in memcg to 
reclaim file folios and running thpcompact to reclaim file folios), and 
I did not observe any obvious changes in reclaim efficiency.

[1] 
https://lore.kernel.org/all/cover.1758166683.git.baolin.wang@linux.alibaba.com/T/#u


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-10-20  7:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-17  7:53 [PATCH v2 0/2] optimize the logic for handling dirty file folios during reclaim Baolin Wang
2025-10-17  7:53 ` [PATCH v2 1/2] mm: vmscan: filter out the dirty file folios for node_reclaim() Baolin Wang
2025-10-17  7:53 ` [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios Baolin Wang
2025-10-17 12:02   ` Michal Hocko
2025-10-20  7:34     ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox