* Re: [PATCH 1/8] drm/i915/gem: Convert __shmem_writeback() to folios [not found] ` <20250113093453.1932083-2-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:05 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:05 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > Use folios instead of pages. > > This is preparation for removing PG_reclaim. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20250113093453.1932083-3-kirill.shutemov@linux.intel.com>]
* Re: [PATCH 2/8] drm/i915/gem: Use PG_dropbehind instead of PG_reclaim [not found] ` <20250113093453.1932083-3-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:06 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:06 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > The recently introduced PG_dropbehind allows for freeing folios > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > to be involved to get the folio freed. > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > __shmem_writeback() > > It is safe to leave PG_dropbehind on the folio if, for some reason > (bug?), the folio is not in a writeback state after ->writepage(). > In these cases, the kernel had to clear PG_reclaim as it shared a page > flag bit with PG_readahead. I think this is correct Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20250113093453.1932083-6-kirill.shutemov@linux.intel.com>]
* Re: [PATCH 5/8] mm/vmscan: Use PG_dropbehind instead of PG_reclaim [not found] ` <20250113093453.1932083-6-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:07 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:07 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > The recently introduced PG_dropbehind allows for freeing folios > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > to be involved to get the folio freed. > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > pageout(). > > It is safe to leave PG_dropbehind on the folio if, for some reason > (bug?), the folio is not in a writeback state after ->writepage(). > In these cases, the kernel had to clear PG_reclaim as it shared a page > flag bit with PG_readahead. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/vmscan.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index a099876fa029..d15f80333d6b 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -692,19 +692,16 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping, > if (shmem_mapping(mapping) && folio_test_large(folio)) > wbc.list = folio_list; > > - folio_set_reclaim(folio); > + folio_set_dropbehind(folio); > + > res = mapping->a_ops->writepage(&folio->page, &wbc); > if (res < 0) > handle_write_error(mapping, folio, res); > if (res == AOP_WRITEPAGE_ACTIVATE) { > - folio_clear_reclaim(folio); > + folio_clear_dropbehind(folio); > return PAGE_ACTIVATE; > } > > - if (!folio_test_writeback(folio)) { > - /* synchronous write or broken a_ops? */ > - folio_clear_reclaim(folio); > - } > trace_mm_vmscan_write_folio(folio); > node_stat_add_folio(folio, NR_VMSCAN_WRITE); > return PAGE_SUCCESS; Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20250113093453.1932083-7-kirill.shutemov@linux.intel.com>]
* Re: [PATCH 6/8] mm/vmscan: Use PG_dropbehind instead of PG_reclaim in shrink_folio_list() [not found] ` <20250113093453.1932083-7-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:08 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:08 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > The recently introduced PG_dropbehind allows for freeing folios > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > to be involved to get the folio freed. > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > shrink_folio_list(). > > It is safe to leave PG_dropbehind on the folio if, for some reason > (bug?), the folio is not in a writeback state after ->writepage(). > In these cases, the kernel had to clear PG_reclaim as it shared a page > flag bit with PG_readahead. > > Also use PG_dropbehind instead PG_reclaim to detect I/O congestion. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20250113093453.1932083-8-kirill.shutemov@linux.intel.com>]
* Re: [PATCH 7/8] mm/mglru: Check PG_dropcache instead of PG_reclaim in lru_gen_folio_seq() [not found] ` <20250113093453.1932083-8-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:09 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:09 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > Kernel sets PG_dropcache instead of PG_reclaim everywhere. Check > PG_dropcache in lru_gen_folio_seq(). Subject and description PG_dropcache->PG_dropbehind Apart from that LGTM Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20250113093453.1932083-9-kirill.shutemov@linux.intel.com>]
* Re: [PATCH 8/8] mm: Remove PG_reclaim [not found] ` <20250113093453.1932083-9-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:11 ` David Hildenbrand 2025-01-13 15:28 ` Matthew Wilcox 1 sibling, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:11 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > Nobody sets the flag anymore. > > Remove the PG_reclaim, making PG_readhead exclusive user of the page > flag bit. > Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] mm: Remove PG_reclaim [not found] ` <20250113093453.1932083-9-kirill.shutemov@linux.intel.com> 2025-01-13 10:11 ` [PATCH 8/8] mm: Remove PG_reclaim David Hildenbrand @ 2025-01-13 15:28 ` Matthew Wilcox 2025-01-14 8:30 ` Kirill A. Shutemov 1 sibling, 1 reply; 19+ messages in thread From: Matthew Wilcox @ 2025-01-13 15:28 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Andrew Morton, Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Mon, Jan 13, 2025 at 11:34:53AM +0200, Kirill A. Shutemov wrote: > diff --git a/mm/migrate.c b/mm/migrate.c > index caadbe393aa2..beba72da5e33 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -686,6 +686,8 @@ void folio_migrate_flags(struct folio *newfolio, struct folio *folio) > folio_set_young(newfolio); > if (folio_test_idle(folio)) > folio_set_idle(newfolio); > + if (folio_test_readahead(folio)) > + folio_set_readahead(newfolio); > > folio_migrate_refs(newfolio, folio); > /* Not a problem with this patch ... but aren't we missing a test_dropbehind / set_dropbehind pair in this function? Or are we prohibited from migrating a folio with the dropbehind flag set somewhere? > +++ b/mm/swap.c > @@ -221,22 +221,6 @@ static void lru_move_tail(struct lruvec *lruvec, struct folio *folio) > __count_vm_events(PGROTATED, folio_nr_pages(folio)); > } > > -/* > - * Writeback is about to end against a folio which has been marked for > - * immediate reclaim. If it still appears to be reclaimable, move it > - * to the tail of the inactive list. > - * > - * folio_rotate_reclaimable() must disable IRQs, to prevent nasty races. > - */ > -void folio_rotate_reclaimable(struct folio *folio) > -{ > - if (folio_test_locked(folio) || folio_test_dirty(folio) || > - folio_test_unevictable(folio)) > - return; > - > - folio_batch_add_and_move(folio, lru_move_tail, true); > -} I think this is the last caller of lru_move_tail(), which means we can get rid of fbatches->lru_move_tail and the local_lock that protects it. Or did I miss something? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] mm: Remove PG_reclaim 2025-01-13 15:28 ` Matthew Wilcox @ 2025-01-14 8:30 ` Kirill A. Shutemov 2025-01-14 17:01 ` Yu Zhao 0 siblings, 1 reply; 19+ messages in thread From: Kirill A. Shutemov @ 2025-01-14 8:30 UTC (permalink / raw) To: Matthew Wilcox Cc: Andrew Morton, Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Mon, Jan 13, 2025 at 03:28:43PM +0000, Matthew Wilcox wrote: > On Mon, Jan 13, 2025 at 11:34:53AM +0200, Kirill A. Shutemov wrote: > > diff --git a/mm/migrate.c b/mm/migrate.c > > index caadbe393aa2..beba72da5e33 100644 > > --- a/mm/migrate.c > > +++ b/mm/migrate.c > > @@ -686,6 +686,8 @@ void folio_migrate_flags(struct folio *newfolio, struct folio *folio) > > folio_set_young(newfolio); > > if (folio_test_idle(folio)) > > folio_set_idle(newfolio); > > + if (folio_test_readahead(folio)) > > + folio_set_readahead(newfolio); > > > > folio_migrate_refs(newfolio, folio); > > /* > > Not a problem with this patch ... but aren't we missing a > test_dropbehind / set_dropbehind pair in this function? Or are we > prohibited from migrating a folio with the dropbehind flag set > somewhere? Hm. Good catch. We might want to drop clean dropbehind pages instead migrating them. But I am not sure about dirty ones. With slow backing storage it might be better for the system to migrate them instead of keeping them in the old place for potentially long time. Any opinions? > > +++ b/mm/swap.c > > @@ -221,22 +221,6 @@ static void lru_move_tail(struct lruvec *lruvec, struct folio *folio) > > __count_vm_events(PGROTATED, folio_nr_pages(folio)); > > } > > > > -/* > > - * Writeback is about to end against a folio which has been marked for > > - * immediate reclaim. If it still appears to be reclaimable, move it > > - * to the tail of the inactive list. > > - * > > - * folio_rotate_reclaimable() must disable IRQs, to prevent nasty races. > > - */ > > -void folio_rotate_reclaimable(struct folio *folio) > > -{ > > - if (folio_test_locked(folio) || folio_test_dirty(folio) || > > - folio_test_unevictable(folio)) > > - return; > > - > > - folio_batch_add_and_move(folio, lru_move_tail, true); > > -} > > I think this is the last caller of lru_move_tail(), which means we can > get rid of fbatches->lru_move_tail and the local_lock that protects it. > Or did I miss something? I see lru_move_tail() being used by lru_add_drain_cpu(). -- Kiryl Shutsemau / Kirill A. Shutemov ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 8/8] mm: Remove PG_reclaim 2025-01-14 8:30 ` Kirill A. Shutemov @ 2025-01-14 17:01 ` Yu Zhao 0 siblings, 0 replies; 19+ messages in thread From: Yu Zhao @ 2025-01-14 17:01 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Matthew Wilcox, Andrew Morton, Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Tue, Jan 14, 2025 at 1:30 AM Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote: > > On Mon, Jan 13, 2025 at 03:28:43PM +0000, Matthew Wilcox wrote: > > On Mon, Jan 13, 2025 at 11:34:53AM +0200, Kirill A. Shutemov wrote: > > > diff --git a/mm/migrate.c b/mm/migrate.c > > > index caadbe393aa2..beba72da5e33 100644 > > > --- a/mm/migrate.c > > > +++ b/mm/migrate.c > > > @@ -686,6 +686,8 @@ void folio_migrate_flags(struct folio *newfolio, struct folio *folio) > > > folio_set_young(newfolio); > > > if (folio_test_idle(folio)) > > > folio_set_idle(newfolio); > > > + if (folio_test_readahead(folio)) > > > + folio_set_readahead(newfolio); > > > > > > folio_migrate_refs(newfolio, folio); > > > /* > > > > Not a problem with this patch ... but aren't we missing a > > test_dropbehind / set_dropbehind pair in this function? Or are we > > prohibited from migrating a folio with the dropbehind flag set > > somewhere? > > Hm. Good catch. > > We might want to drop clean dropbehind pages instead migrating them. > > But I am not sure about dirty ones. With slow backing storage it might be > better for the system to migrate them instead of keeping them in the old > place for potentially long time. > > Any opinions? > > > > +++ b/mm/swap.c > > > @@ -221,22 +221,6 @@ static void lru_move_tail(struct lruvec *lruvec, struct folio *folio) > > > __count_vm_events(PGROTATED, folio_nr_pages(folio)); > > > } > > > > > > -/* > > > - * Writeback is about to end against a folio which has been marked for > > > - * immediate reclaim. If it still appears to be reclaimable, move it > > > - * to the tail of the inactive list. > > > - * > > > - * folio_rotate_reclaimable() must disable IRQs, to prevent nasty races. > > > - */ > > > -void folio_rotate_reclaimable(struct folio *folio) > > > -{ > > > - if (folio_test_locked(folio) || folio_test_dirty(folio) || > > > - folio_test_unevictable(folio)) > > > - return; > > > - > > > - folio_batch_add_and_move(folio, lru_move_tail, true); > > > -} > > > > I think this is the last caller of lru_move_tail(), which means we can > > get rid of fbatches->lru_move_tail and the local_lock that protects it. > > Or did I miss something? > > I see lru_move_tail() being used by lru_add_drain_cpu(). That can be deleted too, since you've already removed the producer to fbatches->lru_move_tail. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/8] mm: Remove PG_reclaim [not found] <20250113093453.1932083-1-kirill.shutemov@linux.intel.com> ` (5 preceding siblings ...) [not found] ` <20250113093453.1932083-9-kirill.shutemov@linux.intel.com> @ 2025-01-13 13:45 ` Matthew Wilcox 2025-01-13 14:07 ` Kirill A. Shutemov [not found] ` <20250113093453.1932083-4-kirill.shutemov@linux.intel.com> [not found] ` <20250113093453.1932083-5-kirill.shutemov@linux.intel.com> 8 siblings, 1 reply; 19+ messages in thread From: Matthew Wilcox @ 2025-01-13 13:45 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Andrew Morton, Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Mon, Jan 13, 2025 at 11:34:45AM +0200, Kirill A. Shutemov wrote: > Use PG_dropbehind instead of PG_reclaim and remove PG_reclaim. I was hoping we'd end up with the name PG_reclaim instead of the name PG_dropbehind. PG_reclaim is a better name for this functionality. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 0/8] mm: Remove PG_reclaim 2025-01-13 13:45 ` [PATCH 0/8] " Matthew Wilcox @ 2025-01-13 14:07 ` Kirill A. Shutemov 0 siblings, 0 replies; 19+ messages in thread From: Kirill A. Shutemov @ 2025-01-13 14:07 UTC (permalink / raw) To: Matthew Wilcox Cc: Andrew Morton, Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Mon, Jan 13, 2025 at 01:45:48PM +0000, Matthew Wilcox wrote: > On Mon, Jan 13, 2025 at 11:34:45AM +0200, Kirill A. Shutemov wrote: > > Use PG_dropbehind instead of PG_reclaim and remove PG_reclaim. > > I was hoping we'd end up with the name PG_reclaim instead of the name > PG_dropbehind. PG_reclaim is a better name for this functionality. I got burned by re-using the name with MAX_ORDER redefinition. I guess it is less risky as it is less used, but still... Anyway, it can be done with a patch on top of the patchset. We must get rid of current PG_reclaim first. -- Kiryl Shutsemau / Kirill A. Shutemov ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20250113093453.1932083-4-kirill.shutemov@linux.intel.com>]
* Re: [PATCH 3/8] mm/zswap: Use PG_dropbehind instead of PG_reclaim [not found] ` <20250113093453.1932083-4-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:06 ` David Hildenbrand 2025-01-13 16:10 ` Yosry Ahmed 1 sibling, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:06 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > The recently introduced PG_dropbehind allows for freeing folios > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > to be involved to get the folio freed. > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > zswap_writeback_entry(). > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/zswap.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 167ae641379f..c20bad0b0978 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1096,8 +1096,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry, > /* folio is up to date */ > folio_mark_uptodate(folio); > > - /* move it to the tail of the inactive list after end_writeback */ > - folio_set_reclaim(folio); > + /* free the folio after writeback */ > + folio_set_dropbehind(folio); Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/8] mm/zswap: Use PG_dropbehind instead of PG_reclaim [not found] ` <20250113093453.1932083-4-kirill.shutemov@linux.intel.com> 2025-01-13 10:06 ` [PATCH 3/8] mm/zswap: Use PG_dropbehind instead of PG_reclaim David Hildenbrand @ 2025-01-13 16:10 ` Yosry Ahmed 1 sibling, 0 replies; 19+ messages in thread From: Yosry Ahmed @ 2025-01-13 16:10 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Mon, Jan 13, 2025 at 1:35 AM Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote: > > The recently introduced PG_dropbehind allows for freeing folios > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > to be involved to get the folio freed. > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > zswap_writeback_entry(). > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Yosry Ahmed <yosryahmed@google.com> > --- > mm/zswap.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 167ae641379f..c20bad0b0978 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1096,8 +1096,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry, > /* folio is up to date */ > folio_mark_uptodate(folio); > > - /* move it to the tail of the inactive list after end_writeback */ > - folio_set_reclaim(folio); > + /* free the folio after writeback */ > + folio_set_dropbehind(folio); > > /* start writeback */ > __swap_writepage(folio, &wbc); > -- > 2.45.2 > ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <20250113093453.1932083-5-kirill.shutemov@linux.intel.com>]
* Re: [PATCH 4/8] mm/swap: Use PG_dropbehind instead of PG_reclaim [not found] ` <20250113093453.1932083-5-kirill.shutemov@linux.intel.com> @ 2025-01-13 10:07 ` David Hildenbrand 2025-01-13 16:17 ` Yosry Ahmed 1 sibling, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2025-01-13 10:07 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe Cc: Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yosry Ahmed, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On 13.01.25 10:34, Kirill A. Shutemov wrote: > The recently introduced PG_dropbehind allows for freeing folios > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > to be involved to get the folio freed. > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > lru_deactivate_file(). > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/swap.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/mm/swap.c b/mm/swap.c > index fc8281ef4241..4eb33b4804a8 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -562,14 +562,8 @@ static void lru_deactivate_file(struct lruvec *lruvec, struct folio *folio) > folio_clear_referenced(folio); > > if (folio_test_writeback(folio) || folio_test_dirty(folio)) { > - /* > - * Setting the reclaim flag could race with > - * folio_end_writeback() and confuse readahead. But the > - * race window is _really_ small and it's not a critical > - * problem. > - */ > lruvec_add_folio(lruvec, folio); > - folio_set_reclaim(folio); > + folio_set_dropbehind(folio); > } else { > /* > * The folio's writeback ended while it was in the batch. Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/8] mm/swap: Use PG_dropbehind instead of PG_reclaim [not found] ` <20250113093453.1932083-5-kirill.shutemov@linux.intel.com> 2025-01-13 10:07 ` [PATCH 4/8] mm/swap: " David Hildenbrand @ 2025-01-13 16:17 ` Yosry Ahmed 2025-01-14 8:12 ` Kirill A. Shutemov 1 sibling, 1 reply; 19+ messages in thread From: Yosry Ahmed @ 2025-01-13 16:17 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Mon, Jan 13, 2025 at 1:35 AM Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote: > > The recently introduced PG_dropbehind allows for freeing folios > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > to be involved to get the folio freed. > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > lru_deactivate_file(). > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > --- > mm/swap.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/mm/swap.c b/mm/swap.c > index fc8281ef4241..4eb33b4804a8 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -562,14 +562,8 @@ static void lru_deactivate_file(struct lruvec *lruvec, struct folio *folio) > folio_clear_referenced(folio); > > if (folio_test_writeback(folio) || folio_test_dirty(folio)) { > - /* > - * Setting the reclaim flag could race with > - * folio_end_writeback() and confuse readahead. But the > - * race window is _really_ small and it's not a critical > - * problem. > - */ > lruvec_add_folio(lruvec, folio); > - folio_set_reclaim(folio); > + folio_set_dropbehind(folio); > } else { > /* > * The folio's writeback ended while it was in the batch. Now there's a difference in behavior here depending on whether or not the folio is under writeback (or will be written back soon). If it is, we set PG_dropbehind to get it freed right after, but if writeback has already ended we put it on the tail of the LRU to be freed later. It's a bit counterintuitive to me that folios with pending writeback get freed faster than folios that completed their writeback already. Am I missing something? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/8] mm/swap: Use PG_dropbehind instead of PG_reclaim 2025-01-13 16:17 ` Yosry Ahmed @ 2025-01-14 8:12 ` Kirill A. Shutemov 2025-01-14 18:02 ` Yosry Ahmed 0 siblings, 1 reply; 19+ messages in thread From: Kirill A. Shutemov @ 2025-01-14 8:12 UTC (permalink / raw) To: Yosry Ahmed Cc: Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Mon, Jan 13, 2025 at 08:17:20AM -0800, Yosry Ahmed wrote: > On Mon, Jan 13, 2025 at 1:35 AM Kirill A. Shutemov > <kirill.shutemov@linux.intel.com> wrote: > > > > The recently introduced PG_dropbehind allows for freeing folios > > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > > to be involved to get the folio freed. > > > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > > lru_deactivate_file(). > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > --- > > mm/swap.c | 8 +------- > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > diff --git a/mm/swap.c b/mm/swap.c > > index fc8281ef4241..4eb33b4804a8 100644 > > --- a/mm/swap.c > > +++ b/mm/swap.c > > @@ -562,14 +562,8 @@ static void lru_deactivate_file(struct lruvec *lruvec, struct folio *folio) > > folio_clear_referenced(folio); > > > > if (folio_test_writeback(folio) || folio_test_dirty(folio)) { > > - /* > > - * Setting the reclaim flag could race with > > - * folio_end_writeback() and confuse readahead. But the > > - * race window is _really_ small and it's not a critical > > - * problem. > > - */ > > lruvec_add_folio(lruvec, folio); > > - folio_set_reclaim(folio); > > + folio_set_dropbehind(folio); > > } else { > > /* > > * The folio's writeback ended while it was in the batch. > > Now there's a difference in behavior here depending on whether or not > the folio is under writeback (or will be written back soon). If it is, > we set PG_dropbehind to get it freed right after, but if writeback has > already ended we put it on the tail of the LRU to be freed later. > > It's a bit counterintuitive to me that folios with pending writeback > get freed faster than folios that completed their writeback already. > Am I missing something? Yeah, it is strange. I think we can drop the writeback/dirty check. Set PG_dropbehind and put the page on the tail of LRU unconditionally. The check was required to avoid confusion with PG_readahead. Comment above the function is not valid anymore. But the folio that is still dirty under writeback will be freed faster as we get rid of the folio just after writeback is done while clean page can dangle on LRU for a while. I don't think we have any convenient place to free clean dropbehind page other than shrink_folio_list(). Or do we? Looking at shrink_folio_list(), I think we need to bypass page demotion for PG_dropbehind pages. -- Kiryl Shutsemau / Kirill A. Shutemov ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/8] mm/swap: Use PG_dropbehind instead of PG_reclaim 2025-01-14 8:12 ` Kirill A. Shutemov @ 2025-01-14 18:02 ` Yosry Ahmed 2025-01-15 4:28 ` Yu Zhao 0 siblings, 1 reply; 19+ messages in thread From: Yosry Ahmed @ 2025-01-14 18:02 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, Yu Zhao, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Tue, Jan 14, 2025 at 12:12 AM Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote: > > On Mon, Jan 13, 2025 at 08:17:20AM -0800, Yosry Ahmed wrote: > > On Mon, Jan 13, 2025 at 1:35 AM Kirill A. Shutemov > > <kirill.shutemov@linux.intel.com> wrote: > > > > > > The recently introduced PG_dropbehind allows for freeing folios > > > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > > > to be involved to get the folio freed. > > > > > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > > > lru_deactivate_file(). > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > --- > > > mm/swap.c | 8 +------- > > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > > > diff --git a/mm/swap.c b/mm/swap.c > > > index fc8281ef4241..4eb33b4804a8 100644 > > > --- a/mm/swap.c > > > +++ b/mm/swap.c > > > @@ -562,14 +562,8 @@ static void lru_deactivate_file(struct lruvec *lruvec, struct folio *folio) > > > folio_clear_referenced(folio); > > > > > > if (folio_test_writeback(folio) || folio_test_dirty(folio)) { > > > - /* > > > - * Setting the reclaim flag could race with > > > - * folio_end_writeback() and confuse readahead. But the > > > - * race window is _really_ small and it's not a critical > > > - * problem. > > > - */ > > > lruvec_add_folio(lruvec, folio); > > > - folio_set_reclaim(folio); > > > + folio_set_dropbehind(folio); > > > } else { > > > /* > > > * The folio's writeback ended while it was in the batch. > > > > Now there's a difference in behavior here depending on whether or not > > the folio is under writeback (or will be written back soon). If it is, > > we set PG_dropbehind to get it freed right after, but if writeback has > > already ended we put it on the tail of the LRU to be freed later. > > > > It's a bit counterintuitive to me that folios with pending writeback > > get freed faster than folios that completed their writeback already. > > Am I missing something? > > Yeah, it is strange. > > I think we can drop the writeback/dirty check. Set PG_dropbehind and put > the page on the tail of LRU unconditionally. The check was required to > avoid confusion with PG_readahead. > > Comment above the function is not valid anymore. My read is that we don't put dirty/writeback folios at the tail of the LRU because they cannot be freed immediately and we want to give them time to be written back before reclaim reaches them. So I don't think we want to change that and always put the pages at the tail. > > But the folio that is still dirty under writeback will be freed faster as > we get rid of the folio just after writeback is done while clean page can > dangle on LRU for a while. Yeah if we reuse PG_dropbehind then we cannot avoid folio_end_writeback() freeing the folio faster than clean ones. > > I don't think we have any convenient place to free clean dropbehind page > other than shrink_folio_list(). Or do we? Not sure tbh. FWIW I am not saying it's necessarily a bad thing to free dirty/writeback folios before clean ones when deactivated, it's just strange and a behavioral change from today that I wanted to point out. Perhaps that's the best we can do for now. > > Looking at shrink_folio_list(), I think we need to bypass page demotion > for PG_dropbehind pages. > > -- > Kiryl Shutsemau / Kirill A. Shutemov ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/8] mm/swap: Use PG_dropbehind instead of PG_reclaim 2025-01-14 18:02 ` Yosry Ahmed @ 2025-01-15 4:28 ` Yu Zhao 2025-01-15 4:31 ` Yu Zhao 0 siblings, 1 reply; 19+ messages in thread From: Yu Zhao @ 2025-01-15 4:28 UTC (permalink / raw) To: Yosry Ahmed Cc: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Tue, Jan 14, 2025 at 11:03 AM Yosry Ahmed <yosryahmed@google.com> wrote: > > On Tue, Jan 14, 2025 at 12:12 AM Kirill A. Shutemov > <kirill.shutemov@linux.intel.com> wrote: > > > > On Mon, Jan 13, 2025 at 08:17:20AM -0800, Yosry Ahmed wrote: > > > On Mon, Jan 13, 2025 at 1:35 AM Kirill A. Shutemov > > > <kirill.shutemov@linux.intel.com> wrote: > > > > > > > > The recently introduced PG_dropbehind allows for freeing folios > > > > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > > > > to be involved to get the folio freed. > > > > > > > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > > > > lru_deactivate_file(). > > > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > --- > > > > mm/swap.c | 8 +------- > > > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > > > > > diff --git a/mm/swap.c b/mm/swap.c > > > > index fc8281ef4241..4eb33b4804a8 100644 > > > > --- a/mm/swap.c > > > > +++ b/mm/swap.c > > > > @@ -562,14 +562,8 @@ static void lru_deactivate_file(struct lruvec *lruvec, struct folio *folio) > > > > folio_clear_referenced(folio); > > > > > > > > if (folio_test_writeback(folio) || folio_test_dirty(folio)) { > > > > - /* > > > > - * Setting the reclaim flag could race with > > > > - * folio_end_writeback() and confuse readahead. But the > > > > - * race window is _really_ small and it's not a critical > > > > - * problem. > > > > - */ > > > > lruvec_add_folio(lruvec, folio); > > > > - folio_set_reclaim(folio); > > > > + folio_set_dropbehind(folio); > > > > } else { > > > > /* > > > > * The folio's writeback ended while it was in the batch. > > > > > > Now there's a difference in behavior here depending on whether or not > > > the folio is under writeback (or will be written back soon). If it is, > > > we set PG_dropbehind to get it freed right after, but if writeback has > > > already ended we put it on the tail of the LRU to be freed later. > > > > > > It's a bit counterintuitive to me that folios with pending writeback > > > get freed faster than folios that completed their writeback already. > > > Am I missing something? > > > > Yeah, it is strange. > > > > I think we can drop the writeback/dirty check. Set PG_dropbehind and put > > the page on the tail of LRU unconditionally. The check was required to > > avoid confusion with PG_readahead. > > > > Comment above the function is not valid anymore. > > My read is that we don't put dirty/writeback folios at the tail of the > LRU because they cannot be freed immediately and we want to give them > time to be written back before reclaim reaches them. So I don't think > we want to change that and always put the pages at the tail. > > > > > But the folio that is still dirty under writeback will be freed faster as > > we get rid of the folio just after writeback is done while clean page can > > dangle on LRU for a while. > > Yeah if we reuse PG_dropbehind then we cannot avoid > folio_end_writeback() freeing the folio faster than clean ones. > > > > > I don't think we have any convenient place to free clean dropbehind page > > other than shrink_folio_list(). Or do we? > > Not sure tbh. FWIW I am not saying it's necessarily a bad thing to > free dirty/writeback folios before clean ones when deactivated, it's > just strange and a behavioral change from today that I wanted to point > out. Perhaps that's the best we can do for now. > > > > > Looking at shrink_folio_list(), I think we need to bypass page demotion > > for PG_dropbehind pages. I agree with Yosry. I don't think lru_deactivate_file() is still needed -- it was needed only because when truncation fails to free a dirty/writeback folio, page reclaim can do that quickly. For other conditions that mapping_evict_folio() returns 0, there isn't much page reclaim can do, and those conditions are not deactivate_file_folio() and lru_deactivate_file()'s intentions. So the following should be enough, and it's a lot cleaner : diff --git a/mm/truncate.c b/mm/truncate.c index e2e115adfbc5..12d2aa608517 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -486,7 +486,7 @@ unsigned long mapping_try_invalidate(struct address_space *mapping, * of interest and try to speed up its reclaim. */ if (!ret) { - deactivate_file_folio(folio); + folio_set_dropbehind(folio) /* Likely in the lru cache of a remote CPU */ if (nr_failed) (*nr_failed)++; Then we can drop deactivate_file_folio() and lru_deactivate_file(). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/8] mm/swap: Use PG_dropbehind instead of PG_reclaim 2025-01-15 4:28 ` Yu Zhao @ 2025-01-15 4:31 ` Yu Zhao 0 siblings, 0 replies; 19+ messages in thread From: Yu Zhao @ 2025-01-15 4:31 UTC (permalink / raw) To: Yosry Ahmed Cc: Kirill A. Shutemov, Andrew Morton, Matthew Wilcox (Oracle), Jens Axboe, Jason A. Donenfeld, Andi Shyti, Chengming Zhou, Christian Brauner, Christophe Leroy, Dan Carpenter, David Airlie, David Hildenbrand, Hao Ge, Jani Nikula, Johannes Weiner, Joonas Lahtinen, Josef Bacik, Masami Hiramatsu, Mathieu Desnoyers, Miklos Szeredi, Nhat Pham, Oscar Salvador, Ran Xiaokai, Rodrigo Vivi, Simona Vetter, Steven Rostedt, Tvrtko Ursulin, Vlastimil Babka, intel-gfx, dri-devel, linux-kernel, linux-fsdevel, linux-mm, linux-trace-kernel On Tue, Jan 14, 2025 at 9:28 PM Yu Zhao <yuzhao@google.com> wrote: > > On Tue, Jan 14, 2025 at 11:03 AM Yosry Ahmed <yosryahmed@google.com> wrote: > > > > On Tue, Jan 14, 2025 at 12:12 AM Kirill A. Shutemov > > <kirill.shutemov@linux.intel.com> wrote: > > > > > > On Mon, Jan 13, 2025 at 08:17:20AM -0800, Yosry Ahmed wrote: > > > > On Mon, Jan 13, 2025 at 1:35 AM Kirill A. Shutemov > > > > <kirill.shutemov@linux.intel.com> wrote: > > > > > > > > > > The recently introduced PG_dropbehind allows for freeing folios > > > > > immediately after writeback. Unlike PG_reclaim, it does not need vmscan > > > > > to be involved to get the folio freed. > > > > > > > > > > Instead of using folio_set_reclaim(), use folio_set_dropbehind() in > > > > > lru_deactivate_file(). > > > > > > > > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > --- > > > > > mm/swap.c | 8 +------- > > > > > 1 file changed, 1 insertion(+), 7 deletions(-) > > > > > > > > > > diff --git a/mm/swap.c b/mm/swap.c > > > > > index fc8281ef4241..4eb33b4804a8 100644 > > > > > --- a/mm/swap.c > > > > > +++ b/mm/swap.c > > > > > @@ -562,14 +562,8 @@ static void lru_deactivate_file(struct lruvec *lruvec, struct folio *folio) > > > > > folio_clear_referenced(folio); > > > > > > > > > > if (folio_test_writeback(folio) || folio_test_dirty(folio)) { > > > > > - /* > > > > > - * Setting the reclaim flag could race with > > > > > - * folio_end_writeback() and confuse readahead. But the > > > > > - * race window is _really_ small and it's not a critical > > > > > - * problem. > > > > > - */ > > > > > lruvec_add_folio(lruvec, folio); > > > > > - folio_set_reclaim(folio); > > > > > + folio_set_dropbehind(folio); > > > > > } else { > > > > > /* > > > > > * The folio's writeback ended while it was in the batch. > > > > > > > > Now there's a difference in behavior here depending on whether or not > > > > the folio is under writeback (or will be written back soon). If it is, > > > > we set PG_dropbehind to get it freed right after, but if writeback has > > > > already ended we put it on the tail of the LRU to be freed later. > > > > > > > > It's a bit counterintuitive to me that folios with pending writeback > > > > get freed faster than folios that completed their writeback already. > > > > Am I missing something? > > > > > > Yeah, it is strange. > > > > > > I think we can drop the writeback/dirty check. Set PG_dropbehind and put > > > the page on the tail of LRU unconditionally. The check was required to > > > avoid confusion with PG_readahead. > > > > > > Comment above the function is not valid anymore. > > > > My read is that we don't put dirty/writeback folios at the tail of the > > LRU because they cannot be freed immediately and we want to give them > > time to be written back before reclaim reaches them. So I don't think > > we want to change that and always put the pages at the tail. > > > > > > > > But the folio that is still dirty under writeback will be freed faster as > > > we get rid of the folio just after writeback is done while clean page can > > > dangle on LRU for a while. > > > > Yeah if we reuse PG_dropbehind then we cannot avoid > > folio_end_writeback() freeing the folio faster than clean ones. > > > > > > > > I don't think we have any convenient place to free clean dropbehind page > > > other than shrink_folio_list(). Or do we? > > > > Not sure tbh. FWIW I am not saying it's necessarily a bad thing to > > free dirty/writeback folios before clean ones when deactivated, it's > > just strange and a behavioral change from today that I wanted to point > > out. Perhaps that's the best we can do for now. > > > > > > > > Looking at shrink_folio_list(), I think we need to bypass page demotion > > > for PG_dropbehind pages. > > I agree with Yosry. I don't think lru_deactivate_file() is still > needed -- it was needed only because when truncation fails to free a > dirty/writeback folio, page reclaim can do that quickly. For other > conditions that mapping_evict_folio() returns 0, there isn't much page > reclaim can do, and those conditions are not deactivate_file_folio() > and lru_deactivate_file()'s intentions. So the following should be > enough, and it's a lot cleaner : > > diff --git a/mm/truncate.c b/mm/truncate.c > index e2e115adfbc5..12d2aa608517 100644 > --- a/mm/truncate.c > +++ b/mm/truncate.c > @@ -486,7 +486,7 @@ unsigned long mapping_try_invalidate(struct > address_space *mapping, > * of interest and try to speed up its reclaim. > */ > if (!ret) { > - deactivate_file_folio(folio); > + folio_set_dropbehind(folio) > /* Likely in the lru cache of a remote CPU */ > if (nr_failed) > (*nr_failed)++; > > Then we can drop deactivate_file_folio() and lru_deactivate_file(). And with the above and list_move_tail() removed, we can also remove lruvec_add_folio_tail(). ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-01-15 4:31 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20250113093453.1932083-1-kirill.shutemov@linux.intel.com>
[not found] ` <20250113093453.1932083-2-kirill.shutemov@linux.intel.com>
2025-01-13 10:05 ` [PATCH 1/8] drm/i915/gem: Convert __shmem_writeback() to folios David Hildenbrand
[not found] ` <20250113093453.1932083-3-kirill.shutemov@linux.intel.com>
2025-01-13 10:06 ` [PATCH 2/8] drm/i915/gem: Use PG_dropbehind instead of PG_reclaim David Hildenbrand
[not found] ` <20250113093453.1932083-6-kirill.shutemov@linux.intel.com>
2025-01-13 10:07 ` [PATCH 5/8] mm/vmscan: " David Hildenbrand
[not found] ` <20250113093453.1932083-7-kirill.shutemov@linux.intel.com>
2025-01-13 10:08 ` [PATCH 6/8] mm/vmscan: Use PG_dropbehind instead of PG_reclaim in shrink_folio_list() David Hildenbrand
[not found] ` <20250113093453.1932083-8-kirill.shutemov@linux.intel.com>
2025-01-13 10:09 ` [PATCH 7/8] mm/mglru: Check PG_dropcache instead of PG_reclaim in lru_gen_folio_seq() David Hildenbrand
[not found] ` <20250113093453.1932083-9-kirill.shutemov@linux.intel.com>
2025-01-13 10:11 ` [PATCH 8/8] mm: Remove PG_reclaim David Hildenbrand
2025-01-13 15:28 ` Matthew Wilcox
2025-01-14 8:30 ` Kirill A. Shutemov
2025-01-14 17:01 ` Yu Zhao
2025-01-13 13:45 ` [PATCH 0/8] " Matthew Wilcox
2025-01-13 14:07 ` Kirill A. Shutemov
[not found] ` <20250113093453.1932083-4-kirill.shutemov@linux.intel.com>
2025-01-13 10:06 ` [PATCH 3/8] mm/zswap: Use PG_dropbehind instead of PG_reclaim David Hildenbrand
2025-01-13 16:10 ` Yosry Ahmed
[not found] ` <20250113093453.1932083-5-kirill.shutemov@linux.intel.com>
2025-01-13 10:07 ` [PATCH 4/8] mm/swap: " David Hildenbrand
2025-01-13 16:17 ` Yosry Ahmed
2025-01-14 8:12 ` Kirill A. Shutemov
2025-01-14 18:02 ` Yosry Ahmed
2025-01-15 4:28 ` Yu Zhao
2025-01-15 4:31 ` Yu Zhao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox