* [PATCH 0/3 v2] improve fadvise(POSIX_FADV_WILLNEED) with large folio
@ 2025-12-02 1:30 Jaegeuk Kim
2025-12-02 1:30 ` [PATCH 1/3] mm/readahead: fix the broken readahead for POSIX_FADV_WILLNEED Jaegeuk Kim
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Jaegeuk Kim @ 2025-12-02 1:30 UTC (permalink / raw)
To: linux-kernel, linux-f2fs-devel, linux-mm, Matthew Wilcox; +Cc: Jaegeuk Kim
This patch series aims to improve fadvise(POSIX_FADV_WILLNEED). The first patch
fixes the broken logic which was not reading the entire range ahead, and second
patch converts the readahead function to adopt large folio, and the thrid one
bumps up the folio order for high-order page allocation accordingly.
Jaegeuk Kim (3):
mm/readahead: fix the broken readahead for POSIX_FADV_WILLNEED
mm/readahead: use page_cache_sync_ra for FADVISE_FAV_WILLNEED
mm/readahead: try to allocate high order pages for
FADVISE_FAV_WILLNEED
mm/readahead.c | 43 +++++++++++++++++++++++++------------------
1 file changed, 25 insertions(+), 18 deletions(-)
--
2.52.0.107.ga0afd4fd5b-goog
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/3] mm/readahead: fix the broken readahead for POSIX_FADV_WILLNEED 2025-12-02 1:30 [PATCH 0/3 v2] improve fadvise(POSIX_FADV_WILLNEED) with large folio Jaegeuk Kim @ 2025-12-02 1:30 ` Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 2/3] mm/readahead: use page_cache_sync_ra for FADVISE_FAV_WILLNEED Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 3/3] mm/readahead: try to allocate high order pages " Jaegeuk Kim 2 siblings, 0 replies; 7+ messages in thread From: Jaegeuk Kim @ 2025-12-02 1:30 UTC (permalink / raw) To: linux-kernel, linux-f2fs-devel, linux-mm, Matthew Wilcox; +Cc: Jaegeuk Kim This patch fixes the broken readahead flow for POSIX_FADV_WILLNEED, where the problem is, in force_page_cache_ra(nr_to_read), nr_to_read is cut by the below code. max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); nr_to_read = min_t(unsigned long, nr_to_read, max_pages); IOWs, we are not able to read ahead larger than the above max_pages which is most likely the range of 2MB and 16MB. Note, it doesn't make sense to set ra->ra_pages to the entire file size. Instead, let's fix this logic. Before: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:4294967296, advise:3 page_cache_ra_unbounded: dev=252:16 ino=e index=0 nr_to_read=512 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=512 nr_to_read=512 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=1024 nr_to_read=512 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=1536 nr_to_read=512 lookahead_size=0 After: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:4294967296, advise:3 page_cache_ra_unbounded: dev=252:16 ino=e index=0 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=2048 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=4096 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=6144 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=8192 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=10240 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=12288 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=14336 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=16384 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=18432 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=20480 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=22528 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=24576 nr_to_read=2048 lookahead_size=0 ... page_cache_ra_unbounded: dev=252:16 ino=e index=1042432 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=1044480 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=1046528 nr_to_read=2048 lookahead_size=0 Cc: linux-mm@kvack.org Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- mm/readahead.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 3a4b5d58eeb6..e88425ce06f7 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -311,7 +311,7 @@ EXPORT_SYMBOL_GPL(page_cache_ra_unbounded); * behaviour which would occur if page allocations are causing VM writeback. * We really don't want to intermingle reads and writes like that. */ -static void do_page_cache_ra(struct readahead_control *ractl, +static int do_page_cache_ra(struct readahead_control *ractl, unsigned long nr_to_read, unsigned long lookahead_size) { struct inode *inode = ractl->mapping->host; @@ -320,45 +320,42 @@ static void do_page_cache_ra(struct readahead_control *ractl, pgoff_t end_index; /* The last page we want to read */ if (isize == 0) - return; + return -EINVAL; end_index = (isize - 1) >> PAGE_SHIFT; if (index > end_index) - return; + return -EINVAL; /* Don't read past the page containing the last byte of the file */ if (nr_to_read > end_index - index) nr_to_read = end_index - index + 1; page_cache_ra_unbounded(ractl, nr_to_read, lookahead_size); + return 0; } /* - * Chunk the readahead into 2 megabyte units, so that we don't pin too much - * memory at once. + * Chunk the readahead per the block device capacity, and read all nr_to_read. */ void force_page_cache_ra(struct readahead_control *ractl, unsigned long nr_to_read) { struct address_space *mapping = ractl->mapping; - struct file_ra_state *ra = ractl->ra; struct backing_dev_info *bdi = inode_to_bdi(mapping->host); - unsigned long max_pages; + unsigned long this_chunk; if (unlikely(!mapping->a_ops->read_folio && !mapping->a_ops->readahead)) return; /* - * If the request exceeds the readahead window, allow the read to - * be up to the optimal hardware IO size + * Consider the optimal hardware IO size for readahead chunk. */ - max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); - nr_to_read = min_t(unsigned long, nr_to_read, max_pages); + this_chunk = max_t(unsigned long, bdi->io_pages, ractl->ra->ra_pages); + while (nr_to_read) { - unsigned long this_chunk = (2 * 1024 * 1024) / PAGE_SIZE; + this_chunk = min_t(unsigned long, this_chunk, nr_to_read); - if (this_chunk > nr_to_read) - this_chunk = nr_to_read; - do_page_cache_ra(ractl, this_chunk, 0); + if (do_page_cache_ra(ractl, this_chunk, 0)) + break; nr_to_read -= this_chunk; } -- 2.52.0.107.ga0afd4fd5b-goog ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] mm/readahead: use page_cache_sync_ra for FADVISE_FAV_WILLNEED 2025-12-02 1:30 [PATCH 0/3 v2] improve fadvise(POSIX_FADV_WILLNEED) with large folio Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 1/3] mm/readahead: fix the broken readahead for POSIX_FADV_WILLNEED Jaegeuk Kim @ 2025-12-02 1:30 ` Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 3/3] mm/readahead: try to allocate high order pages " Jaegeuk Kim 2 siblings, 0 replies; 7+ messages in thread From: Jaegeuk Kim @ 2025-12-02 1:30 UTC (permalink / raw) To: linux-kernel, linux-f2fs-devel, linux-mm, Matthew Wilcox; +Cc: Jaegeuk Kim This patch replaces page_cache_ra_unbounded() with page_cache_sync_ra() in fadvise(FADVISE_FAV_WILLNEED) to support the large folio. Before: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:4294967296, advise:3 page_cache_ra_unbounded: dev=252:16 ino=e index=0 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=2048 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=4096 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=6144 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=8192 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=10240 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=12288 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=14336 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=16384 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=18432 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=20480 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=22528 nr_to_read=2048 lookahead_size=0 page_cache_ra_unbounded: dev=252:16 ino=e index=24576 nr_to_read=2048 lookahead_size=0 ... page_cache_ra_unbounded: dev=252:16 ino=e index=1042432 nr_to_read=2048 lookahead_size=0 Note, this is all order-zero page allocation. After: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:536870912, advise:3 page_cache_sync_ra: dev=252:16 ino=e index=0 req_count=2048 order=0 size=0 async_size=0 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=0 order=0 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=2048 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=2048 nr_to_read=2048 lookahead_size=0 page_cache_sync_ra: dev=252:16 ino=e index=4096 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=4096 nr_to_read=2048 lookahead_size=0 page_cache_sync_ra: dev=252:16 ino=e index=6144 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=6144 nr_to_read=2048 lookahead_size=0 ... page_cache_sync_ra: dev=252:16 ino=e index=129024 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=129024 nr_to_read=2048 lookahead_size=0 Cc: linux-mm@kvack.org Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- mm/readahead.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index e88425ce06f7..54c78f8276fe 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -340,6 +340,7 @@ void force_page_cache_ra(struct readahead_control *ractl, unsigned long nr_to_read) { struct address_space *mapping = ractl->mapping; + struct inode *inode = mapping->host; struct backing_dev_info *bdi = inode_to_bdi(mapping->host); unsigned long this_chunk; @@ -352,11 +353,19 @@ void force_page_cache_ra(struct readahead_control *ractl, this_chunk = max_t(unsigned long, bdi->io_pages, ractl->ra->ra_pages); while (nr_to_read) { - this_chunk = min_t(unsigned long, this_chunk, nr_to_read); + unsigned long index = readahead_index(ractl); + pgoff_t end_index = (i_size_read(inode) - 1) >> PAGE_SHIFT; - if (do_page_cache_ra(ractl, this_chunk, 0)) + if (index > end_index) break; + if (nr_to_read > end_index - index) + nr_to_read = end_index - index + 1; + + this_chunk = min_t(unsigned long, this_chunk, nr_to_read); + + page_cache_sync_ra(ractl, this_chunk); + nr_to_read -= this_chunk; } } @@ -573,7 +582,7 @@ void page_cache_sync_ra(struct readahead_control *ractl, /* be dumb */ if (do_forced_ra) { - force_page_cache_ra(ractl, req_count); + do_page_cache_ra(ractl, req_count, 0); return; } -- 2.52.0.107.ga0afd4fd5b-goog ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/3] mm/readahead: try to allocate high order pages for FADVISE_FAV_WILLNEED 2025-12-02 1:30 [PATCH 0/3 v2] improve fadvise(POSIX_FADV_WILLNEED) with large folio Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 1/3] mm/readahead: fix the broken readahead for POSIX_FADV_WILLNEED Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 2/3] mm/readahead: use page_cache_sync_ra for FADVISE_FAV_WILLNEED Jaegeuk Kim @ 2025-12-02 1:30 ` Jaegeuk Kim 2025-12-02 22:56 ` Matthew Wilcox 2025-12-03 23:25 ` [PATCH 3/3 v2] " Jaegeuk Kim 2 siblings, 2 replies; 7+ messages in thread From: Jaegeuk Kim @ 2025-12-02 1:30 UTC (permalink / raw) To: linux-kernel, linux-f2fs-devel, linux-mm, Matthew Wilcox; +Cc: Jaegeuk Kim This patch assigns the max folio order for readahead. After applying this patch, it starts with high-order page allocation successfully as shown in the below traces. Before: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:536870912, advise:3 page_cache_sync_ra: dev=252:16 ino=e index=0 req_count=2048 order=0 size=0 async_size=0 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=0 order=0 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=2048 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=2048 nr_to_read=2048 lookahead_size=0 page_cache_sync_ra: dev=252:16 ino=e index=4096 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=4096 nr_to_read=2048 lookahead_size=0 page_cache_sync_ra: dev=252:16 ino=e index=6144 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=6144 nr_to_read=2048 lookahead_size=0 ... page_cache_sync_ra: dev=252:16 ino=e index=129024 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=129024 nr_to_read=2048 lookahead_size=0 After: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:536870912, advise:3 page_cache_sync_ra: dev=252:16 ino=e index=0 req_count=2048 order=0 size=0 async_size=0 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=0 order=9 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=2048 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=2048 order=9 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=4096 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=4096 order=9 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=6144 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=6144 order=9 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=8192 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 ... page_cache_sync_ra: dev=252:16 ino=e index=129024 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=129024 order=9 size=2048 async_size=1024 ra_pages=2048 Cc: linux-mm@kvack.org Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- mm/readahead.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 54c78f8276fe..cfc63f7d5e81 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -593,7 +593,8 @@ void page_cache_sync_ra(struct readahead_control *ractl, * trivial case: (index - prev_index) == 1 * unaligned reads: (index - prev_index) == 0 */ - if (!index || req_count > max_pages || index - prev_index <= 1UL) { + if (!index || req_count > max_pages || index - prev_index <= 1UL || + mapping_large_folio_support(ractl->mapping)) { ra->start = index; ra->size = get_init_ra_size(req_count, max_pages); ra->async_size = ra->size > req_count ? ra->size - req_count : @@ -627,7 +628,7 @@ void page_cache_sync_ra(struct readahead_control *ractl, ra->size = min(contig_count + req_count, max_pages); ra->async_size = 1; readit: - ra->order = 0; + ra->order = mapping_max_folio_order(ractl->mapping); ractl->_index = ra->start; page_cache_ra_order(ractl, ra); } -- 2.52.0.107.ga0afd4fd5b-goog ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] mm/readahead: try to allocate high order pages for FADVISE_FAV_WILLNEED 2025-12-02 1:30 ` [PATCH 3/3] mm/readahead: try to allocate high order pages " Jaegeuk Kim @ 2025-12-02 22:56 ` Matthew Wilcox 2025-12-03 19:04 ` Jaegeuk Kim 2025-12-03 23:25 ` [PATCH 3/3 v2] " Jaegeuk Kim 1 sibling, 1 reply; 7+ messages in thread From: Matthew Wilcox @ 2025-12-02 22:56 UTC (permalink / raw) To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel, linux-mm On Tue, Dec 02, 2025 at 01:30:13AM +0000, Jaegeuk Kim wrote: > @@ -627,7 +628,7 @@ void page_cache_sync_ra(struct readahead_control *ractl, > ra->size = min(contig_count + req_count, max_pages); > ra->async_size = 1; > readit: > - ra->order = 0; > + ra->order = mapping_max_folio_order(ractl->mapping); > ractl->_index = ra->start; > page_cache_ra_order(ractl, ra); > } I suspect this is in the wrong place, but I'm on holiday and not going to go spelunking through the readahead code looking for the right place. Also, going directly to max folio order is wrong, we should use the same approach as the write order code, encapsulated in filemap_get_order(). See 4f6617011910 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] mm/readahead: try to allocate high order pages for FADVISE_FAV_WILLNEED 2025-12-02 22:56 ` Matthew Wilcox @ 2025-12-03 19:04 ` Jaegeuk Kim 0 siblings, 0 replies; 7+ messages in thread From: Jaegeuk Kim @ 2025-12-03 19:04 UTC (permalink / raw) To: Matthew Wilcox; +Cc: linux-kernel, linux-f2fs-devel, linux-mm On 12/02, Matthew Wilcox wrote: > On Tue, Dec 02, 2025 at 01:30:13AM +0000, Jaegeuk Kim wrote: > > @@ -627,7 +628,7 @@ void page_cache_sync_ra(struct readahead_control *ractl, > > ra->size = min(contig_count + req_count, max_pages); > > ra->async_size = 1; > > readit: > > - ra->order = 0; > > + ra->order = mapping_max_folio_order(ractl->mapping); > > ractl->_index = ra->start; > > page_cache_ra_order(ractl, ra); > > } > > I suspect this is in the wrong place, but I'm on holiday and not going > to go spelunking through the readahead code looking for the right place. > > Also, going directly to max folio order is wrong, we should use the same > approach as the write order code, encapsulated in filemap_get_order(). > See 4f6617011910 It seems the key is page_cache_ra_order() which allocates pages by ra_alloc_folio() given ra->order. FWIW, madvise() and fault() readahead takes page_cache_async_ra(), while fadvise() takes page_cache_sync_ra(). And, the former one has a logic to bump up the ra->order += 2 by f838ddf8cef5. I think it'd make sense to match that behavior? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3 v2] mm/readahead: try to allocate high order pages for FADVISE_FAV_WILLNEED 2025-12-02 1:30 ` [PATCH 3/3] mm/readahead: try to allocate high order pages " Jaegeuk Kim 2025-12-02 22:56 ` Matthew Wilcox @ 2025-12-03 23:25 ` Jaegeuk Kim 1 sibling, 0 replies; 7+ messages in thread From: Jaegeuk Kim @ 2025-12-03 23:25 UTC (permalink / raw) To: linux-kernel, linux-f2fs-devel, linux-mm, Matthew Wilcox This patch assigns the max folio order for readahead. After applying this patch, it starts with high-order page allocation successfully as shown in the below traces. Before: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:536870912, advise:3 page_cache_sync_ra: dev=252:16 ino=e index=0 req_count=2048 order=0 size=0 async_size=0 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=0 order=0 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=2048 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=2048 nr_to_read=2048 lookahead_size=0 page_cache_sync_ra: dev=252:16 ino=e index=4096 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=4096 nr_to_read=2048 lookahead_size=0 page_cache_sync_ra: dev=252:16 ino=e index=6144 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=6144 nr_to_read=2048 lookahead_size=0 ... page_cache_sync_ra: dev=252:16 ino=e index=129024 req_count=2048 order=0 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_unbounded: dev=252:16 ino=e index=129024 nr_to_read=2048 lookahead_size=0 After: f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:536870912, advise:3 page_cache_sync_ra: dev=252:16 ino=e index=0 req_count=2048 order=0 size=0 async_size=0 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=0 order=2 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=2048 req_count=2048 order=2 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=2048 order=4 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=4096 req_count=2048 order=4 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=4096 order=6 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=6144 req_count=2048 order=6 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=6144 order=8 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=8192 req_count=2048 order=8 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=8192 order=10 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=10240 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=10240 order=11 size=2048 async_size=1024 ra_pages=2048 ... page_cache_ra_order: dev=252:16 ino=e index=126976 order=11 size=2048 async_size=1024 ra_pages=2048 page_cache_sync_ra: dev=252:16 ino=e index=129024 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=129024 order=11 size=2048 async_size=1024 ra_pages=2048 page_cache_async_ra: dev=252:16 ino=e index=1024 req_count=2048 order=9 size=2048 async_size=1024 ra_pages=2048 mmap_miss=0 prev_pos=-1 For comparion, this is the trace of madvise(MADV_POPULATE_READ) which bumps up the order by 2. page_cache_ra_order: dev=252:16 ino=e index=0 order=0 size=2048 async_size=512 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 0, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: MAJOR|RETRY page_cache_async_ra: dev=252:16 ino=e index=1536 req_count=2048 order=0 size=2048 async_size=512 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=2048 order=2 size=2048 async_size=2048 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 1536, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY page_cache_async_ra: dev=252:16 ino=e index=2048 req_count=2048 order=2 size=2048 async_size=2048 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=4096 order=4 size=2048 async_size=2048 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 2048, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY page_cache_async_ra: dev=252:16 ino=e index=4096 req_count=2048 order=4 size=2048 async_size=2048 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=6144 order=6 size=2048 async_size=2048 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 4096, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY page_cache_async_ra: dev=252:16 ino=e index=6144 req_count=2048 order=6 size=2048 async_size=2048 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=8192 order=8 size=2048 async_size=2048 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 6144, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY page_cache_async_ra: dev=252:16 ino=e index=8192 req_count=2048 order=8 size=2048 async_size=2048 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=10240 order=10 size=2048 async_size=2048 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 8192, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY page_cache_async_ra: dev=252:16 ino=e index=10240 req_count=2048 order=9 size=2048 async_size=2048 ra_pages=2048 mmap_miss=0 prev_pos=-1 ... f2fs_filemap_fault: dev = (252,16), ino = 14, index = 518144, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY page_cache_async_ra: dev=252:16 ino=e index=520192 req_count=2048 order=9 size=2048 async_size=2048 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=522240 order=11 size=2048 async_size=2048 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 520192, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY page_cache_async_ra: dev=252:16 ino=e index=522240 req_count=2048 order=9 size=2048 async_size=2048 ra_pages=2048 mmap_miss=0 prev_pos=-1 page_cache_ra_order: dev=252:16 ino=e index=524288 order=11 size=2048 async_size=2048 ra_pages=2048 f2fs_filemap_fault: dev = (252,16), ino = 14, index = 522240, flags: WRITE|KILLABLE|USER|REMOTE|0x8082000, ret: RETRY Cc: linux-mm@kvack.org Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> --- Change log from v1: - take the same madvise() behavior which bumps up ra->order by 2. mm/readahead.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index 54c78f8276fe..61a469117209 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -593,7 +593,8 @@ void page_cache_sync_ra(struct readahead_control *ractl, * trivial case: (index - prev_index) == 1 * unaligned reads: (index - prev_index) == 0 */ - if (!index || req_count > max_pages || index - prev_index <= 1UL) { + if (!index || req_count > max_pages || index - prev_index <= 1UL || + mapping_large_folio_support(ractl->mapping)) { ra->start = index; ra->size = get_init_ra_size(req_count, max_pages); ra->async_size = ra->size > req_count ? ra->size - req_count : @@ -627,7 +628,7 @@ void page_cache_sync_ra(struct readahead_control *ractl, ra->size = min(contig_count + req_count, max_pages); ra->async_size = 1; readit: - ra->order = 0; + ra->order += 2; ractl->_index = ra->start; page_cache_ra_order(ractl, ra); } -- 2.52.0.223.gf5cc29aaa4-goog ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-12-03 23:25 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-12-02 1:30 [PATCH 0/3 v2] improve fadvise(POSIX_FADV_WILLNEED) with large folio Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 1/3] mm/readahead: fix the broken readahead for POSIX_FADV_WILLNEED Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 2/3] mm/readahead: use page_cache_sync_ra for FADVISE_FAV_WILLNEED Jaegeuk Kim 2025-12-02 1:30 ` [PATCH 3/3] mm/readahead: try to allocate high order pages " Jaegeuk Kim 2025-12-02 22:56 ` Matthew Wilcox 2025-12-03 19:04 ` Jaegeuk Kim 2025-12-03 23:25 ` [PATCH 3/3 v2] " Jaegeuk Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox