More like me not reading the comments properly, sorry. What I thought I said, was that the problematic code in the call to do_page_cache_ra was reached when the folio alloction returned an error. Sorry for not being clear, and thanks for your patience. /Anders On Tue, 26 Nov 2024, 19:42 Matthew Wilcox, wrote: > On Tue, Nov 26, 2024 at 06:26:13PM +0100, Anders Blomdell wrote: > > On 2024-11-26 17:55, Matthew Wilcox wrote: > > > On Tue, Nov 26, 2024 at 04:28:04PM +0100, Anders Blomdell wrote: > > > > On 2024-11-26 16:06, Jan Kara wrote: > > > > > Hum, checking the history the update of ra->size has been added by > Neil two > > > > > years ago in 9fd472af84ab ("mm: improve cleanup when ->readpages > doesn't > > > > > process all pages"). Neil, the changelog seems as there was some > real > > > > > motivation behind updating of ra->size in read_pages(). What was > it? Now I > > > > > somewhat disagree with reducing ra->size in read_pages() because > it seems > > > > > like a wrong place to do that and if we do need something like > that, > > > > > readahead window sizing logic should rather be changed to take > that into > > > > > account? But it all depends on what was the real rationale behind > reducing > > > > > ra->size in read_pages()... > > > > > > > > My (rather limited) understanding of the patch is that it was > intended to read those pages > > > > that didn't get read because the allocation of a bigger folio > failed, while not redoing what > > > > readpages already did; how it was actually going to accomplish that > is still unclear to me, > > > > but I even don't even quite understand the comment... > > > > > > > > /* > > > > * If there were already pages in the page cache, then we may have > > > > * left some gaps. Let the regular readahead code take care of > this > > > > * situation. > > > > */ > > > > > > > > the reason for an unchanged async_size is also beyond my > understanding. > > > > > > This isn't because we couldn't allocate a folio, this is when we > > > allocated folios, tried to read them and we failed to submit the I/O. > > > This is a pretty rare occurrence under normal conditions. > > > > I beg to differ, the code is reached when there is > > no folio support or ra->size < 4 (not considered in > > this discussion) or falling throug when !err, err > > is set by: > > > > err = ra_alloc_folio(ractl, index, mark, order, gfp); > > if (err) > > break; > > > > isn't the reading done by: > > > > read_pages(ractl); > > > > which does not set err! > > You're misunderstanding. Yes, read_pages() is called when we fail to > allocate a fresh folio; either because there's already one in the > page cache, or because -ENOMEM (or if we raced to install one), but > it's also called when all folios are normally allocated. Here: > > /* > * Now start the IO. We ignore I/O errors - if the folio is not > * uptodate then the caller will launch read_folio again, and > * will then handle the error. > */ > read_pages(ractl); > > So at the point that read_pages() is called, all folios that ractl > describes are present in the page cache, locked and !uptodate. > > After calling aops->readahead() in read_pages(), most filesystems will > have consumed all folios described by ractl. It seems that NFS is > choosing not to submit some folios, so rather than leave them sitting > around in the page cache, Neil decided that we should remove them from > the page cache. >