From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B43F8D6554E for ; Tue, 26 Nov 2024 18:43:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B713E6B0092; Tue, 26 Nov 2024 13:43:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B21006B0099; Tue, 26 Nov 2024 13:43:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E8B06B009B; Tue, 26 Nov 2024 13:43:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7BEC16B0092 for ; Tue, 26 Nov 2024 13:43:01 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B78B7C13C5 for ; Tue, 26 Nov 2024 18:43:00 +0000 (UTC) X-FDA: 82829117952.06.EBCF856 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf25.hostedemail.com (Postfix) with ESMTP id 24CF4A000F for ; Tue, 26 Nov 2024 18:42:53 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=EU98cgwj; spf=none (imf25.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732646577; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iAMc31dd9MJkrpOFEbxZ+AE1N40iVg3cqLAXQKJHJ1A=; b=OGNf+RxeNKYktu09KnMYzSpVxRm8JS7X/Q2Jl1Qlz0w/IWsEjUUBifwUB0M0bTok+vp/kR 8R5Y+K+yRLNWuzIVnd3TOnFW587Aur4eip0j7J49o823VKQqEjKFB6XnjYqQq0aj4v92+c wke8Pe7PhAucyX13Qdt0i8h0WLQZadw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732646577; a=rsa-sha256; cv=none; b=XOAkEhdv3Ub5HAzHu8E1nMHCYHl4lOG57U7PlNI+IPubX4BBAWMcGcIBK+EYc5s++0oHqV ds8iqqtxa8McP40ldNdU8l3EySHdBoI2DQ4c+z5cPHgU/AR2jOw/+kebXCF+qG2FxYwEo5 kWJaQhL3FZ/lrrxVM7Qk+/5Ctu7nsKU= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=EU98cgwj; spf=none (imf25.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=iAMc31dd9MJkrpOFEbxZ+AE1N40iVg3cqLAXQKJHJ1A=; b=EU98cgwj9wlugyVXUEVULw592y QHqWpo0D00gREC10/6OCe1KDhwWIfabi1dmU4JzMcKTYLsfO78El+k4fmjHd+27tE3jCxh3XaXeLY GQVLlz+qGQDTkZ3Dx6cY3XqT4FMIXbvhQLN1TtzN6LVGm0cRHfH5TCCqW4hW+wbyHLXGS41FpdHQo Aou0aACX237d20dR8b3PfMcCXct58/RH1Rtm/+xbLQY/+C151Hzi6PeYfvc3P57XcX+3QK/WC4UA0 J9R96ppHhjCr/WG+/YbfpQliCcSswGvlE8exKBL2wEUllc/2Ueg0R8r+9KXALvmxBlp+8O30Sfrzs rO/FWAGA==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tG0WU-00000000Jfd-17G8; Tue, 26 Nov 2024 18:42:54 +0000 Date: Tue, 26 Nov 2024 18:42:54 +0000 From: Matthew Wilcox To: Anders Blomdell Cc: Jan Kara , Philippe Troin , Andrew Morton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, NeilBrown Subject: Re: Regression in NFS probably due to very large amounts of readahead Message-ID: References: <49648605-d800-4859-be49-624bbe60519d@gmail.com> <3b1d4265b384424688711a9259f98dec44c77848.camel@fifi.org> <4bb8bfe1-5de6-4b5d-af90-ab24848c772b@gmail.com> <20241126103719.bvd2umwarh26pmb3@quack3> <20241126150613.a4b57y2qmolapsuc@quack3> <569d0df0-71d5-4227-aa28-e57cd60bc9f1@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <569d0df0-71d5-4227-aa28-e57cd60bc9f1@gmail.com> X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 24CF4A000F X-Stat-Signature: 7ddcikyd61583nxamntgsjwx7w7oq3xc X-Rspam-User: X-HE-Tag: 1732646573-774551 X-HE-Meta: U2FsdGVkX18dBXCkgzms4P4U8fJrQc7VGfK5QEiDdYlRpgex29KmKCrxNilhGq/Kiwu3dyvRVnD+gHNN03YBFcMkrGfbSOzyNuQWML9zlNS1WEE3QiVis8bQPr8yp1Lcptt8k/dmBS8tiXnMZ9Kus40ATwtFGGNG7SnaZCqXy2kjcn+vVkKVioplGZrZnwBIzBqs4Y7LSgNplDd0WnA7m4Lthe9zcqSltkqD+Ymq62xKkw5yOEkEUJV1yu8fz3a+cHPlDvKlRb/5r/dHkrMESlCFcFnAPLRFkiYV/q9BjRDsuK7c5A+RT2MyFkz26NPDPaknKexwo0AfGvXsfKJe67+ddORPxwkQgO0flr95yt1S8KiROhiu2PIRUL5ot9yoYOTQwT7bIjFILU6Ya/IrRZE9I1zmR5dhkqHu0APepCeGTqhrBAaeCKYNxrUCnjbEDXim3x/4LupmJ1mEyc33unebRCjZ0vWQ28EoQivdoTdn4RCNxDS2iOWiyPMJJAp65CZyWJk0vzHSJeCpQbc4Jwkav2KNPRqR0/2eWNJMloGhq+W9piBFEg76JWfRndjeyog9mQuBh3urc10XafJ176ulRZhchgBdUZqm6AeQDJOc69u2HRGFArsxLEcvTpqmOabBx/g+YajdWtnN9yj9vcdD3CrhcJuxfhZWNDdHqaO0BhCjwoTiEdIhv0D+GwPpdwxSAsKE0Cm6iC1u5Z0aFKgXhOGGEpYzisy1H/xWfsILwk6Zz63/g5dvUNwh586AheLVkv9gyagYZGFleTaxXJ57ScgOo+aT+bkOQhlgZzjJ+t8EpTphN+J8H+bJk5IOhPzrj7csfvdsS76glTK0hT1vEJ5jdv0+wkdvyD8dHuS7NAqz7EykKei4fSpLHePGQdmKxUGdXKKrfMVXpUUK4YBBthTZZze84ZES1MQiBPuNT2FHiIof001Dp8UpRr1vGiVxCjuVfFzjBt+97jo FFuimqKl oxowVchDfE4XrbUlQtjD7RFAOnLaINDaFq3IJ5xzglPvxsR/MyMvtl30W5OmbDI145647jeG4nabsSGtXaR82tm3K7SVhhkTzh8iz6hMK3jQ/ZLes7BQBLR/lCedTdV9lkbWhjy0oudLbjQhn0h+jjtOL9+J29GI3lBQWGWL3Sw1DOtbCUIlBibng49TSA3eM+9jPBo0fsjsX3GrOaXqrN1E5YciOkuNB+w96b1rTdzVri3Pq7cfIy0BYas1yz8b93F6Gl/HMMGw2Zvh3YK5U+0v77NRVIgVbWXEFvQ+b4GJ42Bs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 26, 2024 at 06:26:13PM +0100, Anders Blomdell wrote: > On 2024-11-26 17:55, Matthew Wilcox wrote: > > On Tue, Nov 26, 2024 at 04:28:04PM +0100, Anders Blomdell wrote: > > > On 2024-11-26 16:06, Jan Kara wrote: > > > > Hum, checking the history the update of ra->size has been added by Neil two > > > > years ago in 9fd472af84ab ("mm: improve cleanup when ->readpages doesn't > > > > process all pages"). Neil, the changelog seems as there was some real > > > > motivation behind updating of ra->size in read_pages(). What was it? Now I > > > > somewhat disagree with reducing ra->size in read_pages() because it seems > > > > like a wrong place to do that and if we do need something like that, > > > > readahead window sizing logic should rather be changed to take that into > > > > account? But it all depends on what was the real rationale behind reducing > > > > ra->size in read_pages()... > > > > > > My (rather limited) understanding of the patch is that it was intended to read those pages > > > that didn't get read because the allocation of a bigger folio failed, while not redoing what > > > readpages already did; how it was actually going to accomplish that is still unclear to me, > > > but I even don't even quite understand the comment... > > > > > > /* > > > * If there were already pages in the page cache, then we may have > > > * left some gaps. Let the regular readahead code take care of this > > > * situation. > > > */ > > > > > > the reason for an unchanged async_size is also beyond my understanding. > > > > This isn't because we couldn't allocate a folio, this is when we > > allocated folios, tried to read them and we failed to submit the I/O. > > This is a pretty rare occurrence under normal conditions. > > I beg to differ, the code is reached when there is > no folio support or ra->size < 4 (not considered in > this discussion) or falling throug when !err, err > is set by: > > err = ra_alloc_folio(ractl, index, mark, order, gfp); > if (err) > break; > > isn't the reading done by: > > read_pages(ractl); > > which does not set err! You're misunderstanding. Yes, read_pages() is called when we fail to allocate a fresh folio; either because there's already one in the page cache, or because -ENOMEM (or if we raced to install one), but it's also called when all folios are normally allocated. Here: /* * Now start the IO. We ignore I/O errors - if the folio is not * uptodate then the caller will launch read_folio again, and * will then handle the error. */ read_pages(ractl); So at the point that read_pages() is called, all folios that ractl describes are present in the page cache, locked and !uptodate. After calling aops->readahead() in read_pages(), most filesystems will have consumed all folios described by ractl. It seems that NFS is choosing not to submit some folios, so rather than leave them sitting around in the page cache, Neil decided that we should remove them from the page cache.