From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD4B1D609AD for ; Wed, 27 Nov 2024 07:55:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A8696B0089; Wed, 27 Nov 2024 02:55:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0819B6B008C; Wed, 27 Nov 2024 02:55:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E89626B0092; Wed, 27 Nov 2024 02:55:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CB41E6B0089 for ; Wed, 27 Nov 2024 02:55:20 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 561D380D26 for ; Wed, 27 Nov 2024 07:55:20 +0000 (UTC) X-FDA: 82831114632.22.45BF308 Received: from mail-lf1-f52.google.com (mail-lf1-f52.google.com [209.85.167.52]) by imf15.hostedemail.com (Postfix) with ESMTP id 9B590A001B for ; Wed, 27 Nov 2024 07:55:12 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="fDhHI/zE"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of anders.blomdell@gmail.com designates 209.85.167.52 as permitted sender) smtp.mailfrom=anders.blomdell@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732694115; a=rsa-sha256; cv=none; b=6XIPpwvV4Nl8MFwT2kmv2asBaTypqq9ebuqHdiKWAmQv21Z+eoQ1TvN7zzzlrc8exo1jqT kOqU88aHUSxwAfE3eAT2bPhX5rLYYsfuGZflK+0Dc0Am18ACY1l20YCC14xkNWE6TCX+T2 j3rhSG0+Fo20eKeUnEgWhZKJv7zXnv4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="fDhHI/zE"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of anders.blomdell@gmail.com designates 209.85.167.52 as permitted sender) smtp.mailfrom=anders.blomdell@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732694115; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yGWQ1sChWGYzmBoNIP5bk9GE7YFe+i1tuP3FVoV5U5Y=; b=LcZ8ChSBpoT12lC82vVq8FYN2kZKxC3CcV0JDNpSCLODhaGKPxLnxRLcU39aghXP5kuoSK FQydHcU8DQYRbYF6MD1ZgoXzp+5AXjEPu9Z3GlFSqsBjV/xiFKmamL/tLr5JR6Q+00FTSr 9FrqITbVfI1CRbaAYeRELGSA0emWBlU= Received: by mail-lf1-f52.google.com with SMTP id 2adb3069b0e04-53df119675dso358268e87.0 for ; Tue, 26 Nov 2024 23:55:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732694116; x=1733298916; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=yGWQ1sChWGYzmBoNIP5bk9GE7YFe+i1tuP3FVoV5U5Y=; b=fDhHI/zEjR2ZUxM4B5ZhS57erICPcNiKoo1SyDUkxdWEwcwidle4Z0FeAXSWuh6ZES znKmBjai7usgA/QGbTn/cXdIySQFgJ6IYnCudGssstJvH9x/3QaQE0IweEFWzejje74b DglRm2GwlCEaebC4wchdBL5evIf415Tf2XGQKARQd3LScqOM3f7JB09emVmCpDvb6mMt jZBfPP4VgCRkzRP6C8Nt1YF2MsF997m2XMPPHRyzqRP/nd2fhQJMYeAHt6QPjA/F9+AG Ns4M2LOUqajQuUGDPgbs+ySBvIFLhf7gm7zxGheu+vjEONBsgkRgMvqjaVAxmPjTIXFK 8BdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732694116; x=1733298916; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yGWQ1sChWGYzmBoNIP5bk9GE7YFe+i1tuP3FVoV5U5Y=; b=wS1u/tYlKv3Wjhvqffy+xZFo1/KLIMRHhMaWHQYaZ0T4QE5pjGwCztfC838PsB1vD7 v4w964CB0WHIiOyr5CiRVowBDKqyuTqX8dIFbKWFkbyML6uSCYFndC2LddAaPGlbkKEb vEuuC/emENB1LRpkuxaLU7Qfjr8R9FzGn387dbaN6Ykdc1dlWBLVvsVn/L6gxeGg7eP3 g+X3rHDqVt6lObkfI3PyUu8CZStZV7leJRI9B/0o3dRZSW9PFBJg5pHUYJxugtDAEFMS MZ4VFgP85fXHKyKuw/OkkYVBPTH8FhdOA9pJ/oPTX+vOW7e3NQ75kp9rmbvWiT8fOUuy xk5w== X-Forwarded-Encrypted: i=1; AJvYcCXulpe+jit+pWRkTDrsH60u5yemIHJxNLOph8ex+J5LuKef/DSSQTGttHkhcjzhFikkIPb/HgepGA==@kvack.org X-Gm-Message-State: AOJu0YzJo+hzYW6EuCgMVIxCi6qftjKsCMY5bBbBmX8GvRYvDy/eMgJt v7shYOrML9YLaxl55sZSV4KjICTV3RLADoFJPPfbunJdcBg7coc5 X-Gm-Gg: ASbGncuHlxQxv+CBBOeEgbm20KXf+FgkoFcRChJcnIIKuBwm/oOLWEU2vFQ6yrN4StE BshtlltD/SycCpsj6MH5dcENXGma1ubEAvaqwr+URFiRfKhm2fWklDrB6/OMdB/fBKsYNIsoTE+ dDD4TPj5uipYPXQnMOU8/nPkOHFRfnWsNT/tZPABFnkWBs+bsFNEksY2aFxWmfTRcnYMjvdKoDy 9s+kFrfDNixRIJ+Kpmj+sX+1McpkyQ3p4UfQ9JQv/8P0EC5IQg4paGGCAKn2McqvzNW7DrlrqWt hhX+4J81 X-Google-Smtp-Source: AGHT+IEbbPe1I6HmGXbu5iMcKg85RnC3++SP4n4FvFu2rBZ9au0a2KeUhuG+RffetELROIKre6q4kA== X-Received: by 2002:a05:6512:3a8c:b0:53d:e5fc:83c8 with SMTP id 2adb3069b0e04-53df0104707mr1023581e87.45.1732694115897; Tue, 26 Nov 2024 23:55:15 -0800 (PST) Received: from [130.235.83.196] (nieman.control.lth.se. [130.235.83.196]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-53de0c3ce8bsm1241799e87.116.2024.11.26.23.55.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 26 Nov 2024 23:55:14 -0800 (PST) Message-ID: <6d4cfb7b-b1c4-4307-a090-c5fd0b895a7b@gmail.com> Date: Wed, 27 Nov 2024 08:55:12 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Regression in NFS probably due to very large amounts of readahead To: Matthew Wilcox Cc: Jan Kara , Philippe Troin , Andrew Morton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, NeilBrown References: <49648605-d800-4859-be49-624bbe60519d@gmail.com> <3b1d4265b384424688711a9259f98dec44c77848.camel@fifi.org> <4bb8bfe1-5de6-4b5d-af90-ab24848c772b@gmail.com> <20241126103719.bvd2umwarh26pmb3@quack3> <20241126150613.a4b57y2qmolapsuc@quack3> <569d0df0-71d5-4227-aa28-e57cd60bc9f1@gmail.com> Content-Language: en-US From: Anders Blomdell In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9B590A001B X-Stat-Signature: fboe6uihc1o5gocq9fnymdxuei9th9ci X-HE-Tag: 1732694112-391650 X-HE-Meta: U2FsdGVkX1/4egOiZaWI5r/IkK+bIKqa5CGk4hV2/mU6xWGUOkhvoUjS4ozhipyWNT5nyZi5qDWnoe57QWmMtWdKKMLOQN/j6k8KTTCxvg5QtidW520cH+kUJwtVtHlD+DKT3eNzmF/Hysb2x1o1CVBbsnCaPx6po/9N6iMcUF2ppaNC7ejGvwExNM06fKTs64tY6tmDewnWv7jVZWVwM2WLW1SUCEr7A7p8MndXvE1r1osTOWJsQ70VrEh0t+yANFpgSzssbKqnt1c+teXnmcTxemqtkuYEExoTwBd9FH9hDSicD7mfTH0goyvDREMkK437S9YICcbsCRWGZaNAQM+ZEa3oQzsxgJZyvtWJihCZGOkjzADH9qwO11j7E8XcW+EDufY4NOSkOX+ADHu2CoRe8VvkmkpEu5hkger7zKs9hrWh5d3HAEbLOzwLNFRnHMkDPMaylNsbyaycNOZ1+t8Jz1YOl6wzZSf2UtfsHf4Q3FXKQMQUYcQJ8AigWVLUlwUvCYBJ4K4HVqWOcF7MMMeNlxS9b+z9pOKJ3MlLVNeBmx3pSzvXyffuaQiAbke5+qI9/mPt2WCDeeK4Qj6Vc1Y1+sSng8K1t6Hut/UiE7O7fYYG/JYK7zQeslCqeWOIY1M/bxniyqALWsNm8zcTMl7Qxj8B3B+yYvNKuNZuu7UcSgLTNwCsVVSD2CM04CADLNhO54Gt86J0FcPUdC6uyIJeIhjgInmFwudHuqx57WGfw9Y+2KACMFAOATF4CjiNsgHXDdQ2tt4nsBbSFLa46BLlJFV6D1wDEP8H+NdqJAPTfeD8PWEqdbV8qFYOHK9kQ1tvlmeI8e8PMG1iwraK45n/bsPeUR2fVRuRYOXkjnQoTG9ZaYSx5Pl6GUDOt4eGAxkq1XwjPi8euiqvAkYJRJLHK+fieoUZ3hse7CKrl8Ma7ngY0KJJorltx9Hmr7IO88u0fRIeAwN9lHOJlHz PTOwqPtN yutX0Xbg23234DQFryH47El8qpG88VHXPXYWOc6wzRIZwkUdFXGxNL+BxqI59NGtGDGNghSTQsWluwwCdLNKgweeKa3wb0NBsSTLc437uWeR8sCUDxtZ2WlZq8TqA7mlrUGDqfFSkJxLRBbzV+IFJc5yJelORyFRjPkdw9Vhw0XfLw3z6zsf2jjNxWV+jJ9qe5WeNWdU3YpK1Lqi8UuHw9QSnPz9KZJUXfL+vtl+RE3+E6gs+3tJtpDkmXqkKS+Rsl75CkuCoQnpmbaEEPq+D4akoy/tMcew1VHqwffO1ajZAH8zd5+1NJWwXI71H4qyb/4nE6TJowLEuy90JqLbjN8yufi612rLvHbBQMb4cAxP1pYtNDODIgFeEhUMQV6raIw3QgvSwGqoB7PW06/eVkD5OChmZ5mFS5nOlUJwBDdrGF10y6YY4WFydiA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-11-26 19:42, Matthew Wilcox wrote: > On Tue, Nov 26, 2024 at 06:26:13PM +0100, Anders Blomdell wrote: >> On 2024-11-26 17:55, Matthew Wilcox wrote: >>> On Tue, Nov 26, 2024 at 04:28:04PM +0100, Anders Blomdell wrote: >>>> On 2024-11-26 16:06, Jan Kara wrote: >>>>> Hum, checking the history the update of ra->size has been added by Neil two >>>>> years ago in 9fd472af84ab ("mm: improve cleanup when ->readpages doesn't >>>>> process all pages"). Neil, the changelog seems as there was some real >>>>> motivation behind updating of ra->size in read_pages(). What was it? Now I >>>>> somewhat disagree with reducing ra->size in read_pages() because it seems >>>>> like a wrong place to do that and if we do need something like that, >>>>> readahead window sizing logic should rather be changed to take that into >>>>> account? But it all depends on what was the real rationale behind reducing >>>>> ra->size in read_pages()... >>>> >>>> My (rather limited) understanding of the patch is that it was intended to read those pages >>>> that didn't get read because the allocation of a bigger folio failed, while not redoing what >>>> readpages already did; how it was actually going to accomplish that is still unclear to me, >>>> but I even don't even quite understand the comment... >>>> >>>> /* >>>> * If there were already pages in the page cache, then we may have >>>> * left some gaps. Let the regular readahead code take care of this >>>> * situation. >>>> */ >>>> >>>> the reason for an unchanged async_size is also beyond my understanding. >>> >>> This isn't because we couldn't allocate a folio, this is when we >>> allocated folios, tried to read them and we failed to submit the I/O. >>> This is a pretty rare occurrence under normal conditions. >> >> I beg to differ, the code is reached when there is >> no folio support or ra->size < 4 (not considered in >> this discussion) or falling throug when !err, err >> is set by: >> >> err = ra_alloc_folio(ractl, index, mark, order, gfp); >> if (err) >> break; >> >> isn't the reading done by: >> >> read_pages(ractl); >> >> which does not set err! > > You're misunderstanding. Yes, read_pages() is called when we fail to > allocate a fresh folio; either because there's already one in the > page cache, or because -ENOMEM (or if we raced to install one), but > it's also called when all folios are normally allocated. Here: > > /* > * Now start the IO. We ignore I/O errors - if the folio is not > * uptodate then the caller will launch read_folio again, and > * will then handle the error. > */ > read_pages(ractl); > > So at the point that read_pages() is called, all folios that ractl > describes are present in the page cache, locked and !uptodate. > > After calling aops->readahead() in read_pages(), most filesystems will > have consumed all folios described by ractl. It seems that NFS is > choosing not to submit some folios, so rather than leave them sitting > around in the page cache, Neil decided that we should remove them from > the page cache. More like me not reading the comments properly, sorry. What I thought I said, was that the problematic code in the call to do_page_cache_ra was reached when the folio alloction returned an error. Sorry for not being clear, and thanks for your patience. /Anders