From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 188B7C02180 for ; Tue, 14 Jan 2025 00:46:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 75BD66B0085; Mon, 13 Jan 2025 19:46:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 70B6C6B0089; Mon, 13 Jan 2025 19:46:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FA176B008A; Mon, 13 Jan 2025 19:46:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 41BBA6B0085 for ; Mon, 13 Jan 2025 19:46:55 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id EC3F54333F for ; Tue, 14 Jan 2025 00:46:54 +0000 (UTC) X-FDA: 83004217548.10.77BD0CF Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf01.hostedemail.com (Postfix) with ESMTP id 2D56F40010 for ; Tue, 14 Jan 2025 00:46:52 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XFA6X7cj; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736815613; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8idD+h4V5g2R8cl6Qdzww9AUT8ARa+hEuSfbrdT8fes=; b=xCAO1GkV1D14+Xw6NfCMkovRl7n1ZzBSwUqe0pepLi6oD7VYXRyl3SNj/6eQ5ZRdSxTbMK HJh+bMyxsIkJ5ecnOpBY2aRp/rfbmoMYRLtee0QEaxfkAuyPkm0oGSJv+NX+2pJz6WLGpz A7+fyytBhBFAEuMUPCGKEKd5aY62j28= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XFA6X7cj; spf=pass (imf01.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736815613; a=rsa-sha256; cv=none; b=Z5bAX2+wG9SStiH4t2JUYQsifun2aI3jDXBxONtnfvqZSThWGNCa6PXMUg3H/MoXWCiQeB 78rEEfF5Wucdlbrr+29qMRUB5PXsl3buDWfsWKUm2CO4IoZxAiX8HHaeiNbNWs2neHtx9L 20Nmn1UnhNSrHjffxoSFTZT4Cq+JzTI= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 197295C55CD; Tue, 14 Jan 2025 00:46:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 51BA6C4CED6; Tue, 14 Jan 2025 00:46:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1736815611; bh=8mP726DmMF2Uh+5gqJGUOIXM5CKPS8V3Mr1N6BkduMY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=XFA6X7cj+a8EL9RqyrGXe09P/hPBMiA2lyZn2wbRqrxcZ3+S1vOc/458obqwi5dW5 IJSmMVVkpx19oq+Sjb/2eSbDqnVu1MBRZN7buw7ca29/Dg3xb7cg5nKQOv5i635KIo gZLpqxtU2MFCX1gQViTddPV33fENyRgmYu8TQ8Xo= Date: Mon, 13 Jan 2025 16:46:50 -0800 From: Andrew Morton To: Jens Axboe Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, hannes@cmpxchg.org, clm@meta.com, linux-kernel@vger.kernel.org, willy@infradead.org, kirill@shutemov.name, bfoster@redhat.com Subject: Re: [PATCHSET v8 0/12] Uncached buffered IO Message-Id: <20250113164650.5dfbc4f77c4b294bb004804c@linux-foundation.org> In-Reply-To: <3cba2c9e-4136-4199-84a6-ddd6ad302875@kernel.dk> References: <20241220154831.1086649-1-axboe@kernel.dk> <20250107193532.f8518eb71a469b023b6a9220@linux-foundation.org> <3cba2c9e-4136-4199-84a6-ddd6ad302875@kernel.dk> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2D56F40010 X-Rspam-User: X-Stat-Signature: 6hdgaqb3grxwb33kd18a6jwd9q7fkija X-HE-Tag: 1736815612-390627 X-HE-Meta: U2FsdGVkX19YlWiRatIfocCRwYLyMyF33Nph7v5aEDu28WZAL8xVuYcfi+aPjzoNsijRGn2m65568h0vfcZS2YMbeuxWYI8JgmZzTQ/eajoAI1KXQ0aIxmSqmIhwWKJp0ephBSyuyEvJmn+ppfQ1Bdnt9g4mSrwFAGGqZpULe1EhRqZvYLpZJVF65Gkm9oQE3R8a7WbYdliukL4TdYOr1MBH73iETds8ltvsky275uzyjdtxdqmeCXHZOLn5JXFTK7JmLnrUjcX7mYwm6O9wrouh2+3h2LUIv6QACKeOfM8pS8j3EBUWXEz10ME06Js3iDK1CxchmzAcR2z0tMgjnaOa1HGcqZUm/PjpyURdSVII2qLgitFtuWG+bnon/DFrbz1/5wG2eNyRPq2iFw38PIMVCBGNK69IsZaHTP93oEICrvvPXZVafaLYzFCJeISFSKEtRvCJcjwwDvvKj+ABDJH1WopdXi8D69nFFT+mpoAquhs45SGs5QNRXbwgC591N8spLYdPZ0vCUSTYC4JuErc32mVZMn7nxBWrYBqZmh/4j9E66vS8/2mejFYyyx2s7JWcThngEMCNqU3Pdim/4JO1bf/8K1dNuvvUbR9rtIUPfjn+izNMj3QUQ3wR3oniDFV6GISBVL+lgvxlExZ6FV5soaUa7yWs3MgbDLYlZJWWHDcJRhq0uImtH1oEiEeq7rpkdtbgq8h2ZKQvKUXpLLWFPG3bnzpcEVf6q1iKeD9azW0X3hLZ2l/eay+nrvG7LtAw3qFYLtSJH+9Dv9yQMZVav1spgwei6bb4R3+Wp7WWdZP4w9E55mKIQksu1bU80FpxYw81Rm+5NB1jus84tDAdd5138fGgQdr6O/gHVgkM/QrJnhFUuZLB5jtsqP9Sg4H6TI29nc6O1KImW/xTZgyWQ8Y8RC2yQpp/Prcnj+eOK0A5yPQv3mBTqchsv2KJTEZlhMiCSqFqERD6W4g GeROwbXs 1GjTCdlqAwjO4dTo9osMF4rhHYqpnst4wamlPLzhgZxRxQu7fzaYYzI6rWh6/lUDern2Z6spSU8ci7VPv7k3dKNMfYwPlFWtGcv692yFaT+xy7bgzO1P/nQZnchecMZtN+My1KXbxMr2EubReonKvV2MQgThK6mRm6VEte6R/cX5MDtG+Bh9o7ZxYDoOTR7mSfDkjsB1KxM026fnmGfBdU5zT/TLw3a8v0imeMDtPCOxjlHTTEXrhT1efU3rvq5a4iq0M X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 13 Jan 2025 08:34:18 -0700 Jens Axboe wrote: > > > > ... > > > Of course, we're doing something here which userspace could itself do: > > drop the pagecache after reading it (with appropriate chunk sizing) and > > for writes, sync the written area then invalidate it. Possible > > added benefits from using separate threads for this. > > > > I suggest that diligence requires that we at least justify an in-kernel > > approach at this time, please. > > Conceptually yes. But you'd end up doing extra work to do it. Some of > that not so expensive, like system calls, and others more so, like LRU > manipulation. Outside of that, I do think it makes sense to expose as a > generic thing, rather than require applications needing to kick > writeback manually, reclaim manually, etc. > > > And there's a possible middle-ground implementation where the kernel > > itself kicks off threads to do the drop-behind just before the read or > > write syscall returns, which will probably be simpler. Can we please > > describe why this also isn't acceptable? > > That's more of an implementation detail. I didn't test anything like > that, though we surely could. If it's better, there's no reason why it > can't just be changed to do that. My gut tells me you want the task/CPU > that just did the page cache additions to do the pruning to, that should > be more efficient than having a kworker or similar do it. Well, gut might be wrong ;) There may be benefit in using different CPUs to perform the dropbehind, rather than making the read() caller do this synchronously. If I understand correctly, the write() dropbehind is performed at interrupt (write completion) time so that's already async. > > Also, it seems wrong for a read(RWF_DONTCACHE) to drop cache if it was > > already present. Because it was presumably present for a reason. Does > > this implementation already take care of this? To make an application > > which does read(/etc/passwd, RWF_DONTCACHE) less annoying? > > The implementation doesn't drop pages that were already present, only > pages that got created/added to the page cache for the operation. So > that part should already work as you expect. > > > Also, consuming a new page flag isn't a minor thing. It would be nice > > to see some justification around this, and some decription of how many > > we have left. > > For sure, though various discussions on this already occurred and Kirill > posted patches for unifying some of this already. It's not something I > wanted to tackle, as I think that should be left to people more familiar > with the page/folio flags and they (sometimes odd) interactions. Matthew & Kirill: are you OK with merging this as-is and then revisiting the page-flag consumption at a later time?