From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 796E6E77188 for ; Tue, 14 Jan 2025 15:01:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12AE2280006; Tue, 14 Jan 2025 10:01:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DB59280002; Tue, 14 Jan 2025 10:01:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0C0A280006; Tue, 14 Jan 2025 10:01:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D451B280002 for ; Tue, 14 Jan 2025 10:01:48 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8D48A44337 for ; Tue, 14 Jan 2025 15:01:48 +0000 (UTC) X-FDA: 83006371896.30.84F171C Received: from mx.manguebit.com (mx.manguebit.com [167.235.159.17]) by imf09.hostedemail.com (Postfix) with ESMTP id 8FE8914002E for ; Tue, 14 Jan 2025 15:01:43 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=manguebit.com header.s=dkim header.b=Uf7ZmyHK; dmarc=pass (policy=quarantine) header.from=manguebit.com; spf=pass (imf09.hostedemail.com: domain of pc@manguebit.com designates 167.235.159.17 as permitted sender) smtp.mailfrom=pc@manguebit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736866904; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Uswn3LNPnFxj4MhYD3S5lhOLiJ3rridcbMI1Q7nSoeg=; b=eUCbUCihLzb5640NfQ6coY/6WuvB6KE9pyA0gv5DpxUI/PZfxvfVK0lgZ5e3pGV6lrVz+G e0sjLfaUpeCJAGG1zo11+K69i8TTthIZcZssCTb3bsgAVMBCIxZGgGzZD3kTk0bWEJJcol 9mgyia31HOUAAYwKReLfBnsJI1tFIlk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736866904; a=rsa-sha256; cv=none; b=c2VVzOwbRc+3F+Buuz3V/5X+Em/NvsjKS2PuSwlEdS5l9Y4XfhMlsNGv6E07HIo5Vr0ptw mXQJQspPiGYcfSOuYtcqmK0a2+ZkGBlpqXpxQtMK9kFs11eMurvB/oMnDUwombVE5VHS3b pbK9mmT+4Dk8QV6LEE+HxVpU/zc3DoE= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=manguebit.com header.s=dkim header.b=Uf7ZmyHK; dmarc=pass (policy=quarantine) header.from=manguebit.com; spf=pass (imf09.hostedemail.com: domain of pc@manguebit.com designates 167.235.159.17 as permitted sender) smtp.mailfrom=pc@manguebit.com Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=manguebit.com; s=dkim; t=1736866899; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Uswn3LNPnFxj4MhYD3S5lhOLiJ3rridcbMI1Q7nSoeg=; b=Uf7ZmyHKAkdRXahsoWD410eaGLOaBe/6YSqJb+a/rRsBgfix0bHyiQ7yanuth+uwTHC95F qW8bv5WV+tnBj0b0HgtMMn5JGlvc1L5Zf3ma5ENrueprOjXUXg/Khl/XEH9GrvV0yuDTDr nVChqmMudj8tLbfHnJvc96/jZK5sDFezwTcFqhSQb20etjJyv3kbW/iS1P23rnKlCratKV ABzSyUKHVZ+PMUo8MWH7lj11X/7Ua/mWCRrdfvWQ9SGJJaZgA9KA2d1Ck4pfexloZH7mXp JZzE/FG+64yL4E8gzU9dsxo1YdUQ9d/lFeJOT94j9q86nRLtImC2Mz3Dnzdkhg== From: Paulo Alcantara To: Benjamin Coddington , Amir Goldstein Cc: Shyam Prasad N , lsf-pc@lists.linux-foundation.org, linux-fsdevel , linux-mm@kvack.org, brauner@kernel.org, Matthew Wilcox , David Howells , Jeff Layton , Steve French , trondmy@kernel.org, Shyam Prasad N Subject: Re: [LSF/MM/BPF TOPIC] Predictive readahead of dentries In-Reply-To: <460E352E-DDFA-4259-A017-CAE51C78EDFC@redhat.com> References: <460E352E-DDFA-4259-A017-CAE51C78EDFC@redhat.com> Date: Tue, 14 Jan 2025 12:01:36 -0300 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: q4f6smxw7stdidg8ynsgj3jwm349bpqw X-Rspamd-Queue-Id: 8FE8914002E X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736866903-146602 X-HE-Meta: U2FsdGVkX18NrSPyA6LsDeq+iVEUEBHCh8tm4VSTP2fn/Pth58+6Oak7vcoo4HBw6N5v0F2pi/XV65+shU+0/auQE31qhCAB2Q+ZX18x9Gv70OFn4i0NyCCN9QPaeOKManaPGBSzkRn/vKsxsL1dAcf/Is+DBqOnQLTsip6Qd3lvjIC1Pwu1UQCftrZipYqbdLvKNzrB9mLp5gB5asiupfCSobrSsfBlZsGm9yQBu/929pYuCFl3CS9qdW4RY0XOrH1jgb2q9s94MR6qbItlI2hv55u3t2ZSP86lx03qXj3ZkvEIrDK0brDNYy7oTc3RaIAewj46W8hW1gg3tnCL+ypwYrsOcqA/5ZZTmKGWH+eGxzTgq4yQ8F8xBcvFEOX4RuU1ZuMFjERhg3moUDlX4zqLMfmpBNIqTKTygatnG8grwhenJhee/3WxpSZ4P5JPwD9i++lREMUKT6s530M64Rnp/Aq/wfZa8biz8MOJdwMVuNvACszJ4kdzwevQ13auzZ0z3M/pWsNQPLHdcph7Mcv1BF/q+bdM9LOMzv1V/ivxs+I2lSm9tnMYJ2Pg+rNwRfO/+9XPR5Xw2FCDKiFVHPSBXaEqIGsHBqGBf8TgP8puvRB49tS8FTuQo4Hi5vmrc+k8lk18LwEJrNev2UB1hNQilgd7S3zVEZYwTDcdRFneCqt8Sfpw4yAk1ucztGBOaVQuBlNEGgz1RRztDp6IvI8Nt+1u0zAkrrR+Cu8JNPH0to6/tWLyS/iyCmb8A5bmwxyJZMwtiaykdsTcfKajbZhSoH1pj+nSJFsaNwDy6OX/aAHPg4kVyCf04jcLjk9zzTYYzOKx53JVU+fkjk5cbUGRWVdLQLpIgRswe5cJFyw70r1YuFWB8BYj+ttDorVb3x+nhQozjtmuvZZ8AYZFS5+Q3KBTR5cbz1c6tNW8AVDHmUDggvWaFG0nliWsPnrI10IVOrmQnZYlCwhP+Mw kBCeQ95B DOO4Dmz/aUcU6uER79GlaDqK6Oz4/9m3/pb+gjaT3KqlAFF1EPZ94xr79aAVmj7eoExTf3YkRHqUvB/opAIriz+wIGH7C65Nn2QbEs0mLCj7/aKOIO8YIAgVdVCpvPCEUsZ//duccF6x2iyTdDe1aG8EycoVqF2v6bvg54oMxL1stapWOQobfe/+nug0hyt1TOgFNfa60HV7BS74= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Benjamin Coddington writes: > On 14 Jan 2025, at 8:24, Amir Goldstein wrote: > >> On Tue, Jan 14, 2025 at 4:38=E2=80=AFAM Shyam Prasad N wrote: >>> >>> The Linux kernel does buffered reads and writes using the page cache >>> layer, where the filesystem reads and writes are offloaded to the >>> VM/MM layer. The VM layer does a predictive readahead of data by >>> optionally asking the filesystem to read more data asynchronously than >>> what was requested. >>> >>> The VFS layer maintains a dentry cache which gets populated during >>> access of dentries (either during readdir/getdents or during lookup). >>> This dentries within a directory actually forms the address space for >>> the directory, which is read sequentially during getdents. For network >>> filesystems, the dentries are also looked up during revalidate. >>> >>> During sequential getdents, it makes sense to perform a readahead >>> similar to file reads. Even for revalidations and dentry lookups, >>> there can be some heuristics that can be maintained to know if the >>> lookups within the directory are sequential in nature. With this, the >>> dentry cache can be pre-populated for a directory, even before the >>> dentries are accessed, thereby boosting the performance. This could >>> give even more benefits for network filesystems by avoiding costly >>> round trips to the server. >>> >> >> I believe you are referring to READDIRPLUS, which is quite common >> for network protocols and also supported by FUSE. >> >> Unlike network protocols, FUSE decides by server configuration and >> heuristics whether to "fuse_use_readdirplus" - specifically in readdirpl= us_auto >> mode, FUSE starts with readdirplus, but if nothing calls lookup on the >> directory inode by the time the next getdents call, it stops with readdi= rplus. >> >> I personally ran into the problem that I would like to control from the >> application, which knows if it is doing "ls" or "ls -l" whether a specif= ic >> getdents() will use FUSE readdirplus or not, because in some situations >> where "ls -l" is not needed that can avoid a lot of unneeded IO. > > Indeed, we often have folks wanting dramatically different behavior from > getdents() in NFS, and every time we've tried to improve our heuristics > someone else shouts "regression"! In CIFS, we already preload the dcache with the result of SMB2_QUERY_DIRECTORY, which I believe NFS does the same thing. Shyam, what's the problem with current approach?