From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8114CC02180 for ; Wed, 15 Jan 2025 09:52:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1418E280002; Wed, 15 Jan 2025 04:52:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0CA92280001; Wed, 15 Jan 2025 04:52:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EADE1280002; Wed, 15 Jan 2025 04:52:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C88E8280001 for ; Wed, 15 Jan 2025 04:52:39 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4519CAF10D for ; Wed, 15 Jan 2025 09:52:39 +0000 (UTC) X-FDA: 83009221638.11.68F49E6 Received: from mail-ej1-f43.google.com (mail-ej1-f43.google.com [209.85.218.43]) by imf19.hostedemail.com (Postfix) with ESMTP id 38C211A000A for ; Wed, 15 Jan 2025 09:52:36 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RcJZiMXP; spf=pass (imf19.hostedemail.com: domain of nspmangalore@gmail.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=nspmangalore@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736934757; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=trbeBr6R3RgZyqFN+J7kI/G9tr61k9osn0PDuq9VKHY=; b=YLV3hs99RMG+RRuDsjCaQ7+YtO6sTLK0aSFqdw8OzlsFVDYLX8+GNhnubw/iQ5lHdRjsaC AUgNOH5UMCsZsk/wLbW333CYDLPkv9MQ7C/gmvTMtOjS4eEHkrJSsN3osumeZqWVWwGc31 63x5rMIi9aBsoSYp4iPtbh7118TQGoM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RcJZiMXP; spf=pass (imf19.hostedemail.com: domain of nspmangalore@gmail.com designates 209.85.218.43 as permitted sender) smtp.mailfrom=nspmangalore@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736934757; a=rsa-sha256; cv=none; b=4zrlZ3f2Go3ZZ93ANOsfqKFuJ1z7LHe7XBHlkd20VrV+9RhiUHDIqs/ZIjH/izGTpHtPFD FYeB6IZexKrSRsUOxhZkwaniQlmGDs2lesxy69iZGxxkTjEgzLQy9NDb4b3j8cXtupod1S ErnCZSjpZtnLoMqCNxqOdJXP7/fV2yI= Received: by mail-ej1-f43.google.com with SMTP id a640c23a62f3a-aa6a92f863cso1222204366b.1 for ; Wed, 15 Jan 2025 01:52:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736934756; x=1737539556; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=trbeBr6R3RgZyqFN+J7kI/G9tr61k9osn0PDuq9VKHY=; b=RcJZiMXP8KciXgyLO9ak0n0PZyKU8nmEaGA1eyyo3I0SPZU1T+ivpCXoO5AjG6sY8E EV3FYLCQXFWpVyaiaUvYfJ/bhwgtqxWVmxBU/c1UfVzTskW4eYSKDcpmBXhAy0JP7u+6 PjYWAbO3r+buKBYkKl+8jiR3iJgOjiWawN0+1zQJZv6W76jVIVtV0SuH6akkPBLr3oaP 2Xtr68ZQzB1e2mjE1GkQ1jSnrv2N1vrGW8U4qZYjJm985QLov2ONPnSR4UmSUAbHyXTY Z8PAtyzDYcA2X1Eu0Cpugzyzc+aLYXlfp8TiCrFcUmr1JIOvknIDJSIdLNcHTOO9QZ7F 1hHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736934756; x=1737539556; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=trbeBr6R3RgZyqFN+J7kI/G9tr61k9osn0PDuq9VKHY=; b=C3iFpGW+8DUCluKRGBJIuGBlYpiutIqBON47WDJ0W0q1rn8XCCzJ2nR8uGsc+2Be4y bIlrfQL9KKIc/5y2862c/OR1drgycEXycgPof4LgNf6lt8wYI+srhxOpYoTmB87rIfGS u6gt37LtzLhUiSDIkGuz7Fp1GRDfFkp1V61LuLVQHCNAaG2s87PHsSF3dh2ttP301NwE 0diw5+j8qiCtoTY+RKRffCZRTFcb9BIoa0Fe1lhQlKUroHpUfKTsJ++5FJagmFnuTRMA LxzjnQSY9un9hmsYe54y0/wySHyN2pVk0I2s9qzdrEunKT5D+H3i1IzbCbLqIu7RBXri yvbA== X-Forwarded-Encrypted: i=1; AJvYcCWEC2+b9skFJ4iKLVMYOsfAv7+2dqA0VqDq3HE0mJfCy9xZYnKI/iU3VHeY3hobYoSskxn8G2doUg==@kvack.org X-Gm-Message-State: AOJu0Yxb3ockohUHKvZ5WCTBEKhgrF4LRR54hCljag0XZvg2Pm7c0y2B WUCM82GiXIrgUfveoz9ZagfqLebx2tqUIhVbatP+1rkH+roHaL//FE91QG2MSzOrx9HS3amdvhG vjLAuad6weApXtFF69PJDohsm5g4= X-Gm-Gg: ASbGncv6M2oACqpUkE9pYolCW9ip8xpBc7ofjoJWWY5neP4sstwkAiQW7scGc5v+xUg sA1Dq4kRH2PSaL+J2Yy/X8ri1dF83wpibgcXb2EBKzWbzgAAo/ut66xyFiFXtJjATcQXD X-Google-Smtp-Source: AGHT+IEDLcvardndVaj+GYPVD8O1WYeNUBhWrvT5ipyuJCaeaaHTbp3MPFk0W5eABx7T1tZjg2WfunVpOQgZrRKp680= X-Received: by 2002:a17:906:f585:b0:aa6:7933:8b26 with SMTP id a640c23a62f3a-ab2ab6a8deamr2407874766b.9.1736934755415; Wed, 15 Jan 2025 01:52:35 -0800 (PST) MIME-Version: 1.0 References: <6wcmvyeuelngltuiohumo6pffwptgbgofqba453pdi45ahydkn@ern4qy4i2zoa> In-Reply-To: <6wcmvyeuelngltuiohumo6pffwptgbgofqba453pdi45ahydkn@ern4qy4i2zoa> From: Shyam Prasad N Date: Wed, 15 Jan 2025 15:22:23 +0530 X-Gm-Features: AbW1kvaPcnzVgLUg0meHPlfSZf0qt6H9NxvZB7k6fIo5mBN8fM0UifCtUtz7kUA Message-ID: Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Predictive readahead of dentries To: Jan Kara Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel , linux-mm@kvack.org, brauner@kernel.org, Matthew Wilcox , David Howells , Jeff Layton , Steve French , trondmy@kernel.org, Shyam Prasad N Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Stat-Signature: au9aqpqfoarphyc4i3qtje9pmteee5fc X-Rspamd-Queue-Id: 38C211A000A X-Rspam-User: X-HE-Tag: 1736934756-332580 X-HE-Meta: U2FsdGVkX18A1fyMVQ+Z3xK34uPqU2We56WJUeV9mOs3Pn7tbA7JyOHuujFTcgx8auxodXxZ1Sfm9N4SFbkUerxkUvw80UmPgXJ/u+dR3fH4D9A31iMG4YxI+Z/An4DhBq4GW5tGRbQ3BQNpExb8yt66ypmxBvHiEb31vOHwi5UpE0VBbCT5MAtgFduIxeQNVRMsF92ieyDH/ExS7eqtfNIDNUACw/SSoHTPz3NQEtWvcarB4DAIih7WPrw0WU98mrN0w4OOzpr2pgC70oNuRJ7A31Ztl2Z5hctZnfwcoZjJtWAnJOQJQAIpPLoX7LrmCH+ME57rXsbkOZaNsrfOXZVCQlxumhq91B5+wgJsl22rR1w6HAGjRR1xYqX2Q5YiqlpUW5sIU6BCjp0J18ps4rU+3jo5ioF7FenD/nJrvce8+FDpqAYOwFPPXILmoX1BHCPpZkO14jiRwJ+tjVbn1Y71Mn+qRbaAtIx1HcGINgZqm/bSUYskQPnVc98dzhZnD6wD5vOChjSX9S2bCqkjO/IVa7EOiw9J7b3wAv05kiXW4Ue0pDVwrMI+Z4tf2C1Pkf06J1CoaPBdUmvGfubA19eYlS35c5bdRSku1kChaMkFGJq0RQ16V/Cw0yav+xXFpp2tzG0nHY4HL76HJc4Faa07UVLC9/q3bDyqdWhzTh6ztE0sjSSKE3ESez9AZPpP1DO8VfkOarMsYjAxYlnLC93BzHhDRbXzJ/WAs5mejLgbKdeAetaDGtbCIJrmvSO2igJRdSpPag3xYMEPQV5vL8U3z/Ex4C7cdyIlh6vxut4844nFcETT/5LbiQkj3uyqDtrpWCg5g3PiDK6IlXfi3asXQPI2pprPQ54o+7RihCsHOj5LxywFD9Uh+ZXVZ7XyHxNHsKySumKzqlSPy7s4mK+sXweem4cZGY35jj7SqabRPHy7IptWBAP2RnKub60hlOh1Atd4jrhMZT+bTWn 0ui/LlxV qFqdQ94Zb/LMzOY8BeFx/OLhP5qJx3HuhyTMAx+jh7+1AKrCY5Rfv9oSZtfoTNWuYDGPVAgaBx0A6RKqPP7mhbRNv2yq6pKtlNuXbYGLAdZX9kQiPDil6bpU7n5p3fJrnCk/gn64GP45ExkO+OvLGBlk6hZSeHJHXLY+xYPJMkLtAG+WEZuAU1JvGPN5tgekNMnJV4/srAoyhV7gkE+LhrEM06QtjSoAvnyxsB8YscyOSIA2aM84S8JQkGQTxaNcyj60XHQTf7Tw8d/rMCfJtbvqsOYWaUAgapHIZvJ1rwerF3l6wyc4clEIrNHkLpaTzyu2XwImb31mSyGcXSYbU0HF3RBTQFQWE+Tyv1/z18ccBdkA3jXiYwgskPC9PeeevmNr1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Jan, Thanks for the review. On Tue, Jan 14, 2025 at 6:09=E2=80=AFPM Jan Kara wrote: > > Hello! > > On Tue 14-01-25 09:08:38, Shyam Prasad N wrote: > > The Linux kernel does buffered reads and writes using the page cache > > layer, where the filesystem reads and writes are offloaded to the > > VM/MM layer. The VM layer does a predictive readahead of data by > > optionally asking the filesystem to read more data asynchronously than > > what was requested. > > > > The VFS layer maintains a dentry cache which gets populated during > > access of dentries (either during readdir/getdents or during lookup). > > This dentries within a directory actually forms the address space for > > the directory, which is read sequentially during getdents. For network > > filesystems, the dentries are also looked up during revalidate. > > > > During sequential getdents, it makes sense to perform a readahead > > similar to file reads. Even for revalidations and dentry lookups, > > there can be some heuristics that can be maintained to know if the > > lookups within the directory are sequential in nature. With this, the > > dentry cache can be pre-populated for a directory, even before the > > dentries are accessed, thereby boosting the performance. This could > > give even more benefits for network filesystems by avoiding costly > > round trips to the server. > > > > NFS client already does a simplistic form of this readahead by > > maintaining an address space for the directory inode and storing the > > dentry records returned by the server in this space. However, this > > dentry access mechanism is so generic that I feel that this can be a > > part of the VFS/VM layer, similar to buffered reads of a file. Also, > > VFS layer is better equipped to store heuristics about dentry access > > patterns. > > Interesting idea. Note that individual filesystems actually do directory > readahead on their own. They just don't readahead 'struct dentry' but > rather issue readahead for metadata blocks to get into cache which is wha= t > takes most time. Readahead makes the most sense for readdir() (or > getdents() as you call it) calls where the filesystem driver has all the > information it needs (unlike VFS) for performing efficient readahead. So > here I'm not sure there's much need for a change. I agree that the filesystem driver can do this. But the logic for "advising" how many dentries to readahead may be something that depends on the workload rather than the filesystem itself. Most of the practical use cases would readdir the entire directory. But there could be use cases where a partial directory could be read too. > > I'm not against some form of readahead for ->lookup calls but we'd have t= o > very carefully design the heuristics for detecting some kind of pattern o= f > ->lookup calls so that we know which entry is going to be the next one > looked up and evaluate whether it is actually an overall win or not. So > for this the discussion would need a more concrete proposal to be useful = I > think. Acked. Simplistically, the whole directory could be read when the number of dentry revalidations or lookups that missed the cache, but was successfully loaded from the backend exceeds a certain number (I can see how this number could be filesystem specific). There could be other more sophisticated implementations. Let me think through this further (and read the other comments) and see if I can refine this further. > > Honza > -- > Jan Kara > SUSE Labs, CR --=20 Regards, Shyam