From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68EDBEB64D7 for ; Fri, 23 Jun 2023 22:21:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 002E48D0002; Fri, 23 Jun 2023 18:21:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EF5728D0001; Fri, 23 Jun 2023 18:21:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D95598D0002; Fri, 23 Jun 2023 18:21:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C93118D0001 for ; Fri, 23 Jun 2023 18:21:52 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6BE1280E2D for ; Fri, 23 Jun 2023 22:21:52 +0000 (UTC) X-FDA: 80935436064.21.91140D4 Received: from wout2-smtp.messagingengine.com (wout2-smtp.messagingengine.com [64.147.123.25]) by imf08.hostedemail.com (Postfix) with ESMTP id ED913160019 for ; Fri, 23 Jun 2023 22:21:48 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm2 header.b=o3tDWuNq; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=PCfG+wtm; dmarc=pass (policy=none) header.from=fastmail.fm; spf=pass (imf08.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 64.147.123.25 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687558909; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=i2BJkd/fLtNpjyEZayxizPnyWG+2whbLBrPqCRMkRlI=; b=QyclbhL/DhuW4W3UqiiHQp8ImUbEoslIVqvQWJUgpLF2mHsTDrCyaSgB4RSFy6Ocv5SCH0 Zc012RGLnYK9oFZ5h4nxXILV+8A3CB7aExX60x9hwX+XiLpXrSrhJnLYsOPIAnOiuVaMO5 xHOhjHTwtf9tbSSjuojdLaSBY6cOxuA= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm2 header.b=o3tDWuNq; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=PCfG+wtm; dmarc=pass (policy=none) header.from=fastmail.fm; spf=pass (imf08.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 64.147.123.25 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687558909; a=rsa-sha256; cv=none; b=DPQbZre2OYjSIazs48KWj/rBCMCznLUeNgVCfK19u8PLzgsT6hpbrJKKHD8YBWF6ctuK2W 8cBYKPdn04K2sQDUX8PRGkXvHHIaHN70Cv52rLzI9/CthQ/EzIPvAXgIduvTVKOqrFjDQ7 hlAXK4E3SIVTMGhhLQ1oHayOp2oKrhI= Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id E300F32009B0; Fri, 23 Jun 2023 18:21:46 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Fri, 23 Jun 2023 18:21:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1687558906; x=1687645306; bh=i2BJkd/fLtNpjyEZayxizPnyWG+2whbLBrP qCRMkRlI=; b=o3tDWuNq4F8KDgICTFAjQdF0fQzAagBp04IBTkrjI0OQMnORU4s B2qg31KCp8pCt2KIivnBFo5NJWG3Sy8KGYSuZxJheNnwzz0pUR+TPVfWYxeIGSpt Jg3aMIHGWdnehCN1gDY0xJW6FINLSLdlpq5bnPVI2VVbjemdW6g0hvYaF0hS9OR9 u1B7eDxr4UgWjVDOAuJh65hSEE5aY7R54g2hcPAwmfLcuwllnTethFK9pnZZFA6A NXPgEVzo+3+/HweK30UG2thWjq9oNlEqjCLek5DlnxRtHyFso7e6tI/RXIXTvoLG 72P9QW/FG4VdjlnRVYAYAig8TMhKrk1HXXA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1687558906; x=1687645306; bh=i2BJkd/fLtNpjyEZayxizPnyWG+2whbLBrP qCRMkRlI=; b=PCfG+wtmQfTLD6IrCbpqzrb05YqXpLzxAdmfn4mt/xho6I+LgDX hcq/rOKe/bF0ktQnc1OnWdQ9L8LlszQ7XbkGYJ9eWfe0pqy3c2EhZpAUs0k3n5Vr WpZIhWwmuqkO2qMx/lGUitOtQZSD6DWN2vKm72ViH6DI+z4bhzn7EQKorpIv5pBM 5VLpyKPV4Q/UjHHlOF4m4+w2guMVBBXhL/8l/F5L/UWbQXx8TrF2Pa5yTPtK5L/k z4tbATZtS39I61ViNraOb7mpKD3YSFvt4MPlFE9kcFThYMSfw6aEFgL94ZHWzY0D ZO+7iQqTPxJ+qCV3ggwjlYyX1I7nPO+0PHg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgeeghedgtdelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtfeejnecuhfhrohhmpeeuvghr nhguucfutghhuhgsvghrthcuoegsvghrnhgurdhstghhuhgsvghrthesfhgrshhtmhgrih hlrdhfmheqnecuggftrfgrthhtvghrnhepkeehveekleekkeejhfehgeeftdffuddujeej ieehheduueelleeghfeukeefvedunecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrg hmpehmrghilhhfrhhomhepsggvrhhnugdrshgthhhusggvrhhtsehfrghsthhmrghilhdr fhhm X-ME-Proxy: Feedback-ID: id8a24192:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 23 Jun 2023 18:21:44 -0400 (EDT) Message-ID: Date: Sat, 24 Jun 2023 00:21:42 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v3 1/3] libfs: Add directory operations for stable offsets Content-Language: en-US, de-DE To: Chuck Lever , viro@zeniv.linux.org.uk, brauner@kernel.org, hughd@google.com, akpm@linux-foundation.org Cc: Chuck Lever , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org References: <168605676256.32244.6158641147817585524.stgit@manet.1015granger.net> <168605705924.32244.13384849924097654559.stgit@manet.1015granger.net> From: Bernd Schubert In-Reply-To: <168605705924.32244.13384849924097654559.stgit@manet.1015granger.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: ED913160019 X-Stat-Signature: 4fw54z3gnpkruncxkyunzifj7y61nhro X-Rspam-User: X-HE-Tag: 1687558908-192575 X-HE-Meta: U2FsdGVkX1+GVeG5Pw//hQN1RBLQ3w1jGLd6Q/q+wuXS1y0pTt6eq1w5oouRGvqtyzrwjRF6ILsz+63DsSbXZmBDeP6ZXQprKd9+Zavqm5GJwgYQEArdap8erre/kDqV7VZjTeuW3kAe6h8sFdse1PMVkvvSk6+NbNrneXJUFqDLcwoN7sfmXx7DFHLacEmOQiB56PoUQBJFfTI/fRiKI1EZtqOyVnDjR+yyy+xZ+wT+IZUGvxo/BTnz2ogeDNlcCWxs36Wq+yZpWvQpV8/+JY3GaF7SozXzwpiP164+isOsli8uUjLV6VuQQnqHL2JHWpk6n6GevfW+24kzLL1s+lZVugwc/OHoOGaEcU+nwGrlp7DJZMXFa+f212Xq9F9vxUVTvkDfSiEEgh2Bi8UqmOuqfZHju+pSPWCIB8kydX03EqyKL05zg+8BTbIHkUEQrEXXrX38/weHXIGfdDrpcM21QIAy7prfdgcZAwWAq4WiNZcdsIZ+FSYa4HO66bMlnfPXTpjpPMenVtSGjuv+0So1+wvCZYOeZqDqmcAwBl7Xf3rYIGX0FOkeU9Zf5+AzkvzuWmki7FrqClS71iXwqOCbHN7//buQgV34AwMMkbDtkcjIghJRBG36eU73shdTquicI/4i5sF7E9wxQ0DnLDdWvjUoVgf1G+UFPqM4LkR2EpQhCZZmJJ5oqxudBsIL5czOWJiW2zd8b50VeTOA0U3r7Ew2ntLeLDxvpWXG9i/LR3keC5RfONiRuGtTObcskv0Nh9aOLLaPqU22VEMVvunzOFyz0KdIEXmy417MCAWRJ4oJ9Cvf3pNb51GXhcVV1JX8n8TMairfgmH0i5B5bNwjqU0bH4rl3lJ3rV4dsLjoKvsii8f0E6ARGi0bjx5ANNmN462v5cCzB5Nfbuu24U2xw1RmLyrGgT1Xe6bUH3cir9a94r4y4rQp+KPNoKO4MY3HEyUnFKaxz19O+JZ LDhHzWH0 tliJdRcQrHNQrYUTrXSszaO+gI5w0GK+7k4qayZNScQPHs3CvYk5LMpmzy+gJ15B9OdaJQz/iq7OFPSwLK+SgcYxhtC4DRCrH1lnI8bhEVVPf9VmXJftISUOhuRJoFmxc+ZGIaQBLSVR/CpdP7//Kl6GjVN5AotAfLNotefb769rAQd1Qbl9H8rgf8dUQzTxQMTftsJdE+ytcRF/82tPFANQVmg6g8xYstGTa69O7DAvPrXW0zfJ1Aptk54Rps9m/Hc3iUS6MoVDZ3QsTo89IN8YH/g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/6/23 15:10, Chuck Lever wrote: > From: Chuck Lever > > Create a vector of directory operations in fs/libfs.c that handles > directory seeks and readdir via stable offsets instead of the > current cursor-based mechanism. > > For the moment these are unused. > > Signed-off-by: Chuck Lever > --- > fs/dcache.c | 1 > fs/libfs.c | 185 ++++++++++++++++++++++++++++++++++++++++++++++++ > include/linux/dcache.h | 1 > include/linux/fs.h | 9 ++ > 4 files changed, 196 insertions(+) > > diff --git a/fs/dcache.c b/fs/dcache.c > index 52e6d5fdab6b..9c9a801f3b33 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -1813,6 +1813,7 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) > dentry->d_sb = sb; > dentry->d_op = NULL; > dentry->d_fsdata = NULL; > + dentry->d_offset = 0; > INIT_HLIST_BL_NODE(&dentry->d_hash); > INIT_LIST_HEAD(&dentry->d_lru); > INIT_LIST_HEAD(&dentry->d_subdirs); > diff --git a/fs/libfs.c b/fs/libfs.c > index 89cf614a3271..07317bbe1668 100644 > --- a/fs/libfs.c > +++ b/fs/libfs.c > @@ -239,6 +239,191 @@ const struct inode_operations simple_dir_inode_operations = { > }; > EXPORT_SYMBOL(simple_dir_inode_operations); > > +/** > + * stable_offset_init - initialize a parent directory > + * @dir: parent directory to be initialized > + * > + */ > +void stable_offset_init(struct inode *dir) > +{ > + xa_init_flags(&dir->i_doff_map, XA_FLAGS_ALLOC1); > + dir->i_next_offset = 0; > +} > +EXPORT_SYMBOL(stable_offset_init); > + > +/** > + * stable_offset_add - Add an entry to a directory's stable offset map > + * @dir: parent directory being modified > + * @dentry: new dentry being added > + * > + * Returns zero on success. Otherwise, a negative errno value is returned. > + */ > +int stable_offset_add(struct inode *dir, struct dentry *dentry) > +{ > + struct xa_limit limit = XA_LIMIT(2, U32_MAX); > + u32 offset = 0; > + int ret; > + > + if (dentry->d_offset) > + return -EBUSY; > + > + ret = xa_alloc_cyclic(&dir->i_doff_map, &offset, dentry, limit, > + &dir->i_next_offset, GFP_KERNEL); Please see below at struct inode my question about i_next_offset. > + if (ret < 0) > + return ret; > + > + dentry->d_offset = offset; > + return 0; > +} > +EXPORT_SYMBOL(stable_offset_add); > + > +/** > + * stable_offset_remove - Remove an entry to a directory's stable offset map > + * @dir: parent directory being modified > + * @dentry: dentry being removed > + * > + */ > +void stable_offset_remove(struct inode *dir, struct dentry *dentry) > +{ > + if (!dentry->d_offset) > + return; > + > + xa_erase(&dir->i_doff_map, dentry->d_offset); > + dentry->d_offset = 0; > +} > +EXPORT_SYMBOL(stable_offset_remove); > + > +/** > + * stable_offset_destroy - Release offset map > + * @dir: parent directory that is about to be destroyed > + * > + * During fs teardown (eg. umount), a directory's offset map might still > + * contain entries. xa_destroy() cleans out anything that remains. > + */ > +void stable_offset_destroy(struct inode *dir) > +{ > + xa_destroy(&dir->i_doff_map); > +} > +EXPORT_SYMBOL(stable_offset_destroy); > + > +/** > + * stable_dir_llseek - Advance the read position of a directory descriptor > + * @file: an open directory whose position is to be updated > + * @offset: a byte offset > + * @whence: enumerator describing the starting position for this update > + * > + * SEEK_END, SEEK_DATA, and SEEK_HOLE are not supported for directories. > + * > + * Returns the updated read position if successful; otherwise a > + * negative errno is returned and the read position remains unchanged. > + */ > +static loff_t stable_dir_llseek(struct file *file, loff_t offset, int whence) > +{ > + switch (whence) { > + case SEEK_CUR: > + offset += file->f_pos; > + fallthrough; > + case SEEK_SET: > + if (offset >= 0) > + break; > + fallthrough; > + default: > + return -EINVAL; > + } > + > + return vfs_setpos(file, offset, U32_MAX); > +} > + > +static struct dentry *stable_find_next(struct xa_state *xas) > +{ > + struct dentry *child, *found = NULL; > + > + rcu_read_lock(); > + child = xas_next_entry(xas, U32_MAX); > + if (!child) > + goto out; > + spin_lock_nested(&child->d_lock, DENTRY_D_LOCK_NESTED); > + if (simple_positive(child)) > + found = dget_dlock(child); > + spin_unlock(&child->d_lock); > +out: > + rcu_read_unlock(); > + return found; > +} > + > +static bool stable_dir_emit(struct dir_context *ctx, struct dentry *dentry) > +{ > + struct inode *inode = d_inode(dentry); > + > + return ctx->actor(ctx, dentry->d_name.name, dentry->d_name.len, > + dentry->d_offset, inode->i_ino, > + fs_umode_to_dtype(inode->i_mode)); > +} > + > +static void stable_iterate_dir(struct dentry *dir, struct dir_context *ctx) > +{ > + XA_STATE(xas, &((d_inode(dir))->i_doff_map), ctx->pos); > + struct dentry *dentry; > + > + while (true) { > + spin_lock(&dir->d_lock); > + dentry = stable_find_next(&xas); > + spin_unlock(&dir->d_lock); > + if (!dentry) > + break; > + > + if (!stable_dir_emit(ctx, dentry)) { > + dput(dentry); > + break; > + } > + > + dput(dentry); > + ctx->pos = xas.xa_index + 1; > + } > +} > + > +/** > + * stable_readdir - Emit entries starting at offset @ctx->pos > + * @file: an open directory to iterate over > + * @ctx: directory iteration context > + * > + * Caller must hold @file's i_rwsem to prevent insertion or removal of > + * entries during this call. > + * > + * On entry, @ctx->pos contains an offset that represents the first entry > + * to be read from the directory. > + * > + * The operation continues until there are no more entries to read, or > + * until the ctx->actor indicates there is no more space in the caller's > + * output buffer. > + * > + * On return, @ctx->pos contains an offset that will read the next entry > + * in this directory when shmem_readdir() is called again with @ctx. > + * > + * Return values: > + * %0 - Complete > + */ > +static int stable_readdir(struct file *file, struct dir_context *ctx) > +{ > + struct dentry *dir = file->f_path.dentry; > + > + lockdep_assert_held(&d_inode(dir)->i_rwsem); > + > + if (!dir_emit_dots(file, ctx)) > + return 0; > + > + stable_iterate_dir(dir, ctx); > + return 0; > +} > + > +const struct file_operations stable_dir_operations = { > + .llseek = stable_dir_llseek, > + .iterate_shared = stable_readdir, > + .read = generic_read_dir, > + .fsync = noop_fsync, > +}; > +EXPORT_SYMBOL(stable_dir_operations); > + > static struct dentry *find_next_child(struct dentry *parent, struct dentry *prev) > { > struct dentry *child = NULL; > diff --git a/include/linux/dcache.h b/include/linux/dcache.h > index 6b351e009f59..579ce1800efe 100644 > --- a/include/linux/dcache.h > +++ b/include/linux/dcache.h > @@ -96,6 +96,7 @@ struct dentry { > struct super_block *d_sb; /* The root of the dentry tree */ > unsigned long d_time; /* used by d_revalidate */ > void *d_fsdata; /* fs-specific data */ > + u32 d_offset; /* directory offset in parent */ > > union { > struct list_head d_lru; /* LRU list */ > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 133f0640fb24..3fc2c04ed8ff 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -719,6 +719,10 @@ struct inode { > #endif > > void *i_private; /* fs or device private pointer */ > + > + /* simplefs stable directory offset tracking */ > + struct xarray i_doff_map; > + u32 i_next_offset; Hmm, I was grepping through the patches and only find that "i_next_offset" is initialized to 0 and then passed to xa_alloc_cyclic - does this really need to part of struct inode or could it be a local variable in stable_offset_add()? I only managed to look a bit through the patches right now, personally I like v2 better as it doesn't extend struct inode with changes that can be used by in-memory file system only. What do others think? An alternative would be to have these fields in struct shmem_inode_info and pass it as extra argument to the stable_ functions? Thanks, Bernd