From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32778EB64D7 for ; Mon, 26 Jun 2023 13:18:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C2E9E8D0002; Mon, 26 Jun 2023 09:18:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB7E68D0001; Mon, 26 Jun 2023 09:18:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A31C18D0002; Mon, 26 Jun 2023 09:18:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 902258D0001 for ; Mon, 26 Jun 2023 09:18:48 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 32529A060E for ; Mon, 26 Jun 2023 13:18:48 +0000 (UTC) X-FDA: 80944953936.24.FC46016 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by imf04.hostedemail.com (Postfix) with ESMTP id 9016040018 for ; Mon, 26 Jun 2023 13:18:44 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm2 header.b=SGk3RVq7; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=dqfuaZ6m; dmarc=pass (policy=none) header.from=fastmail.fm; spf=pass (imf04.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 66.111.4.27 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687785524; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gZrijpOdvanWIKt5AZyAe9P8aRprxVz3zI189BLiGus=; b=YMIiwpO4AdnJUEoxy/BXVhFWROAG40ytWpTYwYMeKmAdsCoQRmj7ybTHUcI3N/0iB4VrPJ BsqPyhbi5cMDHc5w+OEwwE96VfoR6ctkGej3xpgg/GG+MnLC6hbGPjOoWw78znoSHmD33S 70NxL3fiJWxUy79/32vyxr11Xc/7RAQ= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm2 header.b=SGk3RVq7; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=dqfuaZ6m; dmarc=pass (policy=none) header.from=fastmail.fm; spf=pass (imf04.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 66.111.4.27 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687785524; a=rsa-sha256; cv=none; b=X4HQD/LI+HgJIp6MyG7+Ar6qK9R5LHgDZUxkMKyC6LdbJAHLxDgh9ZKIWV5yYUYdCKaCi0 /8igSRZsmQvph17O4jvaGqhn1tM1zdroNxm9qN4fd3bDEgfYd4KjPh1o8la7IXWzYQ8KBy tRaWrFg71gQEtzDpHVTBzrE7Bzqu964= Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id BC8025C0167; Mon, 26 Jun 2023 09:18:43 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Mon, 26 Jun 2023 09:18:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm2; t= 1687785523; x=1687871923; bh=gZrijpOdvanWIKt5AZyAe9P8aRprxVz3zI1 89BLiGus=; b=SGk3RVq7fmAH06kq0usDUDSQPMffXg2Xf/oYLNqLhXQaGLGTfWC wR03jQUgEr6ZYR8UvwOiMiAu9pXRqXb+4SbAveUM7KBH7mkfS/rN5IiBp5xNZz5T 6vHn5RV5mdm/xCji2qhpXmskTKNVxMeoqGL/Jx7OlFOfd3HhZLt8NyRCvGjsIyI/ 6Muw/5c8Vc03hujPN5r1o1C97jehQr6Gs70gPpFwoEWDMhpAGILLY2+iBQSCHHVj 7cXbmZxkIEFap2RoxipvYIAI/RuGWIdufqwWR3nCc4+BXGcS9qUgvx7g1xbiIRvh 5tnjOQDy/mH1LVkZRscWoYtjCnS0X1uN2Zg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1687785523; x=1687871923; bh=gZrijpOdvanWIKt5AZyAe9P8aRprxVz3zI1 89BLiGus=; b=dqfuaZ6mMvTFXSqW9u5C+e1VJCYVfi4zBXNq/bEw0x17C2Hbirz 1bowEWBFvXZh6luErnWq11LZsWzlgx3aw3SwPR4z9G7vohagfFpZ5tJok7QB47ge kHpVY+rlO4Mq3MVrSk795RSAAsVffog1xgqjGEdgXco1tK1EpkKLPyWe7kbQqHe+ lNRRcdWb8gjjlqx+ogotKYlptWOWA9geXFCw49E7qcoYPqJoCT5qU7MBk1bjJtbE dGLNLDW0c5wB5e2JFg9aH6Sr3D6qDxLopXTlWCYFv8MbI7uUUhroOCXzszZZePs/ bTgGu+uzvTUiVhqfP8/Irtf4krafuboPfig== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgeehfedgheekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtfeejnecuhfhrohhmpeeuvghr nhguucfutghhuhgsvghrthcuoegsvghrnhgurdhstghhuhgsvghrthesfhgrshhtmhgrih hlrdhfmheqnecuggftrfgrthhtvghrnhepkeehveekleekkeejhfehgeeftdffuddujeej ieehheduueelleeghfeukeefvedunecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrg hmpehmrghilhhfrhhomhepsggvrhhnugdrshgthhhusggvrhhtsehfrghsthhmrghilhdr fhhm X-ME-Proxy: Feedback-ID: id8a24192:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 26 Jun 2023 09:18:42 -0400 (EDT) Message-ID: Date: Mon, 26 Jun 2023 15:18:41 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v3 3/3] shmem: stable directory offsets To: Chuck Lever , viro@zeniv.linux.org.uk, brauner@kernel.org, hughd@google.com, akpm@linux-foundation.org Cc: Chuck Lever , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org References: <168605676256.32244.6158641147817585524.stgit@manet.1015granger.net> <168605707262.32244.4794425063054676856.stgit@manet.1015granger.net> Content-Language: en-US From: Bernd Schubert In-Reply-To: <168605707262.32244.4794425063054676856.stgit@manet.1015granger.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 9016040018 X-Stat-Signature: cgkrabiq3fdgguhnrgefmo878mncz4bj X-Rspam-User: X-HE-Tag: 1687785524-400372 X-HE-Meta: U2FsdGVkX1+CUsJ55Y2QwJIxieA3aEmRbSvbJF8owhMkIABq+JajtTUQjSsTMwPxCHc1WSe/+qoX0HaPmTX3AcPTJrjPpdzXvk4I/Yq467E7+EEc0u7XtW5cSwo3GUZEDiKKRZ7FEU2y6RwkLOKRq5YF6veW8J6fKtb1mSZUDiwDAxPn0cfmLYJuQeRoLWNxSRh9bvq8TF7BrKjemn4P32U6B8NMJw3PcHQvqorVuwROcJmDjAx48gPQNhJDAQzWDCYYo2x2ET024sutrU2LvcGw7NaSyV267JKmPFNEJ39zLGzRrDrcC4sJna8f2mqooHXvtPAsTr9CsvRfyudp6KlppWV//RlBsEgnpRj5/cjmWIJpssvJVOHHWlPDGBVUC3PxA1o6U6TgmotYGbCy/lhMFwuaTOS3vRVzPdGVRK6lh23IOORu/pAvGOd+I4vvnapMXDJyszlui+akORvhK8vE6TQBmu0FXZUVwS2yg5hXd/j2xmWt+QLwk1UW1NQHVVMOwotNDFbFxa/855ZNqv3p3SnRfTZWWrdt7lxIcPS2cDJcp4Hwn6XbE76cK9aVYZgt9smtFG7+kbpBkxCCtQSQZPh5Q0JQ6TgTz4taetx4OGEUPMf7rPkOxx/HBnSWMqdn4JzWECznxt+bWCDLYTcYxPF5axjqdxH27RDJaMBiinEU1vfj2fj2XT66ewYU9UfqY3/854jwXDX48tkRnNxH7O99cNr3B++BPku5x8B4AIi1JtnpPTaRJWTyw5YsS500YxTxIg273QJ0ixPPpVGW1skvaz3CYskvtlkxST45MGzBSSJFaJKJRZqoh0tYrhlNZRT9sHtENQv+L0LrBhxglU6edEV3xs8tCLbTaFCjeS7MYL54EVNAM5UrDs9WpejzVFhBKTl4o0nVMSdA4R79fJkA+NBUblirtbfYxrW4bXd9AM2EB8MgNoWTAq6D9qzXSE0EGtuNOU4g7gc fcBdICVR rKLApWBXAqbp2kl2zgobHsR5Ks6BzepobHAQKeWh3+OGmDRtqf8tg4DX2se1onKLMT5/DEW8gQ+TfolrWjxPYTITutxtllT69mjLFcSJFgUTKm7k0PetS5WdtRfzpcxKlS7wXkA/HxD1oWjpl7hzXnUGoaDBJ8mYG5PF5z6v7/EvBVx2xcEqtBycubH/TDKyQAJ9FGZ/3/tsFeh7aQS0cvrZKaFBWgxbNRIFLi6k7Iwd4FOB+7hNpyZ5Uin7RB/yct26c/P0WaJ2lF7slSm6MCrRX8w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/6/23 15:11, Chuck Lever wrote: > From: Chuck Lever > > The current cursor-based directory offset mechanism doesn't work > when a tmpfs filesystem is exported via NFS. This is because NFS > clients do not open directories. Each server-side READDIR operation > has to open the directory, read it, then close it. The cursor state > for that directory, being associated strictly with the opened > struct file, is thus discarded after each NFS READDIR operation. > > Directory offsets are cached not only by NFS clients, but also by > user space libraries on those clients. Essentially there is no way > to invalidate those caches when directory offsets have changed on > an NFS server after the offset-to-dentry mapping changes. Thus the > whole application stack depends on unchanging directory offsets. > > The solution we've come up with is to make the directory offset for > each file in a tmpfs filesystem stable for the life of the directory > entry it represents. > > shmem_readdir() and shmem_dir_llseek() now use an xarray to map each > directory offset (an loff_t integer) to the memory address of a > struct dentry. > > Signed-off-by: Chuck Lever > --- > mm/shmem.c | 39 +++++++++++++++++++++++++++++++++++---- > 1 file changed, 35 insertions(+), 4 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 721f9fd064aa..fd9571056181 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -2410,7 +2410,8 @@ static struct inode *shmem_get_inode(struct mnt_idmap *idmap, struct super_block > /* Some things misbehave if size == 0 on a directory */ > inode->i_size = 2 * BOGO_DIRENT_SIZE; > inode->i_op = &shmem_dir_inode_operations; > - inode->i_fop = &simple_dir_operations; > + inode->i_fop = &stable_dir_operations; > + stable_offset_init(inode); > break; > case S_IFLNK: > /* > @@ -2950,6 +2951,10 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir, > if (error && error != -EOPNOTSUPP) > goto out_iput; > > + error = stable_offset_add(dir, dentry); > + if (error) > + goto out_iput; > + > error = 0; This line can be removed? > dir->i_size += BOGO_DIRENT_SIZE; > dir->i_ctime = dir->i_mtime = current_time(dir); > @@ -3027,6 +3032,10 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr > goto out; > } > > + ret = stable_offset_add(dir, dentry); > + if (ret) > + goto out; > + I think this should call shmem_free_inode() before goto out - reverse what shmem_reserve_inode() has done. > dir->i_size += BOGO_DIRENT_SIZE; > inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); > inode_inc_iversion(dir); > @@ -3045,6 +3054,8 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) > if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode)) > shmem_free_inode(inode->i_sb); > > + stable_offset_remove(dir, dentry); > + > dir->i_size -= BOGO_DIRENT_SIZE; > inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); > inode_inc_iversion(dir); > @@ -3103,24 +3114,37 @@ static int shmem_rename2(struct mnt_idmap *idmap, > { > struct inode *inode = d_inode(old_dentry); > int they_are_dirs = S_ISDIR(inode->i_mode); > + int error; > > if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT)) > return -EINVAL; > > - if (flags & RENAME_EXCHANGE) > + if (flags & RENAME_EXCHANGE) { > + stable_offset_remove(old_dir, old_dentry); > + stable_offset_remove(new_dir, new_dentry); > + error = stable_offset_add(new_dir, old_dentry); > + if (error) > + return error; > + error = stable_offset_add(old_dir, new_dentry); > + if (error) > + return error; > return simple_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); > + } Hmm, error handling issues? Everything needs to be reversed when any of the operations fails? > > if (!simple_empty(new_dentry)) > return -ENOTEMPTY; > > if (flags & RENAME_WHITEOUT) { > - int error; > - > error = shmem_whiteout(idmap, old_dir, old_dentry); > if (error) > return error; > } > > + stable_offset_remove(old_dir, old_dentry); > + error = stable_offset_add(new_dir, old_dentry); > + if (error) > + return error; > + > if (d_really_is_positive(new_dentry)) { > (void) shmem_unlink(new_dir, new_dentry); > if (they_are_dirs) { > @@ -3185,6 +3209,11 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, > folio_unlock(folio); > folio_put(folio); > } > + > + error = stable_offset_add(dir, dentry); > + if (error) > + goto out_iput; > + Error handling, there is a kmemdup() above which needs to be freed? I'm not sure about folio, automatically released with the inode? > dir->i_size += BOGO_DIRENT_SIZE; > dir->i_ctime = dir->i_mtime = current_time(dir); > inode_inc_iversion(dir); > @@ -3920,6 +3949,8 @@ static void shmem_destroy_inode(struct inode *inode) > { > if (S_ISREG(inode->i_mode)) > mpol_free_shared_policy(&SHMEM_I(inode)->policy); > + if (S_ISDIR(inode->i_mode)) > + stable_offset_destroy(inode); > } > > static void shmem_init_inode(void *foo) > > Thanks, Bernd