From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BE43D41D4A for ; Wed, 13 Nov 2024 04:30:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 88FE26B00C2; Tue, 12 Nov 2024 23:30:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 841006B00C3; Tue, 12 Nov 2024 23:30:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E22D6B00C4; Tue, 12 Nov 2024 23:30:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4E8786B00C2 for ; Tue, 12 Nov 2024 23:30:12 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id F0146A07AB for ; Wed, 13 Nov 2024 04:30:11 +0000 (UTC) X-FDA: 82779794244.29.078293A Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) by imf25.hostedemail.com (Postfix) with ESMTP id 436BAA000D for ; Wed, 13 Nov 2024 04:29:39 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux.org.uk header.s=zeniv-20220401 header.b=XKMv5wom; dmarc=pass (policy=none) header.from=zeniv.linux.org.uk; spf=none (imf25.hostedemail.com: domain of viro@ftp.linux.org.uk has no SPF policy when checking 62.89.141.173) smtp.mailfrom=viro@ftp.linux.org.uk ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731472147; a=rsa-sha256; cv=none; b=AmmQ6aekFJj90dhPbjGM2VCNwl1nAkkzQy2AH1YWBVuftso7PC1lzw+7XljN1lvalS4IVr fWi6cItTOYRyehed37RGffpeFhESDHPqRph6vvMiwzXypd/6YJ2AgRrJClXQU7SothdNKW zgXE/FawxC/TOqNfIBkzlo9KUpvEX+0= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=linux.org.uk header.s=zeniv-20220401 header.b=XKMv5wom; dmarc=pass (policy=none) header.from=zeniv.linux.org.uk; spf=none (imf25.hostedemail.com: domain of viro@ftp.linux.org.uk has no SPF policy when checking 62.89.141.173) smtp.mailfrom=viro@ftp.linux.org.uk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731472147; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wZh1lNxuuSr30YjUMdsOV4XTgEJagRrgTnzmNehpilo=; b=tOd4sFsNpReGt3QM4QHOKqHPEX9kksoHVz6wyEDE5G+VHf+f3aiDd2lxBE37mRXR8TqIrG DgDihtHxtqcHg/RwYw2wi+sOkfpVFSXNjZuk21n43MzD5CqRt8ykSKGXrHO7bQzvwGmRMW S4wC5XIDMpdDh5HN1z6sJRP9A3FqiWU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=wZh1lNxuuSr30YjUMdsOV4XTgEJagRrgTnzmNehpilo=; b=XKMv5womPW9nnryoMkpEGH/E0+ rxxM9TGFfKtRN19ZwerG5ECpVtqj1D3/a5YMf5g8FBAHkS17vvpRhNVAsjYRtPGgwgUtG0FBcE7Pg w7HhUBBdaxTzQlWhdAOCwsJfXIKSfSNelx5WZVeaybOEzdxcFmJK+RNsNWDYNI5CnOKHkDx7NK6lv uOAkF7wVrCAO3+MSVBW6jjU7RAp3cxm6ut+uAxvPAPoZffve0MuF66okL3a/mWuJeqVY1brwlbR+A XFC+yZfvbRk0AL884vqXjTh3XY3/wuOwelgq3mhSIp8sEFT+VOPHnrEutmtnmg4Rf24zeX1UH+WYN FKxc42Bw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.98 #2 (Red Hat Linux)) id 1tB511-0000000EMIh-0wwD; Wed, 13 Nov 2024 04:30:03 +0000 Date: Wed, 13 Nov 2024 04:30:03 +0000 From: Al Viro To: Linus Torvalds Cc: Amir Goldstein , Josef Bacik , kernel-team@fb.com, linux-fsdevel@vger.kernel.org, jack@suse.cz, brauner@kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org Subject: Re: [PATCH v7 05/18] fsnotify: introduce pre-content permission events Message-ID: <20241113043003.GH3387508@ZenIV> References: <141e2cc2dfac8b2f49c1c8d219dd7c20925b2cef.1731433903.git.josef@toxicpanda.com> <20241113001251.GF3387508@ZenIV> <20241113011954.GG3387508@ZenIV> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241113011954.GG3387508@ZenIV> X-Rspam-User: X-Rspamd-Queue-Id: 436BAA000D X-Rspamd-Server: rspam11 X-Stat-Signature: msxsp9uaugr95izyzcczfnczped3xrtt X-HE-Tag: 1731472179-792795 X-HE-Meta: U2FsdGVkX195u48mFNUVtWXgmWAfXX9fir61zMkSUqTQEg+fI+t0p5zjVc6dhaUoBx/kXZ3gZC92z1imW4BC1cjhbJjpfvHrGE/05vA4LYdF1nZZzQYLzBXiq6Tq3LwTt+BSBIpX/FYGbdxAZWleAXjAT3MqZqOy7Jcij+UC/85DP1xBwPT5hi7rQqbqm9Zpq+75SYDBypOIbwix/IprSs3ZnKKe0J9oMFtGybWSY/9XXKXRJsNqQrlQ+FRatE9J7+VTsjS7VAK9adOUcpmEBf6cx7KXaP1Iv3XKjSMtv4cf49bnJkdQoNtLm700KVCnI9gfC1XAPB7XYefNKQBCeO4g1yjVT+QVKUT9Lhe9LbK9Flf9fMd0h5CRDPa4S1aWqbIKfjMDKMilKpOgNzQAfWT5BPQKWA3uFmXx8NboWNEUL1RiW1+nwKuCxYmXsTDbU7x9zISgmoxCkNo+GI7N9HA1lJiYkzCVD5JO0xDmLyjCDfP9wxHzr1caTcnJXDc9+yzgGxpmq6i8b1zWGK3lqK6Uz0gUSEYzSSiAXKZl7AEwlcdB2DDMfqgpsQfwW2r+/nvf4owpE89bxIbWR0Y77mLOdPXdENt2WwFx5/8uiH1jvdHWvPcxM/ZYfwjVg56caOj49lT0XB0DXHlHPL1sn2XgclyGEJ1IvFAJRKYoabLiEGwEGMkeoOniphk2lgsmV5tr5CelMV/JjzKnCrxL4ZW6BEgyKPrFgPrn3vx02tPkSk+AgH1chzWZAJnEZo2y7BVVAGMuvltWcyzoUU3DUEgTLc3A1gFjgr4LATvEgBUknv0nYgVG9X9NhqMlMb6FQzi8l74TPmQjq9ngof/l/GwRDufVCpiqI0Ll/hdBDh3e98Gcp2gNibtwzZ9ltz6sCPeLlnlWgcSPkEl7JMLsCUaw9nsZEjhJU0Gp4U/mEDLVqEQiN/IjuWuJt2WWfdEFo2sklwT6ZDU5NOEteK6 6PEcKwtO nEvSa+u6Us8aEoXbM6bV3kEem9OZaugQBwAbPtBDz0LnQGv6xtv04eTMBEwHh97Xy9DLT9zDMl+UqcHj346kOmFZhGgdIsOEQvscu4CBnOY/b1SREqjCBt4zMrVeieHQrUKqWJTcaq5jsWniNlaXFW7Ff+3zq1Hvf4RfTU120a/M5f+pOMtiJRRwAr0ZXqQlvOd0hZPTWMjt5bV9cLwYJj1ZNQpDbUZ1du5hKbdPKph+Z576s3MOL9qwZXqdH7MAXOi/1Db6YjDTyK95ZF4alr5E/h30zQ6h5nMWo+Wr5V0u3+5ldnM3TlILHwpcCMyjcZdo5ftz3J76JRHBcP0Q7v6YUHNFJ5melhjKv1i9/72TwucOZOrCb0x9kGtxW0BeyAJZ07ND+PNe30iFXuF6tcQlaJwuPgxczsGo0detNxU24hYq99Drzp8yiMATENArEqvRn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 13, 2024 at 01:19:54AM +0000, Al Viro wrote: > On Tue, Nov 12, 2024 at 04:38:42PM -0800, Linus Torvalds wrote: > > Looking at that locking code in fadvise() just for the f_mode use does > > make me think this would be a really good cleanup. > > > > I note that our fcntl code seems buggy as-is, because while it does > > use f_lock for assignments (good), it clearly does *not* use them for > > reading. > > > > So it looks like you can actually read inconsistent values. > > > > I get the feeling that f_flags would want WRITE_ONCE/READ_ONCE in > > _addition_ to the f_lock use it has. > > AFAICS, fasync logics is the fishy part - the rest should be sane. > > > The f_mode thing with fadvise() smells like the same bug. Just because > > the modifications are serialized wrt each other doesn't mean that > > readers are then automatically ok. > > Reads are also under ->f_lock in there, AFAICS... > > Another thing in the vicinity is ->f_mode modifications after the calls > of anon_inode_getfile() in several callers - probably ought to switch > those to anon_inode_getfile_fmode(). That had been discussed back in > April when the function got merged, but "convert to using it" followup > series hadn't materialized... While we are at it, there's is a couple of kludges I really hate - mixing __FMODE_NONOTIFY and __FMODE_EXEC with O_... flags. E.g. for __FMODE_NONOTIFY all it takes is switching fanotify from anon_inode_getfd() to anon_inode_getfile_fmode() and adding a dentry_open_nonotify() to be used by fanotify on the other path. That's it - no more weird shit in OPEN_FMODE(), etc. For __FMODE_EXEC it might get trickier (nfs is the main consumer), but I seriously suspect that something like "have path_openat() check op->acc_mode & MAY_EXEC and set FMODE_EXEC in ->f_mode right after struct file allocation" would make a good starting point; yes, it would affect uselib(2), but... I've no idea whether it wouldn't be the right thing to do; would be hard to test. Anyway, untested __FMODE_NONOTIFY side of it: diff --git a/fs/fcntl.c b/fs/fcntl.c index 22dd9dcce7ec..ebd1c82bfb6b 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -1161,10 +1161,10 @@ static int __init fcntl_init(void) * Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY * is defined as O_NONBLOCK on some platforms and not on others. */ - BUILD_BUG_ON(21 - 1 /* for O_RDONLY being 0 */ != + BUILD_BUG_ON(20 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32( (VALID_OPEN_FLAGS & ~(O_NONBLOCK | O_NDELAY)) | - __FMODE_EXEC | __FMODE_NONOTIFY)); + __FMODE_EXEC)); fasync_cache = kmem_cache_create("fasync_cache", sizeof(struct fasync_struct), 0, diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index 9644bc72e457..43fbf29ef03a 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -101,8 +101,7 @@ static void __init fanotify_sysctls_init(void) * * Internal and external open flags are stored together in field f_flags of * struct file. Only external open flags shall be allowed in event_f_flags. - * Internal flags like FMODE_NONOTIFY, FMODE_EXEC, FMODE_NOCMTIME shall be - * excluded. + * Internal flags like FMODE_EXEC shall be excluded. */ #define FANOTIFY_INIT_ALL_EVENT_F_BITS ( \ O_ACCMODE | O_APPEND | O_NONBLOCK | \ @@ -262,8 +261,8 @@ static int create_fd(struct fsnotify_group *group, const struct path *path, * we need a new file handle for the userspace program so it can read even if it was * originally opened O_WRONLY. */ - new_file = dentry_open(path, - group->fanotify_data.f_flags | __FMODE_NONOTIFY, + new_file = dentry_open_nonotify(path, + group->fanotify_data.f_flags, current_cred()); if (IS_ERR(new_file)) { /* @@ -1404,6 +1403,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) unsigned int fid_mode = flags & FANOTIFY_FID_BITS; unsigned int class = flags & FANOTIFY_CLASS_BITS; unsigned int internal_flags = 0; + struct file *file; pr_debug("%s: flags=%x event_f_flags=%x\n", __func__, flags, event_f_flags); @@ -1472,7 +1472,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) (!(fid_mode & FAN_REPORT_NAME) || !(fid_mode & FAN_REPORT_FID))) return -EINVAL; - f_flags = O_RDWR | __FMODE_NONOTIFY; + f_flags = O_RDWR; if (flags & FAN_CLOEXEC) f_flags |= O_CLOEXEC; if (flags & FAN_NONBLOCK) @@ -1550,10 +1550,18 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) goto out_destroy_group; } - fd = anon_inode_getfd("[fanotify]", &fanotify_fops, group, f_flags); + fd = get_unused_fd_flags(flags); if (fd < 0) goto out_destroy_group; + file = anon_inode_getfile_fmode("[fanotify]", &fanotify_fops, group, + f_flags, FMODE_NONOTIFY); + if (IS_ERR(file)) { + fd = PTR_ERR(file); + put_unused_fd(fd); + goto out_destroy_group; + } + fd_install(fd, file); return fd; out_destroy_group: diff --git a/fs/open.c b/fs/open.c index acaeb3e25c88..04cb581528ff 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1118,6 +1118,23 @@ struct file *dentry_open(const struct path *path, int flags, } EXPORT_SYMBOL(dentry_open); +struct file *dentry_open_nonotify(const struct path *path, int flags, + const struct cred *cred) +{ + struct file *f = alloc_empty_file(flags, cred); + if (!IS_ERR(f)) { + int error; + + f->f_mode |= FMODE_NONOTIFY; + error = vfs_open(path, f); + if (error) { + fput(f); + f = ERR_PTR(error); + } + } + return f; +} + /** * dentry_create - Create and open a file * @path: path to create @@ -1215,7 +1232,7 @@ inline struct open_how build_open_how(int flags, umode_t mode) inline int build_open_flags(const struct open_how *how, struct open_flags *op) { u64 flags = how->flags; - u64 strip = __FMODE_NONOTIFY | O_CLOEXEC; + u64 strip = O_CLOEXEC; int lookup_flags = 0; int acc_mode = ACC_MODE(flags); diff --git a/include/linux/fs.h b/include/linux/fs.h index e3c603d01337..18888d601550 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2731,6 +2731,8 @@ struct file *dentry_open(const struct path *path, int flags, struct file *dentry_create(const struct path *path, int flags, umode_t mode, const struct cred *cred); struct path *backing_file_user_path(struct file *f); +struct file *dentry_open_nonotify(const struct path *path, int flags, + const struct cred *creds); /* * When mmapping a file on a stackable filesystem (e.g., overlayfs), the file @@ -3620,11 +3622,9 @@ struct ctl_table; int __init list_bdev_fs_names(char *buf, size_t size); #define __FMODE_EXEC ((__force int) FMODE_EXEC) -#define __FMODE_NONOTIFY ((__force int) FMODE_NONOTIFY) #define ACC_MODE(x) ("\004\002\006\006"[(x)&O_ACCMODE]) -#define OPEN_FMODE(flag) ((__force fmode_t)(((flag + 1) & O_ACCMODE) | \ - (flag & __FMODE_NONOTIFY))) +#define OPEN_FMODE(flag) ((__force fmode_t)(((flag + 1) & O_ACCMODE))) static inline bool is_sxid(umode_t mode) { diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h index 80f37a0d40d7..613475285643 100644 --- a/include/uapi/asm-generic/fcntl.h +++ b/include/uapi/asm-generic/fcntl.h @@ -6,7 +6,6 @@ /* * FMODE_EXEC is 0x20 - * FMODE_NONOTIFY is 0x4000000 * These cannot be used by userspace O_* until internal and external open * flags are split. * -Eric Paris