From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DB00D462AF for ; Wed, 13 Nov 2024 14:36:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99C556B00DF; Wed, 13 Nov 2024 09:36:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 925B66B00E0; Wed, 13 Nov 2024 09:36:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 79D0E6B00E1; Wed, 13 Nov 2024 09:36:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4C24F6B00DF for ; Wed, 13 Nov 2024 09:36:47 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E21BFA0A89 for ; Wed, 13 Nov 2024 14:36:46 +0000 (UTC) X-FDA: 82781320230.17.FF7A619 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) by imf19.hostedemail.com (Postfix) with ESMTP id 2B82F1A0030 for ; Wed, 13 Nov 2024 14:35:51 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=koVFZpXG; spf=pass (imf19.hostedemail.com: domain of amir73il@gmail.com designates 209.85.222.176 as permitted sender) smtp.mailfrom=amir73il@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731508549; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5TpO1rEDli92wfsqt+A461mrFCCbHcRm9Ir9kcXs4JU=; b=PwE8xiMazOEu0AsZfQGmSfARdenZihfCw9nV7fZIm/s+dDAWaySeuh2v4UMj0BS8IOh6vO PF2j8+Fp8RiQp4FCYjyS4VEGFQk33Me4j24ZL+ZMqqFFxGIG4i0kYrWMPV0eisiO4k3qxC 4Gw3RUYXOiXxnvZ+vB7/pmUOLhQbuVM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731508549; a=rsa-sha256; cv=none; b=ceYwzawBO5hjejABaD87/SElv2HSy+Li9/zLFBpPPBCLi9i591g+q5faEZRC8Vqc1cxg8y sCxbvjXBANeH/d/lX7t0EVQyeJtRzmVAUR5RPNQtqQ3hp2jvUYaPnntlpfxFO8NU2OVb0i AOohUXqS+w3OX18a2pUsSuBy+HBnI04= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=koVFZpXG; spf=pass (imf19.hostedemail.com: domain of amir73il@gmail.com designates 209.85.222.176 as permitted sender) smtp.mailfrom=amir73il@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-7b14443a71eso524826985a.1 for ; Wed, 13 Nov 2024 06:36:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731508604; x=1732113404; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5TpO1rEDli92wfsqt+A461mrFCCbHcRm9Ir9kcXs4JU=; b=koVFZpXGP5NNO2fRdOphWX2kUe5WBw7PdpJFLl0Q9+G/ZpFc3xMlQqEVzVQ52Xj3Wq oA2hyuIbdNxXUN6mmghsd9vWLzmiBfJUU3YeIFlDLP2MVayx8HENKEwHjuqMg7YiSxoA 8MlHDXgwnWs/AQ0LcsyLZwadoCIzqon1cnAchByoQcZ1dNsz+YCwF1qQJDhL7wBDZ71O dytK/nEw4Etjg9C6alQHdqUNHRYphCPg8+kVIoD1iANI3BxXrW3KboI0ViC4sftY6u3L V8oIWzbOtzddr8XRGsTneQGX7vCnHo1jbohZn1lBk6r3FE+ef5lueX2P0fvvgb1qHIMl KnoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731508604; x=1732113404; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5TpO1rEDli92wfsqt+A461mrFCCbHcRm9Ir9kcXs4JU=; b=d5Y5dt40QFHi6T0QYtBPwKXJJgwskzNwVKvHvFaSmq/tCN0GuqOXuv7WViUb98d3S3 h25m312D2jV6wn+uBK0vZGdjSZYRBmeC8LqbF0b4WZDhtZc7XWVcYqNlHJOa15/RBQGW 8Sv0RgZteQjre36l05nwWV+8NukkElDKpCdWMsJSmEWOfZkXJXx3ajUkM+wh3VE2AhsK u7uwNuiNpS7LvWijt0clVU01QTxCol/JlsbTw1g7ES9Orcv6s6+6/6YxpMkJLubhC8WW jks0JtpcdjvCeRRk+jVl4Pv6x/Hl9qa9V9JKzLv5949KlFbGhwcl5KWqdJhPXg/bI+8d 183g== X-Forwarded-Encrypted: i=1; AJvYcCWjkTZHXbpQGof1vkWFmVo/JRNseih5LNepdBJxrUWM7gPY0mrvlZf74jXgVbdm2h9rVWfcv3irSA==@kvack.org X-Gm-Message-State: AOJu0YyHrSlh9jYp07vkLTOy7u3ja8OT7E1b5p7Egui7F0UYBEghvr+z bQnC/dARhSDpcPyB43zSX1Dof1fS4R5U63hvO25wf4DvK9ZQXS9gmc3BlkRQRaYnSzDf+gNhIcN ch8u8vIh9x8kQY6VoJoq+WvkYDoE= X-Google-Smtp-Source: AGHT+IHDM6yIlJ+rbWXJBsM1kdPByZi+pkkssJXLfoBLa9nCWP4173NtLKc4WYnD4Kt86lm779n87BvsYqF4dqhC3AQ= X-Received: by 2002:a05:6214:4589:b0:6d3:9359:26ef with SMTP id 6a1803df08f44-6d39e0f6663mr268917446d6.6.1731508604035; Wed, 13 Nov 2024 06:36:44 -0800 (PST) MIME-Version: 1.0 References: <141e2cc2dfac8b2f49c1c8d219dd7c20925b2cef.1731433903.git.josef@toxicpanda.com> <20241113001251.GF3387508@ZenIV> <20241113011954.GG3387508@ZenIV> <20241113043003.GH3387508@ZenIV> In-Reply-To: <20241113043003.GH3387508@ZenIV> From: Amir Goldstein Date: Wed, 13 Nov 2024 15:36:33 +0100 Message-ID: Subject: Re: [PATCH v7 05/18] fsnotify: introduce pre-content permission events To: Al Viro Cc: Linus Torvalds , Josef Bacik , kernel-team@fb.com, linux-fsdevel@vger.kernel.org, jack@suse.cz, brauner@kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 1oq811788sr1zanuidtqfed4j4uc8eur X-Rspam-User: X-Rspamd-Queue-Id: 2B82F1A0030 X-Rspamd-Server: rspam02 X-HE-Tag: 1731508551-195480 X-HE-Meta: U2FsdGVkX198lyTnlwNJcXAplTHZUic6uuRZS44aPlRG+Yrf7ndqHU2DMVdMPpDjmx+kK8Nn9gpTNRf0kguJ7Pbg5/UjyLKxqAuGPf+D7qoUszIytCfacPwarf6DXvJhUVcvXwPN8VWtERuyyOP7vxfMS4eyo6AeeFQo8i5kb4n+qHFifTq4NXtblxQUuBX/SdjgnLD/ntMvd5Un4RoRvgBLtcZKtqjg12JxQinYKvL+3rWN4ohvYFWyMBIM8DvS/1EG1vCyCjKEnNrdgFUbMoswd6CScJlMelIIrsAczYGz3Ve14mGjfANw2YB6YvNIbLgsUlJM0DGaVhTpeQ+/SPcR9IyW68rRFkoqJpt8+LuhmoL+77jSbaVUohFRVZHMnZvKlloG1SFcUBBFfreJMdl0HB35RZP39NkxcXhLM1SkG3RC3u5XpZ5Wd+nOEGktWdTZwYhkZXTwrRIE2893Q1Tmp+nk6jdhkPh0j9AVExgpPj5sg9EHgzETWJ5bNVxnrzjizTtcXvOLs3kipZ5dXI3tUPTBQT4nkeIA8vSqde1bnpyPPkhtHbWF9Gk1/ML7MwEYARC2wDOxg82shAPlTjCxlD5Wpy7A+U+FviWL59ZJWE1DTlxa3T2tWm5BSYzh8xRnUmq167/m5BWutEq9o8okwYXIRNVQx54ElneeZDzEHdh48B9MJ2KP3lVn+YIoPSP1SpkSpCAtH8EW/+kivf0mhIYnIbheQn/sFdUDmep9Y5Y0wVkS1g0E0B/FhUxFQj7Ks4rI4Idaz+ZtLEXWh20fYkVn3oSHh+fSl7nKEqFyAn5JfoIe2kFwkolifuK1WHO+S7Rpr1xHutA8MSXz7uHTkwWiKZEQs2eD9snvveE5FA4AGCnKCP21lPaRfH25oGKqeXy1+E1JEbghtiV9YHI9CNShvMlKG6UfJI1XjDCPh7CLUqSQQlWyfjUjN6Kd4RjarL+sS/DHaLDDuiQ nZLiAd4v 1MiUrjaA5gac3fgnamIAwEAbu0dATd5V7KPM34/dV4I/FQ1JYX1EpW0mc6JpsnhxRnl1DorvKJcCxabx8v2AcLrKY5GiEbhivuGjtVKAwPEyhVHoKKoA20gE2GEVKB+3sUKxc8nh0UCRXmnEeudnkqTm0dl/gin4D/iSbx5soqvtPm2ihgqiEBYqoy2MoSpegRqJqefuDXX2Q7HW/uni0O86cfaepM/YZbr+RHJi5ToCOJr4+/IppkNrE/m7emKG2+WY/o/QguBT4PWUd2IsDnkt2Tku19eS8iW+AQ92B1ZWp2LImkstW+7Z25asvalpq1DFDPtHZgt562wLo4YlotrfWsa6vQS/6VhyTArzBXl7yhskLUZytYBGkgWe9nOt7AG/duer6YRAysqxJM6Oig9FnBQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 13, 2024 at 5:30=E2=80=AFAM Al Viro w= rote: > > On Wed, Nov 13, 2024 at 01:19:54AM +0000, Al Viro wrote: > > On Tue, Nov 12, 2024 at 04:38:42PM -0800, Linus Torvalds wrote: > > > Looking at that locking code in fadvise() just for the f_mode use doe= s > > > make me think this would be a really good cleanup. > > > > > > I note that our fcntl code seems buggy as-is, because while it does > > > use f_lock for assignments (good), it clearly does *not* use them for > > > reading. > > > > > > So it looks like you can actually read inconsistent values. > > > > > > I get the feeling that f_flags would want WRITE_ONCE/READ_ONCE in > > > _addition_ to the f_lock use it has. > > > > AFAICS, fasync logics is the fishy part - the rest should be sane. > > > > > The f_mode thing with fadvise() smells like the same bug. Just becaus= e > > > the modifications are serialized wrt each other doesn't mean that > > > readers are then automatically ok. > > > > Reads are also under ->f_lock in there, AFAICS... > > > > Another thing in the vicinity is ->f_mode modifications after the calls > > of anon_inode_getfile() in several callers - probably ought to switch > > those to anon_inode_getfile_fmode(). That had been discussed back in > > April when the function got merged, but "convert to using it" followup > > series hadn't materialized... > > While we are at it, there's is a couple of kludges I really hate - > mixing __FMODE_NONOTIFY and __FMODE_EXEC with O_... flags. > > E.g. for __FMODE_NONOTIFY all it takes is switching fanotify from > anon_inode_getfd() to anon_inode_getfile_fmode() and adding > a dentry_open_nonotify() to be used by fanotify on the other path. > That's it - no more weird shit in OPEN_FMODE(), etc. > > For __FMODE_EXEC it might get trickier (nfs is the main consumer), > but I seriously suspect that something like "have path_openat() > check op->acc_mode & MAY_EXEC and set FMODE_EXEC in ->f_mode > right after struct file allocation" would make a good starting > point; yes, it would affect uselib(2), but... I've no idea whether > it wouldn't be the right thing to do; would be hard to test. > > Anyway, untested __FMODE_NONOTIFY side of it: > > diff --git a/fs/fcntl.c b/fs/fcntl.c > index 22dd9dcce7ec..ebd1c82bfb6b 100644 > --- a/fs/fcntl.c > +++ b/fs/fcntl.c > @@ -1161,10 +1161,10 @@ static int __init fcntl_init(void) > * Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY > * is defined as O_NONBLOCK on some platforms and not on others. > */ > - BUILD_BUG_ON(21 - 1 /* for O_RDONLY being 0 */ !=3D > + BUILD_BUG_ON(20 - 1 /* for O_RDONLY being 0 */ !=3D > HWEIGHT32( > (VALID_OPEN_FLAGS & ~(O_NONBLOCK | O_NDELAY)) | > - __FMODE_EXEC | __FMODE_NONOTIFY)); > + __FMODE_EXEC)); > > fasync_cache =3D kmem_cache_create("fasync_cache", > sizeof(struct fasync_struct), 0, > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fano= tify_user.c > index 9644bc72e457..43fbf29ef03a 100644 > --- a/fs/notify/fanotify/fanotify_user.c > +++ b/fs/notify/fanotify/fanotify_user.c > @@ -101,8 +101,7 @@ static void __init fanotify_sysctls_init(void) > * > * Internal and external open flags are stored together in field f_flags= of > * struct file. Only external open flags shall be allowed in event_f_fla= gs. > - * Internal flags like FMODE_NONOTIFY, FMODE_EXEC, FMODE_NOCMTIME shall = be > - * excluded. > + * Internal flags like FMODE_EXEC shall be excluded. > */ > #define FANOTIFY_INIT_ALL_EVENT_F_BITS (= \ > O_ACCMODE | O_APPEND | O_NONBLOCK | \ > @@ -262,8 +261,8 @@ static int create_fd(struct fsnotify_group *group, co= nst struct path *path, > * we need a new file handle for the userspace program so it can = read even if it was > * originally opened O_WRONLY. > */ > - new_file =3D dentry_open(path, > - group->fanotify_data.f_flags | __FMODE_NON= OTIFY, > + new_file =3D dentry_open_nonotify(path, > + group->fanotify_data.f_flags, I would make this a bit more generic helper and the comment above is truly clueless: /* - * we need a new file handle for the userspace program so it can read even if it was - * originally opened O_WRONLY. + * We provide an fd for the userspace program, so it could access t= he + * file without generating fanotify events itself. */ - new_file =3D dentry_open(path, - group->fanotify_data.f_flags | __FMODE_NONOT= IFY, - current_cred()); + new_file =3D dentry_open_fmode(path, group->fanotify_data.f_flags, + FMODE_NONOTIFY, current_cred()); > current_cred()); > if (IS_ERR(new_file)) { > /* > @@ -1404,6 +1403,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags,= unsigned int, event_f_flags) > unsigned int fid_mode =3D flags & FANOTIFY_FID_BITS; > unsigned int class =3D flags & FANOTIFY_CLASS_BITS; > unsigned int internal_flags =3D 0; > + struct file *file; > > pr_debug("%s: flags=3D%x event_f_flags=3D%x\n", > __func__, flags, event_f_flags); > @@ -1472,7 +1472,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags,= unsigned int, event_f_flags) > (!(fid_mode & FAN_REPORT_NAME) || !(fid_mode & FAN_REPORT_FID= ))) > return -EINVAL; > > - f_flags =3D O_RDWR | __FMODE_NONOTIFY; > + f_flags =3D O_RDWR; > if (flags & FAN_CLOEXEC) > f_flags |=3D O_CLOEXEC; > if (flags & FAN_NONBLOCK) > @@ -1550,10 +1550,18 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flag= s, unsigned int, event_f_flags) > goto out_destroy_group; > } > > - fd =3D anon_inode_getfd("[fanotify]", &fanotify_fops, group, f_fl= ags); > + fd =3D get_unused_fd_flags(flags); s/flags/f_flags > if (fd < 0) > goto out_destroy_group; > > + file =3D anon_inode_getfile_fmode("[fanotify]", &fanotify_fops, g= roup, > + f_flags, FMODE_NONOTIFY); > + if (IS_ERR(file)) { > + fd =3D PTR_ERR(file); > + put_unused_fd(fd); > + goto out_destroy_group; > + } > + fd_install(fd, file); > return fd; > > out_destroy_group: > diff --git a/fs/open.c b/fs/open.c > index acaeb3e25c88..04cb581528ff 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -1118,6 +1118,23 @@ struct file *dentry_open(const struct path *path, = int flags, > } > EXPORT_SYMBOL(dentry_open); > > +struct file *dentry_open_nonotify(const struct path *path, int flags, > + const struct cred *cred) > +{ > + struct file *f =3D alloc_empty_file(flags, cred); > + if (!IS_ERR(f)) { > + int error; > + > + f->f_mode |=3D FMODE_NONOTIFY; > + error =3D vfs_open(path, f); > + if (error) { > + fput(f); > + f =3D ERR_PTR(error); > + } > + } > + return f; > +} > + > /** > * dentry_create - Create and open a file > * @path: path to create > @@ -1215,7 +1232,7 @@ inline struct open_how build_open_how(int flags, um= ode_t mode) > inline int build_open_flags(const struct open_how *how, struct open_flag= s *op) > { > u64 flags =3D how->flags; > - u64 strip =3D __FMODE_NONOTIFY | O_CLOEXEC; > + u64 strip =3D O_CLOEXEC; > int lookup_flags =3D 0; > int acc_mode =3D ACC_MODE(flags); > Get rid of another stale comment: /* - * Strip flags that either shouldn't be set by userspace like - * FMODE_NONOTIFY or that aren't relevant in determining struct - * open_flags like O_CLOEXEC. + * Strip flags that aren't relevant in determining struct open_flag= s. */ With these changed, you can add: Reviewed-by: Amir Goldstein With the f_flags typo fixed, this passed LTP sanity tests, but I am going to test the NONOTIFY functionally some more. Thanks, Amir.