From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32EDFD41D40 for ; Tue, 12 Nov 2024 00:00:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BAB0B6B00CA; Mon, 11 Nov 2024 19:00:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B5A5B6B00D5; Mon, 11 Nov 2024 19:00:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9AC326B00E7; Mon, 11 Nov 2024 19:00:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7AF066B00CA for ; Mon, 11 Nov 2024 19:00:01 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A522241B83 for ; Tue, 12 Nov 2024 00:00:00 +0000 (UTC) X-FDA: 82775483112.19.10F83A5 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf07.hostedemail.com (Postfix) with ESMTP id 12D4340002 for ; Mon, 11 Nov 2024 23:59:00 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MtWJCEfw; spf=pass (imf07.hostedemail.com: domain of amir73il@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=amir73il@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731369406; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qeb/sGQBQbggfnIh8ADSPE4+Kba4+lOc0hxRhJ4OMyo=; b=6j+0K5zpPos+IssFldkaoCsulDoxa2s12THWxLtVIEopeIAQ15WmCFbAPDQPyDtRCJF8Mn UV16R2uAM+ySsMwMjH9SM4dWlsGApFyenTxHwLoZflqQvMipIcp3bBAJjBbp5n3ThIe52b /4oXn5UEv59yepVrIIBT6/IlNXsEhqg= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MtWJCEfw; spf=pass (imf07.hostedemail.com: domain of amir73il@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=amir73il@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731369406; a=rsa-sha256; cv=none; b=MOdt5qiDb7dz40jL9wTk4OCNWZi7eaJ38hdQRohHcRWxoaKQGiiysynYEPAYawcwu/clTW wmQcP3f9Qe9qabMhP1l8KKSYHfFu22/R4w3LHw65TrI1g7h6qy36PfHQ1TsNaHlKHUXNAW FT09zBT9A+oclK89v73+AN4lpKGZrkg= Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a9ed0ec0e92so707047566b.0 for ; Mon, 11 Nov 2024 15:59:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731369597; x=1731974397; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qeb/sGQBQbggfnIh8ADSPE4+Kba4+lOc0hxRhJ4OMyo=; b=MtWJCEfwFshWuOs5Wd/cAkgef1PHytG558tm3Ya7GfBpy9KT3s3PEQQO1O+6DdSz+H cqWao72XT9aQnWASQ8KfDi+BYsTScHJJ2Iiq3Q8yT1aKtF9jWazal2nHsBqz7uT2rODN eg8t2xegs94DJ7LL/O1N4fp00nCJ+q0Wrsm2wyVGLrSWCTskbFVEsUtjJX/RJOfEATve C4YAbxRjK7geVCPyT4ux0XFn8JJnkOQCcrsSUkvOaGBIswM+3n7NKLQ5OPwgCu65EbVl trrOygu0jjudiwfuM3ZwbBO9azFEFlHUAsiKt0pjmqo9yL5TqGlG0PmPXwOpTUDyiwLg q1qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731369597; x=1731974397; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qeb/sGQBQbggfnIh8ADSPE4+Kba4+lOc0hxRhJ4OMyo=; b=k0ztS4wxSiidxobMQABInr7h1K3FTkV0UG7tRKLYHQKzh1vJyBos716TLXzAd4MKek 2D0uLHwdr286QDZpJ/GruqoTbIYlzH11b6/Np03WruuwmcbKmlniTckePpzQOM+lFoJE CfWzvl/PIqk8aZJTrxO1ey6VTXJ9ZiuXn3PpmuypSncUyZFMoctJSdPiDv+G2pvIJ0ZX 2h4Dkuu398X0ghh6FncB3uDxKwHZ+rIKGy5/RJRPxMJV7+M1wP5ZTBdghKQAYiuCgUZy jZTL76KSdbhX9eXJOzg7gbxxbWDPKY8HdZhMKIdfb5d1nZOy4FQzUntJKR9YCcVv8SnL aRzw== X-Forwarded-Encrypted: i=1; AJvYcCUI1V190LU4OIJ3nptQszbhBmRaO+a2F+1+Liw6Jhy5sMOk4JzApxWBiOSsggRS70P9o/4lQyNmcQ==@kvack.org X-Gm-Message-State: AOJu0YwwMJzZm+CebjlQ24mELvI9IsbTVOWEVCZD/wJuICx6ZXXt26cH SyNsbtjMbbW3D3sMInyxd2oww5sb0OLGdjwobJiQzbPq5HWJR/aY67rfEEOsGav7FQuL+81k8FT Vus7MkVIc1cFMbbhpfQF8o1k2EUsP5a74x7A= X-Google-Smtp-Source: AGHT+IHHwxY/TJOQ7Nuz4nknIlqtFJHiuYgNFoMDUB8JaLJfrMkEmhf3QjspysraBHI4CEeJBsEvdbsAjRJvE47Uc7Q= X-Received: by 2002:a17:906:6a19:b0:a99:d6cf:a1df with SMTP id a640c23a62f3a-a9eeffee0e2mr1462283866b.46.1731369596890; Mon, 11 Nov 2024 15:59:56 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Amir Goldstein Date: Tue, 12 Nov 2024 00:59:43 +0100 Message-ID: Subject: Re: [PATCH v6 06/17] fsnotify: generate pre-content permission event on open To: Linus Torvalds Cc: Josef Bacik , kernel-team@fb.com, linux-fsdevel@vger.kernel.org, jack@suse.cz, brauner@kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 12D4340002 X-Stat-Signature: 6sjxedgd65ftjxkn7dwuhnnn9j3ektcd X-Rspam-User: X-HE-Tag: 1731369540-876635 X-HE-Meta: U2FsdGVkX19BNlS80vfvzufI3NIWLSVxKLhmYWdm/iNiylRAC/UMji+XyIi8uF/eHcNqjXLEFMeKno77zvzd6ttfTiFiLpXaXQ12MxByi3aWUYaGpWjoRM+AGEbvoj+5xgfClCD4rbZQeZPVWaJSxsPGRNEp5hH9lZDDtzSnt5LRCLqRN/noeoJ8P3M5GxkUMbZTGvsJIXlR0sYOGvUBGxl49rtSJ9m7uT7KvMGtEf1fm5zqEPWrYWmOlf31xeCP4A0rZ+oV9Rc4XHDQjH7ee4RUoBmUrBbPhFM3/eXGOPQ59EQlidjt5m+vbskJpkkZRJVF7JjEzj/k5rIs33pvTkonEGkxhl25sgJ+Pcx3xODzchXGca+JBfGBDrxB1lxS7hCQeExSQmSl4S1Kml3BHvmzAXbK9cDqko/0mf5mQXgUENq5La8+b9R7hymTHFYRVbSvSS0Yb9l8S1Qxox7L8m21LHrfA5xdKwx6lBXC0tF7AeOCKwLbAzj3GuYKRl4Fr/ltxwlYew+RV+zI7A5Cuph/oniEmsGQMZvorjaJKVR3pRFzPjsGGlJttiFTKTPze8933U+CX5alfG0waQlslsNtT2AWIZCYWgyuLv2LupCUuDwIcAp0VGYN5TnSqNWgTMr+yu7Iwwu9YqfojOE1G7Zd4jy64X141Q6AJnQXkF8Zs4zu5fHOg3CG/2p99nPuWvNNd203Ra0RlCLp9nZgJZOjGzuqlGQHoF4Hd112IFop82pd/6wXEr4xi3w0XiMs7aBZJrcA4PhpDGcA/PfdVQLMp0t3di9vqp+Zm2X+/1hQgWi3uBuSvlqA2KZGT/9GHCYE++g+4qHNDAcaVaR+1/68EIp1p8bZt4G4jxcHi8Wrrbt8PtpWryjrkoJmMVnsQ3UsS9sek06k0MdiVwmKAVh5py7FFg8hffAc4MckGIJc+ij1k9/46Qc99zpSR43KCMQrxbkDJkOaV4Oy5Bl WNAg89LM P8FU5s9RpEQAsipgzqVVKUB/YuKZ6qxo2tPmhEbgi+KFbPvfslsMffoFCZ+GLU9xg0+UYT5LuFlWRfFsogOaC41Fg0/0srVrTOGZXgvoenF/6gpm3ODkgAiIktps2UW0URBHvwt0c45BY9X9cprisApGGvRc4x9LKHK3u1winjiysxCekrvw1tIfgtdN41MXm0mfON17eKjwhQDm8/pm8+yoWHCWDLZLx8ADPVE9RrLjOMpZ9RTxBveGQNfZfevreX/FKl5trJZpFdyWJM829FguwGWcdejmXo3J9MQz/bPA/lEIeWZGcNoqQCk3t5rwi2YESAFCDAHTTtTOaRwHga7GYteNS1Wy8R4B6P7LQsr5ZHiUbOLygHAXC0fr2/o5FnPhSSOTVa+ydEMYnBl30Tcw8YJrfbV7gJ35oIp+U0J0iK0uUrSNQnLOy4pJtCrNlzkpI X-Bogosity: Ham, tests=bogofilter, spamicity=0.021695, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 12, 2024 at 12:22=E2=80=AFAM Linus Torvalds wrote: > > On Mon, 11 Nov 2024 at 14:46, Josef Bacik wrote: > > > > Did you see the patch that added the > > fsnotify_file_has_pre_content_watches() thing? > > No, because I had gotten to patch 6/11, and it added this open thing, > and there was no such thing in any of the patches before it. > > It looks like you added FSNOTIFY_PRE_CONTENT_EVENTS in 11/17. > > However, at no point does it look like you actually test it at open > time, so none of this seems to matter. > > As far as I can see, even at the end of the series, you will call the > fsnotify hook at open time even if there are no content watches on the > file. > > So apparently the fsnotify_file_has_pre_content_watches() is not > called when it should be, and when it *is* called, it's also doing > completely the wrong thing. > > Look, for basic operations THAT DON'T CARE, you now added a function > call to fsnotify_file_has_pre_content_watches(), that function call > looks at inode->i_sb->s_iflags (doing two D$ accesses that shouldn't > be done!), and then after that looks at the i_fsnotify_mask. > > THIS IS EXACTLY THE KIND OF GARBAGE I'M TALKING ABOUT. > > This code has been written by somebody who NEVER EVER looked at > profiles. You're following chains of pointers when you never should. > > Look, here's a very basic example of the kind of complete mis-design > I'm talking about: > > - we're doing a basic read() on a file that isn't being watched. > > - we want to maybe do read-ahead > > - the code does > > if (fsnotify_file_has_pre_content_watches(file)) > return fpin; > > to say that "don't do read-ahead". > > Fine, I understand the concept. But keep in mind that the common case > is presumably that there are no content watches. > > And even ignoring the "common case" issue, that's the one you want to > OPTIMIZE for. That's the case that matters for performance, because > clearly if there are content watches, you're going to go into "Go > Slow" mode anyway and not do pre-fetching. So even if content watches > are common on some load, they are clearly not the case you should do > performance optimization for. > > With me so far? > > So if THAT is the case that matters, then dammit, we shouldn't be > calling a function at all. > > And when calling the function, we shouldn't start out with this > completely broken logic: > > struct inode *inode =3D file_inode(file); > __u32 mnt_mask =3D real_mount(file->f_path.mnt)->mnt_fsnotify_mas= k; > > if (!(inode->i_sb->s_iflags & SB_I_ALLOW_HSM)) > return false; > > that does random crap and looks up some "mount mask" and looks up the > superblock flags. > > Why shouldn't we do this? > > BECAUSE NONE OF THIS MATTERS IF THE FILE HASN'T EVEN BEEN MARKED FOR > CONTENT MATCHES! > > See why I'm shouting? You're doing insane things, and you're doing > them for all the cases that DO NOT MATTER. You're doing all of this > for the common case that doesn't want to see that kind of mindless > overhead. > > You literally check for the "do I even care" *last*, when you finally > do that fsnotify_object_watched() check that looks at the inode. But > by then you have already wasted all that time and effort, and > fsnotify_object_watched() is broken anyway, because it's stupidly > designed to require that mnt_mask that isn't needed if you have > properly marked each object individually. > > So what *should* you have? > > You should have had a per-file flag saying "Do I need to even call > this crud at all", and have it in a location where you don't need to > look at anything else. > > And fsnotify already basically has that flag, except it's mis-designed > too. We have FMODE_NONOTIFY, which is the wrong way around (saying > "don't notify", when that should just be the *default*), and the > fsnotify layer uses it only to mark its own internal files so that it > doesn't get called recursively. So that flag that *looks* sane and is > in the right location is actually doing the wrong thing, because it's > dealing with a rare special case, not the important cases that > actually matter. > > So all of this readahead logic - and all of the read and write hooks - > should be behind a simple "oh, this file doesn't have any notification > stuff, so don't bother calling any fsnotify functions". > > So I think the pattern should be > > static inline bool fsnotify_file_has_pre_content_watches(struct file = *file) > { > if (unlikely(file->f_mode & FMODE_NOTIFY)) > return out_of_line_crud(file); > return false; > } > I think that's a good idea for pre-content events, because it's fine to say that if the sb/mount was not watched by a pre-content event listener at the time of file open, then we do not care. The problem is that legacy inotify/fanotify watches can be added after file is open, so that is allegedly why this optimization was not done for fsnotify hooks in the past. Thanks, Amir.