From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92934D3ABF4 for ; Mon, 11 Nov 2024 23:36:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 072F68D0011; Mon, 11 Nov 2024 18:36:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F3D268D0001; Mon, 11 Nov 2024 18:36:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DDE118D0011; Mon, 11 Nov 2024 18:36:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id BE34D8D0001 for ; Mon, 11 Nov 2024 18:36:26 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 662331611EC for ; Mon, 11 Nov 2024 23:36:26 +0000 (UTC) X-FDA: 82775424606.05.BBB40B9 Received: from mail-vk1-f179.google.com (mail-vk1-f179.google.com [209.85.221.179]) by imf09.hostedemail.com (Postfix) with ESMTP id 32D19140010 for ; Mon, 11 Nov 2024 23:35:56 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GPmLzMfu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of amir73il@gmail.com designates 209.85.221.179 as permitted sender) smtp.mailfrom=amir73il@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731368122; a=rsa-sha256; cv=none; b=fYcy2nEjRvaB5rVV5K2jAusM8a4MhdAV4dnMZchBp+MXDLjH8NfKiEiUqCSei+rLSg+ehO sepIs06SycbRNZBZJdOlOTXnnQTqdSGmVtohtM3PGOhWfYzC4JE3TbM5kXxdaVhqenSOJj /MmDNM1WA4TvldhSskskXFxJ+zjKPwo= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GPmLzMfu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of amir73il@gmail.com designates 209.85.221.179 as permitted sender) smtp.mailfrom=amir73il@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731368122; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MB1YZjL/g4WPNa5p6ho7uTDJtxrw637uxbsOjPihoL4=; b=oh3hGmreirYxmUhEMLrcP70qd6lRQXiv21c7yInQLERyrQqtERagZVwZ25yLu0KB+V9FsW ANrvnvwdoyiZv5oFSPhJ5bSdQvUFQDSJ4JnScCX2Rq2fucIaWLMhnNMJBTLdcrHrhrHqbl Fh07TG0ynswkeBrCWFkBdKUrId5eTlg= Received: by mail-vk1-f179.google.com with SMTP id 71dfb90a1353d-50d32d82bd8so1535960e0c.1 for ; Mon, 11 Nov 2024 15:36:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731368184; x=1731972984; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MB1YZjL/g4WPNa5p6ho7uTDJtxrw637uxbsOjPihoL4=; b=GPmLzMfumjOTHw4rgwW9wE/463jqAQNI0rdkOJrPrOK6urg5SC6nfn00iij7EbMTF2 xPVDiIlCIs1dPQbywJMH6x7FpY008nkyqN9W65g9U727k3DS1/XVbgh6Tar0fsjAQ2nB r8jOj65MoBgsJK8w0R+jTBajbpdcPET+MTKjv5xvaISvLXVD9vF0Ir6svbyT/EUSQMvy /KO8h1gEFaTckEWX5XCMuOp7P89juHH1RCrIyEOMEgBEhpQvH62VixWY3KtuiBgNdXDk YIxTIWpHXyXVP1VlzqQeacbNlngE/grzJ5menqzJrY1LIdsC5iQ8kctkh9axU2OoDtaE 6y6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731368184; x=1731972984; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MB1YZjL/g4WPNa5p6ho7uTDJtxrw637uxbsOjPihoL4=; b=o2Yz833LhrJB9PpAGZRq/VjnjntofQNaOK8ca9cQcLoVeUoFwwfFxR2cZLUvTHe4Dn uQSkyDiH1IeT9q7VVyat4Pr3VrdW+P92szDdcjr3DIMWALAIRSQQaJA/P3aWG3O4RoKU MjeCyH59YsQSpI77girfmZ6WaTICLhDicNr0hL8rOgkDOAYwiLrO1b5d6AFNvqeGmylX C7d4ClUY5wl4APwBGv1QDBhNlJC8I5jzCD85V2cHs2qWtgD8n80ko0Z7dWazaPenTeL3 D6jGGKQ2G78WM/BeokrbrlRC4M2MtQs5e+QumG1OPSBS610U8B90+mjrtf37euUlJcFx DLWw== X-Forwarded-Encrypted: i=1; AJvYcCVnjyF9LfSYxhd8L6fjUXVHrrZe3jUDejO0XaRoNi+LrJ22aHstLkLt37KDj8uss7qJh+QHq8gyDA==@kvack.org X-Gm-Message-State: AOJu0YyWVZdTUZrWvaNYnNWGZj0sqQBWiSJkxvUslcc9A302g2CSysla Vj7NGJSJwRX4frschC5XeUY+O0BPgEt/UPjjhzjqR7QJt6Usc4IwR2Ht1cNj92/dWjJLWOWRXmJ Up1YIbQXGX6g4v0JRHDRVJsO4yxs= X-Google-Smtp-Source: AGHT+IHlHqW9AdZyfpZzGtvMz9MbvMwWSRsKkW2rbTmr8IjIKFtuxOJ6ThRYbwePpYz1fu+sC6uCZbkyQIGaylXfmVw= X-Received: by 2002:a05:6122:2089:b0:50f:fe39:a508 with SMTP id 71dfb90a1353d-51401eb04f6mr12944884e0c.11.1731368183658; Mon, 11 Nov 2024 15:36:23 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Amir Goldstein Date: Tue, 12 Nov 2024 00:36:12 +0100 Message-ID: Subject: Re: [PATCH v6 06/17] fsnotify: generate pre-content permission event on open To: Linus Torvalds Cc: Josef Bacik , kernel-team@fb.com, linux-fsdevel@vger.kernel.org, jack@suse.cz, brauner@kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org, kernel test robot Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 32D19140010 X-Rspamd-Server: rspam01 X-Stat-Signature: fqzhpjaadfgqzz54u8q14117ts4fkr1p X-HE-Tag: 1731368156-877967 X-HE-Meta: U2FsdGVkX19z0cH5U3jH3S/6ND7OK+A2zoTl84b4Wtr4VK5AvzX7tczFKL4vpqhhWaM0qa0L74c6HEv6ByObb0EfoERQnyfAhe0WrWq85uRhMBr1NPeb0uoIbuIonmwd+EuOFfU2t+lwRn0qNJZGX7xN0QxmFBLl2Ksbmp0aEhkqW/iedKHWBeicL3z9aMrJxgZaX7+5mLFZ18x8wSbtNkpP1i2wj2KZwFjWXiALIOofs03pnJYB/uWrs6UUF8gEMQ4Ajb3T3hmr3ImWIG0Un7qmPiiEgN5nv7VP5nHqRfzrTCAfl7zi4xzMW/btqMpwWtKdTTU4Vu3jUIh5BgnjdwJA4Kprf9QGA49mJ/Qk5b1T4xkxlDevgNPYcO1Yl5B687yg7Qnbky9Np+I7MiOhcyxtuaTzsW9DCPdfJrMQQrJ48dBpykEhQPovp1cppdmfqcuFPF+udVkO2G0hMRaM0IkgtZ4Xnb/IFutYD6yRmYn7rt0sGZxerRWi0wcqSw9BTxxASgv0EUv9wPqMwfujlDT6ocAHA9ERb9OJCJdCwlcilqFbLEBbwmh2u2rUi2gpHSGGO1Nbz6xTCdfBskZd14TaUxfFRexySeUYKxsyZcRueenyV1A6vTdHn8NrAwZDAI/eUB5X4mUJ5SVwMcCQY77QzilZ1mSi1mtm6LsOmq3n+U0orZjA9SIQEgNTTV1rQVfSQOJgFTqNhds7eIXaLecJwcuiTSLpa32S9bFepBk/f8cUQfSJrfKc9iJKC1qaOQb0QrD3y9krNL8Uos434ivw26XLGfNGbe/t+DwOGug8puAjU3lW1Lg4NyI7phgleDwt9Hh3BUT6HBiIDCRiLdzqBqcD0ogyXJpGYIhawxpsPprWHUezQbVZwbEUYoW2bETOrDkATwuWlJ6QtQPgQ3XPrH9/fetoCf9wTMneu7JDieOjwfJ+H3TwE8JnTk2GsisIy68VUwBYWoJYCpJ JXp6zYo+ EjgNzIGjt2xVGa1WiNaKRK7Bci/z/1IitLXu0IYL8GN2fFwGcwVVz2C3W57Y4KT3DuCeloc1TgcsJG+fd9zBDVj4ohnWWDvIf2XrVBLf8Wp/7xKTTBSLtgWptORgd/4HyZQxOUut87PqtTBKDEtYej4YpgJofMUCjeJs84kk7WoJf2FWhh/20ehYLHIAz6yH4g3287mkNBwIpKUpxIt5cwh23fMyETPZ20Wzttg5/4lEQ27FmxSwrRaBmK/6pyiljX9YnOz2ylW5nmaKYpCNcR57jBHn4kutb9vo7i04BTiWyV7PyGNpXQxASLG3Nj0xwXatrVmP1jEZjj1YmumS/qv0JxsYGdeQEK5zloY6Bq9KE68lKQbwMykTcuRX+INGMFQfElA010x5xn/zgiBVdAK8U6GxFY+mugy3oD7oEgJ7UNZZVRmz4n9WpauIWMAaCQ76SzjYAzCtzrecWEICzG/7v5q0cpaz33JJYkWwTQC+t7jE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000016, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 11, 2024 at 10:52=E2=80=AFPM Linus Torvalds wrote: > > On Mon, 11 Nov 2024 at 12:19, Josef Bacik wrote: > > > > --- a/fs/namei.c > > +++ b/fs/namei.c > > @@ -3782,7 +3782,15 @@ static int do_open(struct nameidata *nd, > > + /* > > + * This permission hook is different than fsnotify_open_perm() = hook. > > + * This is a pre-content hook that is called without sb_writers= held > > + * and after the file was truncated. > > + */ > > + return fsnotify_file_area_perm(file, MAY_OPEN, &file->f_pos, 0)= ; > > } > > Stop adding sh*t like this to the VFS layer. > > Seriously. I spend time and effort looking at profiles, and then > people who do *not* seem to spend the time and effort just willy nilly > add fsnotify and security events and show down basic paths. > > I'm going to NAK any new fsnotify and permission hooks unless people > show that they don't add any overhead. > FWIW we did work to eliminate overhead long before posting the new hooks. This prep work has already been merged back in v6.9: https://lore.kernel.org/linux-fsdevel/20240317184154.1200192-1-amir73il@gma= il.com/ After working closely with Oliver and kernel test robot to eliminate the overhead of the pre-content hooks and on the way, also reduce overhead of the existi= ng fanotify permission hooks that were merged back in the day. > Because I'm really really tired of having to wade through various > permission hooks in the profiles that I can not fix or optimize, > because those hoosk have no sane defined semantics, just "let user > space know". > > Yes, right now it's mostly just the security layer. But this really > looks to me like now fsnotify will be the same kind of pain. > > And that location is STUPID. Dammit, it is even *documented* to be > stupid. It's a "pre-content" hook that happens after the contents have > already been truncated. WTF? That's no "pre". > Yeh, I understand why it seems stupid and it may have been poorly documented, but there is actually a good reason for the location of this hook. The problem with the existing fsnotify_open_perm() hook is that with O_CREATE, open_last_lookups() takes sb_writers freeze protection (regardless if file with really created), so we cannot safely use this hook to start writing to file and fill its content. So the important part of the comment is: "This permission hook is different than fsnotify_open_perm() hook. This is a pre-content hook that is called without sb_writers held" The part that follows "and after the file was truncated", simply means that in case of O_TRUNC we won't need to fill file content, because the file will have zero size anyway. And for the record, it is not "pre-open" it is "pre-content-access", so the name may be confusing but it is not wrong. > I tried to follow the deep chain of inlines to see what actually ends > up happening, and it looks like if the *whole* filesystem has no > notify events at all, the fsnotify_sb_has_watchers() check will make > this mostly go away, except for all the D$ accesses needed just to > check for it. > > But even *one* entirely unrelated event will now force every single > open to typically call __fsnotify_parent() (or possibly "just" > fsnotify), because there's no sane "nobody cares about this dentry" > kind of thing. > > So effectively this is a new hook that gets called on every single > open call that nobody else cares about than you, and that people have > lived without for three decades. > That's exactly the reason for the commit a5e57b4d370c fsnotify: optimize the case of no permission event watchers It is supposed to reduce overhead to bare minimum for hooks of events from the class "permission" (FSNOTIFY_PRIO_CONTENT), which are typically only watched for Anti-malware software, so fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_CONTENT) is typically false on any given fs. And same for the new hooks for events of class "pre-content". fsnotify_sb_has_priority_watchers(sb, FSNOTIFY_PRIO_PRE_CONTENT) will only be true for fs of dedicated systems that run an HSM service, wher= e the overhead of the hooks is not going to be a concern. > Stop it, or at least add the code to not do this all completely pointless= ly. > > Because otherwise I will not take this kind of stuff any more. I just > spent time trying to figure out how to avoid the pointless cache > misses we did for every path component traversal. > > So I *really* don't want to see another pointless stupid fsnotify hook > in my profiles. I understand your frustration from crawling performance regressions, I really do. I am personally very grateful for the services of Oliver and his kernel test robot who test my development branches to catch regressions very soon in the development process. We may have missed things along the way and you may yet find more issues that justify another NAK, or more work, but you should know that a lot of c= are was taken to try to avoid inflicting pain on the system. I have been practically doing vfs prep and cleanup work together with Josef and Jan and Christian for at least a year before we got to post v1 of= the pre-content patches. So all I ask in return is that you consider the possibility that this is no= t utter garbage and try to work with us to address your concerns. Pre-content hooks (and HSM) is a very powerful feature that many people are requesting and I believe that we will be able to deliver this feature without crapping all over the VFS. Thanks, Amir.