linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josef Bacik <josef@toxicpanda.com>,
	kernel-team@fb.com, linux-fsdevel@vger.kernel.org,  jack@suse.cz,
	brauner@kernel.org, linux-xfs@vger.kernel.org,
	 linux-btrfs@vger.kernel.org, linux-mm@kvack.org,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH v6 06/17] fsnotify: generate pre-content permission event on open
Date: Tue, 12 Nov 2024 00:59:43 +0100	[thread overview]
Message-ID: <CAOQ4uxgxtQhe_3mj5SwH9568xEFsxtNqexLfw9Wx_53LPmyD=Q@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wj8L=mtcRTi=NECHMGfZQgXOp_uix1YVh04fEmrKaMnXA@mail.gmail.com>

On Tue, Nov 12, 2024 at 12:22 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, 11 Nov 2024 at 14:46, Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > Did you see the patch that added the
> > fsnotify_file_has_pre_content_watches() thing?
>
> No, because I had gotten to patch 6/11, and it added this open thing,
> and there was no such thing in any of the patches before it.
>
> It looks like you added FSNOTIFY_PRE_CONTENT_EVENTS in 11/17.
>
> However, at no point does it look like you actually test it at open
> time, so none of this seems to matter.
>
> As far as I can see, even at the end of the series, you will call the
> fsnotify hook at open time even if there are no content watches on the
> file.
>
> So apparently the fsnotify_file_has_pre_content_watches() is not
> called when it should be, and when it *is* called, it's also doing
> completely the wrong thing.
>
> Look, for basic operations THAT DON'T CARE, you now added a function
> call to fsnotify_file_has_pre_content_watches(), that function call
> looks at inode->i_sb->s_iflags (doing two D$ accesses that shouldn't
> be done!), and then after that looks at the i_fsnotify_mask.
>
> THIS IS EXACTLY THE KIND OF GARBAGE I'M TALKING ABOUT.
>
> This code has been written by somebody who NEVER EVER looked at
> profiles. You're following chains of pointers when you never should.
>
> Look, here's a very basic example of the kind of complete mis-design
> I'm talking about:
>
>  - we're doing a basic read() on a file that isn't being watched.
>
>  - we want to maybe do read-ahead
>
>  - the code does
>
>         if (fsnotify_file_has_pre_content_watches(file))
>                 return fpin;
>
>    to say that "don't do read-ahead".
>
> Fine, I understand the concept. But keep in mind that the common case
> is presumably that there are no content watches.
>
> And even ignoring the "common case" issue, that's the one you want to
> OPTIMIZE for. That's the case that matters for performance, because
> clearly if there are content watches, you're going to go into "Go
> Slow" mode anyway and not do pre-fetching. So even if content watches
> are common on some load, they are clearly not the case you should do
> performance optimization for.
>
> With me so far?
>
> So if THAT is the case that matters, then dammit, we shouldn't be
> calling a function at all.
>
> And when calling the function, we shouldn't start out with this
> completely broken logic:
>
>         struct inode *inode = file_inode(file);
>         __u32 mnt_mask = real_mount(file->f_path.mnt)->mnt_fsnotify_mask;
>
>         if (!(inode->i_sb->s_iflags & SB_I_ALLOW_HSM))
>                 return false;
>
> that does random crap and looks up some "mount mask" and looks up the
> superblock flags.
>
> Why shouldn't we do this?
>
> BECAUSE NONE OF THIS MATTERS IF THE FILE HASN'T EVEN BEEN MARKED FOR
> CONTENT MATCHES!
>
> See why I'm shouting? You're doing insane things, and you're doing
> them for all the cases that DO NOT MATTER. You're doing all of this
> for the common case that doesn't want to see that kind of mindless
> overhead.
>
> You literally check for the "do I even care" *last*, when you finally
> do that fsnotify_object_watched() check that looks at the inode. But
> by then you have already wasted all that time and effort, and
> fsnotify_object_watched() is broken anyway, because it's stupidly
> designed to require that mnt_mask that isn't needed if you have
> properly marked each object individually.
>
> So what *should* you have?
>
> You should have had a per-file flag saying "Do I need to even call
> this crud at all", and have it in a location where you don't need to
> look at anything else.
>
> And fsnotify already basically has that flag, except it's mis-designed
> too. We have FMODE_NONOTIFY, which is the wrong way around (saying
> "don't notify", when that should just be the *default*), and the
> fsnotify layer uses it only to mark its own internal files so that it
> doesn't get called recursively. So that flag that *looks* sane and is
> in the right location is actually doing the wrong thing, because it's
> dealing with a rare special case, not the important cases that
> actually matter.
>
> So all of this readahead logic - and all of the read and write hooks -
> should be behind a simple "oh, this file doesn't have any notification
> stuff, so don't bother calling any fsnotify functions".
>
> So I think the pattern should be
>
>     static inline bool fsnotify_file_has_pre_content_watches(struct file *file)
>     {
>         if (unlikely(file->f_mode & FMODE_NOTIFY))
>                 return out_of_line_crud(file);
>         return false;
>     }
>

I think that's a good idea for pre-content events, because it's fine
to say that if the sb/mount was not watched by a pre-content event listener
at the time of file open, then we do not care.

The problem is that legacy inotify/fanotify watches can be added after
file is open,
so that is allegedly why this optimization was not done for fsnotify
hooks in the past.

Thanks,
Amir.


  parent reply	other threads:[~2024-11-12  0:00 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-11 20:17 [PATCH v6 00/17] fanotify: add pre-content hooks Josef Bacik
2024-11-11 20:17 ` [PATCH v6 01/17] fanotify: don't skip extra event info if no info_mode is set Josef Bacik
2024-11-11 20:17 ` [PATCH v6 02/17] fanotify: rename a misnamed constant Josef Bacik
2024-11-11 20:17 ` [PATCH v6 03/17] fanotify: reserve event bit of deprecated FAN_DIR_MODIFY Josef Bacik
2024-11-11 20:17 ` [PATCH v6 04/17] fsnotify: introduce pre-content permission events Josef Bacik
2024-11-11 20:17 ` [PATCH v6 05/17] fsnotify: pass optional file access range in pre-content event Josef Bacik
2024-11-11 20:17 ` [PATCH v6 06/17] fsnotify: generate pre-content permission event on open Josef Bacik
2024-11-11 21:51   ` Linus Torvalds
2024-11-11 22:46     ` Josef Bacik
2024-11-11 23:22       ` Linus Torvalds
2024-11-11 23:39         ` Linus Torvalds
2024-11-11 23:59         ` Amir Goldstein [this message]
2024-11-12  0:37           ` Linus Torvalds
2024-11-12  8:11             ` Amir Goldstein
2024-11-12 13:54               ` Jan Kara
2024-11-12 14:42                 ` Amir Goldstein
2024-11-12 14:28             ` Jan Kara
2024-11-12 15:24         ` Josef Bacik
2024-11-12 17:27           ` Linus Torvalds
2024-11-11 23:36     ` Amir Goldstein
2024-11-11 20:17 ` [PATCH v6 07/17] fsnotify: generate pre-content permission event on truncate Josef Bacik
2024-11-11 20:17 ` [PATCH v6 08/17] fanotify: introduce FAN_PRE_ACCESS permission event Josef Bacik
2024-11-11 20:17 ` [PATCH v6 09/17] fanotify: report file range info with pre-content events Josef Bacik
2024-11-11 20:17 ` [PATCH v6 10/17] fanotify: allow to set errno in FAN_DENY permission response Josef Bacik
2024-11-11 20:18 ` [PATCH v6 11/17] fanotify: add a helper to check for pre content events Josef Bacik
2024-11-11 20:18 ` [PATCH v6 12/17] fanotify: disable readahead if we have pre-content watches Josef Bacik
2024-11-11 20:18 ` [PATCH v6 13/17] mm: don't allow huge faults for files with pre content watches Josef Bacik
2024-11-11 20:18 ` [PATCH v6 14/17] fsnotify: generate pre-content permission event on page fault Josef Bacik
2024-11-11 20:18 ` [PATCH v6 15/17] xfs: add pre-content fsnotify hook for write faults Josef Bacik
2024-11-11 20:18 ` [PATCH v6 16/17] btrfs: disable defrag on pre-content watched files Josef Bacik
2024-11-11 20:18 ` [PATCH v6 17/17] fs: enable pre-content events on supported file systems Josef Bacik
2024-11-11 20:27 ` [PATCH v6 00/17] fanotify: add pre-content hooks Amir Goldstein
2024-11-11 21:55 ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxgxtQhe_3mj5SwH9568xEFsxtNqexLfw9Wx_53LPmyD=Q@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox