From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39FC0CD5BB6 for ; Thu, 5 Sep 2024 12:08:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F35D6B0455; Thu, 5 Sep 2024 08:08:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A23A6B0457; Thu, 5 Sep 2024 08:08:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81F2D6B0456; Thu, 5 Sep 2024 08:08:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 637B06B0454 for ; Thu, 5 Sep 2024 08:08:22 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0D345121C0F for ; Thu, 5 Sep 2024 12:08:22 +0000 (UTC) X-FDA: 82530562044.22.D6B4340 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf15.hostedemail.com (Postfix) with ESMTP id C2CFBA0012 for ; Thu, 5 Sep 2024 12:08:18 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZUBIhauk; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=DjnIceNR; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZUBIhauk; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=DjnIceNR; dmarc=none; spf=pass (imf15.hostedemail.com: domain of jack@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725538027; a=rsa-sha256; cv=none; b=4eOBMPx/0w/uWJhOMoVmMvjLmRD9cfX2NGBjr2LHEK+TmLu32XBoAX2zwxYmwFHqt/awT9 g35ij/gNxoN+VfT7Vs9SueznmQpiTYHGqyX1m3Av83/TaUAlSMe2LRenCKitS9WuCA77di mnJIBLCnntcWyKC40QNDJ1UFPKiolIs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZUBIhauk; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=DjnIceNR; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZUBIhauk; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=DjnIceNR; dmarc=none; spf=pass (imf15.hostedemail.com: domain of jack@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725538027; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4WJ4nFRla3E237+ALZbePBMCha1Jx0LA7680EWIZ9qA=; b=FEufxbhavJEuSltuPxi9DUm3rHRpPV5XBP7fn3z+vf3YVYmG7RDwQYY5zBDlud0cPJEPtC Ry2zFrSuM67PN0wjFNj8XJ6hY5RFvF0GyRbX4J+9sobSxDXu2SmoZ3NRMRc6vbhVZeOZQC qWoFBOycOTFy9jn/XlQUqHpKbNTRtH4= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2205421A81; Thu, 5 Sep 2024 12:08:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1725538097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4WJ4nFRla3E237+ALZbePBMCha1Jx0LA7680EWIZ9qA=; b=ZUBIhauk9pLBn+4LtJzsocMf/RdTLy24lmCpQqShpi8SbBM/cNVVG2D4BUnJnHTyV6NuNo Xc2FU8Q2CA8TL4E1kWdxYz6g86k7fiDmM9MnDHAA8X4fbZEMqEKBvEbKCqLvnxXIgWWB5l 0KzppmFnuC1nIEfCnB5bP/djulDphLc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1725538097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4WJ4nFRla3E237+ALZbePBMCha1Jx0LA7680EWIZ9qA=; b=DjnIceNRdWP15oYi6DsfMSICUce41R5WBJZXVfFmhDi3YmzSQR+qRyG0QJhu7wLGzfRbqi Lwpq7LY9iCpx6KDQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1725538097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4WJ4nFRla3E237+ALZbePBMCha1Jx0LA7680EWIZ9qA=; b=ZUBIhauk9pLBn+4LtJzsocMf/RdTLy24lmCpQqShpi8SbBM/cNVVG2D4BUnJnHTyV6NuNo Xc2FU8Q2CA8TL4E1kWdxYz6g86k7fiDmM9MnDHAA8X4fbZEMqEKBvEbKCqLvnxXIgWWB5l 0KzppmFnuC1nIEfCnB5bP/djulDphLc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1725538097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4WJ4nFRla3E237+ALZbePBMCha1Jx0LA7680EWIZ9qA=; b=DjnIceNRdWP15oYi6DsfMSICUce41R5WBJZXVfFmhDi3YmzSQR+qRyG0QJhu7wLGzfRbqi Lwpq7LY9iCpx6KDQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 1653613419; Thu, 5 Sep 2024 12:08:17 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id dP5sBTGf2WbLawAAD6G6ig (envelope-from ); Thu, 05 Sep 2024 12:08:17 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id B2A12A0968; Thu, 5 Sep 2024 14:08:08 +0200 (CEST) Date: Thu, 5 Sep 2024 14:08:08 +0200 From: Jan Kara To: Josef Bacik Cc: kernel-team@fb.com, linux-fsdevel@vger.kernel.org, jack@suse.cz, amir73il@gmail.com, brauner@kernel.org, linux-xfs@vger.kernel.org, linux-bcachefs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v5 00/18] fanotify: add pre-content hooks Message-ID: <20240905120808.7fcsnv7nslqsq4t6@quack3> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: C2CFBA0012 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: mxpsdz6edyy3prq6pmyjb3qki8kbxwmg X-HE-Tag: 1725538098-139896 X-HE-Meta: U2FsdGVkX18Pq4m/Zuu4q98oQ3andKTKDG+zv0vRgTuO9AJdQC15P3xYRGpe5NUtwi5nwTHFEhdjmrW3DRwFugABe38sp0+Z8sDOF/C0e3nWbSMPRV1vbWFyJtIzQKLJm2tWKoZ5NdNtlFYvM8FngRpUuzW1IxmYt9zCHTe14lNjsJ0BPPHC20l6BBFeuDgxZizaTpTpndGXgl3gcbkiYlhT9ofiqQaz9/icbSxVU9lJoCg+sOBgID4pduoTG8XTS4XwEzGe/IPMtNLrcQV9JwiAehn7frh64Elu8csTIWWfGaKHWKQDLCstxF9cY+AG/yzsxnTDlhbELzRUY/rZrHKeXu4pCJQda9ZO5JaCNVMqZ8sxWxzSgI4wx+Tg8goEsX10Dc0uWD6BJbYsg7PftVkasBV/810gsCem48JVaFNr4V/HF7cZXk67KRyJSuu9+a0vZagCGz+lvOwrvkAvo+4/Sr3xBTW8G9SkzScqyxZQuf4yzLW95YzrsUGY1Tq7wquqH5Qu/sgk7tB24Ubn1aYQmmPNzrCAtt18IErMyjcUGBANjd6GlnYaiI3LRy1ZzigpOeNP5JUczVvjo2SFw3z/s8iFhgWQ1lczi4WNThk1ZTMr9TsJAsbash3tqGH4KDXrlMmOPCmZsqwGEMKkTFJAHFNtIphhlbDf4eIRGDxBTFpOIGJVX3SMaeb7mD2vIWA6v4xderujUx/xTJmSETa/jZH3SgNVMUeyx0bvBXwKUfW5wT47jIOQMQCHt4tK+FFJrj/JVGAdXzfCm3Vd9qvcKdx+P3wBMYlLuAcId235j23JOkMIfiQc01m7jujpUGQ2VUVfrsUoAY2hLUJP4e9rclydOGyp/UQeXaZ+15AtaEhU/bSlWT/nUpV1zpTE9/EFb8oSLCxYn2uPl3AV12J6nRo9vTXoW8h/3sVlsY6XqDCewYACLkcw1RpJbWCzfzGFH+Rz40FITi6qxLw qOPmdHTW sbXHVAK2OCTaXNRLNtzA2tD7dV/qe4tV/o/4A6Fbg21b5VDksp45wJGgZG4gZOPjumWVokp4eSb3P6FI0kTXwaZEnWEnPIXs81zkWeJW39tXldgsw61jl7o6fYpVBrIQDgMdXk9IsoiFaLZB4Im4q+VelzrX7Gvy/cMq7pOMy4l3V3QZrknc1slvuEfWx9mcKPVJWPBTIcr40kSk9LPOhyzNd7i6AZ0QiBcZKKD5e8K/e9YuOfXwtuzXHDuEsZJY3G4JlIGPma/34ngeG2W09MzBTZevtErm1LERclsxpWvBfOYg0B8YWNaTFwXXAKSKQvGF4/3b7Y97ituLN6DSfZoxUmMMfjIY/deFN+/F/23KiVICG90+z7+rNIezMqtM4Zxwuf+YqLkgOnVU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hello! On Wed 04-09-24 16:27:50, Josef Bacik wrote: > These are the patches for the bare bones pre-content fanotify support. The > majority of this work is Amir's, my contribution to this has solely been around > adding the page fault hooks, testing and validating everything. I'm sending it > because Amir is traveling a bunch, and I touched it last so I'm going to take > all the hate and he can take all the credit. > > There is a PoC that I've been using to validate this work, you can find the git > repo here > > https://github.com/josefbacik/remote-fetch The test tool seems to be a bit outdated wrt the current series. It took me quite a while to debug why HSM isn't working with it (eventually I've tracked it down to the changes in struct fanotify_event_info_range...). Anyway all seems to be working (after fixing up some missing export), I've pushed out the result I have to: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify and will push it to linux-next as well so that it gets some soaking before the merge window. That being said I'd still like to get explicit ack from XFS folks (hint) so don't patches may still rebase due to that. Honza > This consists of 3 different tools. > > 1. populate. This just creates all the stub files in the directory from the > source directory. Just run ./populate ~/linux ~/hsm-linux and it'll > recursively create all of the stub files and directories. > 2. remote-fetch. This is the actual PoC, you just point it at the source and > destination directory and then you can do whatever. ./remote-fetch ~/linux > ~/hsm-linux. > 3. mmap-validate. This was to validate the pagefault thing, this is likely what > will be turned into the selftest with remote-fetch. It creates a file and > then you can validate the file matches the right pattern with both normal > reads and mmap. Normally I do something like > > ./mmap-validate create ~/src/foo > ./populate ~/src ~/dst > ./rmeote-fetch ~/src ~/dst > ./mmap-validate validate ~/dst/foo > > I did a bunch of testing, I also got some performance numbers. I copied a > kernel tree, and then did remote-fetch, and then make -j4 > > Normal > real 9m49.709s > user 28m11.372s > sys 4m57.304s > > HSM > real 10m6.454s > user 29m10.517s > sys 5m2.617s > > So ~17 seconds more to build with HSM. I then did a make mrproper on both trees > to see the size > > [root@fedora ~]# du -hs /src/linux > 1.6G /src/linux > [root@fedora ~]# du -hs dst > 125M dst > > This mirrors the sort of savings we've seen in production. > > Meta has had these patches (minus the page fault patch) deployed in production > for almost a year with our own utility for doing on-demand package fetching. > The savings from this has been pretty significant. > > The page-fault hooks are necessary for the last thing we need, which is > on-demand range fetching of executables. Some of our binaries are several gigs > large, having the ability to remote fetch them on demand is a huge win for us > not only with space savings, but with startup time of containers. > > There will be tests for this going into LTP once we're satisfied with the > patches and they're on their way upstream. Thanks, > > Josef > > Amir Goldstein (8): > fsnotify: introduce pre-content permission event > fsnotify: generate pre-content permission event on open > fanotify: introduce FAN_PRE_ACCESS permission event > fanotify: introduce FAN_PRE_MODIFY permission event > fanotify: pass optional file access range in pre-content event > fanotify: rename a misnamed constant > fanotify: report file range info with pre-content events > fanotify: allow to set errno in FAN_DENY permission response > > Josef Bacik (10): > fanotify: don't skip extra event info if no info_mode is set > fs: add a flag to indicate the fs supports pre-content events > fanotify: add a helper to check for pre content events > fanotify: disable readahead if we have pre-content watches > mm: don't allow huge faults for files with pre content watches > fsnotify: generate pre-content permission event on page fault > bcachefs: add pre-content fsnotify hook to fault > xfs: add pre-content fsnotify hook for write faults > btrfs: disable defrag on pre-content watched files > fs: enable pre-content events on supported file systems > > fs/bcachefs/fs-io-pagecache.c | 4 + > fs/bcachefs/fs.c | 2 +- > fs/btrfs/ioctl.c | 9 ++ > fs/btrfs/super.c | 3 +- > fs/ext4/super.c | 6 +- > fs/namei.c | 9 ++ > fs/notify/fanotify/fanotify.c | 33 ++++++-- > fs/notify/fanotify/fanotify.h | 15 ++++ > fs/notify/fanotify/fanotify_user.c | 119 ++++++++++++++++++++++----- > fs/notify/fsnotify.c | 17 +++- > fs/xfs/xfs_file.c | 4 + > fs/xfs/xfs_super.c | 2 +- > include/linux/fanotify.h | 20 +++-- > include/linux/fs.h | 1 + > include/linux/fsnotify.h | 58 +++++++++++-- > include/linux/fsnotify_backend.h | 59 ++++++++++++- > include/linux/mm.h | 1 + > include/uapi/linux/fanotify.h | 18 ++++ > mm/filemap.c | 128 +++++++++++++++++++++++++++-- > mm/memory.c | 22 +++++ > mm/readahead.c | 13 +++ > security/selinux/hooks.c | 3 +- > 22 files changed, 489 insertions(+), 57 deletions(-) > > -- > 2.43.0 > -- Jan Kara SUSE Labs, CR