From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Ian Kent <raven@themaw.net>, Miklos Szeredi <miklos@szeredi.hu>,
NeilBrown <neil@brown.name>,
Andreas Hindborg <a.hindborg@kernel.org>,
linux-mm@kvack.org, linux-efi@vger.kernel.org,
ocfs2-devel@lists.linux.dev, Kees Cook <kees@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
linux-usb@vger.kernel.org, Paul Moore <paul@paul-moore.com>,
Casey Schaufler <casey@schaufler-ca.com>,
linuxppc-dev@lists.ozlabs.org,
John Johansen <john.johansen@canonical.com>,
selinux@vger.kernel.org,
Christian Borntraeger <borntraeger@linux.ibm.com>,
bpf@vger.kernel.org, Chris Mason <clm@meta.com>
Subject: [git pull][vfs] tree-in-dcache stuff
Date: Thu, 4 Dec 2025 06:47:06 +0000 [thread overview]
Message-ID: <20251204064706.GJ1712166@ZenIV> (raw)
The following changes since commit e9a6fb0bcdd7609be6969112f3fbfcce3b1d4a7c:
Linux 6.18-rc5 (2025-11-09 15:10:19 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git tags/pull-persistency
for you to fetch changes up to eb028c33451af08bb34f45c6be6967ef1c98cbd1:
d_make_discardable(): warn if given a non-persistent dentry (2025-11-17 23:59:27 -0500)
Conflicts in Documentation/filesystems/porting.rst, fs/dcache.c,
fs/debugfs/inode.c and security/apparmor/apparmorfs.c; proposed resolution
in #proposed-merge (same as in -next). Details of conflicts:
Documentation/filesystems/porting.rst:
usual trivial append/append conflict
fs/dcache.c in d_instantiate_new():
pulling spin_unlock(&entry->d_lock) from __d_instantiate() into
callers in this branch vs. changes of the lines immediately
following the call of __d_instantiate() in mainline. Obvious
resolution.
fs/debugfs/inode.c in __debugfs_create_file(), debugfs_create_dir()
and debugfs_create_automount():
removal of pointless check in this branch vs. changes of
(unreachable) failure handling in mainline.
Resolution: remove it, it's still a dead code.
fs/debugfs/inode.c in debugfs_create_symlink():
d_instantiate() -> d_make_persistent() in this branch,
end_creating() -> debugfs_end_creating() in mainline.
Obvious resolution...
security/apparmor/apparmorfs.c in aafs_remove():
switch of simple_rmdir() and simple_unlink() to
__simple_{rmdir,unlink}() in this branch vs. switch to
start_removing_dentry()/end_removing() around those calls
in mainline. Resolution: switch simple_{rmdir,unlink}() as in
this branch. TBH, I rather doubt that mainline change there was
worthwhile - aafs interaction with VFS locking is... special,
and _ANY_ changes of locking protocol on our end will have to
consider that location very carefully anyway; making it look
like a regular removal is at that stage is asking for trouble.
----------------------------------------------------------------
Some filesystems use a kinda-sorta controlled dentry refcount leak to pin
dentries of created objects in dcache (and undo it when removing those).
Reference is grabbed and not released, but it's not actually _stored_
anywhere. That works, but it's hard to follow and verify; among other
things, we have no way to tell _which_ of the increments is intended
to be an unpaired one. Worse, on removal we need to decide whether
the reference had already been dropped, which can be non-trivial if
that removal is on umount and we need to figure out if this dentry is
pinned due to e.g. unlink() not done. Usually that is handled by using
kill_litter_super() as ->kill_sb(), but there are open-coded special
cases of the same (consider e.g. /proc/self).
Things get simpler if we introduce a new dentry flag (DCACHE_PERSISTENT)
marking those "leaked" dentries. Having it set claims responsibility
for +1 in refcount.
The end result this series is aiming for:
* get these unbalanced dget() and dput() replaced with new primitives that
would, in addition to adjusting refcount, set and clear persistency flag.
* instead of having kill_litter_super() mess with removing the remaining
"leaked" references (e.g. for all tmpfs files that hadn't been removed
prior to umount), have the regular shrink_dcache_for_umount() strip
DCACHE_PERSISTENT of all dentries, dropping the corresponding
reference if it had been set. After that kill_litter_super() becomes
an equivalent of kill_anon_super().
Doing that in a single step is not feasible - it would affect too many places
in too many filesystems. It has to be split into a series.
This work has really started early in 2024; quite a few preliminary pieces
have already gone into mainline. This chunk is finally getting to the
meat of that stuff - infrastructure and most of the conversions to it.
Some pieces are still sitting in the local branches, but the bulk of
that stuff is here.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
----------------------------------------------------------------
Al Viro (54):
fuse_ctl_add_conn(): fix nlink breakage in case of early failure
tracefs: fix a leak in eventfs_create_events_dir()
new helper: simple_remove_by_name()
new helper: simple_done_creating()
introduce a flag for explicitly marking persistently pinned dentries
primitives for maintaining persisitency
convert simple_{link,unlink,rmdir,rename,fill_super}() to new primitives
convert ramfs and tmpfs
procfs: make /self and /thread_self dentries persistent
configfs, securityfs: kill_litter_super() not needed
convert xenfs
convert smackfs
convert hugetlbfs
convert mqueue
convert bpf
convert dlmfs
convert fuse_ctl
convert pstore
convert tracefs
convert debugfs
debugfs: remove duplicate checks in callers of start_creating()
convert efivarfs
convert spufs
convert ibmasmfs
ibmasmfs: get rid of ibmasmfs_dir_ops
convert devpts
binderfs: use simple_start_creating()
binderfs_binder_ctl_create(): kill a bogus check
convert binderfs
autofs_{rmdir,unlink}: dentry->d_fsdata->dentry == dentry there
convert autofs
convert binfmt_misc
selinuxfs: don't stash the dentry of /policy_capabilities
selinuxfs: new helper for attaching files to tree
convert selinuxfs
functionfs: don't abuse ffs_data_closed() on fs shutdown
functionfs: don't bother with ffs->ref in ffs_data_{opened,closed}()
functionfs: need to cancel ->reset_work in ->kill_sb()
functionfs: fix the open/removal races
functionfs: switch to simple_remove_by_name()
convert functionfs
gadgetfs: switch to simple_remove_by_name()
convert gadgetfs
hypfs: don't pin dentries twice
hypfs: switch hypfs_create_str() to returning int
hypfs: swich hypfs_create_u64() to returning int
convert hypfs
convert rpc_pipefs
convert nfsctl
convert rust_binderfs
get rid of kill_litter_super()
convert securityfs
kill securityfs_recursive_remove()
d_make_discardable(): warn if given a non-persistent dentry
Documentation/filesystems/porting.rst | 7 ++
arch/powerpc/platforms/cell/spufs/inode.c | 17 ++-
arch/s390/hypfs/hypfs.h | 6 +-
arch/s390/hypfs/hypfs_diag_fs.c | 60 ++++------
arch/s390/hypfs/hypfs_vm_fs.c | 21 ++--
arch/s390/hypfs/inode.c | 82 +++++--------
drivers/android/binder/rust_binderfs.c | 121 ++++++-------------
drivers/android/binderfs.c | 82 +++----------
drivers/base/devtmpfs.c | 2 +-
drivers/misc/ibmasm/ibmasmfs.c | 24 ++--
drivers/usb/gadget/function/f_fs.c | 144 +++++++++++++----------
drivers/usb/gadget/legacy/inode.c | 49 ++++----
drivers/xen/xenfs/super.c | 2 +-
fs/autofs/inode.c | 2 +-
fs/autofs/root.c | 11 +-
fs/binfmt_misc.c | 69 ++++++-----
fs/configfs/dir.c | 10 +-
fs/configfs/inode.c | 3 +-
fs/configfs/mount.c | 2 +-
fs/dcache.c | 111 +++++++++++-------
fs/debugfs/inode.c | 32 ++----
fs/devpts/inode.c | 57 ++++-----
fs/efivarfs/inode.c | 7 +-
fs/efivarfs/super.c | 5 +-
fs/fuse/control.c | 38 +++---
fs/hugetlbfs/inode.c | 12 +-
fs/internal.h | 1 -
fs/libfs.c | 52 +++++++--
fs/nfsd/nfsctl.c | 18 +--
fs/ocfs2/dlmfs/dlmfs.c | 8 +-
fs/proc/base.c | 6 +-
fs/proc/internal.h | 1 +
fs/proc/root.c | 14 +--
fs/proc/self.c | 10 +-
fs/proc/thread_self.c | 11 +-
fs/pstore/inode.c | 7 +-
fs/ramfs/inode.c | 8 +-
fs/super.c | 8 --
fs/tracefs/event_inode.c | 7 +-
fs/tracefs/inode.c | 13 +--
include/linux/dcache.h | 4 +-
include/linux/fs.h | 6 +-
include/linux/proc_fs.h | 2 -
include/linux/security.h | 2 -
init/do_mounts.c | 2 +-
ipc/mqueue.c | 12 +-
kernel/bpf/inode.c | 15 +--
mm/shmem.c | 38 ++----
net/sunrpc/rpc_pipe.c | 27 ++---
security/apparmor/apparmorfs.c | 13 ++-
security/inode.c | 35 +++---
security/selinux/selinuxfs.c | 185 +++++++++++++-----------------
security/smack/smackfs.c | 2 +-
53 files changed, 649 insertions(+), 834 deletions(-)
next reply other threads:[~2025-12-04 6:47 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-04 6:47 Al Viro [this message]
2025-12-06 0:16 ` pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251204064706.GJ1712166@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=a.hindborg@kernel.org \
--cc=borntraeger@linux.ibm.com \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=casey@schaufler-ca.com \
--cc=clm@meta.com \
--cc=gregkh@linuxfoundation.org \
--cc=jack@suse.cz \
--cc=john.johansen@canonical.com \
--cc=kees@kernel.org \
--cc=linux-efi@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-usb@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=miklos@szeredi.hu \
--cc=neil@brown.name \
--cc=ocfs2-devel@lists.linux.dev \
--cc=paul@paul-moore.com \
--cc=raven@themaw.net \
--cc=rostedt@goodmis.org \
--cc=selinux@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox