From: "Darrick J. Wong" <djwong@kernel.org>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, Jeff Layton <jlayton@kernel.org>,
Josef Bacik <josef@toxicpanda.com>,
Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Tejun Heo <tj@kernel.org>, Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Jann Horn <jannh@google.com>,
netdev@vger.kernel.org
Subject: Re: [PATCH 00/14] xattr: rework simple xattrs and support user.* xattrs on sockets
Date: Fri, 20 Feb 2026 16:14:57 -0800 [thread overview]
Message-ID: <20260221001457.GC11076@frogsfrogsfrogs> (raw)
In-Reply-To: <20260220-biobauer-beilhieb-2e792f9453cf@brauner>
On Fri, Feb 20, 2026 at 10:23:55AM +0100, Christian Brauner wrote:
> On Thu, Feb 19, 2026 at 04:44:54PM -0800, Darrick J. Wong wrote:
> > On Mon, Feb 16, 2026 at 02:31:56PM +0100, Christian Brauner wrote:
> > > Hey,
> > >
> > > This reworks the simple_xattr infrastructure and adds support for
> > > user.* extended attributes on sockets.
> > >
> > > The simple_xattr subsystem currently uses an rbtree protected by a
> > > reader-writer spinlock. This series replaces the rbtree with an
> > > rhashtable giving O(1) average-case lookup with RCU-based lockless
> > > reads. This sped up concurrent access patterns on tmpfs quite a bit and
> > > it's an overall easy enough conversion to do and gets rid or rwlock_t.
> > >
> > > The conversion is done incrementally: a new rhashtable path is added
> > > alongside the existing rbtree, consumers are migrated one at a time
> > > (shmem, kernfs, pidfs), and then the rbtree code is removed. All three
> > > consumers switch from embedded structs to pointer-based lazy allocation
> > > so the rhashtable overhead is only paid for inodes that actually use
> > > xattrs.
> >
> > Patches 1-6 look ok to me, at least in the sense that nothing stood out
> > to me as obviously wrong, so
> > Acked-by: "Darrick J. Wong" <djwong@kernel.org>
> >
> > > With this infrastructure in place the series adds support for user.*
> > > xattrs on sockets. Path-based AF_UNIX sockets inherit xattr support
> > > from the underlying filesystem (e.g. tmpfs) but sockets in sockfs -
> > > that is everything created via socket() including abstract namespace
> > > AF_UNIX sockets - had no xattr support at all.
> > >
> > > The xattr_permission() checks are reworked to allow user.* xattrs on
> > > S_IFSOCK inodes. Sockfs sockets get per-inode limits of 128 xattrs and
> > > 128KB total value size matching the limits already in use for kernfs.
> > >
> > > The practical motivation comes from several directions. systemd and
> > > GNOME are expanding their use of Varlink as an IPC mechanism. For D-Bus
> > > there are tools like dbus-monitor that can observe IPC traffic across
> > > the system but this only works because D-Bus has a central broker. For
> > > Varlink there is no broker and there is currently no way to identify
> >
> > Hum. I suppose there's never going to be a central varlink broker, is
> > there? That doesn't sound great for discoverability, unless the plan is
>
> Varlink was explicitly designed to avoid having to have a broker.
> Practically it would have been one option to have a a central registry
> maintained as a bpf socket map. My naive take had always been something
> like: systemd can have a global socket map. sockets are picked up
> whenver the appropriate xattr is set and deleted from the map once the
> socket goes away (or the xattr is unset). Right now this is something
> that would require capabilities. Once signed bpf is more common it is
> easy to load that on per-container basis. But...
>
> > to try to concentrate them in (say) /run/varlink? But even then, could
>
> ... the future is already here :)
>
> https://github.com/systemd/systemd/pull/40590
>
> All public varlink services that are supposed to be announced are now
> symlinked into:
>
> /run/varlink/registry
>
> There are of-course non-public interfaces such as the interface
> between PID 1 and oomd. Such interfaces are not exposed.
>
> It's also possible to have per user registries at e.g.:
>
> /run/user/1000/varlink/registry/
>
> Such varlink services can now also be listed via:
>
> valinkctl list-services
>
> This then ties very neatly into the varlink bridge we're currently
> building:
>
> https://github.com/mvo5/varlink-http-bridge
>
> It takes a directory with varlink sockets (or symlinks to varlink
> sockets) like /run/varlink/registry as the argument and will serve
> whatever it finds in there. Sockets can be added or removed dynamically
> in the dir as needed:
>
> curl -s http://localhost:8080/sockets | jq
> {
> "sockets": [
> "io.systemd.Login",
> "io.systemd.Hostname",
> "io.systemd.sysext",
> "io.systemd.BootControl",
> "io.systemd.Import",
> "io.systemd.Repart",
> "io.systemd.MuteConsole",
> "io.systemd.FactoryReset",
> "io.systemd.Credentials",
> "io.systemd.AskPassword",
> "io.systemd.Manager",
> "io.systemd.ManagedOOM"
> ]
> }
>
> The xattrs allow to have a completely global view of such services and
> the per-user sessions all have their own sub-view.
>
> > you have N services that share the same otherwise private tmpfs in order
> > to talk to each other via a varlink socket? I suppose in that case, the
>
> Yeah sure that's one way.
>
> > N services probably don't care/want others to discover their socket.
> >
> > > which sockets speak Varlink. With user.* xattrs on sockets a service
> > > can label its socket with the IPC protocol it speaks (e.g.,
> > > user.varlink=1) and an eBPF program can then selectively capture
> >
> > Who gets to set xattrs? Can a malicious varlink socket user who has
> > connect() abilities also delete user.varlink to mess with everyone who
> > comes afterwards?
>
> The main focus is AF_UNIX sockets of course so a varlink service does:
>
> fd = socket(AF_UNIX)
> umask(0117);
> bind(fd, "/run/foobar");
> umask(original_umask);
> chown("/run/foobar", -1, MYACCESSGID);
> setxattr("/run/foobar", "user.varlink", "1");
>
> For non-path based sockets the inodes for client and server are
> inherently distinct so they cannot interfer with each other. But even
> then a chmod() + chown(-1, MYACCESSGID) on the sockfs socket fd will
> protect this.
>
> Thanks for the review. Please keep going. :)
The rest look fine too, modulo my comments about the fixed limits.
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
--D
prev parent reply other threads:[~2026-02-21 0:15 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-16 13:31 Christian Brauner
2026-02-16 13:31 ` [PATCH 01/14] xattr: add rcu_head and rhash_head to struct simple_xattr Christian Brauner
2026-02-16 13:31 ` [PATCH 02/14] xattr: add rhashtable-based simple_xattr infrastructure Christian Brauner
2026-02-16 13:31 ` [PATCH 03/14] shmem: adapt to rhashtable-based simple_xattrs with lazy allocation Christian Brauner
2026-02-16 13:32 ` [PATCH 04/14] kernfs: " Christian Brauner
2026-02-16 13:32 ` [PATCH 05/14] pidfs: adapt to rhashtable-based simple_xattrs Christian Brauner
2026-02-16 13:32 ` [PATCH 06/14] xattr: remove rbtree-based simple_xattr infrastructure Christian Brauner
2026-02-16 13:32 ` [PATCH 07/14] xattr: add xattr_permission_error() Christian Brauner
2026-02-16 13:32 ` [PATCH 08/14] xattr: switch xattr_permission() to switch statement Christian Brauner
2026-02-16 13:32 ` [PATCH 09/14] xattr: move user limits for xattrs to generic infra Christian Brauner
2026-02-21 0:03 ` Darrick J. Wong
2026-02-16 13:32 ` [PATCH 10/14] xattr,net: support limited amount of extended attributes on sockfs sockets Christian Brauner
2026-02-16 13:32 ` [PATCH 11/14] xattr: support extended attributes on sockets Christian Brauner
2026-02-16 13:32 ` [PATCH 12/14] selftests/xattr: path-based AF_UNIX socket xattr tests Christian Brauner
2026-02-16 13:32 ` [PATCH 13/14] selftests/xattr: sockfs " Christian Brauner
2026-02-16 13:32 ` [PATCH 14/14] selftests/xattr: test xattrs on various socket families Christian Brauner
2026-02-20 0:44 ` [PATCH 00/14] xattr: rework simple xattrs and support user.* xattrs on sockets Darrick J. Wong
2026-02-20 9:23 ` Christian Brauner
2026-02-21 0:14 ` Darrick J. Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260221001457.GC11076@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=brauner@kernel.org \
--cc=edumazet@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=jlayton@kernel.org \
--cc=josef@toxicpanda.com \
--cc=kuba@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox