From: Jan Kara <jack@suse.cz>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, Jeff Layton <jlayton@kernel.org>,
Josef Bacik <josef@toxicpanda.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Jan Kara <jack@suse.cz>,
linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Tejun Heo <tj@kernel.org>, Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Jann Horn <jannh@google.com>,
netdev@vger.kernel.org
Subject: Re: [PATCH 05/14] pidfs: adapt to rhashtable-based simple_xattrs
Date: Fri, 27 Feb 2026 16:09:15 +0100 [thread overview]
Message-ID: <qxctwu77wp7gv4ua3hn6kg7r2vt57laomn3ebjisemzzaybagy@mvoo2wpvu2ux> (raw)
In-Reply-To: <20260216-work-xattr-socket-v1-5-c2efa4f74cb7@kernel.org>
On Mon 16-02-26 14:32:01, Christian Brauner wrote:
> Adapt pidfs to use the rhashtable-based xattr path by switching from a
> dedicated slab cache to simple_xattrs_alloc().
>
> Previously pidfs used a custom kmem_cache (pidfs_xattr_cachep) that
> allocated a struct containing an embedded simple_xattrs plus
> simple_xattrs_init(). Replace this with simple_xattrs_alloc() which
> combines kzalloc + rhashtable_init, and drop the dedicated slab cache
> entirely.
>
> Use simple_xattr_free_rcu() for replaced xattr entries to allow
> concurrent RCU readers to finish.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
One question below:
> +static LLIST_HEAD(pidfs_free_list);
> +
> +static void pidfs_free_attr_work(struct work_struct *work)
> +{
> + struct pidfs_attr *attr, *next;
> + struct llist_node *head;
> +
> + head = llist_del_all(&pidfs_free_list);
> + llist_for_each_entry_safe(attr, next, head, pidfs_llist) {
> + struct simple_xattrs *xattrs = attr->xattrs;
> +
> + if (xattrs) {
> + simple_xattrs_free(xattrs, NULL);
> + kfree(xattrs);
> + }
> + kfree(attr);
> + }
> +}
> +
> +static DECLARE_WORK(pidfs_free_work, pidfs_free_attr_work);
> +
So you bother with postponing the freeing to a scheduled work because
put_pid() can be called from a context where acquiring rcu to iterate
rhashtable would not be possible? Frankly I have hard time imagining such
context (where previous rbtree code wouldn't have issues as well), in
particular because AFAIR rcu is safe to arbitrarily nest. What am I
missing?
Honza
> void pidfs_free_pid(struct pid *pid)
> {
> - struct pidfs_attr *attr __free(kfree) = no_free_ptr(pid->attr);
> - struct simple_xattrs *xattrs __free(kfree) = NULL;
> + struct pidfs_attr *attr = pid->attr;
>
> /*
> * Any dentry must've been wiped from the pid by now.
> @@ -169,9 +196,10 @@ void pidfs_free_pid(struct pid *pid)
> if (IS_ERR(attr))
> return;
>
> - xattrs = no_free_ptr(attr->xattrs);
> - if (xattrs)
> - simple_xattrs_free(xattrs, NULL);
> + if (likely(!attr->xattrs))
> + kfree(attr);
> + else if (llist_add(&attr->pidfs_llist, &pidfs_free_list))
> + schedule_work(&pidfs_free_work);
> }
>
> #ifdef CONFIG_PROC_FS
> @@ -998,7 +1026,7 @@ static int pidfs_xattr_get(const struct xattr_handler *handler,
>
> xattrs = READ_ONCE(attr->xattrs);
> if (!xattrs)
> - return 0;
> + return -ENODATA;
>
> name = xattr_full_name(handler, suffix);
> return simple_xattr_get(xattrs, name, value, size);
> @@ -1018,22 +1046,16 @@ static int pidfs_xattr_set(const struct xattr_handler *handler,
> /* Ensure we're the only one to set @attr->xattrs. */
> WARN_ON_ONCE(!inode_is_locked(inode));
>
> - xattrs = READ_ONCE(attr->xattrs);
> - if (!xattrs) {
> - xattrs = kmem_cache_zalloc(pidfs_xattr_cachep, GFP_KERNEL);
> - if (!xattrs)
> - return -ENOMEM;
> -
> - simple_xattrs_init(xattrs);
> - smp_store_release(&pid->attr->xattrs, xattrs);
> - }
> + xattrs = simple_xattrs_lazy_alloc(&attr->xattrs, value, flags);
> + if (IS_ERR_OR_NULL(xattrs))
> + return PTR_ERR(xattrs);
>
> name = xattr_full_name(handler, suffix);
> old_xattr = simple_xattr_set(xattrs, name, value, size, flags);
> if (IS_ERR(old_xattr))
> return PTR_ERR(old_xattr);
>
> - simple_xattr_free(old_xattr);
> + simple_xattr_free_rcu(old_xattr);
> return 0;
> }
>
> @@ -1108,11 +1130,6 @@ void __init pidfs_init(void)
> (SLAB_HWCACHE_ALIGN | SLAB_RECLAIM_ACCOUNT |
> SLAB_ACCOUNT | SLAB_PANIC), NULL);
>
> - pidfs_xattr_cachep = kmem_cache_create("pidfs_xattr_cache",
> - sizeof(struct simple_xattrs), 0,
> - (SLAB_HWCACHE_ALIGN | SLAB_RECLAIM_ACCOUNT |
> - SLAB_ACCOUNT | SLAB_PANIC), NULL);
> -
> pidfs_mnt = kern_mount(&pidfs_type);
> if (IS_ERR(pidfs_mnt))
> panic("Failed to mount pidfs pseudo filesystem");
>
> --
> 2.47.3
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2026-02-27 15:09 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-16 13:31 [PATCH 00/14] xattr: rework simple xattrs and support user.* xattrs on sockets Christian Brauner
2026-02-16 13:31 ` [PATCH 01/14] xattr: add rcu_head and rhash_head to struct simple_xattr Christian Brauner
2026-02-27 14:43 ` Jan Kara
2026-02-16 13:31 ` [PATCH 02/14] xattr: add rhashtable-based simple_xattr infrastructure Christian Brauner
2026-02-27 14:43 ` Jan Kara
2026-02-16 13:31 ` [PATCH 03/14] shmem: adapt to rhashtable-based simple_xattrs with lazy allocation Christian Brauner
2026-02-27 14:48 ` Jan Kara
2026-02-16 13:32 ` [PATCH 04/14] kernfs: " Christian Brauner
2026-02-27 15:00 ` Jan Kara
2026-02-16 13:32 ` [PATCH 05/14] pidfs: adapt to rhashtable-based simple_xattrs Christian Brauner
2026-02-27 15:09 ` Jan Kara [this message]
2026-02-27 15:16 ` Jan Kara
2026-02-16 13:32 ` [PATCH 06/14] xattr: remove rbtree-based simple_xattr infrastructure Christian Brauner
2026-02-27 15:14 ` Jan Kara
2026-02-16 13:32 ` [PATCH 07/14] xattr: add xattr_permission_error() Christian Brauner
2026-02-27 15:15 ` Jan Kara
2026-02-16 13:32 ` [PATCH 08/14] xattr: switch xattr_permission() to switch statement Christian Brauner
2026-02-27 15:17 ` Jan Kara
2026-02-16 13:32 ` [PATCH 09/14] xattr: move user limits for xattrs to generic infra Christian Brauner
2026-02-21 0:03 ` Darrick J. Wong
2026-02-23 12:13 ` Christian Brauner
2026-02-27 15:20 ` Jan Kara
2026-02-16 13:32 ` [PATCH 10/14] xattr,net: support limited amount of extended attributes on sockfs sockets Christian Brauner
2026-02-27 15:25 ` Jan Kara
2026-02-16 13:32 ` [PATCH 11/14] xattr: support extended attributes on sockets Christian Brauner
2026-02-27 15:26 ` Jan Kara
2026-02-16 13:32 ` [PATCH 12/14] selftests/xattr: path-based AF_UNIX socket xattr tests Christian Brauner
2026-02-27 15:29 ` Jan Kara
2026-02-16 13:32 ` [PATCH 13/14] selftests/xattr: sockfs " Christian Brauner
2026-02-27 15:30 ` Jan Kara
2026-02-16 13:32 ` [PATCH 14/14] selftests/xattr: test xattrs on various socket families Christian Brauner
2026-02-27 15:32 ` Jan Kara
2026-02-20 0:44 ` [PATCH 00/14] xattr: rework simple xattrs and support user.* xattrs on sockets Darrick J. Wong
2026-02-20 9:23 ` Christian Brauner
2026-02-21 0:14 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=qxctwu77wp7gv4ua3hn6kg7r2vt57laomn3ebjisemzzaybagy@mvoo2wpvu2ux \
--to=jack@suse.cz \
--cc=brauner@kernel.org \
--cc=edumazet@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=jlayton@kernel.org \
--cc=josef@toxicpanda.com \
--cc=kuba@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox