From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BB8CD1AD55 for ; Wed, 16 Oct 2024 13:01:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B2426B0089; Wed, 16 Oct 2024 09:01:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6619B6B008A; Wed, 16 Oct 2024 09:01:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 529266B008C; Wed, 16 Oct 2024 09:01:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 30D946B0089 for ; Wed, 16 Oct 2024 09:01:05 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4E31E802CF for ; Wed, 16 Oct 2024 13:00:56 +0000 (UTC) X-FDA: 82679475480.16.AFFB563 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf08.hostedemail.com (Postfix) with ESMTP id 1862E16004F for ; Wed, 16 Oct 2024 13:00:55 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kbCCAoz9; spf=pass (imf08.hostedemail.com: domain of brauner@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729083518; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=00vjl5EO7psKXhN3zX4IA/y90/QDYiEbRflFpWU7bA0=; b=43pY1FM6HqKKt5uWfwZXEy3wn2dkoJVA+Se7AFiW/arnU0HVLyVO3w2lKcf3b5EUBNAk9E eLYRFXa+hogo75TAN7vbFOtn/ZvFG4jFMJnPiL61Hp4wS0QdJV/D0YaF4YZ/nqOHolyU2o 5l/ZUVnEcAmCJJ5lvOw662A4++Uu5RA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729083518; a=rsa-sha256; cv=none; b=1TMtT9I/SDkpXOtKfp7O8/6OrezCnitzq1bI5Y+ogY2zvNqZsDcq8fAikARV8OBYgas6a3 KN2UYmFgilEkWapISPX0lq7DYP8DkeGgoHxoIan43rqZ+3TfLr18l39nyQKAuOPitQejUH wNLrOKnTCGui2Oi+AdgdZds2hb/8Alo= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kbCCAoz9; spf=pass (imf08.hostedemail.com: domain of brauner@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id A46385C58F4; Wed, 16 Oct 2024 13:00:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0C835C4CEC5; Wed, 16 Oct 2024 13:00:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729083660; bh=AQJQ52Y20rrXyjBF5uA6GMPy25kJ+zP7PIIyA12Eb3U=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kbCCAoz9IPzUFd0JxerVoaRYzpOHduc/0PJ3//oseEfrW6BiFMorA52SizQt+i2J0 j0WF9CQeyKct+P6ac4KIqCFwWwe0oxF+6gPZQrqYrld2oeGl9qahJBNO8wq5wYTvqg rL0moXxOCkQsXtrWfFVgDUVNgFsB/FVz8BsLElT7GGSc4oViW0hxHXvKBOSR69VbTV gikoiDIhAKJ7lM9b05AHuvqtF00Yif/Og2PEz55P/DkY02nJdIFy/jdTQcxCgmMDZP ckTrF481yJH4oZpOY4H0avaJW+RDYgUnm5E/O+a+oOv3F97GvcGEnHzLXNq3z+OBSW Ps+Ne3e+9elUg== Date: Wed, 16 Oct 2024 15:00:55 +0200 From: Christian Brauner To: Lorenzo Stoakes Cc: Christian Brauner , Shuah Khan , "Liam R . Howlett" , Suren Baghdasaryan , Vlastimil Babka , pedro.falcato@gmail.com, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/3] pidfd: extend pidfd_get_pid() and de-duplicate pid lookup Message-ID: <20241016-beinbruch-zeltplatz-4bfdedca1ee8@brauner> References: <8e7edaf2f648fb01a71def749f17f76c0502dee1.1728643714.git.lorenzo.stoakes@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <8e7edaf2f648fb01a71def749f17f76c0502dee1.1728643714.git.lorenzo.stoakes@oracle.com> X-Rspamd-Queue-Id: 1862E16004F X-Stat-Signature: 49se9pqoh3basfmxh3pz568b8h8man5p X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729083655-91027 X-HE-Meta: U2FsdGVkX18g3egg539ikcElHaWd6c354mSvA0f/D2nE1nw9xxNN5uejEom9gEYcSdoO08T3gbMxtC/a2CKW/5zOI32p7etcRKQSRDRsu2OZcNKVm4dDm1utOaCJ2AR3veXUZfSfwizzjs2+PLZ8DZnbI1a9UM2YZN0XXtwfBKcyNz5dF+Hwl+VByc5bgCiEjapBDjBpWrElixm8Q1XIjktyuJw2osTWBhhXq7ahZbus60uGj+ZYhmByXcq2IasA+Qev1pShfJxSAYwCJWDEORMhMKCk7WXoElnJJYhupqQB3dJc2+ex3oEPBfJZ4IKPl33EQ20i3R28/wGgSkw/1gF8vnzPqejdGCp4aYp00lhL1sxQ1bNF6pX4nIrOLUS+cgPW3535yWAFhnExQrvvqvBZix9GrM8O7cJCuEOU5xcQhz747tSEURPChCQCIWApowV779kdIRHAIZicTYsU5P9OzOxN00Y2INvoWwdrcSKdcdUUX2yHyS0mwi5Ytn4yWfu4ucYitCmNa7zMk6CsPI5x4590rHGJH/3CgA+JsbNN/iJ/vH5wK6fSMxj3L7+p4Q/+Xk5uEgVyM3hXmOYVD/X5kRVHbFcohtm4N9E1q23pZjFiUlSe/8jmXA+J4r7ZS4zPBGouwpEFizChK3rmRV1HG3SHgFnqcCOXCDd8xkUSHmFSjBS4XEEDc0/811++qWNVHE7lvP7QX2E9MgCDYeBsT54vV+VcpFhkh5f23SmDT+QE5ulbdAzXImEgjIbxGE4i1otGbNNNmko4E7DeH7b9ix6TMVVhbRwSuH1F2qAS0s9VWOPhgENmXLj5zC3unxCnAOSoEhyKNduPpVXubsvcycIuv3hvo5iEyRyp9V7eUaPRe3aZq5lLKm5BHWOti9psfoO6WuDhkbtyj6rId/Uu4pmrURhSOLE7OwvpTmQe+JHr6CCa+BLhMrAaJ4Io1rJjzdDc308aE0G4vqV JPouKgb2 au3+RT/205K4UHvExIp9o0A3s9OsMKv8+2tumgj1M3ul83k0FMD9TxdzEd5Yx4ub+qoU3kKOFefMncW3JsEBey4ED2BUb+IZJIzAYAvyNv/ok5tEhueyFGg88Zc7yp7SS2LfR+xKoKoDxs5K1ivmYqfh+tPAwnSG8WcM51Dh4iCm86hfqqJHmYVFO6B5xFfje3vj5HN9xspeERksIkqQZasmOT5AvKqDl3jA9z5bXuD147nhIxGLOJkfpVLiwj0Z70mSqYARURTpcWkvcqID3YefagNiPjfEdw1dn3m16lycxgyIZYajRDe4aw84xrbXXTDmG4luZsMkzOaUQXtV9a6gI7G20aqwVS1bwcFvxzegtCY8hG/mI6iCduW+3muQDyLJsbC7B6LJUjufKxXtvbVLTQmayn/aaDBss7Oo7V9axD4VVLTvepUOLRwuX8wCL3Ehr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 11, 2024 at 12:05:55PM +0100, Lorenzo Stoakes wrote: > The means by which a pid is determined from a pidfd is duplicated, with > some callers holding a reference to the (pid)fd, and others explicitly > pinning the pid. > > Introduce __pidfd_get_pid() which abstracts both approaches and provide > optional output parameters for file->f_flags and the fd (the latter of > which, if provided, prevents the function from decrementing the fd's > refernce count). > > Additionally, allow the ability to open a pidfd by opening a /proc/ > directory, utilised by the pidfd_send_signal() system call, providing a > pidfd_get_pid_proc() helper function to do so. > > Doing this allows us to eliminate open-coded pidfd pid lookup and to > consistently handle this in one place. > > This lays the groundwork for a subsequent patch which adds a new sentinel > pidfd to explicitly reference the current process (i.e. thread group > leader) without the need for a pidfd. > > Signed-off-by: Lorenzo Stoakes > --- > include/linux/pid.h | 42 +++++++++++++++++++++++++++++++- > kernel/pid.c | 58 ++++++++++++++++++++++++++++++--------------- > kernel/signal.c | 22 ++++------------- > 3 files changed, 84 insertions(+), 38 deletions(-) > > diff --git a/include/linux/pid.h b/include/linux/pid.h > index a3aad9b4074c..68b02eab7509 100644 > --- a/include/linux/pid.h > +++ b/include/linux/pid.h > @@ -2,6 +2,7 @@ > #ifndef _LINUX_PID_H > #define _LINUX_PID_H > > +#include > #include > #include > #include > @@ -72,8 +73,47 @@ extern struct pid init_struct_pid; > > struct file; > > + > +/** > + * __pidfd_get_pid() - Retrieve a pid associated with the specified pidfd. > + * > + * @pidfd: The pidfd whose pid we want, or the fd of a /proc/ file if > + * @alloc_proc is also set. > + * @pin_pid: If set, then the reference counter of the returned pid is > + * incremented. If not set, then @fd should be provided to pin the > + * pidfd. > + * @allow_proc: If set, then an fd of a /proc/ file can be passed instead > + * of a pidfd, and this will be used to determine the pid. > + * @flags: Output variable, if non-NULL, then the file->f_flags of the > + * pidfd will be set here. > + * @fd: Output variable, if non-NULL, then the pidfd reference will > + * remain elevated and the caller will need to decrement it > + * themselves. > + * > + * Returns: If successful, the pid associated with the pidfd, otherwise an > + * error. > + */ > +struct pid *__pidfd_get_pid(unsigned int pidfd, bool pin_pid, > + bool allow_proc, unsigned int *flags, > + struct fd *fd); > + > +static inline struct pid *pidfd_get_pid(unsigned int pidfd, unsigned int *flags) > +{ > + return __pidfd_get_pid(pidfd, /* pin_pid = */ true, > + /* allow_proc = */ false, > + flags, /* fd = */ NULL); > +} > + > +static inline struct pid *pidfd_to_pid_proc(unsigned int pidfd, > + unsigned int *flags, > + struct fd *fd) > +{ > + return __pidfd_get_pid(pidfd, /* pin_pid = */ false, > + /* allow_proc = */ true, > + flags, fd); > +} > + > struct pid *pidfd_pid(const struct file *file); > -struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags); > struct task_struct *pidfd_get_task(int pidfd, unsigned int *flags); > int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret); > void do_notify_pidfd(struct task_struct *task); > diff --git a/kernel/pid.c b/kernel/pid.c > index 2715afb77eab..25cc1c36a1b1 100644 > --- a/kernel/pid.c > +++ b/kernel/pid.c > @@ -36,6 +36,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -534,22 +535,46 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) > } > EXPORT_SYMBOL_GPL(find_ge_pid); > > -struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags) > +struct pid *__pidfd_get_pid(unsigned int pidfd, bool pin_pid, > + bool allow_proc, unsigned int *flags, > + struct fd *fd) Hm, we should never return a struct fd. A struct fd is an inherently scoped-bound concept - or at least aims to be. Simply put, we always want to have the fdget() and the fdput() in the same scope as the file pointer you can access via fd_file() is only valid as long as we're in the syscall. Ideally we mostly use CLASS(fd/fd_raw) and nearly never fdget(). The point is that this is the wrong api to expose. It would probably be wiser if you added a pidfd based fdget() inspired primitive.