From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE7AFD78326 for ; Mon, 2 Dec 2024 14:31:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70DF66B0082; Mon, 2 Dec 2024 09:31:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6BBD26B0083; Mon, 2 Dec 2024 09:31:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 583F26B0085; Mon, 2 Dec 2024 09:31:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 358A76B0082 for ; Mon, 2 Dec 2024 09:31:27 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 637231A1CE8 for ; Mon, 2 Dec 2024 14:31:26 +0000 (UTC) X-FDA: 82850256090.17.AA0435B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id 86A0F1C0002 for ; Mon, 2 Dec 2024 14:31:01 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=BPfnKFDv; spf=pass (imf21.hostedemail.com: domain of brauner@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733149876; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NLkhsS1FGviJZ42w1/u5WcethhSuZirwsSRlETvwPpU=; b=6yJeinC0otI+aN2SfsPbH58VYI3EM+EH1XulnflyluA27ruO0ltlvhsHokFXxXmEo6m7DP YlvtU3vLv37vvkE4qSDAwbfDbvlSBElNk0lX5Ht1PQz5MfbIysWkG2new2PdOB4hGfzo42 /2XPHl7odesqt6qS0Hgd6+UIyoU9b+g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733149876; a=rsa-sha256; cv=none; b=OPWkUa6Vw/XSGQu7N67pg/rvYf2zpCzHgCu9aEG/bVpctbly1FdFynn7T4wSMzcF1fFanW t3OGuupNrVPcEN6cf53SZeQLFTRKedVgv4UNxc/VrCCGiN2QvOeOB7ArQf/XrjghJGL8sl 2t2N8XQtOu/xJJA7V8G85oByngzb4eI= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=BPfnKFDv; spf=pass (imf21.hostedemail.com: domain of brauner@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 587155C6087; Mon, 2 Dec 2024 14:30:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8E961C4CED2; Mon, 2 Dec 2024 14:31:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1733149883; bh=ZtGu6SK5aFdAysupqz7NhzQdEnEBx5d8JsncaclV628=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=BPfnKFDvxZ8vA6jFBhYtrwkOi+RmM6F3HfRS7e8R+fjzPQ0G6pX7hqSlqjNexBJdP kLhua83Sb3v8k1JGVAU+kmGvQCH0YgzOhObmI11ZJ0Pw3n64slRqA70YrowK54qM+K eWnyxl4kVUdhxevMp7hLqFu2qr9vP5NLdKYkzxainIXOpjvSEXjHAiJvr8/mwCQnvO FlCRqDqOZpK0ho/hECjl7tdWFM8F2UVs6p/fpGfILPJJvO86FnPUnTCrpSRjgpfEo3 AM1L0ijv+lSidizSMfnMlgLjsZDsz9EPwFsoiUgp+T7llYXsihPCE8FRO81T2UeH/Y tXtmDr6i2/z0w== Date: Mon, 2 Dec 2024 15:31:13 +0100 From: Christian Brauner To: Lorenzo Stoakes Cc: Oleg Nesterov , Christian Brauner , Shuah Khan , "Liam R . Howlett" , Suren Baghdasaryan , Vlastimil Babka , pedro.falcato@gmail.com, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Oliver Sang , John Hubbard Subject: Re: [PATCH v6 2/5] pidfd: add PIDFD_SELF_* sentinels to refer to own thread/process Message-ID: <20241202-wahrnehmen-mitten-e330cbd1eaf0@brauner> References: <8eceec08eb64b744b24bf2aa09d4535e77e1ba47.1729926229.git.lorenzo.stoakes@oracle.com> <20241028-gesoffen-drehmoment-5314faba9731@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 86A0F1C0002 X-Stat-Signature: at7pckka6a5gh47ohczab96cjoj7xq7d X-Rspam-User: X-HE-Tag: 1733149861-526368 X-HE-Meta: U2FsdGVkX1+1407yl4G0ovEyNIGp05DwRUXYYrY29nkHHs32MrHFjbm/CFCFSXxi7sSwAjvmqxL7u2+Y38LlUcG0Qc4s/E0YTjz/o4+f/TcssOaKf3bZY8aY7x7YtyeX+5uekEfOKuNgx1G5pCcCoEMoHGoHQ2Rkf9bUEnDphoNqkpJDJjhcSWTWvfxFSG1ZyJrlgY5Lz8RCaFpRKTwmH5b0yD1kW2RIjK5zE8I3IxDhUYF4uVJuVei60FMQa4kHSTzKMzedFCoXyyi7UBUPLRFuvSoD3xxaigvA6D4VRAdaNpYsUSRM74YsHyhBZtPYF1d25vmT9XJ4XykkUeOTiZDtATFD2Cy7BvNN63QwTO3DBg5xQ9yvN8M9dkIlcJTMA5rjLbSCO4iC2jNIvyf7M2i4bNUM/IHwxYuCF5bbNBpVk5YMxWrmbsWsoAdl7FDSPYCbWvRdqQD13JOAUurT8asZroSzrtMnGbY0NPOeepx+cn9t32WVfQ0dE86VwQBdECMzlLMg9C3ERmJLP19vSM9vXNZZOe7jEoGcsFhcMdWJHTCq1zYXNRNSXz36QfPemB1xYJ1YUJqozf6aUZbUv4PmJHtMnLNolx2RBnLjWmzYdBgIjbYBL2CuzCBL8R3XgcsDKjvn4+lLEAzSDHREk95mmNSoHQ48g1+pV/Y2GS2YvvwKGK3+8Kzpwf7ulmd1u4iPPxNVNkDxbOcBV7VCicxd+o86IgRuQBYmngFBgQySqsbZ+YvmadmjEe7G1noKEHE0JLKmMwBYz//MRLuwYWsXhWWEVzhOb7P+dZD+8eZj47A18wfct5bqYAnyp02mD/Pd95yHI3B31X+3Cv1xxxkItq6sv0ZANhS8vLpVohdNgfaHIRIStwUmLH12i+2Pkgcp03I5/5oDjbo1pq2PJb9aEJfclm8/XkcOxtnh/CpFPQGniNfFbCTMGlv/kL15ZxGMo3dHw4uCJTZd3SH deNyPIhn 2WbCMlFcJy93OlJ7FbXL8X4PbtqiHBkBHaGO0R3yMGQGPP3drdQNhWVr7SsvlPhc5DRVMfa+vlMIAg4XDA++EPbueKt4cCmyczQnffJrEAErRzLVSr6Qe+5XYyOK1pGHOBclewqIKivldsGNA3m/rGjO/6/ETFoP+MHpgDjfTt3iWQFuJA2Hj9+AItW5Xrf4lPpOr7sBquYTUH/cfUEIMb1a+Rx2Y6xp7d6M7h2nkfzQsUVI1khjBok9yuXSO7L9oH/r1aW1oz0OqRKeMaekLu8PJVyLCW868BXOI9neUMOkhz/eABDexCoZN4TR2XlSUHob1kmQaTZLzYmvSQYQkdF9PiLWTdqKiy1Ca3jsIojUepNqzLMjI7m75z7JDds0UCb63tY+9x4VEgduci6FNLRkZ0Dqb4mz7rB7aQqvtuq1HTqo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 30, 2024 at 04:37:37PM +0000, Lorenzo Stoakes wrote: > On Mon, Oct 28, 2024 at 04:06:07PM +0000, Lorenzo Stoakes wrote: > > I guess I'll try to adapt that and respin a v7 when I get a chance. > > Hm looking at this draft patch, it seems like a total rework of pidfd's > across the board right (now all pidfd's will need to be converted to > pid_fd)? Correct me if I'm wrong. > > If only for the signal case, it seems like overkill to define a whole > pid_fd and to use this CLASS() wrapper just for this one instance. > > If the intent is to convert _all_ pidfd's to use this type, it feels really > out of scope for this series and I think we'd probably instead want to go > off and do that as a separate series and put this on hold until that is > done. > > If instead you mean that we ought to do something like this just for the > signal case, it feels like it'd be quite a bit of extra abstraction just > used in this one case but nowhere else, I think if you did an abstraction > like this it would _have_ to be across the board right? > > I agree that the issue is with this one signal case that pins only the fd > (rather than this pid) where this 'pinning' doesn't _necessary_ mess around > with reference counts. > > So we definitely must address this, but the issue you had with the first > approach was that I think (correct me if I'm wrong) I was passing a pointer > to a struct fd which is not permitted right? > > Could we pass the struct fd by value to avoid this? I think we'd have to > unfortunately special-case this and probably duplicate some code which is a > pity as I liked the idea of abstracting everything to one place, but we can > obviously do that. > > So I guess to TL;DR it, the options are: > > 1. Implement pid_fd everywhere, in which case I will leave off on > this series and I guess, if I have time I could look at trying to > implement that or perhaps you'd prefer to? > > 2. We are good for the sake of this series to special-case a pidfd_to_pid() > implementation (used only by the pidfd_send_signal() syscall) > > 3. Something else, or I am misunderstanding your point :) > > Let me know how you want me to proceed on this as we're at v6 already and I > want to be _really_ sure I'm doing what you want here. I don't think we get away with abstracting it in one place without this ending up a pretty janky api. I need to go back and think about calling conventions for all this stuff. For now I think I'm fine with something like the below that abstracts the api to handle mm/ cleanly and then a special-case for pidfd_send_signal(): diff --git a/kernel/pid.c b/kernel/pid.c index 6131543e7c09..c1857c44d1a3 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -564,15 +564,29 @@ struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags) */ struct task_struct *pidfd_get_task(int pidfd, unsigned int *flags) { - unsigned int f_flags; + unsigned int f_flags = 0; struct pid *pid; struct task_struct *task; + enum pid_type type; - pid = pidfd_get_pid(pidfd, &f_flags); - if (IS_ERR(pid)) - return ERR_CAST(pid); + switch (pidfd) { + case PIDFD_SELF_THREAD: + type = PIDTYPE_PID; + pid = get_task_pid(current, type); + break; + case PIDFD_SELF_THREAD_GROUP: + type = PIDTYPE_TGID; + pid = get_task_pid(current, type); + break; + default: + pid = pidfd_get_pid(pidfd, &f_flags); + if (IS_ERR(pid)) + return ERR_CAST(pid); + type = PIDTYPE_TGID; + break; + } - task = get_pid_task(pid, PIDTYPE_TGID); + task = get_pid_task(pid, type); put_pid(pid); if (!task) return ERR_PTR(-ESRCH); That would get you support for PIDFD_SELF_THREAD and PIDFD_SELF_THREAD_GROUP for process_madvise() and process_mrelease(). And for pidfd_send_signal() we could just open code this for now: diff --git a/kernel/signal.c b/kernel/signal.c index 989b1cc9116a..a2e4e3a5ee42 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -3990,6 +3990,45 @@ static struct pid *pidfd_to_pid(const struct file *file) (PIDFD_SIGNAL_THREAD | PIDFD_SIGNAL_THREAD_GROUP | \ PIDFD_SIGNAL_PROCESS_GROUP) +static int do_pidfd_send_signal(struct pid *pid, int sig, enum pid_type type, + siginfo_t __user *info, unsigned int flags) +{ + kernel_siginfo_t kinfo; + + switch (flags) { + case PIDFD_SIGNAL_THREAD: + type = PIDTYPE_PID; + break; + case PIDFD_SIGNAL_THREAD_GROUP: + type = PIDTYPE_TGID; + break; + case PIDFD_SIGNAL_PROCESS_GROUP: + type = PIDTYPE_PGID; + break; + } + + if (info) { + int ret = copy_siginfo_from_user_any(&kinfo, info); + if (unlikely(ret)) + return ret; + + if (unlikely(sig != kinfo.si_signo)) + return -EINVAL; + + /* Only allow sending arbitrary signals to yourself. */ + if ((task_pid(current) != pid || type > PIDTYPE_TGID) && + (kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL)) + return -EPERM; + } else { + prepare_kill_siginfo(sig, &kinfo, type); + } + + if (type == PIDTYPE_PGID) + return kill_pgrp_info(sig, &kinfo, pid); + + return kill_pid_info_type(sig, &kinfo, pid, type); +} + /** * sys_pidfd_send_signal - Signal a process through a pidfd * @pidfd: file descriptor of the process @@ -4009,7 +4048,6 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, { int ret; struct pid *pid; - kernel_siginfo_t kinfo; enum pid_type type; /* Enforce flags be set to 0 until we add an extension. */ @@ -4021,56 +4059,39 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, return -EINVAL; CLASS(fd, f)(pidfd); - if (fd_empty(f)) - return -EBADF; - /* Is this a pidfd? */ - pid = pidfd_to_pid(fd_file(f)); - if (IS_ERR(pid)) - return PTR_ERR(pid); + switch (pidfd) { + case PIDFD_SELF_THREAD: + pid = get_task_pid(current, PIDTYPE_PID); + type = PIDTYPE_PID; + break; + case PIDFD_SELF_THREAD_GROUP: + pid = get_task_pid(current, PIDTYPE_TGID); + type = PIDTYPE_TGID; + break; + default: + if (fd_empty(f)) + return -EBADF; - if (!access_pidfd_pidns(pid)) - return -EINVAL; + /* Is this a pidfd? */ + pid = pidfd_to_pid(fd_file(f)); + if (IS_ERR(pid)) + return PTR_ERR(pid); - switch (flags) { - case 0: + if (!access_pidfd_pidns(pid)) + return -EINVAL; /* Infer scope from the type of pidfd. */ if (fd_file(f)->f_flags & PIDFD_THREAD) type = PIDTYPE_PID; else type = PIDTYPE_TGID; break; - case PIDFD_SIGNAL_THREAD: - type = PIDTYPE_PID; - break; - case PIDFD_SIGNAL_THREAD_GROUP: - type = PIDTYPE_TGID; - break; - case PIDFD_SIGNAL_PROCESS_GROUP: - type = PIDTYPE_PGID; - break; } - if (info) { - ret = copy_siginfo_from_user_any(&kinfo, info); - if (unlikely(ret)) - return ret; - - if (unlikely(sig != kinfo.si_signo)) - return -EINVAL; - - /* Only allow sending arbitrary signals to yourself. */ - if ((task_pid(current) != pid || type > PIDTYPE_TGID) && - (kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL)) - return -EPERM; - } else { - prepare_kill_siginfo(sig, &kinfo, type); - } - - if (type == PIDTYPE_PGID) - return kill_pgrp_info(sig, &kinfo, pid); - else - return kill_pid_info_type(sig, &kinfo, pid, type); + ret = do_pidfd_send_signal(pid, sig, type, info, flags); + if (fd_empty(f)) + put_pid(pid); + return ret; } static int