From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8295E7719E for ; Mon, 13 Jan 2025 10:26:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 43C3D6B007B; Mon, 13 Jan 2025 05:26:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3EC846B0083; Mon, 13 Jan 2025 05:26:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28D086B0085; Mon, 13 Jan 2025 05:26:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0A1956B007B for ; Mon, 13 Jan 2025 05:26:58 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 826B845908 for ; Mon, 13 Jan 2025 10:26:57 +0000 (UTC) X-FDA: 83002050474.19.87088A0 Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by imf29.hostedemail.com (Postfix) with ESMTP id 9518012000A for ; Mon, 13 Jan 2025 10:26:55 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C7QhUDdN; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of aliceryhl@google.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=aliceryhl@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736764015; a=rsa-sha256; cv=none; b=aQNfMmN+RA7jEpHhC96Eq8i3x0/lJb6DsekxBOEgCVr2uAvrZ+iLKW3GeVODVubNa1w2e/ qga1JFm5QZaok2VUfRN7khINPjrfaDPRlEzlAa6HxXXxdW9YPTMP2Y40cgolm1GHIHnO3S 7mQXYkabjVTPEpX96zKgFqNVSgrTNzs= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C7QhUDdN; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of aliceryhl@google.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=aliceryhl@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736764015; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WotA+xYJuM3ux57wZ522GGvB6ieWOsCacjrw//zGRFg=; b=r8fFodh3XzKfWAGkEsvLbmhXCggm15JRVFfCGSxas/TYu4XUT6XNUxi2ep+wrvRyVazEWC gihEv2spNqjpqUPzFbel9HRf5YLhw62nh8zTTfIjCgf2EW5C9X+fENdddf+bTOjonY8YUk 3k5fMz/oYoIb1A8i8H7N8nmyb6E698M= Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-436341f575fso42054245e9.1 for ; Mon, 13 Jan 2025 02:26:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736764014; x=1737368814; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WotA+xYJuM3ux57wZ522GGvB6ieWOsCacjrw//zGRFg=; b=C7QhUDdNNZUsZdmKnUR8iNfs5+V3AEkw9hsGsjhnXBZFmfJv7npWgB0oiI9xgZUlft Q3bhfGQlgMIbSYZN2hn9gpzCzYPGQChU3dGzOGVc+3mtO8cwGgOhVH3lRneDbNqQjv2p mZHOLHMTlqn/24WN+DTb0wwophjqyrzm67M19Kousv3fJEKGYTFKANhkLGabASd3vTS/ thi0qpBUj8VMKChI2f8VjZ9nAWkZ+cBx1ruSzqxQQfYWJcjAYPA/qJad14VsuITrbVFN etYHIPu7jyHofXYDt+9KGiXcATY/rblM2j7PmMswgDxR4EVOMCYO45+AoZoOyBMCQMYU swmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736764014; x=1737368814; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WotA+xYJuM3ux57wZ522GGvB6ieWOsCacjrw//zGRFg=; b=DZaGX0czHgMqUijcRvpnnkq+3l/um3JY1RlKs+HsoY01BBGqTwWkfGPQPVuSkH9niQ W6t1seX5gp9cwjCVtPxI+b03YqW2+LqrXEMw95DUcqJy0p4dDKQNdaSXAzcE9UamVcaN SpnH9O5J5QRzG00hjUiKyZgTUbaTCnp6OHhp6l48CpYyBj77EWC3Y6zG5lIW933M4ui+ POQuLks4C0cRa+uL+gATpoP8d+ZI8fR3mwpo5GRX/d2UxdmQbi6EUvF7P0DSCLINk2n/ b6lv+Az9s9q0Q6u0COqXhAMm6nUHwhgQHsYLTQ8olrlwOjy0RHZujjc2KO18SM2f9ywS YnLA== X-Forwarded-Encrypted: i=1; AJvYcCX3teBeRi/vXsMQ9nXlo5iE+dxlxo18tgApuHjRZmhSLkm6Icqho65mdYCtMZlwt7bm9GDEpXqmCg==@kvack.org X-Gm-Message-State: AOJu0Yyp1oNiPGtVsw4nqcKXz63FPr8hF5XXcY3czVL7l/vP1TlIOmm+ yJnIIBPLbpgtgf2xkBd1RVKwL0H1FsXTcjZiMPtJcmOuQAwWcVpEChRHL7a35+TuURp/gaHtbsq mvSleEFpeZuD4bV/PXkpEDDW/R0uejOUTaCj8 X-Gm-Gg: ASbGncvQTyYGPNcpXbd9x/AoRvKHDKYQiTzTgTphIQZCLw8Ik/+3sjFCOfdaZIr6fJE YrjP0ZNFstwQGqOzYgMESXX15DyUWcsu9HumBEffm6/sCvI6hRiqnXPJYDihYw2lptLoW X-Google-Smtp-Source: AGHT+IEdGBC/y1Bvaqxe390XP8IZkL7ccHVLd8tb6uNVNh+W/IRSR2mP1N4tBZgiPoa1YN8GklcEUFkm0PwDHhINESE= X-Received: by 2002:a05:600c:a0a:b0:434:a04d:1670 with SMTP id 5b1f17b1804b1-436e25548e3mr76471545e9.0.1736764013996; Mon, 13 Jan 2025 02:26:53 -0800 (PST) MIME-Version: 1.0 References: <20241211-vma-v11-0-466640428fc3@google.com> <20241211-vma-v11-8-466640428fc3@google.com> <87frmnadmk.fsf@kernel.org> <5qXA_x0NK0TzOkDwdQ8RYbk3SboUQiD0qflQhsoPRyhFxpD7fZHC7uVjI7RQBvvHPVEO1zlMRBq3F8BBImHrYQ==@protonmail.internalid> <87y0zk9y4h.fsf@kernel.org> In-Reply-To: <87y0zk9y4h.fsf@kernel.org> From: Alice Ryhl Date: Mon, 13 Jan 2025 11:26:42 +0100 X-Gm-Features: AbW1kvaPU7jte_DTbz1hQMDwki-WCfamsRDNzYatM7mfSJxRtXANZ6h8Hqjf9II Message-ID: Subject: Re: [PATCH v11 8/8] task: rust: rework how current is accessed To: Andreas Hindborg Cc: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Christian Brauner , Jann Horn , Suren Baghdasaryan , Alex Gaynor , Boqun Feng , Gary Guo , =?UTF-8?Q?Bj=C3=B6rn_Roy_Baron?= , Benno Lossin , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9518012000A X-Stat-Signature: dprfz3y5hg9anucniw9hbjm4m5j8hn5c X-HE-Tag: 1736764015-967966 X-HE-Meta: U2FsdGVkX19u8t6AWkJf37xVE2dcMuEez/yWZSbnPYzhb8FqdKtBJRw8FvarYTebfXUBer0AfjvLugYpZmLdIDJlukIG4+GRJKzajqRIVwM/TpHmxPkZulL6G89U7bTZPig5suI/RYB/5kGqIgFsLF3WDgB8O3a2QJA1pdAsrma3pMhjQP5lxwxpsxV1TR2ruyxsN+T/HXpwLBRAotw2uNOUj79CpGFow2W46/XRW1TYJgDDZLp+ZhdFC+WKycAI11yKdJ2gXm6EMpb73qFRQp3zqicZ5/bm29XpJZClsk5THYwRvR3UJTFw55zMFI9wmo47F5Fl2KYlnlp9A4SvC+4ucF7RIDHTVP1GFkhV7Jz/82DLaX1Djvlww7YZcCAluupTGG6FrduC6ymopom+zJdax0b7ZEZ/C0lm4IUW/JWJU1mAo2GFsSRgBvWUIEvJf5z31Y+Wb0Y2xcsvuIY/WeF6kJdHnK1wio7KK4VMPqxl+/3iNpo35REN6lK0Y6foQ9cap9d9I3NIvkpSagphNLLUqBsXwRUWW3Q+RafHX3jFOQ7WVX89GVWcsRBArzDg0TEJ0KUlZ8tl7JIo8WSqpIXq6yyIaANpJgQkWz6Xyszj0cPr+XV386g0UTF/9n7GRGBAfRImxW7gND1MXe0BQI/5GWE1Y6zCnGKLHRHS5tAf71xad5JQlFn0ADEEu0Vs9fW0hBDNERgIowxBNghL0Ezlu7i9UlsbDjof8vjZp+uObV7HWDXg+GzCRRJedtI3feolwrUyVKAYyBZO3z00Uu5kV4l4Akv3j6fnuYW+h+4pDPdHBdcwBHL7XlwHBTJEx+z445QLmUjZlSeXys7Phcy6yghV0Uk7AKY8+GCCTm3H9DiEzMM/PINL3vyXhhrh4zACEivb9+zzhCfioxwLVZ12tPk7xdqVX9YC4PbCzOyWUSkYm8wXjr99XUdMdn5aQziHXSLDIswxqrT/Ctl h0sPARWg 0HQFJyh+trCeWcXGZhkNrYP9NMj7NFiGMMUQpR32yY1212E6l32sLdjRxbC4Ksx08pZPu8lsldUv3SwM8ByJWFLRfuz4l1wWZ6Yf/LSqY1K9nmCQKqE3WkR5nyCHPLUXT2wPdToQzgN0fbHesOBKuqPgSzhUjFgPdUtphg1FKVrsNkEQ1lTzn5f4swBGrN4X+im2kmiC2TdBodJBTVd8Vr4m7eButbnheK+KZVUvDaGu6ODDNCkqmKO/ZpH0bPEVU12WwLGe06x/tliSefQE+XUdI/oNfSuw2ROAPK+7+fDp+u6A0VPPud+H4JwxVeIYg3Q6D X-Bogosity: Unsure, tests=bogofilter, spamicity=0.495419, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 9, 2025 at 9:42=E2=80=AFAM Andreas Hindborg wrote: > > "Alice Ryhl" writes: > > > On Mon, Dec 16, 2024 at 3:51=E2=80=AFPM Andreas Hindborg wrote: > >> > >> "Alice Ryhl" writes: > >> > + #[inline] > >> > + pub fn active_pid_ns(&self) -> Option<&PidNamespace> { > >> > + // SAFETY: It is safe to call `task_active_pid_ns` without = RCU protection when calling it > >> > + // on the current task. > >> > + let active_ns =3D unsafe { bindings::task_active_pid_ns(sel= f.as_ptr()) }; > >> > + > >> > + if active_ns.is_null() { > >> > + return None; > >> > + } > >> > + > >> > + // The lifetime of `PidNamespace` is bound to `Task` and `s= truct pid`. > >> > + // > >> > + // The `PidNamespace` of a `Task` doesn't ever change once = the `Task` is alive. A > >> > + // `unshare(CLONE_NEWPID)` or `setns(fd_pidns/pidfd, CLONE_= NEWPID)` will not have an effect > >> > + // on the calling `Task`'s pid namespace. It will only effe= ct the pid namespace of children > >> > + // created by the calling `Task`. This invariant guarantees= that after having acquired a > >> > + // reference to a `Task`'s pid namespace it will remain unc= hanged. > >> > + // > >> > + // When a task has exited and been reaped `release_task()` = will be called. This will set > >> > + // the `PidNamespace` of the task to `NULL`. So retrieving = the `PidNamespace` of a task > >> > + // that is dead will return `NULL`. Note, that neither hold= ing the RCU lock nor holding a > >> > + // referencing count to the `Task` will prevent `release_ta= sk()` being called. > >> > + // > >> > + // In order to retrieve the `PidNamespace` of a `Task` the = `task_active_pid_ns()` function > >> > + // can be used. There are two cases to consider: > >> > + // > >> > + // (1) retrieving the `PidNamespace` of the `current` task > >> > + // (2) retrieving the `PidNamespace` of a non-`current` tas= k > >> > + // > >> > + // From system call context retrieving the `PidNamespace` f= or case (1) is always safe and > >> > + // requires neither RCU locking nor a reference count to be= held. Retrieving the > >> > + // `PidNamespace` after `release_task()` for current will r= eturn `NULL` but no codepath > >> > + // like that is exposed to Rust. > >> > + // > >> > + // Retrieving the `PidNamespace` from system call context f= or (2) requires RCU protection. > >> > + // Accessing `PidNamespace` outside of RCU protection requi= res a reference count that > >> > + // must've been acquired while holding the RCU lock. Note t= hat accessing a non-`current` > >> > + // task means `NULL` can be returned as the non-`current` t= ask could have already passed > >> > + // through `release_task()`. > >> > + // > >> > + // To retrieve (1) the `&CurrentTask` type should be used w= hich ensures that the returned > >> > + // `PidNamespace` cannot outlive the current task context. = The `CurrentTask::active_pid_ns` > >> > + // function allows Rust to handle the common case of access= ing `current`'s `PidNamespace` > >> > + // without RCU protection and without having to acquire a r= eference count. > >> > + // > >> > + // For (2) the `task_get_pid_ns()` method must be used. Thi= s will always acquire a > >> > + // reference on `PidNamespace` and will return an `Option` = to force the caller to > >> > + // explicitly handle the case where `PidNamespace` is `None= `, something that tends to be > >> > + // forgotten when doing the equivalent operation in `C`. Mi= ssing RCU primitives make it > >> > + // difficult to perform operations that are otherwise safe = without holding a reference > >> > + // count as long as RCU protection is guaranteed. But it is= not important currently. But we > >> > + // do want it in the future. > >> > + // > >> > + // Note for (2) the required RCU protection around calling = `task_active_pid_ns()` > >> > + // synchronizes against putting the last reference of the a= ssociated `struct pid` of > >> > + // `task->thread_pid`. The `struct pid` stored in that fiel= d is used to retrieve the > >> > + // `PidNamespace` of the caller. When `release_task()` is c= alled `task->thread_pid` will be > >> > + // `NULL`ed and `put_pid()` on said `struct pid` will be de= layed in `free_pid()` via > >> > + // `call_rcu()` allowing everyone with an RCU protected acc= ess to the `struct pid` acquired > >> > + // from `task->thread_pid` to finish. > >> > >> While this comment is a nice piece of documentation, I think we should > >> move it elsewhere, or restrict it to paragraphs pertaining to (1), sin= ce > >> that is the only case we consider here? > > > > Where would you move it? > > The info about (2) should probably be with the implementation for that > case, when it lands. Perhaps we can move it hen? The function already exists. It's called Task::get_pid_ns(). I think the comment makes sense here: get_pid_ns() is the normal case where you don't skip synchronization, and active_pid_ns() is the special case where you can skip RCU due to reasons. This comment explains that normally you cannot skip RCU, but in this special case you can. Alice