From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAF2EE7719A for ; Wed, 8 Jan 2025 12:32:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4845A6B0083; Wed, 8 Jan 2025 07:32:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4341E6B0088; Wed, 8 Jan 2025 07:32:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AE196B0089; Wed, 8 Jan 2025 07:32:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0837A6B0083 for ; Wed, 8 Jan 2025 07:32:44 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A9E741C82BA for ; Wed, 8 Jan 2025 12:32:43 +0000 (UTC) X-FDA: 82984223406.18.7245325 Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) by imf17.hostedemail.com (Postfix) with ESMTP id AC1234001B for ; Wed, 8 Jan 2025 12:32:41 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P+PQIbPQ; spf=pass (imf17.hostedemail.com: domain of aliceryhl@google.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=aliceryhl@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736339561; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y3Vodk3cvVQRC0E4rbLU9E6VKqAaSnQ+/DTuNiDJXiA=; b=pan2OZXC1ELHAvZljC+IZ8ImGgR1cPibUsiBt/9ByLgI9hIZ8sVHQrf1lD8dhjTuskLeQr KTZYxP+G9hEh+J71jo2mQujLfQ2UM1E6K7TOi3MjCq3/0xRGxM4xZLvOD5nrqJIsPHrZZ+ YiChN4qqPv1UQaLXnbvNfcRMwU0F/j0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P+PQIbPQ; spf=pass (imf17.hostedemail.com: domain of aliceryhl@google.com designates 209.85.128.45 as permitted sender) smtp.mailfrom=aliceryhl@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736339561; a=rsa-sha256; cv=none; b=v8wXq79gk5EKoxfAANWGTmYAoUp0OEhzcynQDbqOOcb4KxQjwvjh0TbYiEuKZ3cugyWE5b ZfhmpLhoCrFOgg+hnnPSUkb1yZfSi3Xr3nx8anW+gPvVsF2XZ9gWngoEIFcOdTPnz0xaj5 x3tXK+8f2nNaGR1s3S5rAcMMePhtqsE= Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-4362bae4d7dso119618855e9.1 for ; Wed, 08 Jan 2025 04:32:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736339560; x=1736944360; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Y3Vodk3cvVQRC0E4rbLU9E6VKqAaSnQ+/DTuNiDJXiA=; b=P+PQIbPQx9wwz5pZFqnSYjPpjrY1EsGg7XyIhg5VQmQI72C6S4VK+ITTNKjSdeF9lF ZqQ0Sbj54bEQGeKBkdjF7dVYNYuoHkbLAuvaaj+aq5T1HLC7vlyRTDv+Bl0WtGrbX8Lv yy+pzR962bp0vzp6qs/GBCYyvXsa41np95VB5wRFMlWMeNMFfDTmyMNnt356YLsTCoTi TMwelKIWHv3NvMOpTtEd9vmXb4XJkOLjiwc46vmWSPuypc69Ol4Q00+PyTb+v7EpGedF GZMlxD6bfUO1EqiUQfIil8IJ4D5e+tBC7bnX/ut4bfYZza1xKJJ1LBwcYdZqJg9R07xf kfUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736339560; x=1736944360; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y3Vodk3cvVQRC0E4rbLU9E6VKqAaSnQ+/DTuNiDJXiA=; b=T/1S9Rt6Dbmk8QASW6yzVkl1FsmZoeSMyNuD4YofBUOXx/aPvneE3gatN4xEuxNIGB qYHxg4zWL4VlnRgtNLLipeUdfazM8b/dfRlgd24aK3OAlcFAOnEGWmaL1Il3HdFj37/U 5et6Jbr2T5SoBufU0aNl4Gs/GbLEAoQJ+rHXoL5pac3DyrMcTTsZ8pRZZFFYo065/ZrK H4QYRdJmLgz/apDnMusVMbci/3Lg6CBNlxGQMul/Pm184K5oLqpaEkhiPiJlyY2tlnzA bEnQseD9BfWWSMXY00Q5EgpbBV5pQp8R4CmBeDRrw8w/3PhUmBDpasr/Uv50VLwAb7in ZaEQ== X-Forwarded-Encrypted: i=1; AJvYcCUWeMEj2+AFtMp+Uxl4F16V7rK30y97cpLc9s980cOV4u66abELCHY8D4/LCQVQD8wZ5aPiUCoBJw==@kvack.org X-Gm-Message-State: AOJu0Yz36OQw55g8e2+I5t/huqPVz/HNlItYU4oKIbYWtzv4FPUbBNfY M/Jl1AylE3NdPwG8BoBNV3n4lEu5ih2/g1U8XDyywsNKJd38ALLmTYmnFiTlTVFZvBxTZLvV4rF 37CcsnymDhKDSamKnh5geUO62pYki7XLLLLrG X-Gm-Gg: ASbGncv+rfSQYOnOAJy8H2uU8az2WalkSNONHjCEWXJZDcNegJ5zs8wN2UnGK5BMX8D WDdmk3EFuXmGktSAHIuVDl1sXyGTmX2w8nVnyYr/2c8S5EdiHl7Cu/JnG9r8xGn36MnmP X-Google-Smtp-Source: AGHT+IEskuo3fK2LuhaAm13Ug2sSSFQnbDigYwYbO9yrK57j8PvPXDs4R1cjfZdHdzcXJgq1wwS+JhW9G4C5zCKry+s= X-Received: by 2002:a05:6000:1f8d:b0:385:f6c7:90c6 with SMTP id ffacd0b85a97d-38a8730687emr2336604f8f.20.1736339560020; Wed, 08 Jan 2025 04:32:40 -0800 (PST) MIME-Version: 1.0 References: <20241211-vma-v11-0-466640428fc3@google.com> <20241211-vma-v11-8-466640428fc3@google.com> <87frmnadmk.fsf@kernel.org> In-Reply-To: <87frmnadmk.fsf@kernel.org> From: Alice Ryhl Date: Wed, 8 Jan 2025 13:32:28 +0100 X-Gm-Features: AbW1kvYRGjjTmfVAVHygmi3avDuaVXAHexXRmtBBxOnWAWxHdiZq78_fUYF0gdI Message-ID: Subject: Re: [PATCH v11 8/8] task: rust: rework how current is accessed To: Andreas Hindborg Cc: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Christian Brauner , Jann Horn , Suren Baghdasaryan , Alex Gaynor , Boqun Feng , Gary Guo , =?UTF-8?Q?Bj=C3=B6rn_Roy_Baron?= , Benno Lossin , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: AC1234001B X-Rspamd-Server: rspam12 X-Stat-Signature: nwae9uienosjd99facutoiwswbqz51n5 X-Rspam-User: X-HE-Tag: 1736339561-860181 X-HE-Meta: U2FsdGVkX1+tNWhd80nJhGbKm2mts22R0nGtD5bXcBZ1CYKv0ZbMiM+GgVNEkrcycAxV1IqeMWoW6jXb7G2//jjERonz+cO1LDTunPo4dIz2neGaJd6TlbXjjZVbuYEQ6WHj7/jvWTtWgCv/mq/4MXiqr1vdiuWL1gdYixm5KnX6a0svtmpqfQIGTIfuPdXj9q8UxV0CdWK5upqqx66PuO3ep/QJcG7Wl8L6MiDUl5QoWFSqyNcgq4Azx/eSwZDecCdmCGRNpXvmQKJqrbBxfjbjL0+y0tMBmfcxTNamovU6BMgMvhieEhPEK26Z+D6z91BRE1yW8XnbdByyUhDM8ZPpK9JfskwdS9jrwAgfOcoT+4Rn1Ehx6uwk8K5K8UUbD9QP9Oe12CQJkJbLRut0Ri8K1gVEV+WThqh/X7GsO+7sw2tD4BzsGI2kxcVqxtOCQT7n6PzgtgmpAaVP8jX/6E01N2IBRd8tWVcpHCG4YsSD5BOkfWrjruQFyl1R/cafyPz3mDlLVfcAWc8DNj8bTIaJn1dUQW0nT2kK6EIPioZL7fJAWMbRfdgguJ4nSvfLI3qxb+yvu9da+bzpDmtORxwgqyrqT/WcRQc81hyQMCV474qUB6HioHZsFuJglp+Oe5LCdN4ChlsBjg7dga/KLXwl5cDvH1RAaGbbcWXHyO2sAxLvQ8RrfIgOONQKpqLx8dgNqJg48VbF9bDj8CP5vctzDEalatDX5kHuZK3l0OTnikyvryJzH4Q1S/oCvswhzDARDyIFVxIc8YBzwfiH75UNd9Y0d8GWQXg3wjgi8qJuCMHKHDupvFJrndeggLUq9394a9G+0FhAsYYptsRA9kP/fV7Yc9sCgVLgd4BSZy+Owaw9oOZpz+yWg7O6B/R+220HM55hhMKshYlpKXJ5xALFdSDz2pGmG9SUjc/lSL3Gpv53h0jAJvQdAzWC1bbSnBdNM6LMbnfU/We+Eph VpP/PHIl cF4OwOVLmGVcFZYVd61TlAhGdBiIWkxh8MjRwPlXjPtE975RC7nCxen12doxyroilb6cU8Bj4oP9lZOtJq33IMJStdS9UucejFKHgZkPjRdhvhIcyUXaqoPDgZ66PdUkcjGwbbftJ1/pXkdkpxkABQlz8KBXWkcLEJ7GO+hyHz8F234fz5Ssj7LZ0OXrsX5cgiFX+96IhOxOBXaKkTf4725c5MkrZlg6sZkhwtXPrj5O9Dkx2IEsf1je4WFUDRJZatXeerzackA0WX8Go4oz+IcM/RrcV6fHsVtf4boUHfJsuRah14r+5MoZzZ51nwWc52Q4W X-Bogosity: Unsure, tests=bogofilter, spamicity=0.498308, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Dec 16, 2024 at 3:51=E2=80=AFPM Andreas Hindborg wrote: > > "Alice Ryhl" writes: > > +impl CurrentTask { > > + /// Access the address space of the current task. > > + /// > > + /// This function does not touch the refcount of the mm. > > + #[inline] > > + pub fn mm(&self) -> Option<&MmWithUser> { > > + // SAFETY: The `mm` field of `current` is not modified from ot= her threads, so reading it is > > + // not a data race. > > + let mm =3D unsafe { (*self.as_ptr()).mm }; > > + > > + if mm.is_null() { > > + return None; > > + } > > + > > + // SAFETY: If `current->mm` is non-null, then it references a = valid mm with a non-zero > > + // value of `mm_users`. Furthermore, the returned `&MmWithUser= ` borrows from this > > + // `CurrentTask`, so it cannot escape the scope in which the c= urrent pointer was obtained. > > + // > > + // This is safe even if `kthread_use_mm()`/`kthread_unuse_mm()= ` are used. There are two > > + // relevant cases: > > + // * If the `&CurrentTask` was created before `kthread_use_mm(= )`, then it cannot be > > + // accessed during the `kthread_use_mm()`/`kthread_unuse_mm(= )` scope due to the > > + // `NotThreadSafe` field of `CurrentTask`. > > + // * If the `&CurrentTask` was created within a `kthread_use_m= m()`/`kthread_unuse_mm()` > > + // scope, then the `&CurrentTask` cannot escape that scope, = so the returned `&MmWithUser` > > + // also cannot escape that scope. > > + // In either case, it's not possible to read `current->mm` and= keep using it after the > > + // scope is ended with `kthread_unuse_mm()`. > > I guess we don't actually need the last section until we see > `ktread_use_mm` / `kthread_unuse_mm` abstractions in tree? I mean, there could be such a scope in C code that called into Rust? And I don't think there's anything wrong with future-proofing this abstraction towards adding it in the future. > > + Some(unsafe { MmWithUser::from_raw(mm) }) > > + } > > + > > + /// Access the pid namespace of the current task. > > Is it an address space or a memory map(ping)? Can we use consistent vocab= ulary? Neither. It's a pid namespace which has nothing to do with address spaces or memory mappings. This part of this patch is moving an existing abstraction to work with the reworked way to access current. > > + /// > > + /// This function does not touch the refcount of the namespace or = use RCU protection. > > + #[doc(alias =3D "task_active_pid_ns")] > > What is with the alias? This is the Rust equivalent to the C function called task_active_pid_ns. The alias makes it easier to find it. > > + #[inline] > > + pub fn active_pid_ns(&self) -> Option<&PidNamespace> { > > + // SAFETY: It is safe to call `task_active_pid_ns` without RCU= protection when calling it > > + // on the current task. > > + let active_ns =3D unsafe { bindings::task_active_pid_ns(self.a= s_ptr()) }; > > + > > + if active_ns.is_null() { > > + return None; > > + } > > + > > + // The lifetime of `PidNamespace` is bound to `Task` and `stru= ct pid`. > > + // > > + // The `PidNamespace` of a `Task` doesn't ever change once the= `Task` is alive. A > > + // `unshare(CLONE_NEWPID)` or `setns(fd_pidns/pidfd, CLONE_NEW= PID)` will not have an effect > > + // on the calling `Task`'s pid namespace. It will only effect = the pid namespace of children > > + // created by the calling `Task`. This invariant guarantees th= at after having acquired a > > + // reference to a `Task`'s pid namespace it will remain unchan= ged. > > + // > > + // When a task has exited and been reaped `release_task()` wil= l be called. This will set > > + // the `PidNamespace` of the task to `NULL`. So retrieving the= `PidNamespace` of a task > > + // that is dead will return `NULL`. Note, that neither holding= the RCU lock nor holding a > > + // referencing count to the `Task` will prevent `release_task(= )` being called. > > + // > > + // In order to retrieve the `PidNamespace` of a `Task` the `ta= sk_active_pid_ns()` function > > + // can be used. There are two cases to consider: > > + // > > + // (1) retrieving the `PidNamespace` of the `current` task > > + // (2) retrieving the `PidNamespace` of a non-`current` task > > + // > > + // From system call context retrieving the `PidNamespace` for = case (1) is always safe and > > + // requires neither RCU locking nor a reference count to be he= ld. Retrieving the > > + // `PidNamespace` after `release_task()` for current will retu= rn `NULL` but no codepath > > + // like that is exposed to Rust. > > + // > > + // Retrieving the `PidNamespace` from system call context for = (2) requires RCU protection. > > + // Accessing `PidNamespace` outside of RCU protection requires= a reference count that > > + // must've been acquired while holding the RCU lock. Note that= accessing a non-`current` > > + // task means `NULL` can be returned as the non-`current` task= could have already passed > > + // through `release_task()`. > > + // > > + // To retrieve (1) the `&CurrentTask` type should be used whic= h ensures that the returned > > + // `PidNamespace` cannot outlive the current task context. The= `CurrentTask::active_pid_ns` > > + // function allows Rust to handle the common case of accessing= `current`'s `PidNamespace` > > + // without RCU protection and without having to acquire a refe= rence count. > > + // > > + // For (2) the `task_get_pid_ns()` method must be used. This w= ill always acquire a > > + // reference on `PidNamespace` and will return an `Option` to = force the caller to > > + // explicitly handle the case where `PidNamespace` is `None`, = something that tends to be > > + // forgotten when doing the equivalent operation in `C`. Missi= ng RCU primitives make it > > + // difficult to perform operations that are otherwise safe wit= hout holding a reference > > + // count as long as RCU protection is guaranteed. But it is no= t important currently. But we > > + // do want it in the future. > > + // > > + // Note for (2) the required RCU protection around calling `ta= sk_active_pid_ns()` > > + // synchronizes against putting the last reference of the asso= ciated `struct pid` of > > + // `task->thread_pid`. The `struct pid` stored in that field i= s used to retrieve the > > + // `PidNamespace` of the caller. When `release_task()` is call= ed `task->thread_pid` will be > > + // `NULL`ed and `put_pid()` on said `struct pid` will be delay= ed in `free_pid()` via > > + // `call_rcu()` allowing everyone with an RCU protected access= to the `struct pid` acquired > > + // from `task->thread_pid` to finish. > > While this comment is a nice piece of documentation, I think we should > move it elsewhere, or restrict it to paragraphs pertaining to (1), since > that is the only case we consider here? Where would you move it? > > + // > > + // SAFETY: If `current`'s pid ns is non-null, then it referenc= es a valid pid ns. > > + // Furthermore, the returned `&PidNamespace` borrows from this= `CurrentTask`, so it cannot > > + // escape the scope in which the current pointer was obtained. > > + Some(unsafe { PidNamespace::from_ptr(active_ns) }) > > + } > > Can we move the impl block and the struct definition next to each other? I could move the definition of CurrentTask down, but I'm not really convinced that it's an improvement. Alice