From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7ACA8C47DB3 for ; Mon, 29 Jan 2024 21:27:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE4956B00AA; Mon, 29 Jan 2024 16:27:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E95356B00AB; Mon, 29 Jan 2024 16:27:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D5D826B00AC; Mon, 29 Jan 2024 16:27:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C55C56B00AA for ; Mon, 29 Jan 2024 16:27:58 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 983CEA0291 for ; Mon, 29 Jan 2024 21:27:58 +0000 (UTC) X-FDA: 81733636236.21.C673257 Received: from mail-ua1-f44.google.com (mail-ua1-f44.google.com [209.85.222.44]) by imf01.hostedemail.com (Postfix) with ESMTP id D271740010 for ; Mon, 29 Jan 2024 21:27:56 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NSor95G3; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of aliceryhl@google.com designates 209.85.222.44 as permitted sender) smtp.mailfrom=aliceryhl@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706563676; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=brAWAuHMK3TSM8Xhcazu9ZlxC5Vo2cjLsypBIp5RI5M=; b=AKnkZkdtjSKhAk6V7hQLEZbsVsfYu9jUm2ezCCj8zNVXgfnhg+qXDQCYeSAMVC7Q463Y7k K3mLgyFfo3izTm2rtKaXIiJze1L/C9DfIDSNiEktR64ASJPYuebGwJNQJ0fABriSPV0e2X LoI2aZABSU+emLvYOSkvS5rZsEdD22A= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NSor95G3; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of aliceryhl@google.com designates 209.85.222.44 as permitted sender) smtp.mailfrom=aliceryhl@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706563676; a=rsa-sha256; cv=none; b=16hqzVHtoGEitxQgAne/vK99IkoKsf5SR4ZqiuMb3CG2vxvhMQaSxG+A6cPEHSp39lM2rR pakIK9ZOXwy7m/1m4/CvvOLHEzI+ARDg81e+wPM0L60/dYKgu2d90jAfST3Gg1NvaUvJ76 ArVOnIMlvCN7UXdlC2DSA/yZgEnGY3g= Received: by mail-ua1-f44.google.com with SMTP id a1e0cc1a2514c-7d317aafbd1so1228749241.2 for ; Mon, 29 Jan 2024 13:27:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706563676; x=1707168476; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=brAWAuHMK3TSM8Xhcazu9ZlxC5Vo2cjLsypBIp5RI5M=; b=NSor95G3OxwWGZUv7tGUVAeoA0yPTzr4ktvSv2vz4f3Pu4ZuuZ4+/vU+dWassMoEX9 /JWF0An+n1slEpx5Iss8JLcw6dmnjRfu84hL+t6zg8bHOtN5+0xrii8Fn58vBvhr5OmO ZjUCUXJc+HKlK/2n/ZVOyxTd9LaR645ZWlipPe7yl74G+tUMHDTIJ+xUr40IVn8eOdPV fVwF8sOc1BY/vFrX6yMX1aQcTljVNwq0EslSXh0LOlpf2GSNLH2p0M0KghkwRp/bxMsD OCPW/DaMT5m7zS1C1WpwYEqh+qfYlPNlM7NYO5N0gTqLBlI5iFmgOeGY9+GQ1QCbgAtu Ys8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706563676; x=1707168476; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=brAWAuHMK3TSM8Xhcazu9ZlxC5Vo2cjLsypBIp5RI5M=; b=mDqa6D9En3SQ7LSODI/Au7Y4ZSjIMiCAJqLVZViVt5R93AhXcuQAfc0IGWVkZss+Oi RFg+c2pZs0TtWDcp0/Z600dr+JkuZn9jqkW8ZbTderwDf1MKhs1rtJkGVJcIYfUxvPHm pseYcJRfdWUILUZ3iv1dcZFr1xyimyqV99X2Qvt3SGrpRC11fiIEMDJyEUb8mKgzTayi ILfb6ZnDjUICPruoEsmlI6n6pwVw++XZt8GxSJ0bED6FP171PBjUkjzqvLD6MjpK5Jue rrmfYvstJLDn8pnENhgGtmc/2rJhbKJI9KMTObLmK2BXztfCcx2uxJAS55eHDZEhMHmD 6QzA== X-Gm-Message-State: AOJu0YzJZ2gcO/w67PnhvV1j/DZG/hHY9RfIbGZdvmk7wiTQgas1IRrv Z+pLF2jZf8wxRx/Z2M9lRs8cTtco1ZAdlVMSQyfKIdpmQnughrSAbskGWwEJ3fHh+G+h8g/6/GH 1gVk/iBi1EUs1gzvi1mhbpuEUaQuf+SbJ1Zm2 X-Google-Smtp-Source: AGHT+IFj5xck2ZZMU+sBSPYdylP/U9nNDCaRvFkvDK5pc2OQBzprbb2S01cfLCABG6cBuYniOj+KiU6IznAyfqEqbhs= X-Received: by 2002:a05:6102:317a:b0:46b:3d9c:99cc with SMTP id l26-20020a056102317a00b0046b3d9c99ccmr2810766vsm.9.1706563675784; Mon, 29 Jan 2024 13:27:55 -0800 (PST) MIME-Version: 1.0 References: <20240124-alice-mm-v1-0-d1abcec83c44@google.com> <20240124-alice-mm-v1-3-d1abcec83c44@google.com> In-Reply-To: From: Alice Ryhl Date: Mon, 29 Jan 2024 22:27:43 +0100 Message-ID: Subject: Re: [PATCH 3/3] rust: add abstraction for `struct page` To: Matthew Wilcox Cc: Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho , Boqun Feng , Gary Guo , =?UTF-8?Q?Bj=C3=B6rn_Roy_Baron?= , Benno Lossin , Andreas Hindborg , Kees Cook , Al Viro , Andrew Morton , Greg Kroah-Hartman , =?UTF-8?B?QXJ2ZSBIasO4bm5ldsOlZw==?= , Todd Kjos , Martijn Coenen , Joel Fernandes , Carlos Llamas , Suren Baghdasaryan , Arnd Bergmann , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Christian Brauner , liam.howlett@oracle.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: D271740010 X-Stat-Signature: 8if4xa9musw3kin7joa9ky7yaung9h3d X-HE-Tag: 1706563676-868002 X-HE-Meta: U2FsdGVkX19ZO+ghmcjN884JquGcQybAeNhQGP/qdiN5Vkvu5MPpWncn1xBcckoQuyK1Oat/HROpXpAqycJJeCBA4dwrevwz2yZUCaQGR+5Qxr6aNkJSEOObrja3zusyM1d6ie5WKtaNKT9Vn6i2BexKvzWETtfY/i158iAni3yILESOd5PAjVlWmXfgddM4ih2TW4S0XI7Hy/Vt0ayiHjeTeNRW4JfkxcoTV/vaifMjUad62IsheXROd7XobzQmmeNYV2yQpX33Nio5cFKZsrJd/be622SpBtEwULcElEXSUw9Bmv4o6QzUvq0k8FSX5BfztWVHyjjq6t+BS8cCb764qT9rkcEcqz3V7rlPYYAbxpw84JyhAza6Nwyl1IrL8yQ41zvxbN9ScntYeCDs3HTvsUtb5swyEDLTyFTMrcQ+FM0MX927ZOZYLXP3s4VYiSWm4WrDtpGJJnCuqlYBzrR7i4dy7naUb9/VkydwIa6K038wNq+KTWz5kPbqGf6BUoUJtytIzX8PhRak7sQTCqT+j77lDTdGZPD/pqsr2rVxwbd0czz2AGhuTuN+jHS4ArrUjRpGNJL5HyUvZJecvAqY4SaYlRG5YfKGS3gmb/BmYAuPvlHbAUGAOJqlLqcJPYxCoitDB9sw7f9hGq4L7K2fKUnJLtv/oWGUfd65dCddECxUyH9zvwSu7yRjK7mHuXuvwAeEu6ga2koXtuaW0sen3DdtPzdhsXklM6W4sPnZxiM2OgiTn5oDg9H3AIzDcKt4T0jvLewwH/g+x9Y9uS8+lcEuOjQ7EvhKhX5Df2ZFVCLT5HQA3/tSY1qYdc8GW94/wGdWh3WYFZYdArCrKLBWZpDTLYAsRPe2dKyGcj12JFbwHLLakTi2QUUsjp32ziMAaD0Yy34JKKNFLgKeeUdGOWDW6LSFEeUVfnCT0liugy3Sj6H/sleSY+tMzhx/TEm/cBkGYrDf2w5oLXi Li2w8ORV M+fkjiWOuWhVblcv3mI8P76ixVarmNodlsPVQRVgG/e9wukzbA3jhx6lbvDDamuQ3MRLehh2/qe+QxpYOX7/vrPIfQT+CXY5we9Vqdk1/UcQ9pQXgx62sAQcr8a/cva1JyetcsZv/Sv35TcnSosGzCB8WHHefknl7Hu64NxwFEOggb2G78Ehiblf3o8/cKn6KBMOa64MTGH4nuORjGTvR1T5f0prM6CNoyEbeU4BGr+RSr8SlU8rzboFdXln8HhOw1rsAdL2PpRnOQ7MYdfOfkoCgAJO7dQwLN1eAA2SuLOQZ/IGiUvbHiEUN5Qnn1qFlv2yBCRTokOAJuuQzSQad4hawA5ifzLiyRblqS11dnd+yK7JTPGCSLJueBgBFFynjlqV+5C2WslP1GKo0N5WG8Kx2ZzUzjpJLu+WcH9YANULiwlgpNcHnsF+32Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 29, 2024 at 6:59=E2=80=AFPM Matthew Wilcox wrote: > > On Wed, Jan 24, 2024 at 11:20:23AM +0000, Alice Ryhl wrote: > > Adds a new struct called `Page` that wraps a pointer to `struct page`. > > This struct is assumed to hold ownership over the page, so that Rust > > code can allocate and manage pages directly. > > OK ... Thank you for taking your time to review my wrappers! > > This patch only adds support for pages of order zero, as that is all > > Rust Binder needs. However, it is written to make it easy to add suppor= t > > for higher-order pages in the future. To do that, you would add a const > > generic parameter to `Page` that specifies the order. Most of the > > methods do not need to be adjusted, as the logic for dealing with > > mapping multiple pages at once can be isolated to just the > > `with_pointer_into_page` method. Finally, the struct can be renamed to > > `Pages`, and the type alias `Page =3D Pages<0>` can be introduce= d. > > This description concerns me because it reads like you're not keeping > up with the current thinking in MM about what pages are and how we're > improving the type hierarchy. As in, we're creating one instead of > allowing the current mish-mash of absolutely everything to continue. That's very possible. I have a good understanding about how C binder interacts with pages, but I don't know too much about the abstractions that Binder is not using. > Are you the right person to ask about the operations that Binder does > with a page so we can figure out where it fits in the type hierarchy? I can definitely answer questions about that. If we can find another abstraction that is not too far away from what we are doing today, then I am open to looking into whether we can do that instead of the current approach. However, I want to avoid large deviations from C Binder, at least before I get Rust binder into the kernel tree. A short overview of what Binder does: * Every process will mmap a region of memory. This memory region will contain the data for all incoming transactions sent to that process. Only the kernel can modify these pages. * Binder has a data structure that keeps track of where the allocations in this mmap'd region are. It has functionality to find an open interval of a given size (so it's essentially an allocator). This is called the "range allocator". * The pages in this region are allocated lazily whenever the range allocator starts using the page. * When the range allocator stops using a page, it is marked as unused and put on an LRU list. * There's a custom shrinker that can free pages not currently used by any allocation. So when process A sends a transaction to process B using the appropriate ioctl, process A will go into the range allocator for process B and reserve a range for the transaction that A is sending. If any pages in the resulting range are missing, then new pages are allocated, and process A will vm_insert_page them into process B's mmap. Then, process A will map the page with kmap_local_page, and use copy_from_user to copy data *directly* from A's userspace into a page in process B's address space. Note that transactions don't have uniform sizes. Think of them as arbitrary buffers provided by userspace. They will generally not be larger than a few hundred bytes each, but larger transactions are possible. The mmap for a process is usually 4 MB. The biggest user of Page is here in the RFC: https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-19-08ba9197f= 637@google.com/ The range allocator is defined here: https://lore.kernel.org/rust-for-linux/20231101-rust-binder-v1-6-08ba9197f6= 37@google.com/ > > Rust Binder needs to manage pages directly as that is how transactions > > are delivered: Each process has an mmap'd region for incoming > > transactions. When an incoming transaction arrives, the Binder driver > > will choose a region in the mmap, allocate and map the relevant pages > > manually, and copy the incoming transaction directly into the page. Thi= s > > architecture allows the driver to copy transactions directly from the > > address space of one process to another, without an intermediate copy > > to a kernel buffer. > > Everything about this says "This is what a first year comp sci student > thinks will be fast". Oh well, the thinking here isn't your fault. Ultimately, I am just replicating what C Binder does. I had a long discussion with Liam Howlett at Plumbers where we discussed various alternatives to the hand-rolled stuff that Binder does. Liam thought that we may be able to replace the entire thing with maple trees. These are things that I definitely want to experiment with, but I am reluctant to try replacing the entire thing with a maple tree, at least until I get Rust Binder into the kernel tree. In general, there are many places in Binder where we are hand-rolling something that has an alternative elsewhere in the kernel, but replacing them is not always trivial. The hand-rolled versions often have Binder-specific optimizations that make it a regression to replace it with the general thing. > > @@ -127,6 +129,24 @@ int rust_helper_signal_pending(struct task_struct = *t) > > } > > EXPORT_SYMBOL_GPL(rust_helper_signal_pending); > > > > +struct page *rust_helper_alloc_pages(gfp_t gfp_mask, unsigned int orde= r) > > +{ > > + return alloc_pages(gfp_mask, order); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_alloc_pages); > > + > > +void *rust_helper_kmap_local_page(struct page *page) > > +{ > > + return kmap_local_page(page); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_kmap_local_page); > > + > > +void rust_helper_kunmap_local(const void *addr) > > +{ > > + kunmap_local(addr); > > +} > > +EXPORT_SYMBOL_GPL(rust_helper_kunmap_local); > > I remain opposed to all these fidgetty little helpers. Particularly > when they're noops on machines without HIGHMEM, which is ~all of them. I don't disagree with you, but there's not much I can do about them. I can wrap them in #ifdef HIGHMEM if they are no-ops or exported without HIGHMEM? > > +/// A bitwise shift for the page size. > > +pub const PAGE_SHIFT: usize =3D bindings::PAGE_SHIFT as usize; > > Does PAGE_SHIFT really need to be as large as 'usize'? If it's more > than 63 by the time I retire, I'll be shocked. If it's more than 127 > by the time I die, I'll be even more shocked. And it won't get to 255 > by the heat death of the universe. Rust usually requires that both operands to an integer operation are of the same integer type, requiring explicit conversions if that is not the case. That is why I am using usize here. However, it seems like Rust doesn't actually require that for the << operator, so I guess it doesn't matter much for this particular constant. > > +/// A bitwise mask for the page size. > > +pub const PAGE_MASK: usize =3D PAGE_SIZE - 1; > > Are you trying to get somebody killed? > > include/asm-generic/page.h:#define PAGE_MASK (~(PAGE_SIZE-1)) > > Defining PAGE_MASK to be the opposite set of bits in C and Rust is > going to bite us all day every day for a decade. Oops, that's embarrassing. Thank you for catching that. I'll make sure to change it. > > +impl Page { > > + /// Allocates a new set of contiguous pages. > > + pub fn new() -> Result { > > + // SAFETY: These are the correct arguments to allocate a singl= e page. > > + let page =3D unsafe { > > + bindings::alloc_pages( > > + bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings= ::__GFP_HIGHMEM, > > + 0, > > + ) > > + }; > > This feels too Binder-specific to be 'Page'. Pages are not necessarily > allocated with GFP_HIGHMEM, nor are they necessarily zeroed. Maybe you > want a BinderPage type? We can add a constructor that takes a flag argument, so this type doesn't necessarily have to be tied to those specific arguments. Alice