From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 387F5C02192 for ; Mon, 3 Feb 2025 14:32:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C26CC280002; Mon, 3 Feb 2025 09:32:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BAF74280001; Mon, 3 Feb 2025 09:32:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A787B280002; Mon, 3 Feb 2025 09:32:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 865E6280001 for ; Mon, 3 Feb 2025 09:32:29 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2726816064A for ; Mon, 3 Feb 2025 14:32:29 +0000 (UTC) X-FDA: 83078874018.25.18EC50F Received: from mail.marcansoft.com (marcansoft.com [212.63.210.85]) by imf26.hostedemail.com (Postfix) with ESMTP id 037B314000A for ; Mon, 3 Feb 2025 14:32:26 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=asahilina.net header.s=default header.b=zZtvrsfE; spf=pass (imf26.hostedemail.com: domain of lina@asahilina.net designates 212.63.210.85 as permitted sender) smtp.mailfrom=lina@asahilina.net; dmarc=pass (policy=quarantine) header.from=asahilina.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738593147; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aaiFoxOyJMRTi2xzPk1Udn0PXVgY0vu9dXbWxuY8Rkc=; b=gU2titUPooeMN1sjCc3HqFSf1b6mUWMBGPIZ2I0QGZzJdLekA1xo0ZXkWI9AX4nQZzIGGE sa9ngJ+vA7DGbohoMReMDzIs1+lmAMhgJ064XRdpfLKYXR5fuVCzbc32OcmCCtt37HKUip MAMPLmoupknOicKq2O8au/CG6ZK/MAU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738593147; a=rsa-sha256; cv=none; b=eBnYwrQgcuT7OcQeXWw7fxYCEeH2m27e1rtsz/DC9ZOiPTZu+By59oOced5bVKucebYFOO IVLkEFCV+Su/gWx+mT7j4ZCEZVtfhnhhWZIg6lILIOa+BPFzuHKQ9pknFArTg/o7Em2Ad9 dZgl5hAH3EMEhoQMZYMHxBKLAAZt9Fo= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=asahilina.net header.s=default header.b=zZtvrsfE; spf=pass (imf26.hostedemail.com: domain of lina@asahilina.net designates 212.63.210.85 as permitted sender) smtp.mailfrom=lina@asahilina.net; dmarc=pass (policy=quarantine) header.from=asahilina.net Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) (Authenticated sender: lina@asahilina.net) by mail.marcansoft.com (Postfix) with ESMTPSA id D59D13FA6A; Mon, 3 Feb 2025 14:32:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=asahilina.net; s=default; t=1738593144; bh=o2atDu1PWcl9Kgs88c64C/EyOfqDqKG+Y34fJ02LYlY=; h=Date:Subject:To:References:From:In-Reply-To; b=zZtvrsfEWZHYSIaS690wPRQu+ZMjSgYb3OctW6aa1PyNYqyc83s12Dv/5OyJkfOaG l038s4t7eFIhWV1biQVmOXMrTF3Mud7NAxnfp5Bg+hY7RRhj9poKYb0ut0Gs4ZrIt6 FqC/yck1wBkJn+9d0DtVu63mR7alYox6d0aRIAwZaLwDAQ50gEjqcMXtyWJv9fy71N +pp2fMaqyYMW4Xv5BJJEsqr58mJq9ne7vhI1bblz4fdaumqbfwHD8bmWKSGiF26Zsk lF+CEx7Tvv46k9mp5Qim4msJFFx/9cXzCvQA51YCKUrqVjuHCGrlfY3WAGLSfz789c 6QFtir98VcDgA== Message-ID: <41ca3445-80cd-43c1-8f9e-634c195c9187@asahilina.net> Date: Mon, 3 Feb 2025 23:32:21 +0900 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 0/6] rust: page: Support borrowing `struct page` and physaddr conversion To: Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , =?UTF-8?Q?Bj=C3=B6rn_Roy_Baron?= , Benno Lossin , Andreas Hindborg , Alice Ryhl , Trevor Gross , Jann Horn , Matthew Wilcox , Paolo Bonzini , Danilo Krummrich , Wedson Almeida Filho , Valentin Obst , Andrew Morton , linux-mm@kvack.org, airlied@redhat.com, Abdiel Janulgue , rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org, asahi@lists.linux.dev References: <20250202-rust-page-v1-0-e3170d7fe55e@asahilina.net> Content-Language: en-US From: Asahi Lina In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: 7bomanxxwmby7wkhmkqreptfdtjtxh6b X-Rspamd-Queue-Id: 037B314000A X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738593146-338750 X-HE-Meta: U2FsdGVkX18i4ABKeopP9VRXsaKKYUUqAw8EpXjcY5D2YlQHaMnXflYoLi/2OgqJpp+qGEBMKbtcHXbrWVJ2h8nqGGyhSdIQL+j0RrKYtzrk6SYYWyUIcSRGb5U64wnX1lfm3Wi5g7OIQFtpTglNuc3TKtQsv0QwkluLdj9Fn70gVjE6zp0OLBN0kkvWwZavpJjEPhK0A8IJouRrOmUmZcIqhRoGVeMgtnDqS2TIT5vb+yZJ6jgOZhGlIf2Z/jKHnjsyJxofQuejAtc7iMKzbyYp/a6CCvvXyNhL5752F8P7xHTe8ee9KU8TwOn9sgx92iWpxEGUQrxPyAFS32OWiBr2n1+353+GXCUQfXqlCKN8MA1FtHl+jpqa8bNkW5WSlfKp2nxTHoe74iac+NzpjbKoitvLsEVghownh7QRkGDgfjaiOPh3ivhpLFsIQCavQ/qeHpy4aX0GgxzsJCaylqFpZSuERHpOn10Hqq/pQex0G+qVwofzSS5MuCRoBQ1Tj7n0xxyddgNBoWQaj09AAS1vMzgOIdcSYKsYM57Lphupb412M1DhDtS1pxH1MLVfH7VlLN5QLegeSTCjdK4JHOaQr8o81q204JcTJqHt2G/61J+MF27U2d93pXY3hF/uqim0/jIYVfvDXlZeZvN+/fwLb+EdmX7Zb/YMOcwuGAqZbo6+4Ehi44Wo9q8eMaX9RTDEnisdVUMZ3QnO5vlM3b3U5iVRT5y0DEzSUruu6JUUag1yV1PRw9RtY0x20O2VI/CVTTwxOf+rP2cKJeYxJh3B3UMpZ4wFRC6AWVZGPY89MEnz4iGp5+72Ruakpq/SxYawAlUy5yZ7ClVyCwRVXVWbcEdFcGyLOksLlOQOlQVzX39LREAY0sRC8mqb1qdGGqld8i+onHJ3dhwH4qA6EDtKZlxIKdB+K5MXQbRUNeyikTE3/USwS1EWQcQFWnXnH8sxIFE67ndh4QekWOZ z6Sr5bOF y+WzVHQdzn7ozaGAxpzApnGMLYmd9gMcqBHDQu3yVyJqB1ZRbW0+mUYIRiXpRDSdY49FLN2eN4l0J1HDKyRDIkr5juHBrFRzaolordcENitHe6r23qAuupfkLz83WcZftFY2Z3JoR6v61zVOR2H1FYhyTEdlsR5sTnzjxZgoQ6KNO7vhzR77QolxfXY9t+RlsQZNU/HXtOsBWE/9Zq6Am9HGMbdv1wJyy0OkEmRRwO0HSxHuTPYe06UHQUUPW+4H86q62JqdLasDe6QqfBK23s20VASM6XSFy/YooOQ9WaOrSzC/f9m647636k66gK9Z8GC3vrdg1qE+I2MA6jCEkQf0w7NXwc45DLm6L/VdNFhnGmk/7yFbB0Ahh0iAKsLwvPoacIzk3CQK2C8CkVXMfRSuCqPJIQFB4W6GbFMlZAOI6qpUqxH+85NMz7Ns6wN1U2ZoEsT/D9JafmR/UPo19JorUex+fTePkHGfHNyJjyjHRuV2ESHMQWmS23OS5cOPWGkZFNBj1zfez/2PK+8zFU4+8I2X9ZnW/4VvEDZ1hov0t16PtzQk68r9A0g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.431877, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2/3/25 6:58 PM, Simona Vetter wrote: > On Sun, Feb 02, 2025 at 10:05:42PM +0900, Asahi Lina wrote: >> This series refactors the existing Page wrapper to support borrowing >> `struct page` objects without ownership on the Rust side, and converting >> page references to/from physical memory addresses. >> >> The series overlaps with the earlier submission in [1] and follows a >> different approach, based on the discussion that happened there. >> >> The primary use case for this is implementing IOMMU-style page table >> management in Rust. This allows drivers for IOMMUs and MMU-containing >> SoC devices to be written in Rust (such as embedded GPUs). The intended >> logic is similar to how ARM SMMU page tables are managed in the >> drivers/iommu tree. >> >> First, introduce a concept of Owned and an Ownable trait. These are >> similar to ARef and AlwaysRefCounted, but are used for types which >> are not ref counted but rather have a single intended owner. >> >> Then, refactor the existing Page support to use the new mechanism. Pages >> returned from the page allocator are not intended to be ref counted by >> consumers (see previous discussion in [1]), so this keeps Rust's view of >> page ownership as a simple "owned or not". Of course, this is still >> composable as Arc> if Rust code needs to reference count its >> own Page allocations for whatever reason. > > I think there's a bit a potential mess here because the conversion to > folios isn't far enough yet that we can entirely ignore page refcounts and > just use folio refcounts. But I guess we can deal with that oddity if we > hit it (maybe folio conversion moves fast enough), since this only really > starts to become relevant for hmm/svm gpu stuff. > > iow I think anticipating the future where struct page really doesn't have > a refcount is the right move. Aside from that it's really not a refcount > that works in the rust ARef sense, since struct page cannot disappear for > system memory, and for dev_pagemap memory it's an entirely different > reference you need (and then there's a few more special cases). Right, as far as this abstraction is concerned, all that needs to hold is that: - alloc_pages() and __free_pages() work as intended, however that may be, to reserve and return one page (for now, though I think extending the Rust abstraction to handle higher-order folios is pretty easy, but that can happen later). - Whatever borrows pages knows what it's doing. In this case there's only support for borrowing pages by physaddr, and it's only going to be used in a driver for a platform without memory hot remove (so far) and only for pages which have known usage (in principle) and are either explicitly allocated or known pinned or reserved, so it's not a problem right now. Future abstractions that return borrowed pages can do their own locking/bookkeeping/whatever is necessary to keep it safe. I would like to hear how memory hot-remove is supposed to work though, to see if we should be doing something to make the abstraction safer (though it's still unsafe and always will be). Is there a chance a `struct page` could vanish out from under us under some conditions? For dev_pagemap memory I imagine we'd have an entirely different abstraction wrapping that, that can just return a borrowed &Page to give the user access to page operations without going through Owned. > For dma/iommu stuff there's also a push to move towards pfn + metadata > model, so that p2pdma doesn't need struct page. But I haven't looked into > that much yet. Yeah, I don't know how that stuff works... > > Cheers, Sima > >> Then, make some existing private methods public, since this is needed to >> reasonably use allocated pages as IOMMU page tables. >> >> Along the way we also add a small module to represent a core kernel >> address types (PhysicalAddr, DmaAddr, ResourceSize, Pfn). In the future, >> this might grow with helpers to make address math safer and more >> Rust-like. >> >> Finally, add methods to: >> - Get a page's physical address >> - Convert an owned Page into its physical address >> - Convert a physical address back to its owned Page >> - Borrow a Page from a physical address, in both checked (with checks >> that a struct page exists and is accessible as regular RAM) and not >> checked forms (useful when the user knows the physaddr is valid, >> for example because it got it from Page::into_phys()). >> >> Of course, all but the first two have to be `unsafe` by nature, but that >> comes with the territory of writing low level memory management code. >> >> These methods allow page table code to know the physical address of >> pages (needed to build intermediate level PTEs) and to essentially >> transfer ownership of the pages into the page table structure itself, >> and back into Page objects when freeing page tables. Without that, the >> code would have to keep track of page allocations in duplicate, once in >> Rust code and once in the page table structure itself, which is less >> desirable. >> >> For Apple GPUs, the address space shared between firmware and the driver >> is actually pre-allocated by the bootloader, with the top level page >> table already pre-allocated, and the firmware owning some PTEs within it >> while the kernel populates others. This cooperation works well when the >> kernel can reference this top level page table by physical address. The >> only thing the driver needs to ensure is that it never attempts to free >> it in this case, nor the page tables corresponding to virtual address >> ranges it doesn't own. Without the ability to just borrow the >> pre-allocated top level page and access it, the driver would have to >> special-case this and manually manage the top level PTEs outside the >> main page table code, as well as introduce different page table >> configurations with different numbers of levels so the kernel's view is >> one lever shallower. >> >> The physical address borrow feature is also useful to generate virtual >> address space dumps for crash dumps, including firmware pages. The >> intent is that firmware pages are configured in the Device Tree as >> reserved System RAM (without no-map), which creates struct page objects >> for them and makes them available in the kernel's direct map. Then the >> driver's page table code can walk the page tables and make a snapshot of >> the entire address space, including firmware code and data pages, >> pre-allocated shared segments, and driver-allocated objects (which are >> GEM objects), again without special casing anything. The checks in >> `Page::borrow_phys()` should ensure that the page is safe to access as >> RAM, so this will skip MMIO pages and anything that wasn't declared to >> the kernel in the DT. >> >> Example usage: >> https://github.com/AsahiLinux/linux/blob/gpu/rust-wip/drivers/gpu/drm/asahi/pgtable.rs >> >> The last patch is a minor cleanup to the Page abstraction noticed while >> preparing this series. >> >> [1] https://lore.kernel.org/lkml/20241119112408.779243-1-abdiel.janulgue@gmail.com/T/#u >> >> Signed-off-by: Asahi Lina >> --- >> Asahi Lina (6): >> rust: types: Add Ownable/Owned types >> rust: page: Convert to Ownable >> rust: page: Make with_page_mapped() and with_pointer_into_page() public >> rust: addr: Add a module to declare core address types >> rust: page: Add physical address conversion functions >> rust: page: Make Page::as_ptr() pub(crate) >> >> rust/helpers/page.c | 26 ++++++++++++ >> rust/kernel/addr.rs | 15 +++++++ >> rust/kernel/lib.rs | 1 + >> rust/kernel/page.rs | 101 ++++++++++++++++++++++++++++++++++++++-------- >> rust/kernel/types.rs | 110 +++++++++++++++++++++++++++++++++++++++++++++++++++ >> 5 files changed, 236 insertions(+), 17 deletions(-) >> --- >> base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 >> change-id: 20250202-rust-page-80892069fc78 >> >> Cheers, >> ~~ Lina >> > ~~ Lina