From: Alice Ryhl <aliceryhl@google.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: "Miguel Ojeda" <ojeda@kernel.org>,
"Will Deacon" <will@kernel.org>,
"Daniel Almeida" <daniel.almeida@collabora.com>,
"Boris Brezillon" <boris.brezillon@collabora.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Gary Guo" <gary@garyguo.net>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Benno Lossin" <lossin@kernel.org>,
"Andreas Hindborg" <a.hindborg@kernel.org>,
"Trevor Gross" <tmgross@umich.edu>,
"Danilo Krummrich" <dakr@kernel.org>,
"Joerg Roedel" <joro@8bytes.org>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
"Asahi Lina" <lina+kernel@asahilina.net>,
linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org,
iommu@lists.linux.dev, linux-mm@kvack.org
Subject: Re: [PATCH v3] io: add io_pgtable abstraction
Date: Fri, 28 Nov 2025 12:27:44 +0000 [thread overview]
Message-ID: <aSmVQDMXqwc8ctDf@google.com> (raw)
In-Reply-To: <12d99a54-e111-4877-b8cd-cb1e58cd6d30@arm.com>
On Fri, Nov 28, 2025 at 11:56:17AM +0000, Robin Murphy wrote:
> On 2025-11-12 10:15 am, Alice Ryhl wrote:
> > From: Asahi Lina <lina+kernel@asahilina.net>
> >
> > This will be used by the Tyr driver to create and modify the page table
> > of each address space on the GPU. Each time a mapping gets created or
> > removed by userspace, Tyr will call into GPUVM, which will figure out
> > which calls to map_pages and unmap_pages are required to map the data in
> > question in the page table so that the GPU may access those pages when
> > using that address space.
> >
> > The Rust type wraps the struct using a raw pointer rather than the usual
> > Opaque+ARef approach because Opaque+ARef requires the target type to be
> > refcounted.
> >
> > Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
> > Co-Developed-by: Alice Ryhl <aliceryhl@google.com>
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > +/// Protection flags used with IOMMU mappings.
> > +pub mod prot {
> > + /// Read access.
> > + pub const READ: u32 = bindings::IOMMU_READ;
> > + /// Write access.
> > + pub const WRITE: u32 = bindings::IOMMU_WRITE;
> > + /// Request cache coherency.
> > + pub const CACHE: u32 = bindings::IOMMU_CACHE;
> > + /// Request no-execute permission.
> > + pub const NOEXEC: u32 = bindings::IOMMU_NOEXEC;
> > + /// MMIO peripheral mapping.
> > + pub const MMIO: u32 = bindings::IOMMU_MMIO;
> > + /// Privileged mapping.
> > + pub const PRIV: u32 = bindings::IOMMU_PRIV;
>
> Nit: probably best to call this PRIVILEGED from day 1 for clarity - some day
> we may eventually get round to renaming the C symbol too, especially if we
> revisit the notion of "private" mappings (that's still on my ideas list...)
Sure, will rename.
> > + /// Map a physically contiguous range of pages of the same size.
> > + ///
> > + /// # Safety
> > + ///
> > + /// * This page table must not contain any mapping that overlaps with the mapping created by
> > + /// this call.
>
> As mentioned this isn't necessarily true of io-pgtable itself, but since
> you've not included QUIRK_NO_WARN in the abstraction then it's fair if this
> layer wants to be a little stricter toward Rust users.
Assuming that we don't allow QUICK_NO_WARN, would you say that it's
precise as-is?
> > + /// * If this page table is live, then the caller must ensure that it's okay to access the
> > + /// physical address being mapped for the duration in which it is mapped.
> > + #[inline]
> > + pub unsafe fn map_pages(
> > + &self,
> > + iova: usize,
> > + paddr: PhysAddr,
> > + pgsize: usize,
> > + pgcount: usize,
> > + prot: u32,
> > + flags: alloc::Flags,
> > + ) -> Result<usize> {
> > + let mut mapped: usize = 0;
> > +
> > + // SAFETY: The `map_pages` function in `io_pgtable_ops` is never null.
> > + let map_pages = unsafe { (*self.raw_ops()).map_pages.unwrap_unchecked() };
> > +
> > + // SAFETY: The safety requirements of this method are sufficient to call `map_pages`.
> > + to_result(unsafe {
> > + (map_pages)(
> > + self.raw_ops(),
> > + iova,
> > + paddr,
> > + pgsize,
> > + pgcount,
> > + prot as i32,
> > + flags.as_raw(),
> > + &mut mapped,
> > + )
> > + })?;
> > +
> > + Ok(mapped)
>
> Just to double-check since I'm a bit unclear on the Rust semantics, this can
> correctly reflect all 4 outcomes back to the caller, right? I.e.:
>
> - no error, mapped == pgcount * pgsize (success)
> - no error, mapped < pgcount * pgsize (call again with the remainder)
> - error, mapped > 0 (probably unmap that bit, unless clever trickery where
> an error was expected)
> - error, mapped == 0 (nothing was done, straightforward failure)
>
> (the only case not permitted is "no error, mapped == 0" - failure to make
> any progress must always be an error)
>
> Alternatively you might want to consider encapsulating the partial-mapping
> handling in this layer as well - in the C code that's done at the level of
> the IOMMU API calls that io-pgtable-using IOMMU drivers are merely passing
> through, hence why panfrost/panthor have to open-code their own equivalents,
> but there's no particular reason to follow the *exact* same pattern here.
Ah, no this signature does not reflect all of those cases. The return
type is Result<usize>, which corresponds to:
struct my_return_type {
bool success;
union {
size_t ok;
int err; // an errno
}
};
We need a different signature if it's possible to have mapped != 0 when
returning an error.
> > + }
> > +
> > + /// Unmap a range of virtually contiguous pages of the same size.
> > + ///
> > + /// # Safety
> > + ///
> > + /// This page table must contain a mapping at `iova` that consists of exactly `pgcount` pages
> > + /// of size `pgsize`.
>
> Again, the underlying requirement here is only that pgsize * pgcount
> represents the IOVA range of one or more consecutive ranges previously
> mapped, i.e.:
>
> map(0, 4KB * 256);
> map(1MB, 4KB * 256);
> unmap(0, 2MB * 1);
>
> is legal, since it's generally impractical for callers to know and keep
> track of the *exact* structure of a given pagetable. In this case there
> isn't really any good reason to try to be stricter.
How about this wording?
This page table must contain one or more consecutive mappings starting
at `iova` whose total size is `pgcount*pgsize`.
Alice
next prev parent reply other threads:[~2025-11-28 12:27 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-12 10:15 Alice Ryhl
2025-11-12 12:57 ` Daniel Almeida
2025-11-17 16:34 ` Alice Ryhl
2025-11-19 8:59 ` Boris Brezillon
2025-11-19 10:53 ` Boris Brezillon
2025-11-19 10:56 ` Boris Brezillon
2025-11-28 11:56 ` Robin Murphy
2025-11-28 12:27 ` Alice Ryhl [this message]
2025-11-28 16:47 ` Robin Murphy
2025-12-01 9:58 ` Alice Ryhl
2025-12-01 13:55 ` Robin Murphy
2025-11-28 18:02 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aSmVQDMXqwc8ctDf@google.com \
--to=aliceryhl@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=a.hindborg@kernel.org \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=boris.brezillon@collabora.com \
--cc=dakr@kernel.org \
--cc=daniel.almeida@collabora.com \
--cc=gary@garyguo.net \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=lina+kernel@asahilina.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=lossin@kernel.org \
--cc=ojeda@kernel.org \
--cc=robin.murphy@arm.com \
--cc=rust-for-linux@vger.kernel.org \
--cc=tmgross@umich.edu \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox