* [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap
@ 2024-11-20 14:49 Alice Ryhl
2024-11-20 14:49 ` [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct Alice Ryhl
` (6 more replies)
0 siblings, 7 replies; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:49 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
This updates the vm_area_struct support to use the approach we discussed
at LPC where there are several different Rust wrappers for
vm_area_struct depending on the kind of access you have to the vma. Each
case allows a different set of operations on the vma.
This series depends on vfs.rust.file and char-misc-next which are
scheduled to land upstream in the current merge window.
---
Changes in v8:
- Split series into more commits to ease review.
- Improve read locks based on Lorenzo's doc: either the mmap or vma lock
can be used.
- Get rid of mmap write lock because it's possible to avoid the need for
it.
- Do not allow invalid flag combinations on VmAreaNew.
- Link to v7: https://lore.kernel.org/r/20241014-vma-v7-0-01e32f861195@google.com
Changes in v7:
- Make the mmap read/write lock guards respect strict owner semantics.
- Link to v6: https://lore.kernel.org/r/20241010-vma-v6-0-d89039b6f573@google.com
Changes in v6:
- Introduce VmArea{Ref,Mut,New} distinction.
- Add a second patchset for miscdevice.
- Rebase on char-misc-next (currently on v6.12-rc2).
- Link to v5: https://lore.kernel.org/r/20240806-vma-v5-1-04018f05de2b@google.com
Changes in v5:
- Rename VmArea::from_raw_vma to from_raw.
- Use Pin for mutable VmArea references.
- Go through `ARef::from` in `mmgrab_current`.
- Link to v4: https://lore.kernel.org/r/20240802-vma-v4-1-091a87058a43@google.com
Changes in v4:
- Pull out ARef::into_raw into a separate patch.
- Update invariants and struct documentation.
- Rename from_raw_mm to from_raw.
- Link to v3: https://lore.kernel.org/r/20240801-vma-v3-1-db6c1c0afda9@google.com
Changes in v3:
- Reorder entries in mm.rs.
- Use ARef for mmput_async helper.
- Clarify that VmArea requires you to hold the mmap read or write lock.
- Link to v2: https://lore.kernel.org/r/20240727-vma-v2-1-ab3e5927dc3a@google.com
Changes in v2:
- mm.rs is redesigned from scratch making use of AsRef
- Add notes about whether destructors may sleep
- Rename Area to VmArea
- Link to v1: https://lore.kernel.org/r/20240723-vma-v1-1-32ad5a0118ee@google.com
---
Alice Ryhl (7):
mm: rust: add abstraction for struct mm_struct
mm: rust: add vm_area_struct methods that require read access
mm: rust: add vm_insert_page
mm: rust: add lock_vma_under_rcu
mm: rust: add mmput_async support
mm: rust: add VmAreaNew
rust: miscdevice: add mmap support
rust/helpers/helpers.c | 1 +
rust/helpers/mm.c | 50 ++++++
rust/kernel/lib.rs | 1 +
rust/kernel/miscdevice.rs | 28 ++++
rust/kernel/mm.rs | 345 +++++++++++++++++++++++++++++++++++++++
rust/kernel/mm/virt.rs | 404 ++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 829 insertions(+)
---
base-commit: 7f219f5002eb0ce2b12562127aeb0521bbf2146e
change-id: 20240723-vma-f80119f9fb35
Best regards,
--
Alice Ryhl <aliceryhl@google.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
@ 2024-11-20 14:49 ` Alice Ryhl
2024-11-20 18:13 ` Lorenzo Stoakes
2024-11-20 14:49 ` [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access Alice Ryhl
` (5 subsequent siblings)
6 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:49 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
These abstractions allow you to reference a `struct mm_struct` using
both mmgrab and mmget refcounts. This is done using two Rust types:
* Mm - represents an mm_struct where you don't know anything about the
value of mm_users.
* MmWithUser - represents an mm_struct where you know at compile time
that mm_users is non-zero.
This allows us to encode in the type system whether a method requires
that mm_users is non-zero or not. For instance, you can always call
`mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
non-zero.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/helpers/helpers.c | 1 +
rust/helpers/mm.c | 39 +++++++++
rust/kernel/lib.rs | 1 +
rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 260 insertions(+)
diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
index 20a0c69d5cc7..60a488eb4efe 100644
--- a/rust/helpers/helpers.c
+++ b/rust/helpers/helpers.c
@@ -13,6 +13,7 @@
#include "build_bug.c"
#include "err.c"
#include "kunit.c"
+#include "mm.c"
#include "mutex.c"
#include "page.c"
#include "rbtree.c"
diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
new file mode 100644
index 000000000000..7201747a5d31
--- /dev/null
+++ b/rust/helpers/mm.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/mm.h>
+#include <linux/sched/mm.h>
+
+void rust_helper_mmgrab(struct mm_struct *mm)
+{
+ mmgrab(mm);
+}
+
+void rust_helper_mmdrop(struct mm_struct *mm)
+{
+ mmdrop(mm);
+}
+
+void rust_helper_mmget(struct mm_struct *mm)
+{
+ mmget(mm);
+}
+
+bool rust_helper_mmget_not_zero(struct mm_struct *mm)
+{
+ return mmget_not_zero(mm);
+}
+
+void rust_helper_mmap_read_lock(struct mm_struct *mm)
+{
+ mmap_read_lock(mm);
+}
+
+bool rust_helper_mmap_read_trylock(struct mm_struct *mm)
+{
+ return mmap_read_trylock(mm);
+}
+
+void rust_helper_mmap_read_unlock(struct mm_struct *mm)
+{
+ mmap_read_unlock(mm);
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 66f5cde7f322..cc1963510cdf 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -43,6 +43,7 @@
pub mod kunit;
pub mod list;
pub mod miscdevice;
+pub mod mm;
#[cfg(CONFIG_NET)]
pub mod net;
pub mod page;
diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
new file mode 100644
index 000000000000..84cba581edaa
--- /dev/null
+++ b/rust/kernel/mm.rs
@@ -0,0 +1,219 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2024 Google LLC.
+
+//! Memory management.
+//!
+//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
+
+use crate::{
+ bindings,
+ types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
+};
+use core::{ops::Deref, ptr::NonNull};
+
+/// A wrapper for the kernel's `struct mm_struct`.
+///
+/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
+/// [`mmget_not_zero`] to be able to access the address space.
+///
+/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
+///
+/// # Invariants
+///
+/// Values of this type are always refcounted using `mmgrab`.
+///
+/// [`mmget_not_zero`]: Mm::mmget_not_zero
+#[repr(transparent)]
+pub struct Mm {
+ mm: Opaque<bindings::mm_struct>,
+}
+
+// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called.
+unsafe impl Send for Mm {}
+// SAFETY: All methods on `Mm` can be called in parallel from several threads.
+unsafe impl Sync for Mm {}
+
+// SAFETY: By the type invariants, this type is always refcounted.
+unsafe impl AlwaysRefCounted for Mm {
+ #[inline]
+ fn inc_ref(&self) {
+ // SAFETY: The pointer is valid since self is a reference.
+ unsafe { bindings::mmgrab(self.as_raw()) };
+ }
+
+ #[inline]
+ unsafe fn dec_ref(obj: NonNull<Self>) {
+ // SAFETY: The caller is giving up their refcount.
+ unsafe { bindings::mmdrop(obj.cast().as_ptr()) };
+ }
+}
+
+/// A wrapper for the kernel's `struct mm_struct`.
+///
+/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
+/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
+/// refcount. It can be used to access the associated address space.
+///
+/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
+///
+/// # Invariants
+///
+/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
+#[repr(transparent)]
+pub struct MmWithUser {
+ mm: Mm,
+}
+
+// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
+unsafe impl Send for MmWithUser {}
+// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
+unsafe impl Sync for MmWithUser {}
+
+// SAFETY: By the type invariants, this type is always refcounted.
+unsafe impl AlwaysRefCounted for MmWithUser {
+ #[inline]
+ fn inc_ref(&self) {
+ // SAFETY: The pointer is valid since self is a reference.
+ unsafe { bindings::mmget(self.as_raw()) };
+ }
+
+ #[inline]
+ unsafe fn dec_ref(obj: NonNull<Self>) {
+ // SAFETY: The caller is giving up their refcount.
+ unsafe { bindings::mmput(obj.cast().as_ptr()) };
+ }
+}
+
+// Make all `Mm` methods available on `MmWithUser`.
+impl Deref for MmWithUser {
+ type Target = Mm;
+
+ #[inline]
+ fn deref(&self) -> &Mm {
+ &self.mm
+ }
+}
+
+// These methods are safe to call even if `mm_users` is zero.
+impl Mm {
+ /// Call `mmgrab` on `current.mm`.
+ #[inline]
+ pub fn mmgrab_current() -> Option<ARef<Mm>> {
+ // SAFETY: It's safe to get the `mm` field from current.
+ let mm = unsafe {
+ let current = bindings::get_current();
+ (*current).mm
+ };
+
+ if mm.is_null() {
+ return None;
+ }
+
+ // SAFETY: The value of `current->mm` is guaranteed to be null or a valid `mm_struct`. We
+ // just checked that it's not null. Furthermore, the returned `&Mm` is valid only for the
+ // duration of this function, and `current->mm` will stay valid for that long.
+ let mm = unsafe { Mm::from_raw(mm) };
+
+ // This increments the refcount using `mmgrab`.
+ Some(ARef::from(mm))
+ }
+
+ /// Returns a raw pointer to the inner `mm_struct`.
+ #[inline]
+ pub fn as_raw(&self) -> *mut bindings::mm_struct {
+ self.mm.get()
+ }
+
+ /// Obtain a reference from a raw pointer.
+ ///
+ /// # Safety
+ ///
+ /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated
+ /// during the lifetime 'a.
+ #[inline]
+ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm {
+ // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to
+ // repr(transparent).
+ unsafe { &*ptr.cast() }
+ }
+
+ /// Calls `mmget_not_zero` and returns a handle if it succeeds.
+ #[inline]
+ pub fn mmget_not_zero(&self) -> Option<ARef<MmWithUser>> {
+ // SAFETY: The pointer is valid since self is a reference.
+ let success = unsafe { bindings::mmget_not_zero(self.as_raw()) };
+
+ if success {
+ // SAFETY: We just created an `mmget` refcount.
+ Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) })
+ } else {
+ None
+ }
+ }
+}
+
+// These methods require `mm_users` to be non-zero.
+impl MmWithUser {
+ /// Obtain a reference from a raw pointer.
+ ///
+ /// # Safety
+ ///
+ /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains
+ /// non-zero for the duration of the lifetime 'a.
+ #[inline]
+ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
+ // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due
+ // to repr(transparent).
+ unsafe { &*ptr.cast() }
+ }
+
+ /// Lock the mmap read lock.
+ #[inline]
+ pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> {
+ // SAFETY: The pointer is valid since self is a reference.
+ unsafe { bindings::mmap_read_lock(self.as_raw()) };
+
+ // INVARIANT: We just acquired the read lock.
+ MmapReadGuard {
+ mm: self,
+ _nts: NotThreadSafe,
+ }
+ }
+
+ /// Try to lock the mmap read lock.
+ #[inline]
+ pub fn mmap_read_trylock(&self) -> Option<MmapReadGuard<'_>> {
+ // SAFETY: The pointer is valid since self is a reference.
+ let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) };
+
+ if success {
+ // INVARIANT: We just acquired the read lock.
+ Some(MmapReadGuard {
+ mm: self,
+ _nts: NotThreadSafe,
+ })
+ } else {
+ None
+ }
+ }
+}
+
+/// A guard for the mmap read lock.
+///
+/// # Invariants
+///
+/// This `MmapReadGuard` guard owns the mmap read lock.
+pub struct MmapReadGuard<'a> {
+ mm: &'a MmWithUser,
+ // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread
+ _nts: NotThreadSafe,
+}
+
+impl Drop for MmapReadGuard<'_> {
+ #[inline]
+ fn drop(&mut self) {
+ // SAFETY: We hold the read lock by the type invariants.
+ unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
+ }
+}
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
2024-11-20 14:49 ` [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct Alice Ryhl
@ 2024-11-20 14:49 ` Alice Ryhl
2024-11-20 19:07 ` Lorenzo Stoakes
2024-11-20 14:49 ` [PATCH v8 3/7] mm: rust: add vm_insert_page Alice Ryhl
` (4 subsequent siblings)
6 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:49 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
This adds a type called VmAreaRef which is used when referencing a vma
that you have read access to. Here, read access means that you hold
either the mmap read lock or the vma read lock (or stronger).
Additionally, a vma_lookup method is added to the mmap read guard, which
enables you to obtain a &VmAreaRef in safe Rust code.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/helpers/mm.c | 6 ++
rust/kernel/mm.rs | 21 ++++++
rust/kernel/mm/virt.rs | 171 +++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 198 insertions(+)
diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
index 7201747a5d31..7b72eb065a3e 100644
--- a/rust/helpers/mm.c
+++ b/rust/helpers/mm.c
@@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm)
{
mmap_read_unlock(mm);
}
+
+struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
+ unsigned long addr)
+{
+ return vma_lookup(mm, addr);
+}
diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
index 84cba581edaa..ace8e7d57afe 100644
--- a/rust/kernel/mm.rs
+++ b/rust/kernel/mm.rs
@@ -12,6 +12,8 @@
};
use core::{ops::Deref, ptr::NonNull};
+pub mod virt;
+
/// A wrapper for the kernel's `struct mm_struct`.
///
/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
@@ -210,6 +212,25 @@ pub struct MmapReadGuard<'a> {
_nts: NotThreadSafe,
}
+impl<'a> MmapReadGuard<'a> {
+ /// Look up a vma at the given address.
+ #[inline]
+ pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {
+ // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
+ // for `vma_addr`.
+ let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
+
+ if vma.is_null() {
+ None
+ } else {
+ // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
+ // the returned area will borrow from this read lock guard, so it can only be used
+ // while the mmap read lock is still held.
+ unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
+ }
+ }
+}
+
impl Drop for MmapReadGuard<'_> {
#[inline]
fn drop(&mut self) {
diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
new file mode 100644
index 000000000000..1e755dca46dd
--- /dev/null
+++ b/rust/kernel/mm/virt.rs
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2024 Google LLC.
+
+//! Virtual memory.
+
+use crate::{bindings, types::Opaque};
+
+/// A wrapper for the kernel's `struct vm_area_struct` with read access.
+///
+/// It represents an area of virtual memory.
+///
+/// # Invariants
+///
+/// The caller must hold the mmap read lock or the vma read lock.
+#[repr(transparent)]
+pub struct VmAreaRef {
+ vma: Opaque<bindings::vm_area_struct>,
+}
+
+// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no
+// matter what the vma flags are.
+impl VmAreaRef {
+ /// Access a virtual memory area given a raw pointer.
+ ///
+ /// # Safety
+ ///
+ /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
+ /// (or stronger) is held for at least the duration of 'a.
+ #[inline]
+ pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
+ // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
+ unsafe { &*vma.cast() }
+ }
+
+ /// Returns a raw pointer to this area.
+ #[inline]
+ pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
+ self.vma.get()
+ }
+
+ /// Returns the flags associated with the virtual memory area.
+ ///
+ /// The possible flags are a combination of the constants in [`flags`].
+ #[inline]
+ pub fn flags(&self) -> vm_flags_t {
+ // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
+ // access is not a data race.
+ unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }
+ }
+
+ /// Returns the start address of the virtual memory area.
+ #[inline]
+ pub fn start(&self) -> usize {
+ // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
+ // access is not a data race.
+ unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
+ }
+
+ /// Returns the end address of the virtual memory area.
+ #[inline]
+ pub fn end(&self) -> usize {
+ // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
+ // access is not a data race.
+ unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
+ }
+
+ /// Unmap pages in the given page range.
+ #[inline]
+ pub fn zap_page_range_single(&self, address: usize, size: usize) {
+ // SAFETY: By the type invariants, the caller has read access to this VMA, which is
+ // sufficient for this method call. This method has no requirements on the vma flags. Any
+ // value of `address` and `size` is allowed.
+ unsafe {
+ bindings::zap_page_range_single(
+ self.as_ptr(),
+ address as _,
+ size as _,
+ core::ptr::null_mut(),
+ )
+ };
+ }
+}
+
+/// The integer type used for vma flags.
+#[doc(inline)]
+pub use bindings::vm_flags_t;
+
+/// All possible flags for [`VmAreaRef`].
+pub mod flags {
+ use super::vm_flags_t;
+ use crate::bindings;
+
+ /// No flags are set.
+ pub const NONE: vm_flags_t = bindings::VM_NONE as _;
+
+ /// Mapping allows reads.
+ pub const READ: vm_flags_t = bindings::VM_READ as _;
+
+ /// Mapping allows writes.
+ pub const WRITE: vm_flags_t = bindings::VM_WRITE as _;
+
+ /// Mapping allows execution.
+ pub const EXEC: vm_flags_t = bindings::VM_EXEC as _;
+
+ /// Mapping is shared.
+ pub const SHARED: vm_flags_t = bindings::VM_SHARED as _;
+
+ /// Mapping may be updated to allow reads.
+ pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _;
+
+ /// Mapping may be updated to allow writes.
+ pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _;
+
+ /// Mapping may be updated to allow execution.
+ pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _;
+
+ /// Mapping may be updated to be shared.
+ pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _;
+
+ /// Page-ranges managed without `struct page`, just pure PFN.
+ pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _;
+
+ /// Memory mapped I/O or similar.
+ pub const IO: vm_flags_t = bindings::VM_IO as _;
+
+ /// Do not copy this vma on fork.
+ pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _;
+
+ /// Cannot expand with mremap().
+ pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _;
+
+ /// Lock the pages covered when they are faulted in.
+ pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _;
+
+ /// Is a VM accounted object.
+ pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _;
+
+ /// should the VM suppress accounting.
+ pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _;
+
+ /// Huge TLB Page VM.
+ pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _;
+
+ /// Synchronous page faults.
+ pub const SYNC: vm_flags_t = bindings::VM_SYNC as _;
+
+ /// Architecture-specific flag.
+ pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _;
+
+ /// Wipe VMA contents in child..
+ pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _;
+
+ /// Do not include in the core dump.
+ pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _;
+
+ /// Not soft dirty clean area.
+ pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _;
+
+ /// Can contain `struct page` and pure PFN pages.
+ pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _;
+
+ /// MADV_HUGEPAGE marked this vma.
+ pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _;
+
+ /// MADV_NOHUGEPAGE marked this vma.
+ pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _;
+
+ /// KSM may merge identical pages.
+ pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _;
+}
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 3/7] mm: rust: add vm_insert_page
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
2024-11-20 14:49 ` [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct Alice Ryhl
2024-11-20 14:49 ` [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access Alice Ryhl
@ 2024-11-20 14:49 ` Alice Ryhl
2024-11-20 19:20 ` Lorenzo Stoakes
2024-11-20 14:49 ` [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu Alice Ryhl
` (3 subsequent siblings)
6 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:49 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
The vm_insert_page method is only usable on vmas with the VM_MIXEDMAP
flag, so we introduce a new type to keep track of such vmas.
The approach used in this patch assumes that we will not need to encode
many flag combinations in the type. I don't think we need to encode more
than VM_MIXEDMAP and VM_PFNMAP as things are now. However, if that
becomes necessary, using generic parameters in a single type would scale
better as the number of flags increases.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/kernel/mm/virt.rs | 68 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 67 insertions(+), 1 deletion(-)
diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
index 1e755dca46dd..de7f2338810a 100644
--- a/rust/kernel/mm/virt.rs
+++ b/rust/kernel/mm/virt.rs
@@ -4,7 +4,14 @@
//! Virtual memory.
-use crate::{bindings, types::Opaque};
+use crate::{
+ bindings,
+ error::{to_result, Result},
+ page::Page,
+ types::Opaque,
+};
+
+use core::ops::Deref;
/// A wrapper for the kernel's `struct vm_area_struct` with read access.
///
@@ -80,6 +87,65 @@ pub fn zap_page_range_single(&self, address: usize, size: usize) {
)
};
}
+
+ /// Check whether the `VM_MIXEDMAP` flag is set.
+ #[inline]
+ pub fn check_mixedmap(&self) -> Option<&VmAreaMixedMap> {
+ if self.flags() & flags::MIXEDMAP != 0 {
+ // SAFETY: We just checked that `VM_MIXEDMAP` is set. All other requirements are
+ // satisfied by the type invariants of `VmAreaRef`.
+ Some(unsafe { VmAreaMixedMap::from_raw(self.as_ptr()) })
+ } else {
+ None
+ }
+ }
+}
+
+/// A wrapper for the kernel's `struct vm_area_struct` with read access and `VM_MIXEDMAP` set.
+///
+/// It represents an area of virtual memory.
+///
+/// # Invariants
+///
+/// The caller must hold the mmap read lock or the vma read lock. The `VM_MIXEDMAP` flag must be
+/// set.
+#[repr(transparent)]
+pub struct VmAreaMixedMap {
+ vma: VmAreaRef,
+}
+
+// Make all `VmAreaRef` methods available on `VmAreaMixedMap`.
+impl Deref for VmAreaMixedMap {
+ type Target = VmAreaRef;
+
+ #[inline]
+ fn deref(&self) -> &VmAreaRef {
+ &self.vma
+ }
+}
+
+impl VmAreaMixedMap {
+ /// Access a virtual memory area given a raw pointer.
+ ///
+ /// # Safety
+ ///
+ /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
+ /// (or stronger) is held for at least the duration of 'a. The `VM_MIXEDMAP` flag must be set.
+ #[inline]
+ pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
+ // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
+ unsafe { &*vma.cast() }
+ }
+
+ /// Maps a single page at the given address within the virtual memory area.
+ ///
+ /// This operation does not take ownership of the page.
+ #[inline]
+ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
+ // SAFETY: The caller has read access and has verified that `VM_MIXEDMAP` is set. The page
+ // is order 0. The address is checked on the C side so it can take any value.
+ to_result(unsafe { bindings::vm_insert_page(self.as_ptr(), address as _, page.as_ptr()) })
+ }
}
/// The integer type used for vma flags.
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
` (2 preceding siblings ...)
2024-11-20 14:49 ` [PATCH v8 3/7] mm: rust: add vm_insert_page Alice Ryhl
@ 2024-11-20 14:49 ` Alice Ryhl
2024-11-20 19:29 ` Lorenzo Stoakes
2024-11-20 14:49 ` [PATCH v8 5/7] mm: rust: add mmput_async support Alice Ryhl
` (2 subsequent siblings)
6 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:49 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
All of Rust Binder's existing calls to `vm_insert_page` could be
optimized to first attempt to use `lock_vma_under_rcu`. This patch
provides an abstraction to enable that.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/helpers/mm.c | 5 +++++
rust/kernel/mm.rs | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)
diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
index 7b72eb065a3e..81b510c96fd2 100644
--- a/rust/helpers/mm.c
+++ b/rust/helpers/mm.c
@@ -43,3 +43,8 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
{
return vma_lookup(mm, addr);
}
+
+void rust_helper_vma_end_read(struct vm_area_struct *vma)
+{
+ vma_end_read(vma);
+}
diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
index ace8e7d57afe..a15acb546f68 100644
--- a/rust/kernel/mm.rs
+++ b/rust/kernel/mm.rs
@@ -13,6 +13,7 @@
use core::{ops::Deref, ptr::NonNull};
pub mod virt;
+use virt::VmAreaRef;
/// A wrapper for the kernel's `struct mm_struct`.
///
@@ -170,6 +171,32 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
unsafe { &*ptr.cast() }
}
+ /// Try to lock the vma read lock under rcu.
+ ///
+ /// If this operation fails, the vma may still exist. In that case, you should take the mmap
+ /// read lock and try to use `vma_lookup` instead.
+ ///
+ /// When per-vma locks are disabled, this always returns `None`.
+ #[inline]
+ pub fn lock_vma_under_rcu(&self, vma_addr: usize) -> Option<VmaReadGuard<'_>> {
+ #[cfg(CONFIG_PER_VMA_LOCK)]
+ {
+ // SAFETY: Calling `bindings::lock_vma_under_rcu` is always okay given an mm where
+ // `mm_users` is non-zero.
+ let vma = unsafe { bindings::lock_vma_under_rcu(self.as_raw(), vma_addr as _) };
+ if !vma.is_null() {
+ return Some(VmaReadGuard {
+ // SAFETY: If `lock_vma_under_rcu` returns a non-null ptr, then it points at a
+ // valid vma. The vma is stable for as long as the vma read lock is held.
+ vma: unsafe { VmAreaRef::from_raw(vma) },
+ _nts: NotThreadSafe,
+ });
+ }
+ }
+
+ None
+ }
+
/// Lock the mmap read lock.
#[inline]
pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> {
@@ -238,3 +265,32 @@ fn drop(&mut self) {
unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
}
}
+
+/// A guard for the vma read lock.
+///
+/// # Invariants
+///
+/// This `VmaReadGuard` guard owns the vma read lock.
+pub struct VmaReadGuard<'a> {
+ vma: &'a VmAreaRef,
+ // `vma_end_read` must be called on the same thread as where the lock was taken
+ _nts: NotThreadSafe,
+}
+
+// Make all `VmAreaRef` methods available on `VmaReadGuard`.
+impl Deref for VmaReadGuard<'_> {
+ type Target = VmAreaRef;
+
+ #[inline]
+ fn deref(&self) -> &VmAreaRef {
+ self.vma
+ }
+}
+
+impl Drop for VmaReadGuard<'_> {
+ #[inline]
+ fn drop(&mut self) {
+ // SAFETY: We hold the read lock by the type invariants.
+ unsafe { bindings::vma_end_read(self.vma.as_ptr()) };
+ }
+}
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 5/7] mm: rust: add mmput_async support
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
` (3 preceding siblings ...)
2024-11-20 14:49 ` [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu Alice Ryhl
@ 2024-11-20 14:49 ` Alice Ryhl
2024-11-20 19:46 ` Lorenzo Stoakes
2024-11-20 14:50 ` [PATCH v8 6/7] mm: rust: add VmAreaNew Alice Ryhl
2024-11-20 14:50 ` [PATCH v8 7/7] rust: miscdevice: add mmap support Alice Ryhl
6 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:49 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
Adds an MmWithUserAsync type that uses mmput_async when dropped but is
otherwise identical to MmWithUser. This has to be done using a separate
type because the thing we are changing is the destructor.
Rust Binder needs this to avoid a certain deadlock. See commit
9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for
details. It's also needed in the shrinker to avoid cleaning up the mm in
the shrinker's context.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/kernel/mm.rs | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
index a15acb546f68..f800b6c244cd 100644
--- a/rust/kernel/mm.rs
+++ b/rust/kernel/mm.rs
@@ -98,6 +98,48 @@ fn deref(&self) -> &Mm {
}
}
+/// A wrapper for the kernel's `struct mm_struct`.
+///
+/// This type is identical to `MmWithUser` except that it uses `mmput_async` when dropping a
+/// refcount. This means that the destructor of `ARef<MmWithUserAsync>` is safe to call in atomic
+/// context.
+///
+/// # Invariants
+///
+/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
+#[repr(transparent)]
+pub struct MmWithUserAsync {
+ mm: MmWithUser,
+}
+
+// SAFETY: It is safe to call `mmput_async` on another thread than where `mmget` was called.
+unsafe impl Send for MmWithUserAsync {}
+// SAFETY: All methods on `MmWithUserAsync` can be called in parallel from several threads.
+unsafe impl Sync for MmWithUserAsync {}
+
+// SAFETY: By the type invariants, this type is always refcounted.
+unsafe impl AlwaysRefCounted for MmWithUserAsync {
+ fn inc_ref(&self) {
+ // SAFETY: The pointer is valid since self is a reference.
+ unsafe { bindings::mmget(self.as_raw()) };
+ }
+
+ unsafe fn dec_ref(obj: NonNull<Self>) {
+ // SAFETY: The caller is giving up their refcount.
+ unsafe { bindings::mmput_async(obj.cast().as_ptr()) };
+ }
+}
+
+// Make all `MmWithUser` methods available on `MmWithUserAsync`.
+impl Deref for MmWithUserAsync {
+ type Target = MmWithUser;
+
+ #[inline]
+ fn deref(&self) -> &MmWithUser {
+ &self.mm
+ }
+}
+
// These methods are safe to call even if `mm_users` is zero.
impl Mm {
/// Call `mmgrab` on `current.mm`.
@@ -171,6 +213,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
unsafe { &*ptr.cast() }
}
+ /// Use `mmput_async` when dropping this refcount.
+ #[inline]
+ pub fn use_mmput_async(me: ARef<MmWithUser>) -> ARef<MmWithUserAsync> {
+ // SAFETY: The layouts and invariants are compatible.
+ unsafe { ARef::from_raw(ARef::into_raw(me).cast()) }
+ }
+
/// Try to lock the vma read lock under rcu.
///
/// If this operation fails, the vma may still exist. In that case, you should take the mmap
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 6/7] mm: rust: add VmAreaNew
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
` (4 preceding siblings ...)
2024-11-20 14:49 ` [PATCH v8 5/7] mm: rust: add mmput_async support Alice Ryhl
@ 2024-11-20 14:50 ` Alice Ryhl
2024-11-20 20:02 ` Lorenzo Stoakes
2024-11-20 14:50 ` [PATCH v8 7/7] rust: miscdevice: add mmap support Alice Ryhl
6 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:50 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
When setting up a new vma in an mmap call, you have a bunch of extra
permissions that you normally don't have. This introduces a new
VmAreaNew type that is used for this case.
To avoid setting invalid flag values, the methods for clearing
VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error
if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking
the return value results in a compilation error because the `Result`
type is marked #[must_use].
For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When
we add a VM_PFNMAP method, we will need some way to prevent you from
setting both VM_MIXEDMAP and VM_PFNMAP on the same vma.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/kernel/mm/virt.rs | 169 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 168 insertions(+), 1 deletion(-)
diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
index de7f2338810a..22aff8e77854 100644
--- a/rust/kernel/mm/virt.rs
+++ b/rust/kernel/mm/virt.rs
@@ -6,7 +6,7 @@
use crate::{
bindings,
- error::{to_result, Result},
+ error::{code::EINVAL, to_result, Result},
page::Page,
types::Opaque,
};
@@ -148,6 +148,173 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
}
}
+/// A builder for setting up a vma in an `mmap` call.
+///
+/// # Invariants
+///
+/// For the duration of 'a, the referenced vma must be undergoing initialization.
+pub struct VmAreaNew {
+ vma: VmAreaRef,
+}
+
+// Make all `VmAreaRef` methods available on `VmAreaNew`.
+impl Deref for VmAreaNew {
+ type Target = VmAreaRef;
+
+ #[inline]
+ fn deref(&self) -> &VmAreaRef {
+ &self.vma
+ }
+}
+
+impl VmAreaNew {
+ /// Access a virtual memory area given a raw pointer.
+ ///
+ /// # Safety
+ ///
+ /// Callers must ensure that `vma` is undergoing initial vma setup for the duration of 'a.
+ #[inline]
+ pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
+ // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
+ unsafe { &*vma.cast() }
+ }
+
+ /// Internal method for updating the vma flags.
+ ///
+ /// # Safety
+ ///
+ /// This must not be used to set the flags to an invalid value.
+ #[inline]
+ unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) {
+ let mut flags = self.flags();
+ flags |= set;
+ flags &= !unset;
+
+ // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet
+ // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel.
+ // The caller promises that this does not set the flags to an invalid value.
+ unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags = flags };
+ }
+
+ /// Set the `VM_MIXEDMAP` flag on this vma.
+ ///
+ /// This enables the vma to contain both `struct page` and pure PFN pages. Returns a reference
+ /// that can be used to call `vm_insert_page` on the vma.
+ #[inline]
+ pub fn set_mixedmap(&self) -> &VmAreaMixedMap {
+ // SAFETY: We don't yet provide a way to set VM_PFNMAP, so this cannot put the flags in an
+ // invalid state.
+ unsafe { self.update_flags(flags::MIXEDMAP, 0) };
+
+ // SAFETY: We just set `VM_MIXEDMAP` on the vma.
+ unsafe { VmAreaMixedMap::from_raw(self.vma.as_ptr()) }
+ }
+
+ /// Set the `VM_IO` flag on this vma.
+ ///
+ /// This marks the vma as being a memory-mapped I/O region.
+ #[inline]
+ pub fn set_io(&self) {
+ // SAFETY: Setting the VM_IO flag is always okay.
+ unsafe { self.update_flags(flags::IO, 0) };
+ }
+
+ /// Set the `VM_DONTEXPAND` flag on this vma.
+ ///
+ /// This prevents the vma from being expanded with `mremap()`.
+ #[inline]
+ pub fn set_dontexpand(&self) {
+ // SAFETY: Setting the VM_DONTEXPAND flag is always okay.
+ unsafe { self.update_flags(flags::DONTEXPAND, 0) };
+ }
+
+ /// Set the `VM_DONTCOPY` flag on this vma.
+ ///
+ /// This prevents the vma from being copied on fork. This option is only permanent if `VM_IO`
+ /// is set.
+ #[inline]
+ pub fn set_dontcopy(&self) {
+ // SAFETY: Setting the VM_DONTCOPY flag is always okay.
+ unsafe { self.update_flags(flags::DONTCOPY, 0) };
+ }
+
+ /// Set the `VM_DONTDUMP` flag on this vma.
+ ///
+ /// This prevents the vma from being included in core dumps. This option is only permanent if
+ /// `VM_IO` is set.
+ #[inline]
+ pub fn set_dontdump(&self) {
+ // SAFETY: Setting the VM_DONTDUMP flag is always okay.
+ unsafe { self.update_flags(flags::DONTDUMP, 0) };
+ }
+
+ /// Returns whether `VM_READ` is set.
+ ///
+ /// This flag indicates whether userspace is mapping this vma as readable.
+ #[inline]
+ pub fn get_read(&self) -> bool {
+ (self.flags() & flags::READ) != 0
+ }
+
+ /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set.
+ ///
+ /// This flag indicates whether userspace is allowed to make this vma readable with
+ /// `mprotect()`.
+ #[inline]
+ pub fn try_clear_mayread(&self) -> Result {
+ if self.get_read() {
+ return Err(EINVAL);
+ }
+ // SAFETY: Clearing `VM_MAYREAD` is okay when `VM_READ` is not set.
+ unsafe { self.update_flags(0, flags::MAYREAD) };
+ Ok(())
+ }
+
+ /// Returns whether `VM_WRITE` is set.
+ ///
+ /// This flag indicates whether userspace is mapping this vma as writable.
+ #[inline]
+ pub fn get_write(&self) -> bool {
+ (self.flags() & flags::WRITE) != 0
+ }
+
+ /// Try to clear the `VM_MAYWRITE` flag, failing if `VM_WRITE` is set.
+ ///
+ /// This flag indicates whether userspace is allowed to make this vma writable with
+ /// `mprotect()`.
+ #[inline]
+ pub fn try_clear_maywrite(&self) -> Result {
+ if self.get_write() {
+ return Err(EINVAL);
+ }
+ // SAFETY: Clearing `VM_MAYWRITE` is okay when `VM_WRITE` is not set.
+ unsafe { self.update_flags(0, flags::MAYWRITE) };
+ Ok(())
+ }
+
+ /// Returns whether `VM_EXEC` is set.
+ ///
+ /// This flag indicates whether userspace is mapping this vma as executable.
+ #[inline]
+ pub fn get_exec(&self) -> bool {
+ (self.flags() & flags::EXEC) != 0
+ }
+
+ /// Try to clear the `VM_MAYEXEC` flag, failing if `VM_EXEC` is set.
+ ///
+ /// This flag indicates whether userspace is allowed to make this vma executable with
+ /// `mprotect()`.
+ #[inline]
+ pub fn try_clear_mayexec(&self) -> Result {
+ if self.get_exec() {
+ return Err(EINVAL);
+ }
+ // SAFETY: Clearing `VM_MAYEXEC` is okay when `VM_EXEC` is not set.
+ unsafe { self.update_flags(0, flags::MAYEXEC) };
+ Ok(())
+ }
+}
+
/// The integer type used for vma flags.
#[doc(inline)]
pub use bindings::vm_flags_t;
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v8 7/7] rust: miscdevice: add mmap support
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
` (5 preceding siblings ...)
2024-11-20 14:50 ` [PATCH v8 6/7] mm: rust: add VmAreaNew Alice Ryhl
@ 2024-11-20 14:50 ` Alice Ryhl
2024-11-20 20:07 ` Lorenzo Stoakes
6 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-20 14:50 UTC (permalink / raw)
To: Miguel Ojeda, Matthew Wilcox, Lorenzo Stoakes, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner
Cc: Alex Gaynor, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, linux-kernel, linux-mm, rust-for-linux, Alice Ryhl,
Andreas Hindborg
Using the vma support introduced by the previous commit, introduce mmap
support for miscdevices. The mmap call is given a vma that is undergoing
initial setup, so the VmAreaNew type is used.
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/kernel/miscdevice.rs | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs
index 7e2a79b3ae26..4e4b9476e092 100644
--- a/rust/kernel/miscdevice.rs
+++ b/rust/kernel/miscdevice.rs
@@ -11,6 +11,7 @@
use crate::{
bindings,
error::{to_result, Error, Result, VTABLE_DEFAULT_ERROR},
+ mm::virt::VmAreaNew,
prelude::*,
str::CStr,
types::{ForeignOwnable, Opaque},
@@ -110,6 +111,11 @@ fn release(device: Self::Ptr) {
drop(device);
}
+ /// Handle for mmap.
+ fn mmap(_device: <Self::Ptr as ForeignOwnable>::Borrowed<'_>, _vma: &VmAreaNew) -> Result {
+ kernel::build_error(VTABLE_DEFAULT_ERROR)
+ }
+
/// Handler for ioctls.
///
/// The `cmd` argument is usually manipulated using the utilties in [`kernel::ioctl`].
@@ -156,6 +162,7 @@ impl<T: MiscDevice> VtableHelper<T> {
const VTABLE: bindings::file_operations = bindings::file_operations {
open: Some(fops_open::<T>),
release: Some(fops_release::<T>),
+ mmap: maybe_fn(T::HAS_MMAP, fops_mmap::<T>),
unlocked_ioctl: maybe_fn(T::HAS_IOCTL, fops_ioctl::<T>),
#[cfg(CONFIG_COMPAT)]
compat_ioctl: if T::HAS_COMPAT_IOCTL {
@@ -216,6 +223,27 @@ impl<T: MiscDevice> VtableHelper<T> {
0
}
+/// # Safety
+///
+/// `file` must be a valid file that is associated with a `MiscDeviceRegistration<T>`.
+/// `vma` must be a vma that is currently being mmap'ed with this file.
+unsafe extern "C" fn fops_mmap<T: MiscDevice>(
+ file: *mut bindings::file,
+ vma: *mut bindings::vm_area_struct,
+) -> c_int {
+ // SAFETY: The mmap call of a file can access the private data.
+ let private = unsafe { (*file).private_data };
+ // SAFETY: Mmap calls can borrow the private data of the file.
+ let device = unsafe { <T::Ptr as ForeignOwnable>::borrow(private) };
+ // SAFETY: The caller provides a vma that is undergoing initial VMA setup.
+ let area = unsafe { VmAreaNew::from_raw(vma) };
+
+ match T::mmap(device, area) {
+ Ok(()) => 0,
+ Err(err) => err.to_errno() as c_int,
+ }
+}
+
/// # Safety
///
/// `file` must be a valid file that is associated with a `MiscDeviceRegistration<T>`.
--
2.47.0.371.ga323438b13-goog
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct
2024-11-20 14:49 ` [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct Alice Ryhl
@ 2024-11-20 18:13 ` Lorenzo Stoakes
2024-11-21 9:52 ` Alice Ryhl
2024-11-22 7:42 ` Paolo Bonzini
0 siblings, 2 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-20 18:13 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 02:49:55PM +0000, Alice Ryhl wrote:
> These abstractions allow you to reference a `struct mm_struct` using
> both mmgrab and mmget refcounts. This is done using two Rust types:
>
> * Mm - represents an mm_struct where you don't know anything about the
> value of mm_users.
> * MmWithUser - represents an mm_struct where you know at compile time
> that mm_users is non-zero.
>
> This allows us to encode in the type system whether a method requires
> that mm_users is non-zero or not. For instance, you can always call
> `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
> non-zero.
It's kind of interesting to represent these things this way, I like the
self-documenting element of that.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
So obviously I'm not a rust person (yet... yet :) so from my side I can
only look at things from an mm perspective conceptually. To avoid boring
everyone I won't repeat this and instead you can take it as read.
I will obviously inevitably ask a TON of questions as a result of not being
a rust person so, bear with me...
> ---
> rust/helpers/helpers.c | 1 +
> rust/helpers/mm.c | 39 +++++++++
> rust/kernel/lib.rs | 1 +
> rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 260 insertions(+)
>
> diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
> index 20a0c69d5cc7..60a488eb4efe 100644
> --- a/rust/helpers/helpers.c
> +++ b/rust/helpers/helpers.c
> @@ -13,6 +13,7 @@
> #include "build_bug.c"
> #include "err.c"
> #include "kunit.c"
> +#include "mm.c"
> #include "mutex.c"
> #include "page.c"
> #include "rbtree.c"
> diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> new file mode 100644
> index 000000000000..7201747a5d31
> --- /dev/null
> +++ b/rust/helpers/mm.c
> @@ -0,0 +1,39 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/mm.h>
> +#include <linux/sched/mm.h>
> +
> +void rust_helper_mmgrab(struct mm_struct *mm)
> +{
> + mmgrab(mm);
> +}
> +
> +void rust_helper_mmdrop(struct mm_struct *mm)
> +{
> + mmdrop(mm);
> +}
> +
> +void rust_helper_mmget(struct mm_struct *mm)
> +{
> + mmget(mm);
> +}
> +
> +bool rust_helper_mmget_not_zero(struct mm_struct *mm)
> +{
> + return mmget_not_zero(mm);
> +}
> +
> +void rust_helper_mmap_read_lock(struct mm_struct *mm)
> +{
> + mmap_read_lock(mm);
> +}
> +
> +bool rust_helper_mmap_read_trylock(struct mm_struct *mm)
> +{
> + return mmap_read_trylock(mm);
> +}
> +
> +void rust_helper_mmap_read_unlock(struct mm_struct *mm)
> +{
> + mmap_read_unlock(mm);
> +}
I guess at this point we're only interested in reading?
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index 66f5cde7f322..cc1963510cdf 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -43,6 +43,7 @@
> pub mod kunit;
> pub mod list;
> pub mod miscdevice;
> +pub mod mm;
> #[cfg(CONFIG_NET)]
> pub mod net;
> pub mod page;
> diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> new file mode 100644
> index 000000000000..84cba581edaa
> --- /dev/null
> +++ b/rust/kernel/mm.rs
> @@ -0,0 +1,219 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Memory management.
> +//!
> +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
> +
> +use crate::{
> + bindings,
> + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
> +};
> +use core::{ops::Deref, ptr::NonNull};
> +
> +/// A wrapper for the kernel's `struct mm_struct`.
> +///
> +/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> +/// [`mmget_not_zero`] to be able to access the address space.
> +///
> +/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
> +///
> +/// # Invariants
> +///
> +/// Values of this type are always refcounted using `mmgrab`.
> +///
> +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> +#[repr(transparent)]
> +pub struct Mm {
> + mm: Opaque<bindings::mm_struct>,
> +}
Does this tie this type to the C struct mm_struct type?
Does 'Opaque' mean it is a pointer to a type which is 'opaque' in the sense
that rust can't see into its internals?
> +
> +// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called.
> +unsafe impl Send for Mm {}
> +// SAFETY: All methods on `Mm` can be called in parallel from several threads.
> +unsafe impl Sync for Mm {}
> +
> +// SAFETY: By the type invariants, this type is always refcounted.
> +unsafe impl AlwaysRefCounted for Mm {
> + #[inline]
> + fn inc_ref(&self) {
> + // SAFETY: The pointer is valid since self is a reference.
> + unsafe { bindings::mmgrab(self.as_raw()) };
> + }
> +
> + #[inline]
> + unsafe fn dec_ref(obj: NonNull<Self>) {
> + // SAFETY: The caller is giving up their refcount.
> + unsafe { bindings::mmdrop(obj.cast().as_ptr()) };
> + }
> +}
Under what circumstances would these be taken? Same question for MmWthUser.
> +
> +/// A wrapper for the kernel's `struct mm_struct`.
> +///
> +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
> +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
> +/// refcount. It can be used to access the associated address space.
> +///
> +/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
> +///
> +/// # Invariants
> +///
> +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> +#[repr(transparent)]
> +pub struct MmWithUser {
> + mm: Mm,
> +}
Why does Mm have this as a Opaque<bindings::mm_struct> and this sort of
nests it?
Does this somehow amount to the same thing, or would you probably never
actually reference this mm field?
> +
> +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
> +unsafe impl Send for MmWithUser {}
> +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
> +unsafe impl Sync for MmWithUser {}
> +
> +// SAFETY: By the type invariants, this type is always refcounted.
> +unsafe impl AlwaysRefCounted for MmWithUser {
> + #[inline]
> + fn inc_ref(&self) {
> + // SAFETY: The pointer is valid since self is a reference.
> + unsafe { bindings::mmget(self.as_raw()) };
> + }
> +
> + #[inline]
> + unsafe fn dec_ref(obj: NonNull<Self>) {
> + // SAFETY: The caller is giving up their refcount.
> + unsafe { bindings::mmput(obj.cast().as_ptr()) };
> + }
Hm, why is it we mmget(self.as_raw()) but mmput(obj.cast().as_ptr())?
Also I guess relatedly, why does one refer to &self and the other as a
NonNull<Self>?
I'm guessing as a non-rust person a 'reference' is like a C++ reference in
the sense that (well it is _assumed_ in C++ anyway) it acts like a pointer
for the type which should never not be there, but we need .as_raw() to get
the raw C pointer?
And I guess in the dec_ref() case we need the .cast().as_ptr() because obj
'boxes' around self (I guess equivalent to 'this' in C++ kinda)
guaranteeing that it can provide non-null pointer to the current object?
> +}
> +
> +// Make all `Mm` methods available on `MmWithUser`.
> +impl Deref for MmWithUser {
> + type Target = Mm;
> +
> + #[inline]
> + fn deref(&self) -> &Mm {
> + &self.mm
> + }
> +}
I rubber ducked myself a bit on this, so I guess this makes it possible to
dereference the object, and it
> +
> +// These methods are safe to call even if `mm_users` is zero.
> +impl Mm {
> + /// Call `mmgrab` on `current.mm`.
> + #[inline]
> + pub fn mmgrab_current() -> Option<ARef<Mm>> {
> + // SAFETY: It's safe to get the `mm` field from current.
> + let mm = unsafe {
> + let current = bindings::get_current();
> + (*current).mm
> + };
I don't see an equivalent means of obtaining mm from current for
MmWithUser, is that intentional, would there be another means of obtaining
an mm? (perhaps via vma->vm_mm I guess?)
An aside -->
If we're grabbing from current, and this is non-NULL (i.e. not a kernel
thread), this is kinda MmWithUser to _start out_ with, but I guess if
you're grabbing the current one you might not expect it.
I guess one thing I want to point out (maybe here is wrong place) is that
the usual way of interacting with current->mm is that we do _not_ increment
mm->mm_count, mm->mm_users or any refernce count, as while we are in the
kernel portion of the task, we are guaranteed the mm and the backing
virtual address space will stick around.
With reference to MmWithUser, in fact, if you look up users of
mmget()/mmput() it is pretty rare to do that.
So ideally we'd avoid doing this if we could (though having the semantics
of grabbing a reference if we were to copy the object somehow again or hold
its state or something would be nice).
I guess this might actually be tricky in rust, because we'd probably need
to somehow express the current task's lifetime and tie this to that
and... yeah.
<-- aside
> +
> + if mm.is_null() {
> + return None;
> + }
> +
> + // SAFETY: The value of `current->mm` is guaranteed to be null or a valid `mm_struct`. We
> + // just checked that it's not null. Furthermore, the returned `&Mm` is valid only for the
> + // duration of this function, and `current->mm` will stay valid for that long.
> + let mm = unsafe { Mm::from_raw(mm) };
Hm does mm now reference something with a different type, as in before it
was a 'raw' pointer or such, and now it's a reference to an Mm right?
Also I guess the 'duration of this function' is because we put this in the
'Aref' smart pointer which kinda takes over the lifetime of the reference
by wrapping it right?
I mean I'm not a rust person so actually I have no business _commenting_ on
this :P as this may very well be idiomatic rust, but I'm just curious about
this.
It's nitty but I feel like maybe we're somewhat overloading 'mm's here a
bit though? As we have our wrapped Mm type and then an internal raw mm
type. On the other hand, it's hard to now have horribly awkward and
confusing naming here I guess, and perhaps this is fine.
Not a big deal though!
> +
> + // This increments the refcount using `mmgrab`.
> + Some(ARef::from(mm))
So I get that Some() means this is like a discriminated union or such,
where we can return None (as above) or Some, which contains the value is of
type ARef<Mm>. And I guess this moves the 'lifetime' of mm which was
previously with the function into that of the ARef<>?
Does the ARef<> 'just know' to use the AlwaysRefCounted methods?
> + }
> +
> + /// Returns a raw pointer to the inner `mm_struct`.
> + #[inline]
> + pub fn as_raw(&self) -> *mut bindings::mm_struct {
> + self.mm.get()
> + }
I guess this .get() method is on the Opaque<> object and returns a raw ptr?
> +
> + /// Obtain a reference from a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated
> + /// during the lifetime 'a.
> + #[inline]
> + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm {
I'm guessing this funny 'a syntax, based on the comment, refers to the
lifetime of the object?
> + // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to
> + // repr(transparent).
God I love these SAFETY comments (I mean actually, sorry I realise it's
almost impossible to convey 'not sarcastically, actually' in text form
:). Is that something that gets parsed somewhere, or a convention or?
I like that there is a discipline of expressing under what circumstances we
are permitted to reference things.
> + unsafe { &*ptr.cast() }
> + }
> +
> + /// Calls `mmget_not_zero` and returns a handle if it succeeds.
> + #[inline]
> + pub fn mmget_not_zero(&self) -> Option<ARef<MmWithUser>> {
I actually kinda love that this takes an mm and guarantees that you get an
MmWithUser out of it which is implied by the fact this succeeds.
However as to the point above, I'm concerned that this might be seen as
'the way' to access an mm, i.e. mm.mmgrab_current().mmget_not_zero() or
something.
Whereas, the usual way of referencing current->mm is to not increment any
reference counts at all (assuming what you are doing resides in the same
lifetime as the task).
Obviously if you step outside of that lifetime, then you _do_ have to pin
the mm (nearly always you want to grab rather than get though in that
circumstance).
> + // SAFETY: The pointer is valid since self is a reference.
> + let success = unsafe { bindings::mmget_not_zero(self.as_raw()) };
> +
> + if success {
> + // SAFETY: We just created an `mmget` refcount.
> + Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) })
When you do this, does it cause the reference count to increment, or does
it assume it's already at 1?
> + } else {
> + None
> + }
> + }
> +}
> +
> +// These methods require `mm_users` to be non-zero.
> +impl MmWithUser {
> + /// Obtain a reference from a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains
> + /// non-zero for the duration of the lifetime 'a.
> + #[inline]
> + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> + // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due
> + // to repr(transparent).
> + unsafe { &*ptr.cast() }
> + }
I guess this is another means by which you can get the mm. I'd say an
equivalent for getting from current is highly relevant.
> +
> + /// Lock the mmap read lock.
> + #[inline]
> + pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> {
> + // SAFETY: The pointer is valid since self is a reference.
> + unsafe { bindings::mmap_read_lock(self.as_raw()) };
> +
> + // INVARIANT: We just acquired the read lock.
> + MmapReadGuard {
> + mm: self,
> + _nts: NotThreadSafe,
I'm sure this is a rusty thing, but curious as to why this is like that?
What is '_nts', etc.?
> + }
> + }
> +
> + /// Try to lock the mmap read lock.
> + #[inline]
> + pub fn mmap_read_trylock(&self) -> Option<MmapReadGuard<'_>> {
> + // SAFETY: The pointer is valid since self is a reference.
> + let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) };
> +
> + if success {
> + // INVARIANT: We just acquired the read lock.
> + Some(MmapReadGuard {
> + mm: self,
> + _nts: NotThreadSafe,
> + })
> + } else {
> + None
> + }
> + }
> +}
> +
> +/// A guard for the mmap read lock.
> +///
> +/// # Invariants
> +///
> +/// This `MmapReadGuard` guard owns the mmap read lock.
> +pub struct MmapReadGuard<'a> {
> + mm: &'a MmWithUser,
> + // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread
> + _nts: NotThreadSafe,
> +}
> +
> +impl Drop for MmapReadGuard<'_> {
> + #[inline]
> + fn drop(&mut self) {
> + // SAFETY: We hold the read lock by the type invariants.
> + unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
> + }
> +}
Ah that's nice, an actual guard for it :) I'm guessing the fact this
implements the guard implies that you _must_ hold the lock first right?
>
> --
> 2.47.0.371.ga323438b13-goog
>
Sorry for the numerous questions, I'm afraid there'll be a lot of this for
rust stuff for the time being. Perhaps advent of code will help improve
things on the rust front for me ;)
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access
2024-11-20 14:49 ` [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access Alice Ryhl
@ 2024-11-20 19:07 ` Lorenzo Stoakes
2024-11-21 10:23 ` Alice Ryhl
0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-20 19:07 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 02:49:56PM +0000, Alice Ryhl wrote:
> This adds a type called VmAreaRef which is used when referencing a vma
> that you have read access to. Here, read access means that you hold
> either the mmap read lock or the vma read lock (or stronger).
This is pure nit and not important but...
I know we screwed up naming in mm, with the god awful vm_area_struct (we
abbreviate, for 2 letters, get bored, stop abbreviating, but throw in a
struct for a laugh) or 'VMA', but I wonder if this would be better as
VmaRef? Then again that's kinda terrible too.
Sorry about that. I mean this isn't hugely important + ultimately _our
fault_.
VirtualMemoryAreaRef is worse... VirtMemAreaRef? OK. Maybe VMAreaRef is the
least bad...
>
> Additionally, a vma_lookup method is added to the mmap read guard, which
> enables you to obtain a &VmAreaRef in safe Rust code.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> rust/helpers/mm.c | 6 ++
> rust/kernel/mm.rs | 21 ++++++
> rust/kernel/mm/virt.rs | 171 +++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 198 insertions(+)
>
> diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> index 7201747a5d31..7b72eb065a3e 100644
> --- a/rust/helpers/mm.c
> +++ b/rust/helpers/mm.c
> @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm)
> {
> mmap_read_unlock(mm);
> }
> +
> +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> + unsigned long addr)
> +{
> + return vma_lookup(mm, addr);
> +}
I wonder whether we want to abstract some of the VMA iterator stuff,
because it's inefficient to look up VMAs by doing this each time if you are
looking at, for instance, adjacent VMAs.
But perhaps reasonable to defer doing that to later work?
> diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> index 84cba581edaa..ace8e7d57afe 100644
> --- a/rust/kernel/mm.rs
> +++ b/rust/kernel/mm.rs
> @@ -12,6 +12,8 @@
> };
> use core::{ops::Deref, ptr::NonNull};
>
> +pub mod virt;
> +
> /// A wrapper for the kernel's `struct mm_struct`.
> ///
> /// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> @@ -210,6 +212,25 @@ pub struct MmapReadGuard<'a> {
> _nts: NotThreadSafe,
> }
>
> +impl<'a> MmapReadGuard<'a> {
> + /// Look up a vma at the given address.
> + #[inline]
> + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {
Kind of lovely to have functions under the read guard and then knowing that
hey - we must hold the read lock to be able to do this.
Imagine, a programming language with actual types... :P
> + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
> + // for `vma_addr`.
> + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
Why do we say 'as _' here?
> +
> + if vma.is_null() {
> + None
> + } else {
> + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
> + // the returned area will borrow from this read lock guard, so it can only be used
> + // while the mmap read lock is still held.
Lovely!
> + unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
> + }
> + }
> +}
> +
> impl Drop for MmapReadGuard<'_> {
> #[inline]
> fn drop(&mut self) {
> diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> new file mode 100644
> index 000000000000..1e755dca46dd
> --- /dev/null
> +++ b/rust/kernel/mm/virt.rs
> @@ -0,0 +1,171 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Virtual memory.
Trivial rust q again but does this result in a heading in generated docs or
something?
> +
> +use crate::{bindings, types::Opaque};
> +
> +/// A wrapper for the kernel's `struct vm_area_struct` with read access.
> +///
> +/// It represents an area of virtual memory.
> +///
> +/// # Invariants
> +///
> +/// The caller must hold the mmap read lock or the vma read lock.
Might be worth mentioning here that you have yet to abstract the vma lock?
> +#[repr(transparent)]
> +pub struct VmAreaRef {
> + vma: Opaque<bindings::vm_area_struct>,
> +}
> +
> +// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no
> +// matter what the vma flags are.
typo: 'or strong' -> 'or stronger'.
Nitty, but perhaps say 'Methods must be usable' rather than 'they' to be
totally clear.
> +impl VmAreaRef {
> + /// Access a virtual memory area given a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> + /// (or stronger) is held for at least the duration of 'a.
Well, VMA locks make this more complicated, in that we can just hold a VMA
lock. But again, perhaps worth doing this incrementally and just talk about
mmap locks to start with.
However since we reference VMA locks elsewhere, we should reference them
here (and probably worth mentioning that they are not yet abstracted).
> + #[inline]
> + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> + unsafe { &*vma.cast() }
> + }
> +
> + /// Returns a raw pointer to this area.
> + #[inline]
> + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
> + self.vma.get()
> + }
> +
> + /// Returns the flags associated with the virtual memory area.
> + ///
> + /// The possible flags are a combination of the constants in [`flags`].
> + #[inline]
> + pub fn flags(&self) -> vm_flags_t {
> + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> + // access is not a data race.
> + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }
Hm what's up with this __bindgen_anon_N stuff?
> + }
> +
> + /// Returns the start address of the virtual memory area.
> + #[inline]
> + pub fn start(&self) -> usize {
Is usize guranteed to be equivalent to unsigned long?
> + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> + // access is not a data race.
> + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
> + }
> +
> + /// Returns the end address of the virtual memory area.
Worth mentioning that it's an _exclusive_ end.
> + #[inline]
> + pub fn end(&self) -> usize {
> + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> + // access is not a data race.
> + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
> + }
> +
> + /// Unmap pages in the given page range.
This needs some more description, as 'unmapping' pages is unfortunately an
overloaded term in the kernel and this very much might confuse people as
opposed to e.g. munmap()'ing a range.
I'd say something like 'clear page table mappings for the range at the leaf
level, leaving all other page tables intact, freeing any memory referenced
by the VMA in this range (anonymous memory is completely freed, file-backed
memory has its reference count on page cache folio's dropped, any dirty
data will still be written back to disk as usual)'.
> + #[inline]
> + pub fn zap_page_range_single(&self, address: usize, size: usize) {
> + // SAFETY: By the type invariants, the caller has read access to this VMA, which is
> + // sufficient for this method call. This method has no requirements on the vma flags. Any
> + // value of `address` and `size` is allowed.
> + unsafe {
> + bindings::zap_page_range_single(
Hm weirdly I see this in rust/bindings/bindings_generated.rs but not in
rust/helpers/mm.c is this intended?
Is this meant to be generated _from_ somewhere? Is something missing for
that?
> + self.as_ptr(),
> + address as _,
> + size as _,
> + core::ptr::null_mut(),
> + )
> + };
> + }
> +}
> +
> +/// The integer type used for vma flags.
> +#[doc(inline)]
> +pub use bindings::vm_flags_t;
Where do you declare this type?
> +
> +/// All possible flags for [`VmAreaRef`].
Well you've not specified all as they're some speciailist ones and ones
that depend on config flags, so maybe worth just saying 'core' flags?
> +pub mod flags {
Pure rust noobie question (again...) but what is a 'mod'?
> + use super::vm_flags_t;
> + use crate::bindings;
> +
> + /// No flags are set.
> + pub const NONE: vm_flags_t = bindings::VM_NONE as _;
This is strictly not a flag, as is the 0 value (and is used 'cleverly' in
kernel code when we have flags that are defined under some circumstances
but not others, where we can then assign these values to VM_NONE if not
available, ensuring all |= ... operations are no-ops and all &
... expressions evaluate to false) but I guess it doesn't matter all too
much right?
> +
> + /// Mapping allows reads.
> + pub const READ: vm_flags_t = bindings::VM_READ as _;
> +
> + /// Mapping allows writes.
> + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _;
> +
> + /// Mapping allows execution.
> + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _;
> +
> + /// Mapping is shared.
> + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _;
> +
> + /// Mapping may be updated to allow reads.
> + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _;
> +
> + /// Mapping may be updated to allow writes.
> + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _;
> +
> + /// Mapping may be updated to allow execution.
> + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _;
> +
> + /// Mapping may be updated to be shared.
> + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _;
> +
> + /// Page-ranges managed without `struct page`, just pure PFN.
> + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _;
> +
> + /// Memory mapped I/O or similar.
> + pub const IO: vm_flags_t = bindings::VM_IO as _;
> +
> + /// Do not copy this vma on fork.
> + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _;
> +
> + /// Cannot expand with mremap().
> + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _;
> +
> + /// Lock the pages covered when they are faulted in.
> + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _;
> +
> + /// Is a VM accounted object.
> + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _;
> +
> + /// should the VM suppress accounting.
> + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _;
> +
> + /// Huge TLB Page VM.
> + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _;
> +
> + /// Synchronous page faults.
Might be worth mentioning that this is DAX-specific only.
> + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _;
> +
> + /// Architecture-specific flag.
> + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _;
> +
> + /// Wipe VMA contents in child..
Typo '..' - also probably worth saying 'wipe VMA contents in child on
fork'.
> + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _;
> +
> + /// Do not include in the core dump.
> + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _;
> +
> + /// Not soft dirty clean area.
> + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _;
> +
> + /// Can contain `struct page` and pure PFN pages.
> + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _;
> +
> + /// MADV_HUGEPAGE marked this vma.
> + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _;
> +
> + /// MADV_NOHUGEPAGE marked this vma.
> + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _;
> +
> + /// KSM may merge identical pages.
> + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _;
> +}
>
I'd say these comments are a bit sparse and need more detail, but they are
literally comments from include/linux/mm.h and therefore, again, our fault
so this is fine :)
> --
> 2.47.0.371.ga323438b13-goog
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 3/7] mm: rust: add vm_insert_page
2024-11-20 14:49 ` [PATCH v8 3/7] mm: rust: add vm_insert_page Alice Ryhl
@ 2024-11-20 19:20 ` Lorenzo Stoakes
2024-11-21 10:38 ` Alice Ryhl
0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-20 19:20 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 02:49:57PM +0000, Alice Ryhl wrote:
> The vm_insert_page method is only usable on vmas with the VM_MIXEDMAP
> flag, so we introduce a new type to keep track of such vmas.
Worth mentioning that it can be used for VMAs without this flag set, but if
so we must hold a _write_ lock to be able to do so, so it can update the
flag itself, however we intend only to use it with VMAs which already have
this flag set.
>
> The approach used in this patch assumes that we will not need to encode
> many flag combinations in the type. I don't think we need to encode more
> than VM_MIXEDMAP and VM_PFNMAP as things are now. However, if that
> becomes necessary, using generic parameters in a single type would scale
> better as the number of flags increases.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> rust/kernel/mm/virt.rs | 68 +++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 67 insertions(+), 1 deletion(-)
>
> diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> index 1e755dca46dd..de7f2338810a 100644
> --- a/rust/kernel/mm/virt.rs
> +++ b/rust/kernel/mm/virt.rs
> @@ -4,7 +4,14 @@
>
> //! Virtual memory.
>
> -use crate::{bindings, types::Opaque};
> +use crate::{
> + bindings,
> + error::{to_result, Result},
> + page::Page,
> + types::Opaque,
> +};
> +
> +use core::ops::Deref;
>
> /// A wrapper for the kernel's `struct vm_area_struct` with read access.
> ///
> @@ -80,6 +87,65 @@ pub fn zap_page_range_single(&self, address: usize, size: usize) {
> )
> };
> }
> +
> + /// Check whether the `VM_MIXEDMAP` flag is set.
> + #[inline]
> + pub fn check_mixedmap(&self) -> Option<&VmAreaMixedMap> {
Nitty + a little bikesheddy (this may be my mistake as I am unfamiliar with
rust idioms also) but shouldn't this be 'as_mixedmap_vma()' or something?
> + if self.flags() & flags::MIXEDMAP != 0 {
> + // SAFETY: We just checked that `VM_MIXEDMAP` is set. All other requirements are
> + // satisfied by the type invariants of `VmAreaRef`.
> + Some(unsafe { VmAreaMixedMap::from_raw(self.as_ptr()) })
> + } else {
> + None
> + }
> + }
> +}
> +
> +/// A wrapper for the kernel's `struct vm_area_struct` with read access and `VM_MIXEDMAP` set.
> +///
> +/// It represents an area of virtual memory.
> +///
> +/// # Invariants
> +///
> +/// The caller must hold the mmap read lock or the vma read lock. The `VM_MIXEDMAP` flag must be
> +/// set.
> +#[repr(transparent)]
> +pub struct VmAreaMixedMap {
> + vma: VmAreaRef,
> +}
> +
> +// Make all `VmAreaRef` methods available on `VmAreaMixedMap`.
> +impl Deref for VmAreaMixedMap {
> + type Target = VmAreaRef;
> +
> + #[inline]
> + fn deref(&self) -> &VmAreaRef {
> + &self.vma
> + }
> +}
Ah OK, thinking back to the 'impl Deref' from the other patch, I guess this
allows you to deref VmAreaMixedMap as a VmAreaRef, does it all sort of
automagically merge methods for you as if they were mix-ins then?
> +
> +impl VmAreaMixedMap {
> + /// Access a virtual memory area given a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> + /// (or stronger) is held for at least the duration of 'a. The `VM_MIXEDMAP` flag must be set.
> + #[inline]
> + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> + unsafe { &*vma.cast() }
> + }
> +
> + /// Maps a single page at the given address within the virtual memory area.
> + ///
> + /// This operation does not take ownership of the page.
> + #[inline]
> + pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
I'm guessing this 'Result' type encodes 0 vs. -Exxx error codes?
> + // SAFETY: The caller has read access and has verified that `VM_MIXEDMAP` is set. The page
> + // is order 0. The address is checked on the C side so it can take any value.
> + to_result(unsafe { bindings::vm_insert_page(self.as_ptr(), address as _, page.as_ptr()) })
> + }
> }
It's really nice to abstract this as a seprate type and then to know its
methods only apply in known circumstances... I guess in future we can use
clever generic types when it comes to combinations of characteristics that
would otherwise result in a combinatorial explosion?
>
> /// The integer type used for vma flags.
>
> --
> 2.47.0.371.ga323438b13-goog
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu
2024-11-20 14:49 ` [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu Alice Ryhl
@ 2024-11-20 19:29 ` Lorenzo Stoakes
2024-11-21 10:44 ` Alice Ryhl
0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-20 19:29 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 02:49:58PM +0000, Alice Ryhl wrote:
> All of Rust Binder's existing calls to `vm_insert_page` could be
> optimized to first attempt to use `lock_vma_under_rcu`. This patch
> provides an abstraction to enable that.
I think there should be a blurb about what the VMA locks are, how they avoid
contention on the mmap read lock etc. before talking about a use case (though
it's useful to mention the motivating reason!)
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> rust/helpers/mm.c | 5 +++++
> rust/kernel/mm.rs | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 61 insertions(+)
>
> diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> index 7b72eb065a3e..81b510c96fd2 100644
> --- a/rust/helpers/mm.c
> +++ b/rust/helpers/mm.c
> @@ -43,3 +43,8 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> {
> return vma_lookup(mm, addr);
> }
> +
> +void rust_helper_vma_end_read(struct vm_area_struct *vma)
> +{
> + vma_end_read(vma);
> +}
> diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> index ace8e7d57afe..a15acb546f68 100644
> --- a/rust/kernel/mm.rs
> +++ b/rust/kernel/mm.rs
> @@ -13,6 +13,7 @@
> use core::{ops::Deref, ptr::NonNull};
>
> pub mod virt;
> +use virt::VmAreaRef;
>
> /// A wrapper for the kernel's `struct mm_struct`.
> ///
> @@ -170,6 +171,32 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> unsafe { &*ptr.cast() }
> }
>
> + /// Try to lock the vma read lock under rcu.
This reads oddly, I'd say 'try to acquire the VMA read lock'. It's not really
necessary to mention RCU here I'd say, as while lock_vma_under_rcu() acquires
the RCU lock in order to try to get the VMA read lock, it releases it afterwards
and you hold the VMA read luck until you are done with it and don't need to hold
an RCU lock.
A reader might otherwise be confused and think an RCU read lock is required to
be held throughout too which isn't the case (this is maybe a critique of the
name of the function too, sorry Suren :P).
> + ///
> + /// If this operation fails, the vma may still exist. In that case, you should take the mmap
> + /// read lock and try to use `vma_lookup` instead.
This also reads oddly, you're more likely (assuming you are not arbitrarily
trying to acquire a lock on an address likely to be unmapped soon) to have
failed due to lock contention.
So I'd say 'this is an optimistic try lock operation, so it may fail, in which
case you should fall back to taking the mmap read lock'.
I'm not sure it's necessary to reference vma_lookup() either, especially as in
future versions of this code we might want to use a VMA iterator instead.
> + ///
> + /// When per-vma locks are disabled, this always returns `None`.
> + #[inline]
> + pub fn lock_vma_under_rcu(&self, vma_addr: usize) -> Option<VmaReadGuard<'_>> {
Ah I love having lock guards available... Something I miss from C++ :>)
> + #[cfg(CONFIG_PER_VMA_LOCK)]
Ah interesting, so we have an abstraction for kernel config operations!
> + {
> + // SAFETY: Calling `bindings::lock_vma_under_rcu` is always okay given an mm where
> + // `mm_users` is non-zero.
> + let vma = unsafe { bindings::lock_vma_under_rcu(self.as_raw(), vma_addr as _) };
> + if !vma.is_null() {
> + return Some(VmaReadGuard {
> + // SAFETY: If `lock_vma_under_rcu` returns a non-null ptr, then it points at a
> + // valid vma. The vma is stable for as long as the vma read lock is held.
> + vma: unsafe { VmAreaRef::from_raw(vma) },
> + _nts: NotThreadSafe,
> + });
> + }
> + }
> +
> + None
> + }
> +
> /// Lock the mmap read lock.
> #[inline]
> pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> {
> @@ -238,3 +265,32 @@ fn drop(&mut self) {
> unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
> }
> }
> +
> +/// A guard for the vma read lock.
> +///
> +/// # Invariants
> +///
> +/// This `VmaReadGuard` guard owns the vma read lock.
> +pub struct VmaReadGuard<'a> {
> + vma: &'a VmAreaRef,
> + // `vma_end_read` must be called on the same thread as where the lock was taken
> + _nts: NotThreadSafe,
> +}
> +
> +// Make all `VmAreaRef` methods available on `VmaReadGuard`.
> +impl Deref for VmaReadGuard<'_> {
> + type Target = VmAreaRef;
> +
> + #[inline]
> + fn deref(&self) -> &VmAreaRef {
> + self.vma
> + }
> +}
> +
> +impl Drop for VmaReadGuard<'_> {
> + #[inline]
> + fn drop(&mut self) {
> + // SAFETY: We hold the read lock by the type invariants.
> + unsafe { bindings::vma_end_read(self.vma.as_ptr()) };
Extremely nice to know it is _guaranteed_ this will eventually be called and
that we can be sure that the VMA is valid by the fact we hold it already and
etc.
Selling me on this rust thing here... ;)
> + }
> +}
>
> --
> 2.47.0.371.ga323438b13-goog
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 5/7] mm: rust: add mmput_async support
2024-11-20 14:49 ` [PATCH v8 5/7] mm: rust: add mmput_async support Alice Ryhl
@ 2024-11-20 19:46 ` Lorenzo Stoakes
2024-11-21 11:39 ` Alice Ryhl
2024-11-21 13:04 ` Lorenzo Stoakes
0 siblings, 2 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-20 19:46 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 02:49:59PM +0000, Alice Ryhl wrote:
> Adds an MmWithUserAsync type that uses mmput_async when dropped but is
> otherwise identical to MmWithUser. This has to be done using a separate
> type because the thing we are changing is the destructor.
>
> Rust Binder needs this to avoid a certain deadlock. See commit
> 9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for
> details. It's also needed in the shrinker to avoid cleaning up the mm in
> the shrinker's context.
Oh Lord, I didn't even know this existed... I see it was (re-)added in commit
a1b2289cef92 ("android: binder: drop lru lock in isolate callback") back in 2017
so quite a history of being necessary for binder.
I see mmdrop_async(), I guess that's not needed for anything binder-ish? A quick
look in the code suggests this is invoked in free_signal_struct() and is there
due to some softirq stuff on x86... so yeah I guess not :)
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> rust/kernel/mm.rs | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 49 insertions(+)
>
> diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> index a15acb546f68..f800b6c244cd 100644
> --- a/rust/kernel/mm.rs
> +++ b/rust/kernel/mm.rs
> @@ -98,6 +98,48 @@ fn deref(&self) -> &Mm {
> }
> }
>
> +/// A wrapper for the kernel's `struct mm_struct`.
> +///
> +/// This type is identical to `MmWithUser` except that it uses `mmput_async` when dropping a
> +/// refcount. This means that the destructor of `ARef<MmWithUserAsync>` is safe to call in atomic
> +/// context.
> +///
> +/// # Invariants
> +///
> +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> +#[repr(transparent)]
> +pub struct MmWithUserAsync {
> + mm: MmWithUser,
> +}
> +
> +// SAFETY: It is safe to call `mmput_async` on another thread than where `mmget` was called.
> +unsafe impl Send for MmWithUserAsync {}
> +// SAFETY: All methods on `MmWithUserAsync` can be called in parallel from several threads.
> +unsafe impl Sync for MmWithUserAsync {}
> +
> +// SAFETY: By the type invariants, this type is always refcounted.
> +unsafe impl AlwaysRefCounted for MmWithUserAsync {
> + fn inc_ref(&self) {
> + // SAFETY: The pointer is valid since self is a reference.
> + unsafe { bindings::mmget(self.as_raw()) };
> + }
> +
> + unsafe fn dec_ref(obj: NonNull<Self>) {
> + // SAFETY: The caller is giving up their refcount.
> + unsafe { bindings::mmput_async(obj.cast().as_ptr()) };
> + }
> +}
> +
> +// Make all `MmWithUser` methods available on `MmWithUserAsync`.
> +impl Deref for MmWithUserAsync {
> + type Target = MmWithUser;
> +
> + #[inline]
> + fn deref(&self) -> &MmWithUser {
> + &self.mm
> + }
> +}
> +
> // These methods are safe to call even if `mm_users` is zero.
> impl Mm {
> /// Call `mmgrab` on `current.mm`.
> @@ -171,6 +213,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> unsafe { &*ptr.cast() }
> }
>
> + /// Use `mmput_async` when dropping this refcount.
> + #[inline]
> + pub fn use_mmput_async(me: ARef<MmWithUser>) -> ARef<MmWithUserAsync> {
Again, nitty, but I wonder if this should be as_xxx()?
But I guess this makes sense too.
> + // SAFETY: The layouts and invariants are compatible.
> + unsafe { ARef::from_raw(ARef::into_raw(me).cast()) }
> + }
> +
> /// Try to lock the vma read lock under rcu.
> ///
> /// If this operation fails, the vma may still exist. In that case, you should take the mmap
>
> --
> 2.47.0.371.ga323438b13-goog
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 6/7] mm: rust: add VmAreaNew
2024-11-20 14:50 ` [PATCH v8 6/7] mm: rust: add VmAreaNew Alice Ryhl
@ 2024-11-20 20:02 ` Lorenzo Stoakes
2024-11-21 11:47 ` Alice Ryhl
0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-20 20:02 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 02:50:00PM +0000, Alice Ryhl wrote:
> When setting up a new vma in an mmap call, you have a bunch of extra
> permissions that you normally don't have. This introduces a new
> VmAreaNew type that is used for this case.
Hm I'm confused by what you mean here. What permissions do you mean?
Is this to abstract a VMA as passed by f_op->mmap()? I think it would be
better to explicitly say this if so.
>
> To avoid setting invalid flag values, the methods for clearing
> VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error
> if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking
> the return value results in a compilation error because the `Result`
> type is marked #[must_use].
This is nice.
Though note that, it is explicitly not permitted to permit writability for
a VMA that previously had it disallowed, and we explicitly WARN_ON() this
now. Concretely that means a VMA where !(vma->vm_flags & VM_MAYWRITE), you
must not vma->vm_flags |= VM_MAYWRITE.
>
> For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When
> we add a VM_PFNMAP method, we will need some way to prevent you from
> setting both VM_MIXEDMAP and VM_PFNMAP on the same vma.
Yes this would be unwise.
An aside here - really you should _only_ change flags in this hook (perhaps
of course also initialising vma->vm_private_data state), trying to change
anything _core_ here would be deeply dangerous.
We are far too permissive with this right now, and it's something we want
to try ideally to limit in the future.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> rust/kernel/mm/virt.rs | 169 ++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 168 insertions(+), 1 deletion(-)
>
> diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> index de7f2338810a..22aff8e77854 100644
> --- a/rust/kernel/mm/virt.rs
> +++ b/rust/kernel/mm/virt.rs
> @@ -6,7 +6,7 @@
>
> use crate::{
> bindings,
> - error::{to_result, Result},
> + error::{code::EINVAL, to_result, Result},
> page::Page,
> types::Opaque,
> };
> @@ -148,6 +148,173 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
> }
> }
>
> +/// A builder for setting up a vma in an `mmap` call.
Would be better to explicitly reference the struct file_operations->mmap()
hook and to say that we should only be updating flags and vm_private_data
here (though perhaps not worth mentioning _that_ if not explicitly exposed
by your interface).
I'm guessing fields are, unless a setter/builder is established, immutable?
> +///
> +/// # Invariants
> +///
> +/// For the duration of 'a, the referenced vma must be undergoing initialization.
> +pub struct VmAreaNew {
> + vma: VmAreaRef,
> +}
> +
> +// Make all `VmAreaRef` methods available on `VmAreaNew`.
> +impl Deref for VmAreaNew {
> + type Target = VmAreaRef;
> +
> + #[inline]
> + fn deref(&self) -> &VmAreaRef {
> + &self.vma
> + }
> +}
> +
> +impl VmAreaNew {
> + /// Access a virtual memory area given a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// Callers must ensure that `vma` is undergoing initial vma setup for the duration of 'a.
> + #[inline]
> + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> + unsafe { &*vma.cast() }
> + }
> +
> + /// Internal method for updating the vma flags.
> + ///
> + /// # Safety
> + ///
> + /// This must not be used to set the flags to an invalid value.
> + #[inline]
> + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) {
> + let mut flags = self.flags();
> + flags |= set;
> + flags &= !unset;
> +
> + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet
> + // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel.
> + // The caller promises that this does not set the flags to an invalid value.
> + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags = flags };
Hm not sure if this is correct. We explicitly maintain a union in struct vm_area_struct as:
union {
const vm_flags_t vm_flags;
vm_flags_t __private __vm_flags;
};
Where vma->vm_flags is const, and then use helpers like vm_flags_init() to
set them, which also do things like assert locks (though not in the init
case, of course).
So erally we should at least be updating __vm_flags here, though I'm not
sure how bindgen treats it?
> + }
> +
> + /// Set the `VM_MIXEDMAP` flag on this vma.
> + ///
> + /// This enables the vma to contain both `struct page` and pure PFN pages. Returns a reference
> + /// that can be used to call `vm_insert_page` on the vma.
> + #[inline]
> + pub fn set_mixedmap(&self) -> &VmAreaMixedMap {
> + // SAFETY: We don't yet provide a way to set VM_PFNMAP, so this cannot put the flags in an
> + // invalid state.
> + unsafe { self.update_flags(flags::MIXEDMAP, 0) };
> +
> + // SAFETY: We just set `VM_MIXEDMAP` on the vma.
> + unsafe { VmAreaMixedMap::from_raw(self.vma.as_ptr()) }
> + }
> +
> + /// Set the `VM_IO` flag on this vma.
> + ///
> + /// This marks the vma as being a memory-mapped I/O region.
> + #[inline]
> + pub fn set_io(&self) {
> + // SAFETY: Setting the VM_IO flag is always okay.
> + unsafe { self.update_flags(flags::IO, 0) };
> + }
> +
> + /// Set the `VM_DONTEXPAND` flag on this vma.
> + ///
> + /// This prevents the vma from being expanded with `mremap()`.
> + #[inline]
> + pub fn set_dontexpand(&self) {
> + // SAFETY: Setting the VM_DONTEXPAND flag is always okay.
> + unsafe { self.update_flags(flags::DONTEXPAND, 0) };
> + }
> +
> + /// Set the `VM_DONTCOPY` flag on this vma.
> + ///
> + /// This prevents the vma from being copied on fork. This option is only permanent if `VM_IO`
> + /// is set.
> + #[inline]
> + pub fn set_dontcopy(&self) {
> + // SAFETY: Setting the VM_DONTCOPY flag is always okay.
> + unsafe { self.update_flags(flags::DONTCOPY, 0) };
> + }
> +
> + /// Set the `VM_DONTDUMP` flag on this vma.
> + ///
> + /// This prevents the vma from being included in core dumps. This option is only permanent if
> + /// `VM_IO` is set.
> + #[inline]
> + pub fn set_dontdump(&self) {
> + // SAFETY: Setting the VM_DONTDUMP flag is always okay.
> + unsafe { self.update_flags(flags::DONTDUMP, 0) };
> + }
> +
> + /// Returns whether `VM_READ` is set.
> + ///
> + /// This flag indicates whether userspace is mapping this vma as readable.
> + #[inline]
> + pub fn get_read(&self) -> bool {
> + (self.flags() & flags::READ) != 0
> + }
> +
> + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set.
> + ///
> + /// This flag indicates whether userspace is allowed to make this vma readable with
> + /// `mprotect()`.
> + #[inline]
> + pub fn try_clear_mayread(&self) -> Result {
> + if self.get_read() {
> + return Err(EINVAL);
> + }
This is quite nice! Strong(er) typing for the win, again :>)
> + // SAFETY: Clearing `VM_MAYREAD` is okay when `VM_READ` is not set.
> + unsafe { self.update_flags(0, flags::MAYREAD) };
> + Ok(())
> + }
> +
> + /// Returns whether `VM_WRITE` is set.
> + ///
> + /// This flag indicates whether userspace is mapping this vma as writable.
> + #[inline]
> + pub fn get_write(&self) -> bool {
> + (self.flags() & flags::WRITE) != 0
> + }
> +
> + /// Try to clear the `VM_MAYWRITE` flag, failing if `VM_WRITE` is set.
> + ///
> + /// This flag indicates whether userspace is allowed to make this vma writable with
> + /// `mprotect()`.
> + #[inline]
> + pub fn try_clear_maywrite(&self) -> Result {
> + if self.get_write() {
> + return Err(EINVAL);
> + }
> + // SAFETY: Clearing `VM_MAYWRITE` is okay when `VM_WRITE` is not set.
> + unsafe { self.update_flags(0, flags::MAYWRITE) };
> + Ok(())
> + }
> +
> + /// Returns whether `VM_EXEC` is set.
> + ///
> + /// This flag indicates whether userspace is mapping this vma as executable.
> + #[inline]
> + pub fn get_exec(&self) -> bool {
> + (self.flags() & flags::EXEC) != 0
> + }
> +
> + /// Try to clear the `VM_MAYEXEC` flag, failing if `VM_EXEC` is set.
> + ///
> + /// This flag indicates whether userspace is allowed to make this vma executable with
> + /// `mprotect()`.
> + #[inline]
> + pub fn try_clear_mayexec(&self) -> Result {
> + if self.get_exec() {
> + return Err(EINVAL);
> + }
> + // SAFETY: Clearing `VM_MAYEXEC` is okay when `VM_EXEC` is not set.
> + unsafe { self.update_flags(0, flags::MAYEXEC) };
> + Ok(())
> + }
> +}
> +
> /// The integer type used for vma flags.
> #[doc(inline)]
> pub use bindings::vm_flags_t;
>
> --
> 2.47.0.371.ga323438b13-goog
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 7/7] rust: miscdevice: add mmap support
2024-11-20 14:50 ` [PATCH v8 7/7] rust: miscdevice: add mmap support Alice Ryhl
@ 2024-11-20 20:07 ` Lorenzo Stoakes
2024-11-21 10:08 ` Alice Ryhl
0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-20 20:07 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 02:50:01PM +0000, Alice Ryhl wrote:
> Using the vma support introduced by the previous commit, introduce mmap
> support for miscdevices. The mmap call is given a vma that is undergoing
> initial setup, so the VmAreaNew type is used.
Again, I'd be explicit about the VMA being passed to a struct
file_operations->mmap() hook on mmap. Otherwise this seems super vague to
me!
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> rust/kernel/miscdevice.rs | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs
> index 7e2a79b3ae26..4e4b9476e092 100644
> --- a/rust/kernel/miscdevice.rs
> +++ b/rust/kernel/miscdevice.rs
> @@ -11,6 +11,7 @@
> use crate::{
> bindings,
> error::{to_result, Error, Result, VTABLE_DEFAULT_ERROR},
> + mm::virt::VmAreaNew,
> prelude::*,
> str::CStr,
> types::{ForeignOwnable, Opaque},
> @@ -110,6 +111,11 @@ fn release(device: Self::Ptr) {
> drop(device);
> }
>
> + /// Handle for mmap.
> + fn mmap(_device: <Self::Ptr as ForeignOwnable>::Borrowed<'_>, _vma: &VmAreaNew) -> Result {
You will have to help me with this :) ForeignOwnable, Borrowed<'_>, _vma I'm a
bit lost here!
> + kernel::build_error(VTABLE_DEFAULT_ERROR)
What is this? Is this not yet implemented or something, and this is a
placeholder or something?
> + }
> +
> /// Handler for ioctls.
> ///
> /// The `cmd` argument is usually manipulated using the utilties in [`kernel::ioctl`].
> @@ -156,6 +162,7 @@ impl<T: MiscDevice> VtableHelper<T> {
> const VTABLE: bindings::file_operations = bindings::file_operations {
> open: Some(fops_open::<T>),
> release: Some(fops_release::<T>),
> + mmap: maybe_fn(T::HAS_MMAP, fops_mmap::<T>),
> unlocked_ioctl: maybe_fn(T::HAS_IOCTL, fops_ioctl::<T>),
> #[cfg(CONFIG_COMPAT)]
> compat_ioctl: if T::HAS_COMPAT_IOCTL {
> @@ -216,6 +223,27 @@ impl<T: MiscDevice> VtableHelper<T> {
> 0
> }
>
> +/// # Safety
> +///
> +/// `file` must be a valid file that is associated with a `MiscDeviceRegistration<T>`.
> +/// `vma` must be a vma that is currently being mmap'ed with this file.
> +unsafe extern "C" fn fops_mmap<T: MiscDevice>(
> + file: *mut bindings::file,
> + vma: *mut bindings::vm_area_struct,
> +) -> c_int {
> + // SAFETY: The mmap call of a file can access the private data.
> + let private = unsafe { (*file).private_data };
> + // SAFETY: Mmap calls can borrow the private data of the file.
> + let device = unsafe { <T::Ptr as ForeignOwnable>::borrow(private) };
> + // SAFETY: The caller provides a vma that is undergoing initial VMA setup.
> + let area = unsafe { VmAreaNew::from_raw(vma) };
> +
> + match T::mmap(device, area) {
> + Ok(()) => 0,
> + Err(err) => err.to_errno() as c_int,
> + }
> +}
> +
> /// # Safety
> ///
> /// `file` must be a valid file that is associated with a `MiscDeviceRegistration<T>`.
>
> --
> 2.47.0.371.ga323438b13-goog
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct
2024-11-20 18:13 ` Lorenzo Stoakes
@ 2024-11-21 9:52 ` Alice Ryhl
2024-11-21 12:37 ` Lorenzo Stoakes
2024-11-22 7:42 ` Paolo Bonzini
1 sibling, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 9:52 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 7:13 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Nov 20, 2024 at 02:49:55PM +0000, Alice Ryhl wrote:
> > These abstractions allow you to reference a `struct mm_struct` using
> > both mmgrab and mmget refcounts. This is done using two Rust types:
> >
> > * Mm - represents an mm_struct where you don't know anything about the
> > value of mm_users.
> > * MmWithUser - represents an mm_struct where you know at compile time
> > that mm_users is non-zero.
> >
> > This allows us to encode in the type system whether a method requires
> > that mm_users is non-zero or not. For instance, you can always call
> > `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
> > non-zero.
>
> It's kind of interesting to represent these things this way, I like the
> self-documenting element of that.
>
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
>
> So obviously I'm not a rust person (yet... yet :) so from my side I can
> only look at things from an mm perspective conceptually. To avoid boring
> everyone I won't repeat this and instead you can take it as read.
>
> I will obviously inevitably ask a TON of questions as a result of not being
> a rust person so, bear with me...
>
> > ---
> > rust/helpers/helpers.c | 1 +
> > rust/helpers/mm.c | 39 +++++++++
> > rust/kernel/lib.rs | 1 +
> > rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 260 insertions(+)
> >
> > diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
> > index 20a0c69d5cc7..60a488eb4efe 100644
> > --- a/rust/helpers/helpers.c
> > +++ b/rust/helpers/helpers.c
> > @@ -13,6 +13,7 @@
> > #include "build_bug.c"
> > #include "err.c"
> > #include "kunit.c"
> > +#include "mm.c"
> > #include "mutex.c"
> > #include "page.c"
> > #include "rbtree.c"
> > diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> > new file mode 100644
> > index 000000000000..7201747a5d31
> > --- /dev/null
> > +++ b/rust/helpers/mm.c
> > @@ -0,0 +1,39 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/mm.h>
> > +#include <linux/sched/mm.h>
> > +
> > +void rust_helper_mmgrab(struct mm_struct *mm)
> > +{
> > + mmgrab(mm);
> > +}
> > +
> > +void rust_helper_mmdrop(struct mm_struct *mm)
> > +{
> > + mmdrop(mm);
> > +}
> > +
> > +void rust_helper_mmget(struct mm_struct *mm)
> > +{
> > + mmget(mm);
> > +}
> > +
> > +bool rust_helper_mmget_not_zero(struct mm_struct *mm)
> > +{
> > + return mmget_not_zero(mm);
> > +}
> > +
> > +void rust_helper_mmap_read_lock(struct mm_struct *mm)
> > +{
> > + mmap_read_lock(mm);
> > +}
> > +
> > +bool rust_helper_mmap_read_trylock(struct mm_struct *mm)
> > +{
> > + return mmap_read_trylock(mm);
> > +}
> > +
> > +void rust_helper_mmap_read_unlock(struct mm_struct *mm)
> > +{
> > + mmap_read_unlock(mm);
> > +}
>
> I guess at this point we're only interested in reading?
Yeah. The write lock would be very similar.
> > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > index 66f5cde7f322..cc1963510cdf 100644
> > --- a/rust/kernel/lib.rs
> > +++ b/rust/kernel/lib.rs
> > @@ -43,6 +43,7 @@
> > pub mod kunit;
> > pub mod list;
> > pub mod miscdevice;
> > +pub mod mm;
> > #[cfg(CONFIG_NET)]
> > pub mod net;
> > pub mod page;
> > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > new file mode 100644
> > index 000000000000..84cba581edaa
> > --- /dev/null
> > +++ b/rust/kernel/mm.rs
> > @@ -0,0 +1,219 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +// Copyright (C) 2024 Google LLC.
> > +
> > +//! Memory management.
> > +//!
> > +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
> > +
> > +use crate::{
> > + bindings,
> > + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
> > +};
> > +use core::{ops::Deref, ptr::NonNull};
> > +
> > +/// A wrapper for the kernel's `struct mm_struct`.
> > +///
> > +/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> > +/// [`mmget_not_zero`] to be able to access the address space.
> > +///
> > +/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
> > +///
> > +/// # Invariants
> > +///
> > +/// Values of this type are always refcounted using `mmgrab`.
> > +///
> > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> > +#[repr(transparent)]
> > +pub struct Mm {
> > + mm: Opaque<bindings::mm_struct>,
> > +}
>
> Does this tie this type to the C struct mm_struct type?
>
> Does 'Opaque' mean it is a pointer to a type which is 'opaque' in the sense
> that rust can't see into its internals?
This declaration defines a Rust type called Mm which has the same
size, alignment, and contents as `struct mm_struct`. The purpose of
Opaque is to tell Rust that it can't assume anything about the
contents at all; we do that to leave it up to C.
For example, normally if you have an immutable reference &i32, then
Rust is going to assume that the contents behind the reference are in
fact immutable. Opaque turns that off, meaning that an `&Opaque<i32>`
is allowed to reference an integer as it gets modified. It makes all
access to the contents unsafe, though.
Note that Opaque is *not* a pointer type. We're going to be dealing
with types like &Mm or ARef<Mm> where &_ and ARef<_> are two different
kinds of pointers.
> > +// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called.
> > +unsafe impl Send for Mm {}
> > +// SAFETY: All methods on `Mm` can be called in parallel from several threads.
> > +unsafe impl Sync for Mm {}
> > +
> > +// SAFETY: By the type invariants, this type is always refcounted.
> > +unsafe impl AlwaysRefCounted for Mm {
> > + #[inline]
> > + fn inc_ref(&self) {
> > + // SAFETY: The pointer is valid since self is a reference.
> > + unsafe { bindings::mmgrab(self.as_raw()) };
> > + }
> > +
> > + #[inline]
> > + unsafe fn dec_ref(obj: NonNull<Self>) {
> > + // SAFETY: The caller is giving up their refcount.
> > + unsafe { bindings::mmdrop(obj.cast().as_ptr()) };
> > + }
> > +}
>
> Under what circumstances would these be taken? Same question for MmWthUser.
This makes `Mm` compatible with the pointer type called ARef<_>. This
pointer type is used to represent ownership of a refcount. So whenever
a variable of type ARef<_> goes out of scope, the refcount is
decremented, and whenever an ARef<_> is cloned, the refcount is
incremented.
The way this works is that ARef<_> is implemented to use the
AlwaysRefCounted trait to understand how to manipulate the count. Only
types that implement the trait with an impl block like above can be
used with ARef<_>.
> > +
> > +/// A wrapper for the kernel's `struct mm_struct`.
> > +///
> > +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
> > +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
> > +/// refcount. It can be used to access the associated address space.
> > +///
> > +/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
> > +///
> > +/// # Invariants
> > +///
> > +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> > +#[repr(transparent)]
> > +pub struct MmWithUser {
> > + mm: Mm,
> > +}
>
> Why does Mm have this as a Opaque<bindings::mm_struct> and this sort of
> nests it?
>
> Does this somehow amount to the same thing, or would you probably never
> actually reference this mm field?
It amounts to the same thing as Opaque<bindings::mm_struct>.
> > +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
> > +unsafe impl Send for MmWithUser {}
> > +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
> > +unsafe impl Sync for MmWithUser {}
> > +
> > +// SAFETY: By the type invariants, this type is always refcounted.
> > +unsafe impl AlwaysRefCounted for MmWithUser {
> > + #[inline]
> > + fn inc_ref(&self) {
> > + // SAFETY: The pointer is valid since self is a reference.
> > + unsafe { bindings::mmget(self.as_raw()) };
> > + }
> > +
> > + #[inline]
> > + unsafe fn dec_ref(obj: NonNull<Self>) {
> > + // SAFETY: The caller is giving up their refcount.
> > + unsafe { bindings::mmput(obj.cast().as_ptr()) };
> > + }
>
> Hm, why is it we mmget(self.as_raw()) but mmput(obj.cast().as_ptr())?
There's one assumption about references that Opaque doesn't turn off:
The memory behind the reference must not get deallocated while the
reference exists. We can't use a reference in dec_ref because the
memory might get deallocated during the call to dec_ref.
> Also I guess relatedly, why does one refer to &self and the other as a
> NonNull<Self>?
Writing `&self` means that the "self parameter" has type `&Self`,
which in this case is the same as `&MmWithUser`.
The type `NonNull<Self>` is the same as `NonNull<MmWithUser>`. The
NonNull type is a raw pointer that can't be null. Other than the
non-null requirement, nothing is assumed about the raw pointer.
> I'm guessing as a non-rust person a 'reference' is like a C++ reference in
> the sense that (well it is _assumed_ in C++ anyway) it acts like a pointer
> for the type which should never not be there, but we need .as_raw() to get
> the raw C pointer?
Yeah, Rust references come with an assumption that the object is not
deallocated while the reference exists.
The .as_raw() call converts from &MmWithUser to `*mut
bindings::mm_struct`. So note that it not only converts from reference
to raw pointer, but it also changes the target type from MmWithUser to
bindings::mm_struct.
> And I guess in the dec_ref() case we need the .cast().as_ptr() because obj
> 'boxes' around self (I guess equivalent to 'this' in C++ kinda)
> guaranteeing that it can provide non-null pointer to the current object?
Well, the thing that is equivalent to "this" would be "self".
> > +// Make all `Mm` methods available on `MmWithUser`.
> > +impl Deref for MmWithUser {
> > + type Target = Mm;
> > +
> > + #[inline]
> > + fn deref(&self) -> &Mm {
> > + &self.mm
> > + }
> > +}
>
> I rubber ducked myself a bit on this, so I guess this makes it possible to
> dereference the object, and it
It lets you transparently obtain an &Mm from an &MmWithUser, making
all Mm methods available on MmWithUser.
> > +// These methods are safe to call even if `mm_users` is zero.
> > +impl Mm {
> > + /// Call `mmgrab` on `current.mm`.
> > + #[inline]
> > + pub fn mmgrab_current() -> Option<ARef<Mm>> {
> > + // SAFETY: It's safe to get the `mm` field from current.
> > + let mm = unsafe {
> > + let current = bindings::get_current();
> > + (*current).mm
> > + };
>
> I don't see an equivalent means of obtaining mm from current for
> MmWithUser, is that intentional, would there be another means of obtaining
> an mm? (perhaps via vma->vm_mm I guess?)
>
> An aside -->
>
> If we're grabbing from current, and this is non-NULL (i.e. not a kernel
> thread), this is kinda MmWithUser to _start out_ with, but I guess if
> you're grabbing the current one you might not expect it.
>
> I guess one thing I want to point out (maybe here is wrong place) is that
> the usual way of interacting with current->mm is that we do _not_ increment
> mm->mm_count, mm->mm_users or any refernce count, as while we are in the
> kernel portion of the task, we are guaranteed the mm and the backing
> virtual address space will stick around.
>
> With reference to MmWithUser, in fact, if you look up users of
> mmget()/mmput() it is pretty rare to do that.
>
> So ideally we'd avoid doing this if we could (though having the semantics
> of grabbing a reference if we were to copy the object somehow again or hold
> its state or something would be nice).
>
> I guess this might actually be tricky in rust, because we'd probably need
> to somehow express the current task's lifetime and tie this to that
> and... yeah.
>
> <-- aside
Ah, yeah, I guess this API is incomplete. We could have an API that
lets you obtain an &MmWithUser instead. Then, if the user wants to
increment the refcount, they can manually convert that into an
ARef<Mm> or ARef<MmWithUser>.
It's true that it's slightly tricky to express in Rust, but it's
possible. We already have a way to get a &Task pointing at current.
> > +
> > + if mm.is_null() {
> > + return None;
> > + }
> > +
> > + // SAFETY: The value of `current->mm` is guaranteed to be null or a valid `mm_struct`. We
> > + // just checked that it's not null. Furthermore, the returned `&Mm` is valid only for the
> > + // duration of this function, and `current->mm` will stay valid for that long.
> > + let mm = unsafe { Mm::from_raw(mm) };
>
> Hm does mm now reference something with a different type, as in before it
> was a 'raw' pointer or such, and now it's a reference to an Mm right?
Yes ... this is using shadowing to change the type of the variable.
It's actually rather common in Rust.
The former mm variable had type `*mut bindings::mm_struct`. The latter
mm variable has type `&Mm`.
> Also I guess the 'duration of this function' is because we put this in the
> 'Aref' smart pointer which kinda takes over the lifetime of the reference
> by wrapping it right?
Yeah, basically, the lifetime gets inferred from how we use it.
> I mean I'm not a rust person so actually I have no business _commenting_ on
> this :P as this may very well be idiomatic rust, but I'm just curious about
> this.
>
> It's nitty but I feel like maybe we're somewhat overloading 'mm's here a
> bit though? As we have our wrapped Mm type and then an internal raw mm
> type. On the other hand, it's hard to now have horribly awkward and
> confusing naming here I guess, and perhaps this is fine.
>
> Not a big deal though!
>
> > +
> > + // This increments the refcount using `mmgrab`.
> > + Some(ARef::from(mm))
>
> So I get that Some() means this is like a discriminated union or such,
> where we can return None (as above) or Some, which contains the value is of
> type ARef<Mm>. And I guess this moves the 'lifetime' of mm which was
> previously with the function into that of the ARef<>?
Yes, Some() is normally a discriminated union, but in this particular
case since ARef uses a NonNull pointer for its only field, the
compiler is smart enough to represent None as a null pointer instead
of using a discriminated union with a separate tag.
As for the "lifetime" of the `mm`, that's not really the lifetime of
the mm_struct. Rather, it's the duration for which the &Mm reference
is assumed to be valid for. That lifetime ends right after the
ARef::from call, because that's our last use of the &Mm.
> Does the ARef<> 'just know' to use the AlwaysRefCounted methods?
Yep! See e.g. the destructor of ARef:
impl<T: AlwaysRefCounted> Drop for ARef<T> {
fn drop(&mut self) {
unsafe { T::dec_ref(self.ptr) };
}
}
Due to the `T: AlwaysRefCounted`, the type `T` must be a type that
implements the `AlwaysRefCounted` trait, and the compiler is able to
resolve T::dec_ref based on that.
> > + }
> > +
> > + /// Returns a raw pointer to the inner `mm_struct`.
> > + #[inline]
> > + pub fn as_raw(&self) -> *mut bindings::mm_struct {
> > + self.mm.get()
> > + }
>
> I guess this .get() method is on the Opaque<> object and returns a raw ptr?
Yes, it just creates a raw pointer.
> > + /// Obtain a reference from a raw pointer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated
> > + /// during the lifetime 'a.
> > + #[inline]
> > + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm {
>
> I'm guessing this funny 'a syntax, based on the comment, refers to the
> lifetime of the object?
It's a lifetime, but not necessarily the lifetime of the object.
Rather, it's the maximum duration in which the reference is assumed to
be valid. It must not be longer than the lifetime of the mm_struct of
course, but it's usually going to be much shorter than the lifetime of
the mm_struct.
> > + // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to
> > + // repr(transparent).
>
> God I love these SAFETY comments (I mean actually, sorry I realise it's
> almost impossible to convey 'not sarcastically, actually' in text form
> :). Is that something that gets parsed somewhere, or a convention or?
>
> I like that there is a discipline of expressing under what circumstances we
> are permitted to reference things.
They don't get parsed anywhere, except that not using a SAFETY comment
at all is a compilation warning.
> > + unsafe { &*ptr.cast() }
> > + }
> > +
> > + /// Calls `mmget_not_zero` and returns a handle if it succeeds.
> > + #[inline]
> > + pub fn mmget_not_zero(&self) -> Option<ARef<MmWithUser>> {
>
> I actually kinda love that this takes an mm and guarantees that you get an
> MmWithUser out of it which is implied by the fact this succeeds.
>
> However as to the point above, I'm concerned that this might be seen as
> 'the way' to access an mm, i.e. mm.mmgrab_current().mmget_not_zero() or
> something.
>
> Whereas, the usual way of referencing current->mm is to not increment any
> reference counts at all (assuming what you are doing resides in the same
> lifetime as the task).
>
> Obviously if you step outside of that lifetime, then you _do_ have to pin
> the mm (nearly always you want to grab rather than get though in that
> circumstance).
I can add a way to obtain an &MmWithUser from current without
incrementing the refcount.
> > + // SAFETY: The pointer is valid since self is a reference.
> > + let success = unsafe { bindings::mmget_not_zero(self.as_raw()) };
> > +
> > + if success {
> > + // SAFETY: We just created an `mmget` refcount.
> > + Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) })
>
> When you do this, does it cause the reference count to increment, or does
> it assume it's already at 1?
This uses `from_raw` which by convention never increments the
refcount. Semantically we're taking ownership of the increment
performed by bindings::mmget_not_zero.
> > + } else {
> > + None
> > + }
> > + }
> > +}
> > +
> > +// These methods require `mm_users` to be non-zero.
> > +impl MmWithUser {
> > + /// Obtain a reference from a raw pointer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains
> > + /// non-zero for the duration of the lifetime 'a.
> > + #[inline]
> > + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> > + // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due
> > + // to repr(transparent).
> > + unsafe { &*ptr.cast() }
> > + }
>
> I guess this is another means by which you can get the mm. I'd say an
> equivalent for getting from current is highly relevant.
This lets you write MmWithUser::from_raw in unsafe code.
> > +
> > + /// Lock the mmap read lock.
> > + #[inline]
> > + pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> {
> > + // SAFETY: The pointer is valid since self is a reference.
> > + unsafe { bindings::mmap_read_lock(self.as_raw()) };
> > +
> > + // INVARIANT: We just acquired the read lock.
> > + MmapReadGuard {
> > + mm: self,
> > + _nts: NotThreadSafe,
>
> I'm sure this is a rusty thing, but curious as to why this is like that?
> What is '_nts', etc.?
_nts is the name of a field. The NotThreadSafe type has size zero, so
it doesn't exist in the compiled code. It exists only to mark that the
MmapReadGuard cannot be transferred across thread boundaries.
> > + }
> > + }
> > +
> > + /// Try to lock the mmap read lock.
> > + #[inline]
> > + pub fn mmap_read_trylock(&self) -> Option<MmapReadGuard<'_>> {
> > + // SAFETY: The pointer is valid since self is a reference.
> > + let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) };
> > +
> > + if success {
> > + // INVARIANT: We just acquired the read lock.
> > + Some(MmapReadGuard {
> > + mm: self,
> > + _nts: NotThreadSafe,
> > + })
> > + } else {
> > + None
> > + }
> > + }
> > +}
> > +
> > +/// A guard for the mmap read lock.
> > +///
> > +/// # Invariants
> > +///
> > +/// This `MmapReadGuard` guard owns the mmap read lock.
> > +pub struct MmapReadGuard<'a> {
> > + mm: &'a MmWithUser,
> > + // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread
> > + _nts: NotThreadSafe,
> > +}
> > +
> > +impl Drop for MmapReadGuard<'_> {
> > + #[inline]
> > + fn drop(&mut self) {
> > + // SAFETY: We hold the read lock by the type invariants.
> > + unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
> > + }
> > +}
>
> Ah that's nice, an actual guard for it :) I'm guessing the fact this
> implements the guard implies that you _must_ hold the lock first right?
Yeah so this code runs when a variable of type MmapReadGuard goes out
of scope. We don't provide any way to obtain an MmapReadGuard without
locking the mmap read guard.
> >
> > --
> > 2.47.0.371.ga323438b13-goog
> >
>
> Sorry for the numerous questions, I'm afraid there'll be a lot of this for
> rust stuff for the time being. Perhaps advent of code will help improve
> things on the rust front for me ;)
Thanks for all the excellent questions!
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 7/7] rust: miscdevice: add mmap support
2024-11-20 20:07 ` Lorenzo Stoakes
@ 2024-11-21 10:08 ` Alice Ryhl
2024-11-21 13:11 ` Lorenzo Stoakes
0 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 10:08 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 9:07 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Nov 20, 2024 at 02:50:01PM +0000, Alice Ryhl wrote:
> > Using the vma support introduced by the previous commit, introduce mmap
> > support for miscdevices. The mmap call is given a vma that is undergoing
> > initial setup, so the VmAreaNew type is used.
>
> Again, I'd be explicit about the VMA being passed to a struct
> file_operations->mmap() hook on mmap. Otherwise this seems super vague to
> me!
I wasn't sure if vmas could be constructed in any context other than
mmap. Will reword.
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/kernel/miscdevice.rs | 28 ++++++++++++++++++++++++++++
> > 1 file changed, 28 insertions(+)
> >
> > diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs
> > index 7e2a79b3ae26..4e4b9476e092 100644
> > --- a/rust/kernel/miscdevice.rs
> > +++ b/rust/kernel/miscdevice.rs
> > @@ -11,6 +11,7 @@
> > use crate::{
> > bindings,
> > error::{to_result, Error, Result, VTABLE_DEFAULT_ERROR},
> > + mm::virt::VmAreaNew,
> > prelude::*,
> > str::CStr,
> > types::{ForeignOwnable, Opaque},
> > @@ -110,6 +111,11 @@ fn release(device: Self::Ptr) {
> > drop(device);
> > }
> >
> > + /// Handle for mmap.
> > + fn mmap(_device: <Self::Ptr as ForeignOwnable>::Borrowed<'_>, _vma: &VmAreaNew) -> Result {
>
> You will have to help me with this :) ForeignOwnable, Borrowed<'_>, _vma I'm a
> bit lost here!
When you implement the Miscdevice for your own type, you write a block
like this one:
impl Miscdevice for MyType {
type Ptr = Arc<MyType>;
...
}
Here Arc is a pointer type (very similar to ARef) that represents
ownership over a refcount to a refcounted value.
In this case:
* Self refers to MyType.
* Self::Ptr refers to Arc<MyType>
* <Self::Ptr as ForeignOwnable>::Borrowed<'_> refers to ArcBorrow<MyType>
The last step is resolved to ArcBorrow<MyType> because of this impl
block in rust/kernel/sync/arc.rs:
impl<T: 'static> ForeignOwnable for Arc<T> {
type Borrowed<'a> = ArcBorrow<'a, T>;
...
}
Note that Self::Ptr is short-hand for <Self as Miscdevice>::Ptr.
> > + kernel::build_error(VTABLE_DEFAULT_ERROR)
>
> What is this? Is this not yet implemented or something, and this is a
> placeholder or something?
It's a build-time assertion that this function is dead-code eliminated.
There are two cases:
* Either the driver does not override mmap. In this case, we store a
null pointer in the fops table.
* Or the driver overrides mmap with their own implementation. In this
case, we store a function pointer to whichever implementation they
provided.
In neither case is the above function present anywhere in the final executable.
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access
2024-11-20 19:07 ` Lorenzo Stoakes
@ 2024-11-21 10:23 ` Alice Ryhl
2024-11-21 12:50 ` Lorenzo Stoakes
0 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 10:23 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 8:07 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Nov 20, 2024 at 02:49:56PM +0000, Alice Ryhl wrote:
> > This adds a type called VmAreaRef which is used when referencing a vma
> > that you have read access to. Here, read access means that you hold
> > either the mmap read lock or the vma read lock (or stronger).
>
> This is pure nit and not important but...
>
> I know we screwed up naming in mm, with the god awful vm_area_struct (we
> abbreviate, for 2 letters, get bored, stop abbreviating, but throw in a
> struct for a laugh) or 'VMA', but I wonder if this would be better as
> VmaRef? Then again that's kinda terrible too.
>
> Sorry about that. I mean this isn't hugely important + ultimately _our
> fault_.
>
> VirtualMemoryAreaRef is worse... VirtMemAreaRef? OK. Maybe VMAreaRef is the
> least bad...
Happy to use whichever name you prefer. :)
> > Additionally, a vma_lookup method is added to the mmap read guard, which
> > enables you to obtain a &VmAreaRef in safe Rust code.
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/helpers/mm.c | 6 ++
> > rust/kernel/mm.rs | 21 ++++++
> > rust/kernel/mm/virt.rs | 171 +++++++++++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 198 insertions(+)
> >
> > diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> > index 7201747a5d31..7b72eb065a3e 100644
> > --- a/rust/helpers/mm.c
> > +++ b/rust/helpers/mm.c
> > @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm)
> > {
> > mmap_read_unlock(mm);
> > }
> > +
> > +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> > + unsigned long addr)
> > +{
> > + return vma_lookup(mm, addr);
> > +}
>
> I wonder whether we want to abstract some of the VMA iterator stuff,
> because it's inefficient to look up VMAs by doing this each time if you are
> looking at, for instance, adjacent VMAs.
>
> But perhaps reasonable to defer doing that to later work?
Yeah, that should probably be deferred. Binder only has one vma per
mm, so iteration is not useful.
> > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > index 84cba581edaa..ace8e7d57afe 100644
> > --- a/rust/kernel/mm.rs
> > +++ b/rust/kernel/mm.rs
> > @@ -12,6 +12,8 @@
> > };
> > use core::{ops::Deref, ptr::NonNull};
> >
> > +pub mod virt;
> > +
> > /// A wrapper for the kernel's `struct mm_struct`.
> > ///
> > /// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> > @@ -210,6 +212,25 @@ pub struct MmapReadGuard<'a> {
> > _nts: NotThreadSafe,
> > }
> >
> > +impl<'a> MmapReadGuard<'a> {
> > + /// Look up a vma at the given address.
> > + #[inline]
> > + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {
>
> Kind of lovely to have functions under the read guard and then knowing that
> hey - we must hold the read lock to be able to do this.
>
> Imagine, a programming language with actual types... :P
:)
> > + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
> > + // for `vma_addr`.
> > + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
>
> Why do we say 'as _' here?
This is an integer cast where the target type of the cast is inferred
by the compiler. It will go away once this lands:
https://lore.kernel.org/rust-for-linux/20240913213041.395655-1-gary@garyguo.net/
> > +
> > + if vma.is_null() {
> > + None
> > + } else {
> > + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
> > + // the returned area will borrow from this read lock guard, so it can only be used
> > + // while the mmap read lock is still held.
>
> Lovely!
>
> > + unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
> > + }
> > + }
> > +}
> > +
> > impl Drop for MmapReadGuard<'_> {
> > #[inline]
> > fn drop(&mut self) {
> > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > new file mode 100644
> > index 000000000000..1e755dca46dd
> > --- /dev/null
> > +++ b/rust/kernel/mm/virt.rs
> > @@ -0,0 +1,171 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +// Copyright (C) 2024 Google LLC.
> > +
> > +//! Virtual memory.
>
> Trivial rust q again but does this result in a heading in generated docs or
> something?
Yes, this becomes module documentation. Check out:
https://rust.docs.kernel.org/kernel/
or try out `make LLVM=1 rustdoc`
> > +use crate::{bindings, types::Opaque};
> > +
> > +/// A wrapper for the kernel's `struct vm_area_struct` with read access.
> > +///
> > +/// It represents an area of virtual memory.
> > +///
> > +/// # Invariants
> > +///
> > +/// The caller must hold the mmap read lock or the vma read lock.
>
> Might be worth mentioning here that you have yet to abstract the vma lock?
I do that in the next patch.
> > +#[repr(transparent)]
> > +pub struct VmAreaRef {
> > + vma: Opaque<bindings::vm_area_struct>,
> > +}
> > +
> > +// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no
> > +// matter what the vma flags are.
>
> typo: 'or strong' -> 'or stronger'.
>
> Nitty, but perhaps say 'Methods must be usable' rather than 'they' to be
> totally clear.
Will reword.
> > +impl VmAreaRef {
> > + /// Access a virtual memory area given a raw pointer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> > + /// (or stronger) is held for at least the duration of 'a.
>
> Well, VMA locks make this more complicated, in that we can just hold a VMA
> lock. But again, perhaps worth doing this incrementally and just talk about
> mmap locks to start with.
>
> However since we reference VMA locks elsewhere, we should reference them
> here (and probably worth mentioning that they are not yet abstracted).
Oh I forgot to update this, it should say "mmap or vma read lock".
> > + #[inline]
> > + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> > + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> > + unsafe { &*vma.cast() }
> > + }
> > +
> > + /// Returns a raw pointer to this area.
> > + #[inline]
> > + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
> > + self.vma.get()
> > + }
> > +
> > + /// Returns the flags associated with the virtual memory area.
> > + ///
> > + /// The possible flags are a combination of the constants in [`flags`].
> > + #[inline]
> > + pub fn flags(&self) -> vm_flags_t {
> > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > + // access is not a data race.
> > + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }
>
> Hm what's up with this __bindgen_anon_N stuff?
Whenever you have a `struct { ... }` or `union { ... }` in the middle
of a struct, bindgen generates additional types for them because Rust
doesn't have a direct equivalent.
> > + }
> > +
> > + /// Returns the start address of the virtual memory area.
> > + #[inline]
> > + pub fn start(&self) -> usize {
>
> Is usize guranteed to be equivalent to unsigned long?
They are guaranteed to have the same size, yes.
But again, right now `unsigned long` is being translated to whichever
one of u32 or u64 has the same size as usize, instead of just being
translated to usize. Thus the annoying casts. We can get rid of them
as soon as
https://lore.kernel.org/rust-for-linux/20240913213041.395655-1-gary@garyguo.net/
lands.
> > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > + // access is not a data race.
> > + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
> > + }
> > +
> > + /// Returns the end address of the virtual memory area.
>
> Worth mentioning that it's an _exclusive_ end.
Sure, I'll add that.
> > + #[inline]
> > + pub fn end(&self) -> usize {
> > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > + // access is not a data race.
> > + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
> > + }
> > +
> > + /// Unmap pages in the given page range.
>
> This needs some more description, as 'unmapping' pages is unfortunately an
> overloaded term in the kernel and this very much might confuse people as
> opposed to e.g. munmap()'ing a range.
>
> I'd say something like 'clear page table mappings for the range at the leaf
> level, leaving all other page tables intact, freeing any memory referenced
> by the VMA in this range (anonymous memory is completely freed, file-backed
> memory has its reference count on page cache folio's dropped, any dirty
> data will still be written back to disk as usual)'.
Sure, will add that to the docs.
> > + #[inline]
> > + pub fn zap_page_range_single(&self, address: usize, size: usize) {
> > + // SAFETY: By the type invariants, the caller has read access to this VMA, which is
> > + // sufficient for this method call. This method has no requirements on the vma flags. Any
> > + // value of `address` and `size` is allowed.
> > + unsafe {
> > + bindings::zap_page_range_single(
>
> Hm weirdly I see this in rust/bindings/bindings_generated.rs but not in
> rust/helpers/mm.c is this intended?
>
> Is this meant to be generated _from_ somewhere? Is something missing for
> that?
The bindings_generated.rs file is generated at built-time from C
headers. The zap_page_range_single is a real function, not a fake
static inline header-only function, so bindgen is able to generate it
without anything in rust/helpers.
> > + self.as_ptr(),
> > + address as _,
> > + size as _,
> > + core::ptr::null_mut(),
> > + )
> > + };
> > + }
> > +}
> > +
> > +/// The integer type used for vma flags.
> > +#[doc(inline)]
> > +pub use bindings::vm_flags_t;
>
> Where do you declare this type?
It's declared in include/linux/mm_types.h
> > +
> > +/// All possible flags for [`VmAreaRef`].
>
> Well you've not specified all as they're some speciailist ones and ones
> that depend on config flags, so maybe worth just saying 'core' flags?
Sure.
> > +pub mod flags {
>
> Pure rust noobie question (again...) but what is a 'mod'?
It's a module.
> > + use super::vm_flags_t;
> > + use crate::bindings;
> > +
> > + /// No flags are set.
> > + pub const NONE: vm_flags_t = bindings::VM_NONE as _;
>
> This is strictly not a flag, as is the 0 value (and is used 'cleverly' in
> kernel code when we have flags that are defined under some circumstances
> but not others, where we can then assign these values to VM_NONE if not
> available, ensuring all |= ... operations are no-ops and all &
> ... expressions evaluate to false) but I guess it doesn't matter all too
> much right?
Ultimately it's just a bunch of integer constants. We can include or
exclude whichever ones we prefer.
> > + /// Mapping allows reads.
> > + pub const READ: vm_flags_t = bindings::VM_READ as _;
> > +
> > + /// Mapping allows writes.
> > + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _;
> > +
> > + /// Mapping allows execution.
> > + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _;
> > +
> > + /// Mapping is shared.
> > + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _;
> > +
> > + /// Mapping may be updated to allow reads.
> > + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _;
> > +
> > + /// Mapping may be updated to allow writes.
> > + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _;
> > +
> > + /// Mapping may be updated to allow execution.
> > + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _;
> > +
> > + /// Mapping may be updated to be shared.
> > + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _;
> > +
> > + /// Page-ranges managed without `struct page`, just pure PFN.
> > + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _;
> > +
> > + /// Memory mapped I/O or similar.
> > + pub const IO: vm_flags_t = bindings::VM_IO as _;
> > +
> > + /// Do not copy this vma on fork.
> > + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _;
> > +
> > + /// Cannot expand with mremap().
> > + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _;
> > +
> > + /// Lock the pages covered when they are faulted in.
> > + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _;
> > +
> > + /// Is a VM accounted object.
> > + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _;
> > +
> > + /// should the VM suppress accounting.
> > + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _;
> > +
> > + /// Huge TLB Page VM.
> > + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _;
> > +
> > + /// Synchronous page faults.
>
> Might be worth mentioning that this is DAX-specific only.
Will add.
> > + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _;
> > +
> > + /// Architecture-specific flag.
> > + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _;
> > +
> > + /// Wipe VMA contents in child..
>
> Typo '..' - also probably worth saying 'wipe VMA contents in child on
> fork'.
Will add.
> > + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _;
> > +
> > + /// Do not include in the core dump.
> > + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _;
> > +
> > + /// Not soft dirty clean area.
> > + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _;
> > +
> > + /// Can contain `struct page` and pure PFN pages.
> > + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _;
> > +
> > + /// MADV_HUGEPAGE marked this vma.
> > + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _;
> > +
> > + /// MADV_NOHUGEPAGE marked this vma.
> > + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _;
> > +
> > + /// KSM may merge identical pages.
> > + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _;
> > +}
> >
>
> I'd say these comments are a bit sparse and need more detail, but they are
> literally comments from include/linux/mm.h and therefore, again, our fault
> so this is fine :)
Happy to add more if you tell me what you'd like to see. ;)
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 3/7] mm: rust: add vm_insert_page
2024-11-20 19:20 ` Lorenzo Stoakes
@ 2024-11-21 10:38 ` Alice Ryhl
2024-11-21 12:55 ` Lorenzo Stoakes
0 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 10:38 UTC (permalink / raw)
To: Lorenzo Stoakes, Jann Horn
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 8:20 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Nov 20, 2024 at 02:49:57PM +0000, Alice Ryhl wrote:
> > The vm_insert_page method is only usable on vmas with the VM_MIXEDMAP
> > flag, so we introduce a new type to keep track of such vmas.
>
> Worth mentioning that it can be used for VMAs without this flag set, but if
> so we must hold a _write_ lock to be able to do so, so it can update the
> flag itself, however we intend only to use it with VMAs which already have
> this flag set.
I got the impression from Jann that the automatically-set-VM_MIXEDMAP
thing should only be used during mmap, and that VM_MIXEDMAP should not
get set after initialization even with a write lock?
https://lore.kernel.org/all/CAG48ez3gXicVYXiPsQDmYuPSsKMbES2KRQDk+0ANWSS0zDkqSw@mail.gmail.com/
> > The approach used in this patch assumes that we will not need to encode
> > many flag combinations in the type. I don't think we need to encode more
> > than VM_MIXEDMAP and VM_PFNMAP as things are now. However, if that
> > becomes necessary, using generic parameters in a single type would scale
> > better as the number of flags increases.
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/kernel/mm/virt.rs | 68 +++++++++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 67 insertions(+), 1 deletion(-)
> >
> > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > index 1e755dca46dd..de7f2338810a 100644
> > --- a/rust/kernel/mm/virt.rs
> > +++ b/rust/kernel/mm/virt.rs
> > @@ -4,7 +4,14 @@
> >
> > //! Virtual memory.
> >
> > -use crate::{bindings, types::Opaque};
> > +use crate::{
> > + bindings,
> > + error::{to_result, Result},
> > + page::Page,
> > + types::Opaque,
> > +};
> > +
> > +use core::ops::Deref;
> >
> > /// A wrapper for the kernel's `struct vm_area_struct` with read access.
> > ///
> > @@ -80,6 +87,65 @@ pub fn zap_page_range_single(&self, address: usize, size: usize) {
> > )
> > };
> > }
> > +
> > + /// Check whether the `VM_MIXEDMAP` flag is set.
> > + #[inline]
> > + pub fn check_mixedmap(&self) -> Option<&VmAreaMixedMap> {
>
> Nitty + a little bikesheddy (this may be my mistake as I am unfamiliar with
> rust idioms also) but shouldn't this be 'as_mixedmap_vma()' or something?
You're probably right that this name is more consistent with Rust
naming conventions. :)
> > + if self.flags() & flags::MIXEDMAP != 0 {
> > + // SAFETY: We just checked that `VM_MIXEDMAP` is set. All other requirements are
> > + // satisfied by the type invariants of `VmAreaRef`.
> > + Some(unsafe { VmAreaMixedMap::from_raw(self.as_ptr()) })
> > + } else {
> > + None
> > + }
> > + }
> > +}
> > +
> > +/// A wrapper for the kernel's `struct vm_area_struct` with read access and `VM_MIXEDMAP` set.
> > +///
> > +/// It represents an area of virtual memory.
> > +///
> > +/// # Invariants
> > +///
> > +/// The caller must hold the mmap read lock or the vma read lock. The `VM_MIXEDMAP` flag must be
> > +/// set.
> > +#[repr(transparent)]
> > +pub struct VmAreaMixedMap {
> > + vma: VmAreaRef,
> > +}
> > +
> > +// Make all `VmAreaRef` methods available on `VmAreaMixedMap`.
> > +impl Deref for VmAreaMixedMap {
> > + type Target = VmAreaRef;
> > +
> > + #[inline]
> > + fn deref(&self) -> &VmAreaRef {
> > + &self.vma
> > + }
> > +}
>
> Ah OK, thinking back to the 'impl Deref' from the other patch, I guess this
> allows you to deref VmAreaMixedMap as a VmAreaRef, does it all sort of
> automagically merge methods for you as if they were mix-ins then?
Yeah, I use it to "merge" the method sets to avoid duplication.
The main limitation is that you can only deref to one other type, so
you can't have "diamond inheritance".
> > +impl VmAreaMixedMap {
> > + /// Access a virtual memory area given a raw pointer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> > + /// (or stronger) is held for at least the duration of 'a. The `VM_MIXEDMAP` flag must be set.
> > + #[inline]
> > + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> > + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> > + unsafe { &*vma.cast() }
> > + }
> > +
> > + /// Maps a single page at the given address within the virtual memory area.
> > + ///
> > + /// This operation does not take ownership of the page.
> > + #[inline]
> > + pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
>
> I'm guessing this 'Result' type encodes 0 vs. -Exxx error codes?
In this particular case, yes.
More generally, Result is a discriminated union containing either
Ok(val) for success or Err(err) with some error type. But the default
success type is the unit type () and the default error type is
kernel::error::Error which corresponds to errno numbers.
In this case, the compiler is clever enough to not use a discriminated
union and instead represents Ok(()) as 0 and Err(err) as just the
negative errno. This works since kernel::error::Error uses NonZeroI32
as its only field (as of 6.13).
> > + // SAFETY: The caller has read access and has verified that `VM_MIXEDMAP` is set. The page
> > + // is order 0. The address is checked on the C side so it can take any value.
> > + to_result(unsafe { bindings::vm_insert_page(self.as_ptr(), address as _, page.as_ptr()) })
> > + }
> > }
>
> It's really nice to abstract this as a seprate type and then to know its
> methods only apply in known circumstances... I guess in future we can use
> clever generic types when it comes to combinations of characteristics that
> would otherwise result in a combinatorial explosion?
Yeah, so the alternate approach I mention in the commit message would
be to have something like this:
struct VmAreaRef<const MIXEDMAP: bool, const PFNMAP: bool, const
MAYWRITE: bool> { ... }
listing all the flags we care about. This way, we get 2^3 different
types by writing just one.
(There are a few different variations on how to do this, instead of
bools, we may want to allow three options: true, false, unknown.)
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu
2024-11-20 19:29 ` Lorenzo Stoakes
@ 2024-11-21 10:44 ` Alice Ryhl
2024-11-21 12:59 ` Lorenzo Stoakes
0 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 10:44 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 8:29 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Nov 20, 2024 at 02:49:58PM +0000, Alice Ryhl wrote:
> > All of Rust Binder's existing calls to `vm_insert_page` could be
> > optimized to first attempt to use `lock_vma_under_rcu`. This patch
> > provides an abstraction to enable that.
>
> I think there should be a blurb about what the VMA locks are, how they avoid
> contention on the mmap read lock etc. before talking about a use case (though
> it's useful to mention the motivating reason!)
>
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/helpers/mm.c | 5 +++++
> > rust/kernel/mm.rs | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 61 insertions(+)
> >
> > diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> > index 7b72eb065a3e..81b510c96fd2 100644
> > --- a/rust/helpers/mm.c
> > +++ b/rust/helpers/mm.c
> > @@ -43,3 +43,8 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> > {
> > return vma_lookup(mm, addr);
> > }
> > +
> > +void rust_helper_vma_end_read(struct vm_area_struct *vma)
> > +{
> > + vma_end_read(vma);
> > +}
> > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > index ace8e7d57afe..a15acb546f68 100644
> > --- a/rust/kernel/mm.rs
> > +++ b/rust/kernel/mm.rs
> > @@ -13,6 +13,7 @@
> > use core::{ops::Deref, ptr::NonNull};
> >
> > pub mod virt;
> > +use virt::VmAreaRef;
> >
> > /// A wrapper for the kernel's `struct mm_struct`.
> > ///
> > @@ -170,6 +171,32 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> > unsafe { &*ptr.cast() }
> > }
> >
> > + /// Try to lock the vma read lock under rcu.
>
> This reads oddly, I'd say 'try to acquire the VMA read lock'. It's not really
> necessary to mention RCU here I'd say, as while lock_vma_under_rcu() acquires
> the RCU lock in order to try to get the VMA read lock, it releases it afterwards
> and you hold the VMA read luck until you are done with it and don't need to hold
> an RCU lock.
>
> A reader might otherwise be confused and think an RCU read lock is required to
> be held throughout too which isn't the case (this is maybe a critique of the
> name of the function too, sorry Suren :P).
>
> > + /// If this operation fails, the vma may still exist. In that case, you should take the mmap
> > + /// read lock and try to use `vma_lookup` instead.
>
> This also reads oddly, you're more likely (assuming you are not arbitrarily
> trying to acquire a lock on an address likely to be unmapped soon) to have
> failed due to lock contention.
>
> So I'd say 'this is an optimistic try lock operation, so it may fail, in which
> case you should fall back to taking the mmap read lock'.
>
> I'm not sure it's necessary to reference vma_lookup() either, especially as in
> future versions of this code we might want to use a VMA iterator instead.
Thanks for the doc suggestions, they sound great.
> > + ///
> > + /// When per-vma locks are disabled, this always returns `None`.
> > + #[inline]
> > + pub fn lock_vma_under_rcu(&self, vma_addr: usize) -> Option<VmaReadGuard<'_>> {
>
> Ah I love having lock guards available... Something I miss from C++ :>)
I've heard that C is starting to get lock guards recently!
> > + #[cfg(CONFIG_PER_VMA_LOCK)]
>
> Ah interesting, so we have an abstraction for kernel config operations!
Yeah, it's basically an #ifdef, but the block must still parse even if
the config is disabled.
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 5/7] mm: rust: add mmput_async support
2024-11-20 19:46 ` Lorenzo Stoakes
@ 2024-11-21 11:39 ` Alice Ryhl
2024-11-21 13:04 ` Lorenzo Stoakes
2024-11-21 13:04 ` Lorenzo Stoakes
1 sibling, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 11:39 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 8:47 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Nov 20, 2024 at 02:49:59PM +0000, Alice Ryhl wrote:
> > Adds an MmWithUserAsync type that uses mmput_async when dropped but is
> > otherwise identical to MmWithUser. This has to be done using a separate
> > type because the thing we are changing is the destructor.
> >
> > Rust Binder needs this to avoid a certain deadlock. See commit
> > 9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for
> > details. It's also needed in the shrinker to avoid cleaning up the mm in
> > the shrinker's context.
>
> Oh Lord, I didn't even know this existed... I see it was (re-)added in commit
> a1b2289cef92 ("android: binder: drop lru lock in isolate callback") back in 2017
> so quite a history of being necessary for binder.
>
> I see mmdrop_async(), I guess that's not needed for anything binder-ish? A quick
> look in the code suggests this is invoked in free_signal_struct() and is there
> due to some softirq stuff on x86... so yeah I guess not :)
I didn't know it was so binder-specific. I assumed it would be a
relatively common use-case.
> > // These methods are safe to call even if `mm_users` is zero.
> > impl Mm {
> > /// Call `mmgrab` on `current.mm`.
> > @@ -171,6 +213,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> > unsafe { &*ptr.cast() }
> > }
> >
> > + /// Use `mmput_async` when dropping this refcount.
> > + #[inline]
> > + pub fn use_mmput_async(me: ARef<MmWithUser>) -> ARef<MmWithUserAsync> {
>
> Again, nitty, but I wonder if this should be as_xxx()?
>
> But I guess this makes sense too.
In this case, the as_ prefix is incorrect because this is an owned ->
owned conversion. See the API guidelines:
https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv
If we wish to use a prefix, the correct prefix is into_.
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 6/7] mm: rust: add VmAreaNew
2024-11-20 20:02 ` Lorenzo Stoakes
@ 2024-11-21 11:47 ` Alice Ryhl
2024-11-21 13:08 ` Lorenzo Stoakes
0 siblings, 1 reply; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 11:47 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 9:02 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Wed, Nov 20, 2024 at 02:50:00PM +0000, Alice Ryhl wrote:
> > When setting up a new vma in an mmap call, you have a bunch of extra
> > permissions that you normally don't have. This introduces a new
> > VmAreaNew type that is used for this case.
>
> Hm I'm confused by what you mean here. What permissions do you mean?
I just mean that there are additional things you can do, e.g. you can
set VM_MIXEDMAP.
> Is this to abstract a VMA as passed by f_op->mmap()? I think it would be
> better to explicitly say this if so.
Yeah, the VmAreaNew type is specifically for f_op->mmap(). I'll be
more explicit about that.
> > To avoid setting invalid flag values, the methods for clearing
> > VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error
> > if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking
> > the return value results in a compilation error because the `Result`
> > type is marked #[must_use].
>
> This is nice.
>
> Though note that, it is explicitly not permitted to permit writability for
> a VMA that previously had it disallowed, and we explicitly WARN_ON() this
> now. Concretely that means a VMA where !(vma->vm_flags & VM_MAYWRITE), you
> must not vma->vm_flags |= VM_MAYWRITE.
Got it. The API here doesn't allow that, but good to know we can't add
it in the future.
> > For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When
> > we add a VM_PFNMAP method, we will need some way to prevent you from
> > setting both VM_MIXEDMAP and VM_PFNMAP on the same vma.
>
> Yes this would be unwise.
>
> An aside here - really you should _only_ change flags in this hook (perhaps
> of course also initialising vma->vm_private_data state), trying to change
> anything _core_ here would be deeply dangerous.
>
> We are far too permissive with this right now, and it's something we want
> to try ideally to limit in the future.
The previous version just had a function called "set_flags" that could
be used to set any flags you want. Given Jann's feedback, I had
restricted it to only allow certain flag changes.
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/kernel/mm/virt.rs | 169 ++++++++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 168 insertions(+), 1 deletion(-)
> >
> > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > index de7f2338810a..22aff8e77854 100644
> > --- a/rust/kernel/mm/virt.rs
> > +++ b/rust/kernel/mm/virt.rs
> > @@ -6,7 +6,7 @@
> >
> > use crate::{
> > bindings,
> > - error::{to_result, Result},
> > + error::{code::EINVAL, to_result, Result},
> > page::Page,
> > types::Opaque,
> > };
> > @@ -148,6 +148,173 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
> > }
> > }
> >
> > +/// A builder for setting up a vma in an `mmap` call.
>
> Would be better to explicitly reference the struct file_operations->mmap()
> hook and to say that we should only be updating flags and vm_private_data
> here (though perhaps not worth mentioning _that_ if not explicitly exposed
> by your interface).
I guess also vm_ops?
> I'm guessing fields are, unless a setter/builder is established, immutable?
If you have a mutable reference, you can always modify fields. When
you don't want that, fields are made private.
> > + /// Internal method for updating the vma flags.
> > + ///
> > + /// # Safety
> > + ///
> > + /// This must not be used to set the flags to an invalid value.
> > + #[inline]
> > + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) {
> > + let mut flags = self.flags();
> > + flags |= set;
> > + flags &= !unset;
> > +
> > + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet
> > + // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel.
> > + // The caller promises that this does not set the flags to an invalid value.
> > + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags = flags };
>
> Hm not sure if this is correct. We explicitly maintain a union in struct vm_area_struct as:
>
> union {
> const vm_flags_t vm_flags;
> vm_flags_t __private __vm_flags;
> };
>
> Where vma->vm_flags is const, and then use helpers like vm_flags_init() to
> set them, which also do things like assert locks (though not in the init
> case, of course).
>
> So erally we should at least be updating __vm_flags here, though I'm not
> sure how bindgen treats it?
I don't think using vm_flags vs __vm_flags changes anything on the
Rust side. The modifications are happening unsafely through a raw
pointer to something inside Opaque, so Rust isn't going to care about
stuff like your const annotation; it's unsafe either way.
I'll update this to use __vm_flags instead.
> > + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set.
> > + ///
> > + /// This flag indicates whether userspace is allowed to make this vma readable with
> > + /// `mprotect()`.
> > + #[inline]
> > + pub fn try_clear_mayread(&self) -> Result {
> > + if self.get_read() {
> > + return Err(EINVAL);
> > + }
>
> This is quite nice! Strong(er) typing for the win, again :>)
:)
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct
2024-11-21 9:52 ` Alice Ryhl
@ 2024-11-21 12:37 ` Lorenzo Stoakes
2024-11-21 14:35 ` Alice Ryhl
0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 12:37 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
I'm generally fine with this patch (other than rust specifics which I leave
to the rust experts), however I'm a little concerned about us taking
reference counts when we don't need to so this is something I'd like to see
addressed for v9 or, at least to be confident we're not doing this in the
binder code unnecessarily.
Thanks!
On Thu, Nov 21, 2024 at 10:52:09AM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 7:13 PM Lorenzo Stoakes
> > > +void rust_helper_mmap_read_unlock(struct mm_struct *mm)
> > > +{
> > > + mmap_read_unlock(mm);
> > > +}
> >
> > I guess at this point we're only interested in reading?
>
> Yeah. The write lock would be very similar.
Ack
>
> > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > > index 66f5cde7f322..cc1963510cdf 100644
> > > --- a/rust/kernel/lib.rs
> > > +++ b/rust/kernel/lib.rs
...
> > > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> > > +#[repr(transparent)]
> > > +pub struct Mm {
> > > + mm: Opaque<bindings::mm_struct>,
> > > +}
> >
> > Does this tie this type to the C struct mm_struct type?
> >
> > Does 'Opaque' mean it is a pointer to a type which is 'opaque' in the sense
> > that rust can't see into its internals?
>
> This declaration defines a Rust type called Mm which has the same
> size, alignment, and contents as `struct mm_struct`. The purpose of
> Opaque is to tell Rust that it can't assume anything about the
> contents at all; we do that to leave it up to C.
>
> For example, normally if you have an immutable reference &i32, then
> Rust is going to assume that the contents behind the reference are in
> fact immutable. Opaque turns that off, meaning that an `&Opaque<i32>`
> is allowed to reference an integer as it gets modified. It makes all
> access to the contents unsafe, though.
>
> Note that Opaque is *not* a pointer type. We're going to be dealing
> with types like &Mm or ARef<Mm> where &_ and ARef<_> are two different
> kinds of pointers.
OK so when you reference Mm.mm that is in effect referencing the struct
mm_struct object rather than a pointer to a struct mm_struct? and
>
> > > +// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called.
> > > +unsafe impl Send for Mm {}
> > > +// SAFETY: All methods on `Mm` can be called in parallel from several threads.
> > > +unsafe impl Sync for Mm {}
> > > +
> > > +// SAFETY: By the type invariants, this type is always refcounted.
> > > +unsafe impl AlwaysRefCounted for Mm {
> > > + #[inline]
> > > + fn inc_ref(&self) {
> > > + // SAFETY: The pointer is valid since self is a reference.
> > > + unsafe { bindings::mmgrab(self.as_raw()) };
> > > + }
> > > +
> > > + #[inline]
> > > + unsafe fn dec_ref(obj: NonNull<Self>) {
> > > + // SAFETY: The caller is giving up their refcount.
> > > + unsafe { bindings::mmdrop(obj.cast().as_ptr()) };
> > > + }
> > > +}
> >
> > Under what circumstances would these be taken? Same question for MmWthUser.
>
> This makes `Mm` compatible with the pointer type called ARef<_>. This
> pointer type is used to represent ownership of a refcount. So whenever
> a variable of type ARef<_> goes out of scope, the refcount is
> decremented, and whenever an ARef<_> is cloned, the refcount is
> incremented.
>
> The way this works is that ARef<_> is implemented to use the
> AlwaysRefCounted trait to understand how to manipulate the count. Only
> types that implement the trait with an impl block like above can be
> used with ARef<_>.
OK so when you first instantiate an ARef<_> you don't increment the
reference count, only if it is cloned from there on in?
>
> > > +
> > > +/// A wrapper for the kernel's `struct mm_struct`.
> > > +///
> > > +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
> > > +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
> > > +/// refcount. It can be used to access the associated address space.
> > > +///
> > > +/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
> > > +///
> > > +/// # Invariants
> > > +///
> > > +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> > > +#[repr(transparent)]
> > > +pub struct MmWithUser {
> > > + mm: Mm,
> > > +}
> >
> > Why does Mm have this as a Opaque<bindings::mm_struct> and this sort of
> > nests it?
> >
> > Does this somehow amount to the same thing, or would you probably never
> > actually reference this mm field?
>
> It amounts to the same thing as Opaque<bindings::mm_struct>.
Ack.
>
> > > +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
> > > +unsafe impl Send for MmWithUser {}
> > > +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
> > > +unsafe impl Sync for MmWithUser {}
> > > +
> > > +// SAFETY: By the type invariants, this type is always refcounted.
> > > +unsafe impl AlwaysRefCounted for MmWithUser {
> > > + #[inline]
> > > + fn inc_ref(&self) {
> > > + // SAFETY: The pointer is valid since self is a reference.
> > > + unsafe { bindings::mmget(self.as_raw()) };
> > > + }
> > > +
> > > + #[inline]
> > > + unsafe fn dec_ref(obj: NonNull<Self>) {
> > > + // SAFETY: The caller is giving up their refcount.
> > > + unsafe { bindings::mmput(obj.cast().as_ptr()) };
> > > + }
> >
> > Hm, why is it we mmget(self.as_raw()) but mmput(obj.cast().as_ptr())?
>
> There's one assumption about references that Opaque doesn't turn off:
> The memory behind the reference must not get deallocated while the
> reference exists. We can't use a reference in dec_ref because the
> memory might get deallocated during the call to dec_ref.
OK I see.
>
> > Also I guess relatedly, why does one refer to &self and the other as a
> > NonNull<Self>?
>
> Writing `&self` means that the "self parameter" has type `&Self`,
> which in this case is the same as `&MmWithUser`.
>
> The type `NonNull<Self>` is the same as `NonNull<MmWithUser>`. The
> NonNull type is a raw pointer that can't be null. Other than the
> non-null requirement, nothing is assumed about the raw pointer.
Ack
>
> > I'm guessing as a non-rust person a 'reference' is like a C++ reference in
> > the sense that (well it is _assumed_ in C++ anyway) it acts like a pointer
> > for the type which should never not be there, but we need .as_raw() to get
> > the raw C pointer?
>
> Yeah, Rust references come with an assumption that the object is not
> deallocated while the reference exists.
Makes sense.
>
> The .as_raw() call converts from &MmWithUser to `*mut
> bindings::mm_struct`. So note that it not only converts from reference
> to raw pointer, but it also changes the target type from MmWithUser to
> bindings::mm_struct.
Ah I see.
>
> > And I guess in the dec_ref() case we need the .cast().as_ptr() because obj
> > 'boxes' around self (I guess equivalent to 'this' in C++ kinda)
> > guaranteeing that it can provide non-null pointer to the current object?
>
> Well, the thing that is equivalent to "this" would be "self".
Yeah, but in this case we actually need the 'only assume we have a non-null
pointer' value obj, which we cast to a pointer. OK cull.
>
> > > +// Make all `Mm` methods available on `MmWithUser`.
> > > +impl Deref for MmWithUser {
> > > + type Target = Mm;
> > > +
> > > + #[inline]
> > > + fn deref(&self) -> &Mm {
> > > + &self.mm
> > > + }
> > > +}
> >
> > I rubber ducked myself a bit on this, so I guess this makes it possible to
> > dereference the object, and it
>
> It lets you transparently obtain an &Mm from an &MmWithUser, making
> all Mm methods available on MmWithUser.
Hm did I not finish my sentence haha. Yeah I think I kind of rubber ducked
myself further and saw that this was the case.
>
> > > +// These methods are safe to call even if `mm_users` is zero.
> > > +impl Mm {
> > > + /// Call `mmgrab` on `current.mm`.
> > > + #[inline]
> > > + pub fn mmgrab_current() -> Option<ARef<Mm>> {
> > > + // SAFETY: It's safe to get the `mm` field from current.
> > > + let mm = unsafe {
> > > + let current = bindings::get_current();
> > > + (*current).mm
> > > + };
> >
> > I don't see an equivalent means of obtaining mm from current for
> > MmWithUser, is that intentional, would there be another means of obtaining
> > an mm? (perhaps via vma->vm_mm I guess?)
> >
> > An aside -->
> >
> > If we're grabbing from current, and this is non-NULL (i.e. not a kernel
> > thread), this is kinda MmWithUser to _start out_ with, but I guess if
> > you're grabbing the current one you might not expect it.
> >
> > I guess one thing I want to point out (maybe here is wrong place) is that
> > the usual way of interacting with current->mm is that we do _not_ increment
> > mm->mm_count, mm->mm_users or any refernce count, as while we are in the
> > kernel portion of the task, we are guaranteed the mm and the backing
> > virtual address space will stick around.
> >
> > With reference to MmWithUser, in fact, if you look up users of
> > mmget()/mmput() it is pretty rare to do that.
> >
> > So ideally we'd avoid doing this if we could (though having the semantics
> > of grabbing a reference if we were to copy the object somehow again or hold
> > its state or something would be nice).
> >
> > I guess this might actually be tricky in rust, because we'd probably need
> > to somehow express the current task's lifetime and tie this to that
> > and... yeah.
> >
> > <-- aside
>
> Ah, yeah, I guess this API is incomplete. We could have an API that
> lets you obtain an &MmWithUser instead. Then, if the user wants to
> increment the refcount, they can manually convert that into an
> ARef<Mm> or ARef<MmWithUser>.
>
> It's true that it's slightly tricky to express in Rust, but it's
> possible. We already have a way to get a &Task pointing at current.
Yeah, but I think we should do this incrementally as I want this initial
series to merge first, after which we can tweak things.
>
>
> > > +
> > > + if mm.is_null() {
> > > + return None;
> > > + }
> > > +
> > > + // SAFETY: The value of `current->mm` is guaranteed to be null or a valid `mm_struct`. We
> > > + // just checked that it's not null. Furthermore, the returned `&Mm` is valid only for the
> > > + // duration of this function, and `current->mm` will stay valid for that long.
> > > + let mm = unsafe { Mm::from_raw(mm) };
> >
> > Hm does mm now reference something with a different type, as in before it
> > was a 'raw' pointer or such, and now it's a reference to an Mm right?
>
> Yes ... this is using shadowing to change the type of the variable.
> It's actually rather common in Rust.
>
> The former mm variable had type `*mut bindings::mm_struct`. The latter
> mm variable has type `&Mm`.
Ack yeah I wondered if this might be an idiomatic rust thing.
>
> > Also I guess the 'duration of this function' is because we put this in the
> > 'Aref' smart pointer which kinda takes over the lifetime of the reference
> > by wrapping it right?
>
> Yeah, basically, the lifetime gets inferred from how we use it.
Interesting, I do like this 'compiler consciously aware of lifetimes'
thing.
>
> > I mean I'm not a rust person so actually I have no business _commenting_ on
> > this :P as this may very well be idiomatic rust, but I'm just curious about
> > this.
> >
> > It's nitty but I feel like maybe we're somewhat overloading 'mm's here a
> > bit though? As we have our wrapped Mm type and then an internal raw mm
> > type. On the other hand, it's hard to now have horribly awkward and
> > confusing naming here I guess, and perhaps this is fine.
> >
> > Not a big deal though!
> >
> > > +
> > > + // This increments the refcount using `mmgrab`.
> > > + Some(ARef::from(mm))
> >
> > So I get that Some() means this is like a discriminated union or such,
> > where we can return None (as above) or Some, which contains the value is of
> > type ARef<Mm>. And I guess this moves the 'lifetime' of mm which was
> > previously with the function into that of the ARef<>?
>
> Yes, Some() is normally a discriminated union, but in this particular
> case since ARef uses a NonNull pointer for its only field, the
> compiler is smart enough to represent None as a null pointer instead
> of using a discriminated union with a separate tag.
Ohhhh nice!
>
> As for the "lifetime" of the `mm`, that's not really the lifetime of
> the mm_struct. Rather, it's the duration for which the &Mm reference
> is assumed to be valid for. That lifetime ends right after the
> ARef::from call, because that's our last use of the &Mm.
Ack.
>
> > Does the ARef<> 'just know' to use the AlwaysRefCounted methods?
>
> Yep! See e.g. the destructor of ARef:
>
> impl<T: AlwaysRefCounted> Drop for ARef<T> {
> fn drop(&mut self) {
> unsafe { T::dec_ref(self.ptr) };
> }
> }
>
> Due to the `T: AlwaysRefCounted`, the type `T` must be a type that
> implements the `AlwaysRefCounted` trait, and the compiler is able to
> resolve T::dec_ref based on that.
Ah so this is mandatory for using an ARef<> in general then?
>
> > > + }
> > > +
> > > + /// Returns a raw pointer to the inner `mm_struct`.
> > > + #[inline]
> > > + pub fn as_raw(&self) -> *mut bindings::mm_struct {
> > > + self.mm.get()
> > > + }
> >
> > I guess this .get() method is on the Opaque<> object and returns a raw ptr?
>
> Yes, it just creates a raw pointer.
Ack
>
> > > + /// Obtain a reference from a raw pointer.
> > > + ///
> > > + /// # Safety
> > > + ///
> > > + /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated
> > > + /// during the lifetime 'a.
> > > + #[inline]
> > > + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm {
> >
> > I'm guessing this funny 'a syntax, based on the comment, refers to the
> > lifetime of the object?
>
> It's a lifetime, but not necessarily the lifetime of the object.
> Rather, it's the maximum duration in which the reference is assumed to
> be valid. It must not be longer than the lifetime of the mm_struct of
> course, but it's usually going to be much shorter than the lifetime of
> the mm_struct.
Ack, I like how explicit this is...
>
> > > + // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to
> > > + // repr(transparent).
> >
> > God I love these SAFETY comments (I mean actually, sorry I realise it's
> > almost impossible to convey 'not sarcastically, actually' in text form
> > :). Is that something that gets parsed somewhere, or a convention or?
> >
> > I like that there is a discipline of expressing under what circumstances we
> > are permitted to reference things.
>
> They don't get parsed anywhere, except that not using a SAFETY comment
> at all is a compilation warning.
Ah, nice!
>
> > > + unsafe { &*ptr.cast() }
> > > + }
> > > +
> > > + /// Calls `mmget_not_zero` and returns a handle if it succeeds.
> > > + #[inline]
> > > + pub fn mmget_not_zero(&self) -> Option<ARef<MmWithUser>> {
> >
> > I actually kinda love that this takes an mm and guarantees that you get an
> > MmWithUser out of it which is implied by the fact this succeeds.
> >
> > However as to the point above, I'm concerned that this might be seen as
> > 'the way' to access an mm, i.e. mm.mmgrab_current().mmget_not_zero() or
> > something.
> >
> > Whereas, the usual way of referencing current->mm is to not increment any
> > reference counts at all (assuming what you are doing resides in the same
> > lifetime as the task).
> >
> > Obviously if you step outside of that lifetime, then you _do_ have to pin
> > the mm (nearly always you want to grab rather than get though in that
> > circumstance).
>
> I can add a way to obtain an &MmWithUser from current without
> incrementing the refcount.
Yeah, I feel like we definitely need this.
However we'd need to _not_ drop the refcount when it goes out of scope too
in this case.
I'm not sure how you'd express that, however.
>
> > > + // SAFETY: The pointer is valid since self is a reference.
> > > + let success = unsafe { bindings::mmget_not_zero(self.as_raw()) };
> > > +
> > > + if success {
> > > + // SAFETY: We just created an `mmget` refcount.
> > > + Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) })
> >
> > When you do this, does it cause the reference count to increment, or does
> > it assume it's already at 1?
>
> This uses `from_raw` which by convention never increments the
> refcount. Semantically we're taking ownership of the increment
> performed by bindings::mmget_not_zero.
Cool, good!
Though as above, we definitely need a way to get this without fiddling with
reference count.
>
> > > + } else {
> > > + None
> > > + }
> > > + }
> > > +}
> > > +
> > > +// These methods require `mm_users` to be non-zero.
> > > +impl MmWithUser {
> > > + /// Obtain a reference from a raw pointer.
> > > + ///
> > > + /// # Safety
> > > + ///
> > > + /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains
> > > + /// non-zero for the duration of the lifetime 'a.
> > > + #[inline]
> > > + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> > > + // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due
> > > + // to repr(transparent).
> > > + unsafe { &*ptr.cast() }
> > > + }
> >
> > I guess this is another means by which you can get the mm. I'd say an
> > equivalent for getting from current is highly relevant.
>
> This lets you write MmWithUser::from_raw in unsafe code.
Ack.
>
> > > +
> > > + /// Lock the mmap read lock.
> > > + #[inline]
> > > + pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> {
> > > + // SAFETY: The pointer is valid since self is a reference.
> > > + unsafe { bindings::mmap_read_lock(self.as_raw()) };
> > > +
> > > + // INVARIANT: We just acquired the read lock.
> > > + MmapReadGuard {
> > > + mm: self,
> > > + _nts: NotThreadSafe,
> >
> > I'm sure this is a rusty thing, but curious as to why this is like that?
> > What is '_nts', etc.?
>
> _nts is the name of a field. The NotThreadSafe type has size zero, so
> it doesn't exist in the compiled code. It exists only to mark that the
> MmapReadGuard cannot be transferred across thread boundaries.
Ah right I see, wanting compiler-specific behaviour from the type but not
to actually modify the compiled object in reality.
>
> > > + }
> > > + }
> > > +
> > > + /// Try to lock the mmap read lock.
> > > + #[inline]
> > > + pub fn mmap_read_trylock(&self) -> Option<MmapReadGuard<'_>> {
> > > + // SAFETY: The pointer is valid since self is a reference.
> > > + let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) };
> > > +
> > > + if success {
> > > + // INVARIANT: We just acquired the read lock.
> > > + Some(MmapReadGuard {
> > > + mm: self,
> > > + _nts: NotThreadSafe,
> > > + })
> > > + } else {
> > > + None
> > > + }
> > > + }
> > > +}
> > > +
> > > +/// A guard for the mmap read lock.
> > > +///
> > > +/// # Invariants
> > > +///
> > > +/// This `MmapReadGuard` guard owns the mmap read lock.
> > > +pub struct MmapReadGuard<'a> {
> > > + mm: &'a MmWithUser,
> > > + // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread
> > > + _nts: NotThreadSafe,
> > > +}
> > > +
> > > +impl Drop for MmapReadGuard<'_> {
> > > + #[inline]
> > > + fn drop(&mut self) {
> > > + // SAFETY: We hold the read lock by the type invariants.
> > > + unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
> > > + }
> > > +}
> >
> > Ah that's nice, an actual guard for it :) I'm guessing the fact this
> > implements the guard implies that you _must_ hold the lock first right?
>
> Yeah so this code runs when a variable of type MmapReadGuard goes out
> of scope. We don't provide any way to obtain an MmapReadGuard without
> locking the mmap read guard.
Delightful!
>
> > >
> > > --
> > > 2.47.0.371.ga323438b13-goog
> > >
> >
> > Sorry for the numerous questions, I'm afraid there'll be a lot of this for
> > rust stuff for the time being. Perhaps advent of code will help improve
> > things on the rust front for me ;)
>
> Thanks for all the excellent questions!
Thanks for taking the time to answer them!
>
> Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access
2024-11-21 10:23 ` Alice Ryhl
@ 2024-11-21 12:50 ` Lorenzo Stoakes
2024-11-21 14:39 ` Alice Ryhl
0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 12:50 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 11:23:39AM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 8:07 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 02:49:56PM +0000, Alice Ryhl wrote:
> > > This adds a type called VmAreaRef which is used when referencing a vma
> > > that you have read access to. Here, read access means that you hold
> > > either the mmap read lock or the vma read lock (or stronger).
> >
> > This is pure nit and not important but...
> >
> > I know we screwed up naming in mm, with the god awful vm_area_struct (we
> > abbreviate, for 2 letters, get bored, stop abbreviating, but throw in a
> > struct for a laugh) or 'VMA', but I wonder if this would be better as
> > VmaRef? Then again that's kinda terrible too.
> >
> > Sorry about that. I mean this isn't hugely important + ultimately _our
> > fault_.
> >
> > VirtualMemoryAreaRef is worse... VirtMemAreaRef? OK. Maybe VMAreaRef is the
> > least bad...
>
> Happy to use whichever name you prefer. :)
Let's leave it as-is, this is kinda our fault* in mm :)
* pun intended
>
> > > Additionally, a vma_lookup method is added to the mmap read guard, which
> > > enables you to obtain a &VmAreaRef in safe Rust code.
> > >
> > > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Other than trivial doc/comment tweaks this looks fine to me from mm
perspective so:
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> > > ---
> > > rust/helpers/mm.c | 6 ++
> > > rust/kernel/mm.rs | 21 ++++++
> > > rust/kernel/mm/virt.rs | 171 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > 3 files changed, 198 insertions(+)
> > >
> > > diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> > > index 7201747a5d31..7b72eb065a3e 100644
> > > --- a/rust/helpers/mm.c
> > > +++ b/rust/helpers/mm.c
> > > @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm)
> > > {
> > > mmap_read_unlock(mm);
> > > }
> > > +
> > > +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> > > + unsigned long addr)
> > > +{
> > > + return vma_lookup(mm, addr);
> > > +}
> >
> > I wonder whether we want to abstract some of the VMA iterator stuff,
> > because it's inefficient to look up VMAs by doing this each time if you are
> > looking at, for instance, adjacent VMAs.
> >
> > But perhaps reasonable to defer doing that to later work?
>
> Yeah, that should probably be deferred. Binder only has one vma per
> mm, so iteration is not useful.
Sure, no problem!
>
> > > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > > index 84cba581edaa..ace8e7d57afe 100644
> > > --- a/rust/kernel/mm.rs
> > > +++ b/rust/kernel/mm.rs
> > > @@ -12,6 +12,8 @@
> > > };
> > > use core::{ops::Deref, ptr::NonNull};
> > >
> > > +pub mod virt;
> > > +
> > > /// A wrapper for the kernel's `struct mm_struct`.
> > > ///
> > > /// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> > > @@ -210,6 +212,25 @@ pub struct MmapReadGuard<'a> {
> > > _nts: NotThreadSafe,
> > > }
> > >
> > > +impl<'a> MmapReadGuard<'a> {
> > > + /// Look up a vma at the given address.
> > > + #[inline]
> > > + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {
> >
> > Kind of lovely to have functions under the read guard and then knowing that
> > hey - we must hold the read lock to be able to do this.
> >
> > Imagine, a programming language with actual types... :P
>
> :)
>
> > > + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
> > > + // for `vma_addr`.
> > > + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
> >
> > Why do we say 'as _' here?
>
> This is an integer cast where the target type of the cast is inferred
> by the compiler. It will go away once this lands:
> https://lore.kernel.org/rust-for-linux/20240913213041.395655-1-gary@garyguo.net/
OK cool.
>
> > > +
> > > + if vma.is_null() {
> > > + None
> > > + } else {
> > > + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
> > > + // the returned area will borrow from this read lock guard, so it can only be used
> > > + // while the mmap read lock is still held.
> >
> > Lovely!
> >
> > > + unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
> > > + }
> > > + }
> > > +}
> > > +
> > > impl Drop for MmapReadGuard<'_> {
> > > #[inline]
> > > fn drop(&mut self) {
> > > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > > new file mode 100644
> > > index 000000000000..1e755dca46dd
> > > --- /dev/null
> > > +++ b/rust/kernel/mm/virt.rs
> > > @@ -0,0 +1,171 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +// Copyright (C) 2024 Google LLC.
> > > +
> > > +//! Virtual memory.
> >
> > Trivial rust q again but does this result in a heading in generated docs or
> > something?
>
> Yes, this becomes module documentation. Check out:
> https://rust.docs.kernel.org/kernel/
> or try out `make LLVM=1 rustdoc`
Cool thanks, thought this might be the case!
>
> > > +use crate::{bindings, types::Opaque};
> > > +
> > > +/// A wrapper for the kernel's `struct vm_area_struct` with read access.
> > > +///
> > > +/// It represents an area of virtual memory.
> > > +///
> > > +/// # Invariants
> > > +///
> > > +/// The caller must hold the mmap read lock or the vma read lock.
> >
> > Might be worth mentioning here that you have yet to abstract the vma lock?
>
> I do that in the next patch.
Yeah saw, one of those 'comment in patch X, addressed in patch X+1' scenarios :)
>
> > > +#[repr(transparent)]
> > > +pub struct VmAreaRef {
> > > + vma: Opaque<bindings::vm_area_struct>,
> > > +}
> > > +
> > > +// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no
> > > +// matter what the vma flags are.
> >
> > typo: 'or strong' -> 'or stronger'.
> >
> > Nitty, but perhaps say 'Methods must be usable' rather than 'they' to be
> > totally clear.
>
> Will reword.
Thanks!
>
> > > +impl VmAreaRef {
> > > + /// Access a virtual memory area given a raw pointer.
> > > + ///
> > > + /// # Safety
> > > + ///
> > > + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> > > + /// (or stronger) is held for at least the duration of 'a.
> >
> > Well, VMA locks make this more complicated, in that we can just hold a VMA
> > lock. But again, perhaps worth doing this incrementally and just talk about
> > mmap locks to start with.
> >
> > However since we reference VMA locks elsewhere, we should reference them
> > here (and probably worth mentioning that they are not yet abstracted).
>
> Oh I forgot to update this, it should say "mmap or vma read lock".
Cool, one to tweak on respin also then.
>
> > > + #[inline]
> > > + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> > > + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> > > + unsafe { &*vma.cast() }
> > > + }
> > > +
> > > + /// Returns a raw pointer to this area.
> > > + #[inline]
> > > + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
> > > + self.vma.get()
> > > + }
> > > +
> > > + /// Returns the flags associated with the virtual memory area.
> > > + ///
> > > + /// The possible flags are a combination of the constants in [`flags`].
> > > + #[inline]
> > > + pub fn flags(&self) -> vm_flags_t {
> > > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > > + // access is not a data race.
> > > + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }
> >
> > Hm what's up with this __bindgen_anon_N stuff?
>
> Whenever you have a `struct { ... }` or `union { ... }` in the middle
> of a struct, bindgen generates additional types for them because Rust
> doesn't have a direct equivalent.
OK I see.
>
> > > + }
> > > +
> > > + /// Returns the start address of the virtual memory area.
> > > + #[inline]
> > > + pub fn start(&self) -> usize {
> >
> > Is usize guranteed to be equivalent to unsigned long?
>
> They are guaranteed to have the same size, yes.
>
> But again, right now `unsigned long` is being translated to whichever
> one of u32 or u64 has the same size as usize, instead of just being
> translated to usize. Thus the annoying casts. We can get rid of them
> as soon as
> https://lore.kernel.org/rust-for-linux/20240913213041.395655-1-gary@garyguo.net/
> lands.
Ack cool.
>
> > > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > > + // access is not a data race.
> > > + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
> > > + }
> > > +
> > > + /// Returns the end address of the virtual memory area.
> >
> > Worth mentioning that it's an _exclusive_ end.
>
> Sure, I'll add that.
Thanks
>
> > > + #[inline]
> > > + pub fn end(&self) -> usize {
> > > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > > + // access is not a data race.
> > > + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
> > > + }
> > > +
> > > + /// Unmap pages in the given page range.
> >
> > This needs some more description, as 'unmapping' pages is unfortunately an
> > overloaded term in the kernel and this very much might confuse people as
> > opposed to e.g. munmap()'ing a range.
> >
> > I'd say something like 'clear page table mappings for the range at the leaf
> > level, leaving all other page tables intact, freeing any memory referenced
> > by the VMA in this range (anonymous memory is completely freed, file-backed
> > memory has its reference count on page cache folio's dropped, any dirty
> > data will still be written back to disk as usual)'.
>
> Sure, will add that to the docs.
Thanks, I assume you mean this comment, which will form part of the docs? As
here we should at least replace the 'unmap' with 'zap' to avoid confusion
vs. munmap().
>
> > > + #[inline]
> > > + pub fn zap_page_range_single(&self, address: usize, size: usize) {
> > > + // SAFETY: By the type invariants, the caller has read access to this VMA, which is
> > > + // sufficient for this method call. This method has no requirements on the vma flags. Any
> > > + // value of `address` and `size` is allowed.
> > > + unsafe {
> > > + bindings::zap_page_range_single(
> >
> > Hm weirdly I see this in rust/bindings/bindings_generated.rs but not in
> > rust/helpers/mm.c is this intended?
> >
> > Is this meant to be generated _from_ somewhere? Is something missing for
> > that?
>
> The bindings_generated.rs file is generated at built-time from C
> headers. The zap_page_range_single is a real function, not a fake
> static inline header-only function, so bindgen is able to generate it
> without anything in rust/helpers.
>
> > > + self.as_ptr(),
> > > + address as _,
> > > + size as _,
> > > + core::ptr::null_mut(),
> > > + )
> > > + };
> > > + }
> > > +}
> > > +
> > > +/// The integer type used for vma flags.
> > > +#[doc(inline)]
> > > +pub use bindings::vm_flags_t;
> >
> > Where do you declare this type?
>
> It's declared in include/linux/mm_types.h
I meant from a rust perspective, but I guess bindgen handles this?
>
> > > +
> > > +/// All possible flags for [`VmAreaRef`].
> >
> > Well you've not specified all as they're some speciailist ones and ones
> > that depend on config flags, so maybe worth just saying 'core' flags?
>
> Sure.
Thanks.
>
> > > +pub mod flags {
> >
> > Pure rust noobie question (again...) but what is a 'mod'?
>
> It's a module.
Ack.
>
> > > + use super::vm_flags_t;
> > > + use crate::bindings;
> > > +
> > > + /// No flags are set.
> > > + pub const NONE: vm_flags_t = bindings::VM_NONE as _;
> >
> > This is strictly not a flag, as is the 0 value (and is used 'cleverly' in
> > kernel code when we have flags that are defined under some circumstances
> > but not others, where we can then assign these values to VM_NONE if not
> > available, ensuring all |= ... operations are no-ops and all &
> > ... expressions evaluate to false) but I guess it doesn't matter all too
> > much right?
>
> Ultimately it's just a bunch of integer constants. We can include or
> exclude whichever ones we prefer.
Yeah not a big deal.
>
> > > + /// Mapping allows reads.
> > > + pub const READ: vm_flags_t = bindings::VM_READ as _;
> > > +
> > > + /// Mapping allows writes.
> > > + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _;
> > > +
> > > + /// Mapping allows execution.
> > > + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _;
> > > +
> > > + /// Mapping is shared.
> > > + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _;
> > > +
> > > + /// Mapping may be updated to allow reads.
> > > + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _;
> > > +
> > > + /// Mapping may be updated to allow writes.
> > > + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _;
> > > +
> > > + /// Mapping may be updated to allow execution.
> > > + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _;
> > > +
> > > + /// Mapping may be updated to be shared.
> > > + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _;
> > > +
> > > + /// Page-ranges managed without `struct page`, just pure PFN.
> > > + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _;
> > > +
> > > + /// Memory mapped I/O or similar.
> > > + pub const IO: vm_flags_t = bindings::VM_IO as _;
> > > +
> > > + /// Do not copy this vma on fork.
> > > + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _;
> > > +
> > > + /// Cannot expand with mremap().
> > > + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _;
> > > +
> > > + /// Lock the pages covered when they are faulted in.
> > > + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _;
> > > +
> > > + /// Is a VM accounted object.
> > > + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _;
> > > +
> > > + /// should the VM suppress accounting.
> > > + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _;
> > > +
> > > + /// Huge TLB Page VM.
> > > + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _;
> > > +
> > > + /// Synchronous page faults.
> >
> > Might be worth mentioning that this is DAX-specific only.
>
> Will add.
Ack.
>
> > > + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _;
> > > +
> > > + /// Architecture-specific flag.
> > > + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _;
> > > +
> > > + /// Wipe VMA contents in child..
> >
> > Typo '..' - also probably worth saying 'wipe VMA contents in child on
> > fork'.
>
> Will add.
Ack, thanks.
>
> > > + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _;
> > > +
> > > + /// Do not include in the core dump.
> > > + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _;
> > > +
> > > + /// Not soft dirty clean area.
> > > + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _;
> > > +
> > > + /// Can contain `struct page` and pure PFN pages.
> > > + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _;
> > > +
> > > + /// MADV_HUGEPAGE marked this vma.
> > > + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _;
> > > +
> > > + /// MADV_NOHUGEPAGE marked this vma.
> > > + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _;
> > > +
> > > + /// KSM may merge identical pages.
> > > + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _;
> > > +}
> > >
> >
> > I'd say these comments are a bit sparse and need more detail, but they are
> > literally comments from include/linux/mm.h and therefore, again, our fault
> > so this is fine :)
>
> Happy to add more if you tell me what you'd like to see. ;)
Sure, this is fine for now.
>
> Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 3/7] mm: rust: add vm_insert_page
2024-11-21 10:38 ` Alice Ryhl
@ 2024-11-21 12:55 ` Lorenzo Stoakes
0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 12:55 UTC (permalink / raw)
To: Alice Ryhl
Cc: Jann Horn, Miguel Ojeda, Matthew Wilcox, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 11:38:29AM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 8:20 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 02:49:57PM +0000, Alice Ryhl wrote:
> > > The vm_insert_page method is only usable on vmas with the VM_MIXEDMAP
> > > flag, so we introduce a new type to keep track of such vmas.
> >
> > Worth mentioning that it can be used for VMAs without this flag set, but if
> > so we must hold a _write_ lock to be able to do so, so it can update the
> > flag itself, however we intend only to use it with VMAs which already have
> > this flag set.
>
> I got the impression from Jann that the automatically-set-VM_MIXEDMAP
> thing should only be used during mmap, and that VM_MIXEDMAP should not
> get set after initialization even with a write lock?
> https://lore.kernel.org/all/CAG48ez3gXicVYXiPsQDmYuPSsKMbES2KRQDk+0ANWSS0zDkqSw@mail.gmail.com/
Yeah he's right, but we do allow that to happen, perhaps we shouldn't. A
separate patch idea ;)
In that case let's leave this bit out, as while we allow it, it seems
inadvisible except on very early internal mm initialisation perhaps.
>
> > > The approach used in this patch assumes that we will not need to encode
> > > many flag combinations in the type. I don't think we need to encode more
> > > than VM_MIXEDMAP and VM_PFNMAP as things are now. However, if that
> > > becomes necessary, using generic parameters in a single type would scale
> > > better as the number of flags increases.
> > >
> > > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> > > ---
> > > rust/kernel/mm/virt.rs | 68 +++++++++++++++++++++++++++++++++++++++++++++++++-
> > > 1 file changed, 67 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > > index 1e755dca46dd..de7f2338810a 100644
> > > --- a/rust/kernel/mm/virt.rs
> > > +++ b/rust/kernel/mm/virt.rs
> > > @@ -4,7 +4,14 @@
> > >
> > > //! Virtual memory.
> > >
> > > -use crate::{bindings, types::Opaque};
> > > +use crate::{
> > > + bindings,
> > > + error::{to_result, Result},
> > > + page::Page,
> > > + types::Opaque,
> > > +};
> > > +
> > > +use core::ops::Deref;
> > >
> > > /// A wrapper for the kernel's `struct vm_area_struct` with read access.
> > > ///
> > > @@ -80,6 +87,65 @@ pub fn zap_page_range_single(&self, address: usize, size: usize) {
> > > )
> > > };
> > > }
> > > +
> > > + /// Check whether the `VM_MIXEDMAP` flag is set.
> > > + #[inline]
> > > + pub fn check_mixedmap(&self) -> Option<&VmAreaMixedMap> {
> >
> > Nitty + a little bikesheddy (this may be my mistake as I am unfamiliar with
> > rust idioms also) but shouldn't this be 'as_mixedmap_vma()' or something?
>
> You're probably right that this name is more consistent with Rust
> naming conventions. :)
Yeah, I think it's reasonable in rust code to follow rust idioms and
conventions.
>
> > > + if self.flags() & flags::MIXEDMAP != 0 {
> > > + // SAFETY: We just checked that `VM_MIXEDMAP` is set. All other requirements are
> > > + // satisfied by the type invariants of `VmAreaRef`.
> > > + Some(unsafe { VmAreaMixedMap::from_raw(self.as_ptr()) })
> > > + } else {
> > > + None
> > > + }
> > > + }
> > > +}
> > > +
> > > +/// A wrapper for the kernel's `struct vm_area_struct` with read access and `VM_MIXEDMAP` set.
> > > +///
> > > +/// It represents an area of virtual memory.
> > > +///
> > > +/// # Invariants
> > > +///
> > > +/// The caller must hold the mmap read lock or the vma read lock. The `VM_MIXEDMAP` flag must be
> > > +/// set.
> > > +#[repr(transparent)]
> > > +pub struct VmAreaMixedMap {
> > > + vma: VmAreaRef,
> > > +}
> > > +
> > > +// Make all `VmAreaRef` methods available on `VmAreaMixedMap`.
> > > +impl Deref for VmAreaMixedMap {
> > > + type Target = VmAreaRef;
> > > +
> > > + #[inline]
> > > + fn deref(&self) -> &VmAreaRef {
> > > + &self.vma
> > > + }
> > > +}
> >
> > Ah OK, thinking back to the 'impl Deref' from the other patch, I guess this
> > allows you to deref VmAreaMixedMap as a VmAreaRef, does it all sort of
> > automagically merge methods for you as if they were mix-ins then?
>
> Yeah, I use it to "merge" the method sets to avoid duplication.
>
> The main limitation is that you can only deref to one other type, so
> you can't have "diamond inheritance".
Aww, diamond inheritance is such fun though! ;)
>
> > > +impl VmAreaMixedMap {
> > > + /// Access a virtual memory area given a raw pointer.
> > > + ///
> > > + /// # Safety
> > > + ///
> > > + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> > > + /// (or stronger) is held for at least the duration of 'a. The `VM_MIXEDMAP` flag must be set.
> > > + #[inline]
> > > + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> > > + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> > > + unsafe { &*vma.cast() }
> > > + }
> > > +
> > > + /// Maps a single page at the given address within the virtual memory area.
> > > + ///
> > > + /// This operation does not take ownership of the page.
> > > + #[inline]
> > > + pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
> >
> > I'm guessing this 'Result' type encodes 0 vs. -Exxx error codes?
>
> In this particular case, yes.
>
> More generally, Result is a discriminated union containing either
> Ok(val) for success or Err(err) with some error type. But the default
> success type is the unit type () and the default error type is
> kernel::error::Error which corresponds to errno numbers.
>
> In this case, the compiler is clever enough to not use a discriminated
> union and instead represents Ok(()) as 0 and Err(err) as just the
> negative errno. This works since kernel::error::Error uses NonZeroI32
> as its only field (as of 6.13).
Kinda selling me on rust again...
>
> > > + // SAFETY: The caller has read access and has verified that `VM_MIXEDMAP` is set. The page
> > > + // is order 0. The address is checked on the C side so it can take any value.
> > > + to_result(unsafe { bindings::vm_insert_page(self.as_ptr(), address as _, page.as_ptr()) })
> > > + }
> > > }
> >
> > It's really nice to abstract this as a seprate type and then to know its
> > methods only apply in known circumstances... I guess in future we can use
> > clever generic types when it comes to combinations of characteristics that
> > would otherwise result in a combinatorial explosion?
>
> Yeah, so the alternate approach I mention in the commit message would
> be to have something like this:
>
> struct VmAreaRef<const MIXEDMAP: bool, const PFNMAP: bool, const
> MAYWRITE: bool> { ... }
>
> listing all the flags we care about. This way, we get 2^3 different
> types by writing just one.
>
> (There are a few different variations on how to do this, instead of
> bools, we may want to allow three options: true, false, unknown.)
Sure, it's something we can come to later I guess as we progress.
>
> Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu
2024-11-21 10:44 ` Alice Ryhl
@ 2024-11-21 12:59 ` Lorenzo Stoakes
0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 12:59 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 11:44:52AM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 8:29 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 02:49:58PM +0000, Alice Ryhl wrote:
> > > All of Rust Binder's existing calls to `vm_insert_page` could be
> > > optimized to first attempt to use `lock_vma_under_rcu`. This patch
> > > provides an abstraction to enable that.
> >
> > I think there should be a blurb about what the VMA locks are, how they avoid
> > contention on the mmap read lock etc. before talking about a use case (though
> > it's useful to mention the motivating reason!)
^ I think we should update the commit message to add this at the start then the
binder stuff underneath.
> >
> > >
> > > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Other than the doc stuff, this looks fine, so:
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> > > ---
> > > rust/helpers/mm.c | 5 +++++
> > > rust/kernel/mm.rs | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 61 insertions(+)
> > >
> > > diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> > > index 7b72eb065a3e..81b510c96fd2 100644
> > > --- a/rust/helpers/mm.c
> > > +++ b/rust/helpers/mm.c
> > > @@ -43,3 +43,8 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> > > {
> > > return vma_lookup(mm, addr);
> > > }
> > > +
> > > +void rust_helper_vma_end_read(struct vm_area_struct *vma)
> > > +{
> > > + vma_end_read(vma);
> > > +}
> > > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > > index ace8e7d57afe..a15acb546f68 100644
> > > --- a/rust/kernel/mm.rs
> > > +++ b/rust/kernel/mm.rs
> > > @@ -13,6 +13,7 @@
> > > use core::{ops::Deref, ptr::NonNull};
> > >
> > > pub mod virt;
> > > +use virt::VmAreaRef;
> > >
> > > /// A wrapper for the kernel's `struct mm_struct`.
> > > ///
> > > @@ -170,6 +171,32 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> > > unsafe { &*ptr.cast() }
> > > }
> > >
> > > + /// Try to lock the vma read lock under rcu.
> >
> > This reads oddly, I'd say 'try to acquire the VMA read lock'. It's not really
> > necessary to mention RCU here I'd say, as while lock_vma_under_rcu() acquires
> > the RCU lock in order to try to get the VMA read lock, it releases it afterwards
> > and you hold the VMA read luck until you are done with it and don't need to hold
> > an RCU lock.
> >
> > A reader might otherwise be confused and think an RCU read lock is required to
> > be held throughout too which isn't the case (this is maybe a critique of the
> > name of the function too, sorry Suren :P).
> >
> > > + /// If this operation fails, the vma may still exist. In that case, you should take the mmap
> > > + /// read lock and try to use `vma_lookup` instead.
> >
> > This also reads oddly, you're more likely (assuming you are not arbitrarily
> > trying to acquire a lock on an address likely to be unmapped soon) to have
> > failed due to lock contention.
> >
> > So I'd say 'this is an optimistic try lock operation, so it may fail, in which
> > case you should fall back to taking the mmap read lock'.
> >
> > I'm not sure it's necessary to reference vma_lookup() either, especially as in
> > future versions of this code we might want to use a VMA iterator instead.
>
> Thanks for the doc suggestions, they sound great.
Thanks :)
>
> > > + ///
> > > + /// When per-vma locks are disabled, this always returns `None`.
> > > + #[inline]
> > > + pub fn lock_vma_under_rcu(&self, vma_addr: usize) -> Option<VmaReadGuard<'_>> {
> >
> > Ah I love having lock guards available... Something I miss from C++ :>)
>
> I've heard that C is starting to get lock guards recently!
Yeah there are some (e.g. [0]) but the weak typing hinders things imo and the
syntax is not fun.
This is wrt to the _kernel_ C rather than C in general though in case you were
referring to the newer standard or such!
[0]: https://elixir.bootlin.com/linux/v6.12/source/include/linux/cleanup.h#L307
>
> > > + #[cfg(CONFIG_PER_VMA_LOCK)]
> >
> > Ah interesting, so we have an abstraction for kernel config operations!
>
> Yeah, it's basically an #ifdef, but the block must still parse even if
> the config is disabled.
>
Right, kinda sane to actually make sure it parses too... :)
> Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 5/7] mm: rust: add mmput_async support
2024-11-21 11:39 ` Alice Ryhl
@ 2024-11-21 13:04 ` Lorenzo Stoakes
0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 13:04 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 12:39:37PM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 8:47 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 02:49:59PM +0000, Alice Ryhl wrote:
> > > Adds an MmWithUserAsync type that uses mmput_async when dropped but is
> > > otherwise identical to MmWithUser. This has to be done using a separate
> > > type because the thing we are changing is the destructor.
> > >
> > > Rust Binder needs this to avoid a certain deadlock. See commit
> > > 9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for
> > > details. It's also needed in the shrinker to avoid cleaning up the mm in
> > > the shrinker's context.
> >
> > Oh Lord, I didn't even know this existed... I see it was (re-)added in commit
> > a1b2289cef92 ("android: binder: drop lru lock in isolate callback") back in 2017
> > so quite a history of being necessary for binder.
> >
> > I see mmdrop_async(), I guess that's not needed for anything binder-ish? A quick
> > look in the code suggests this is invoked in free_signal_struct() and is there
> > due to some softirq stuff on x86... so yeah I guess not :)
>
> I didn't know it was so binder-specific. I assumed it would be a
> relatively common use-case.
Yeah, seems not so much :>)
>
> > > // These methods are safe to call even if `mm_users` is zero.
> > > impl Mm {
> > > /// Call `mmgrab` on `current.mm`.
> > > @@ -171,6 +213,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> > > unsafe { &*ptr.cast() }
> > > }
> > >
> > > + /// Use `mmput_async` when dropping this refcount.
> > > + #[inline]
> > > + pub fn use_mmput_async(me: ARef<MmWithUser>) -> ARef<MmWithUserAsync> {
> >
> > Again, nitty, but I wonder if this should be as_xxx()?
> >
> > But I guess this makes sense too.
>
> In this case, the as_ prefix is incorrect because this is an owned ->
> owned conversion. See the API guidelines:
> https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv
>
> If we wish to use a prefix, the correct prefix is into_.
Ack.
>
> Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 5/7] mm: rust: add mmput_async support
2024-11-20 19:46 ` Lorenzo Stoakes
2024-11-21 11:39 ` Alice Ryhl
@ 2024-11-21 13:04 ` Lorenzo Stoakes
1 sibling, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 13:04 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Wed, Nov 20, 2024 at 07:46:46PM +0000, Lorenzo Stoakes wrote:
> On Wed, Nov 20, 2024 at 02:49:59PM +0000, Alice Ryhl wrote:
> > Adds an MmWithUserAsync type that uses mmput_async when dropped but is
> > otherwise identical to MmWithUser. This has to be done using a separate
> > type because the thing we are changing is the destructor.
> >
> > Rust Binder needs this to avoid a certain deadlock. See commit
> > 9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for
> > details. It's also needed in the shrinker to avoid cleaning up the mm in
> > the shrinker's context.
>
> Oh Lord, I didn't even know this existed... I see it was (re-)added in commit
> a1b2289cef92 ("android: binder: drop lru lock in isolate callback") back in 2017
> so quite a history of being necessary for binder.
>
> I see mmdrop_async(), I guess that's not needed for anything binder-ish? A quick
> look in the code suggests this is invoked in free_signal_struct() and is there
> due to some softirq stuff on x86... so yeah I guess not :)
>
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Other than queries, looks good to me from mm side so:
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> > ---
> > rust/kernel/mm.rs | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 49 insertions(+)
> >
> > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > index a15acb546f68..f800b6c244cd 100644
> > --- a/rust/kernel/mm.rs
> > +++ b/rust/kernel/mm.rs
> > @@ -98,6 +98,48 @@ fn deref(&self) -> &Mm {
> > }
> > }
> >
> > +/// A wrapper for the kernel's `struct mm_struct`.
> > +///
> > +/// This type is identical to `MmWithUser` except that it uses `mmput_async` when dropping a
> > +/// refcount. This means that the destructor of `ARef<MmWithUserAsync>` is safe to call in atomic
> > +/// context.
> > +///
> > +/// # Invariants
> > +///
> > +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> > +#[repr(transparent)]
> > +pub struct MmWithUserAsync {
> > + mm: MmWithUser,
> > +}
> > +
> > +// SAFETY: It is safe to call `mmput_async` on another thread than where `mmget` was called.
> > +unsafe impl Send for MmWithUserAsync {}
> > +// SAFETY: All methods on `MmWithUserAsync` can be called in parallel from several threads.
> > +unsafe impl Sync for MmWithUserAsync {}
> > +
> > +// SAFETY: By the type invariants, this type is always refcounted.
> > +unsafe impl AlwaysRefCounted for MmWithUserAsync {
> > + fn inc_ref(&self) {
> > + // SAFETY: The pointer is valid since self is a reference.
> > + unsafe { bindings::mmget(self.as_raw()) };
> > + }
> > +
> > + unsafe fn dec_ref(obj: NonNull<Self>) {
> > + // SAFETY: The caller is giving up their refcount.
> > + unsafe { bindings::mmput_async(obj.cast().as_ptr()) };
> > + }
> > +}
> > +
> > +// Make all `MmWithUser` methods available on `MmWithUserAsync`.
> > +impl Deref for MmWithUserAsync {
> > + type Target = MmWithUser;
> > +
> > + #[inline]
> > + fn deref(&self) -> &MmWithUser {
> > + &self.mm
> > + }
> > +}
> > +
> > // These methods are safe to call even if `mm_users` is zero.
> > impl Mm {
> > /// Call `mmgrab` on `current.mm`.
> > @@ -171,6 +213,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
> > unsafe { &*ptr.cast() }
> > }
> >
> > + /// Use `mmput_async` when dropping this refcount.
> > + #[inline]
> > + pub fn use_mmput_async(me: ARef<MmWithUser>) -> ARef<MmWithUserAsync> {
>
> Again, nitty, but I wonder if this should be as_xxx()?
>
> But I guess this makes sense too.
>
> > + // SAFETY: The layouts and invariants are compatible.
> > + unsafe { ARef::from_raw(ARef::into_raw(me).cast()) }
> > + }
> > +
> > /// Try to lock the vma read lock under rcu.
> > ///
> > /// If this operation fails, the vma may still exist. In that case, you should take the mmap
> >
> > --
> > 2.47.0.371.ga323438b13-goog
> >
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 6/7] mm: rust: add VmAreaNew
2024-11-21 11:47 ` Alice Ryhl
@ 2024-11-21 13:08 ` Lorenzo Stoakes
0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 13:08 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 12:47:33PM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 9:02 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 02:50:00PM +0000, Alice Ryhl wrote:
> > > When setting up a new vma in an mmap call, you have a bunch of extra
> > > permissions that you normally don't have. This introduces a new
> > > VmAreaNew type that is used for this case.
> >
> > Hm I'm confused by what you mean here. What permissions do you mean?
>
> I just mean that there are additional things you can do, e.g. you can
> set VM_MIXEDMAP.
>
> > Is this to abstract a VMA as passed by f_op->mmap()? I think it would be
> > better to explicitly say this if so.
>
> Yeah, the VmAreaNew type is specifically for f_op->mmap(). I'll be
> more explicit about that.
Yeah this is really critical on this one for me.
>
> > > To avoid setting invalid flag values, the methods for clearing
> > > VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error
> > > if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking
> > > the return value results in a compilation error because the `Result`
> > > type is marked #[must_use].
> >
> > This is nice.
> >
> > Though note that, it is explicitly not permitted to permit writability for
> > a VMA that previously had it disallowed, and we explicitly WARN_ON() this
> > now. Concretely that means a VMA where !(vma->vm_flags & VM_MAYWRITE), you
> > must not vma->vm_flags |= VM_MAYWRITE.
>
> Got it. The API here doesn't allow that, but good to know we can't add
> it in the future.
>
> > > For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When
> > > we add a VM_PFNMAP method, we will need some way to prevent you from
> > > setting both VM_MIXEDMAP and VM_PFNMAP on the same vma.
> >
> > Yes this would be unwise.
> >
> > An aside here - really you should _only_ change flags in this hook (perhaps
> > of course also initialising vma->vm_private_data state), trying to change
> > anything _core_ here would be deeply dangerous.
> >
> > We are far too permissive with this right now, and it's something we want
> > to try ideally to limit in the future.
>
> The previous version just had a function called "set_flags" that could
> be used to set any flags you want. Given Jann's feedback, I had
> restricted it to only allow certain flag changes.
Yes this is a good idea.
>
> > > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
This stuff is sensitive (again, we really need to do something about this in the
VMA code, TBD) so will want to see the v9 respin before ack.
> > > ---
> > > rust/kernel/mm/virt.rs | 169 ++++++++++++++++++++++++++++++++++++++++++++++++-
> > > 1 file changed, 168 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > > index de7f2338810a..22aff8e77854 100644
> > > --- a/rust/kernel/mm/virt.rs
> > > +++ b/rust/kernel/mm/virt.rs
> > > @@ -6,7 +6,7 @@
> > >
> > > use crate::{
> > > bindings,
> > > - error::{to_result, Result},
> > > + error::{code::EINVAL, to_result, Result},
> > > page::Page,
> > > types::Opaque,
> > > };
> > > @@ -148,6 +148,173 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result {
> > > }
> > > }
> > >
> > > +/// A builder for setting up a vma in an `mmap` call.
> >
> > Would be better to explicitly reference the struct file_operations->mmap()
> > hook and to say that we should only be updating flags and vm_private_data
> > here (though perhaps not worth mentioning _that_ if not explicitly exposed
> > by your interface).
>
> I guess also vm_ops?
>
> > I'm guessing fields are, unless a setter/builder is established, immutable?
>
> If you have a mutable reference, you can always modify fields. When
> you don't want that, fields are made private.
>
> > > + /// Internal method for updating the vma flags.
> > > + ///
> > > + /// # Safety
> > > + ///
> > > + /// This must not be used to set the flags to an invalid value.
> > > + #[inline]
> > > + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) {
> > > + let mut flags = self.flags();
> > > + flags |= set;
> > > + flags &= !unset;
> > > +
> > > + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet
> > > + // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel.
> > > + // The caller promises that this does not set the flags to an invalid value.
> > > + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags = flags };
> >
> > Hm not sure if this is correct. We explicitly maintain a union in struct vm_area_struct as:
> >
> > union {
> > const vm_flags_t vm_flags;
> > vm_flags_t __private __vm_flags;
> > };
> >
> > Where vma->vm_flags is const, and then use helpers like vm_flags_init() to
> > set them, which also do things like assert locks (though not in the init
> > case, of course).
> >
> > So erally we should at least be updating __vm_flags here, though I'm not
> > sure how bindgen treats it?
>
> I don't think using vm_flags vs __vm_flags changes anything on the
> Rust side. The modifications are happening unsafely through a raw
> pointer to something inside Opaque, so Rust isn't going to care about
> stuff like your const annotation; it's unsafe either way.
>
> I'll update this to use __vm_flags instead.
Yeah, I mean _in practice_ byte-for-byte it will make no difference but it'd be
nice just for the purposes of maintaining a similar abstraction.
It might be worth wrapping in something that explicitly references that this is
on init only but I guess _the whole type_ does this.
>
> > > + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set.
> > > + ///
> > > + /// This flag indicates whether userspace is allowed to make this vma readable with
> > > + /// `mprotect()`.
> > > + #[inline]
> > > + pub fn try_clear_mayread(&self) -> Result {
> > > + if self.get_read() {
> > > + return Err(EINVAL);
> > > + }
> >
> > This is quite nice! Strong(er) typing for the win, again :>)
>
> :)
>
> Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 7/7] rust: miscdevice: add mmap support
2024-11-21 10:08 ` Alice Ryhl
@ 2024-11-21 13:11 ` Lorenzo Stoakes
0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-21 13:11 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 11:08:05AM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 9:07 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 02:50:01PM +0000, Alice Ryhl wrote:
> > > Using the vma support introduced by the previous commit, introduce mmap
> > > support for miscdevices. The mmap call is given a vma that is undergoing
> > > initial setup, so the VmAreaNew type is used.
> >
> > Again, I'd be explicit about the VMA being passed to a struct
> > file_operations->mmap() hook on mmap. Otherwise this seems super vague to
> > me!
>
> I wasn't sure if vmas could be constructed in any context other than
> mmap. Will reword.
Thanks.
>
> > > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
_Definitely_ need the commit msg update above, but other than this from mm side
looks reasonable, so:
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> > > ---
> > > rust/kernel/miscdevice.rs | 28 ++++++++++++++++++++++++++++
> > > 1 file changed, 28 insertions(+)
> > >
> > > diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs
> > > index 7e2a79b3ae26..4e4b9476e092 100644
> > > --- a/rust/kernel/miscdevice.rs
> > > +++ b/rust/kernel/miscdevice.rs
> > > @@ -11,6 +11,7 @@
> > > use crate::{
> > > bindings,
> > > error::{to_result, Error, Result, VTABLE_DEFAULT_ERROR},
> > > + mm::virt::VmAreaNew,
> > > prelude::*,
> > > str::CStr,
> > > types::{ForeignOwnable, Opaque},
> > > @@ -110,6 +111,11 @@ fn release(device: Self::Ptr) {
> > > drop(device);
> > > }
> > >
> > > + /// Handle for mmap.
> > > + fn mmap(_device: <Self::Ptr as ForeignOwnable>::Borrowed<'_>, _vma: &VmAreaNew) -> Result {
> >
> > You will have to help me with this :) ForeignOwnable, Borrowed<'_>, _vma I'm a
> > bit lost here!
>
> When you implement the Miscdevice for your own type, you write a block
> like this one:
>
> impl Miscdevice for MyType {
> type Ptr = Arc<MyType>;
> ...
> }
>
> Here Arc is a pointer type (very similar to ARef) that represents
> ownership over a refcount to a refcounted value.
>
> In this case:
> * Self refers to MyType.
> * Self::Ptr refers to Arc<MyType>
> * <Self::Ptr as ForeignOwnable>::Borrowed<'_> refers to ArcBorrow<MyType>
>
> The last step is resolved to ArcBorrow<MyType> because of this impl
> block in rust/kernel/sync/arc.rs:
>
> impl<T: 'static> ForeignOwnable for Arc<T> {
> type Borrowed<'a> = ArcBorrow<'a, T>;
> ...
> }
>
> Note that Self::Ptr is short-hand for <Self as Miscdevice>::Ptr.
Thanks!
>
> > > + kernel::build_error(VTABLE_DEFAULT_ERROR)
> >
> > What is this? Is this not yet implemented or something, and this is a
> > placeholder or something?
>
> It's a build-time assertion that this function is dead-code eliminated.
>
> There are two cases:
> * Either the driver does not override mmap. In this case, we store a
> null pointer in the fops table.
> * Or the driver overrides mmap with their own implementation. In this
> case, we store a function pointer to whichever implementation they
> provided.
> In neither case is the above function present anywhere in the final executable.
Ah nice, zero-cost stuff.
>
> Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct
2024-11-21 12:37 ` Lorenzo Stoakes
@ 2024-11-21 14:35 ` Alice Ryhl
0 siblings, 0 replies; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 14:35 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 1:37 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> I'm generally fine with this patch (other than rust specifics which I leave
> to the rust experts), however I'm a little concerned about us taking
> reference counts when we don't need to so this is something I'd like to see
> addressed for v9 or, at least to be confident we're not doing this in the
> binder code unnecessarily.
>
> Thanks!
The refcount increment is *not* unnecessary in Binder. For the C
equivalent, see the implementation of `binder_alloc_init`. It's used
because Binder will access the mm of the recipient from the sender's
scope, so it must hold on to an mm_struct reference.
> > > > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> > > > +#[repr(transparent)]
> > > > +pub struct Mm {
> > > > + mm: Opaque<bindings::mm_struct>,
> > > > +}
> > >
> > > Does this tie this type to the C struct mm_struct type?
> > >
> > > Does 'Opaque' mean it is a pointer to a type which is 'opaque' in the sense
> > > that rust can't see into its internals?
> >
> > This declaration defines a Rust type called Mm which has the same
> > size, alignment, and contents as `struct mm_struct`. The purpose of
> > Opaque is to tell Rust that it can't assume anything about the
> > contents at all; we do that to leave it up to C.
> >
> > For example, normally if you have an immutable reference &i32, then
> > Rust is going to assume that the contents behind the reference are in
> > fact immutable. Opaque turns that off, meaning that an `&Opaque<i32>`
> > is allowed to reference an integer as it gets modified. It makes all
> > access to the contents unsafe, though.
> >
> > Note that Opaque is *not* a pointer type. We're going to be dealing
> > with types like &Mm or ARef<Mm> where &_ and ARef<_> are two different
> > kinds of pointers.
>
> OK so when you reference Mm.mm that is in effect referencing the struct
> mm_struct object rather than a pointer to a struct mm_struct? and
Yes.
> > > > +// SAFETY: By the type invariants, this type is always refcounted.
> > > > +unsafe impl AlwaysRefCounted for Mm {
> > > > + #[inline]
> > > > + fn inc_ref(&self) {
> > > > + // SAFETY: The pointer is valid since self is a reference.
> > > > + unsafe { bindings::mmgrab(self.as_raw()) };
> > > > + }
> > > > +
> > > > + #[inline]
> > > > + unsafe fn dec_ref(obj: NonNull<Self>) {
> > > > + // SAFETY: The caller is giving up their refcount.
> > > > + unsafe { bindings::mmdrop(obj.cast().as_ptr()) };
> > > > + }
> > > > +}
> > >
> > > Under what circumstances would these be taken? Same question for MmWthUser.
> >
> > This makes `Mm` compatible with the pointer type called ARef<_>. This
> > pointer type is used to represent ownership of a refcount. So whenever
> > a variable of type ARef<_> goes out of scope, the refcount is
> > decremented, and whenever an ARef<_> is cloned, the refcount is
> > incremented.
> >
> > The way this works is that ARef<_> is implemented to use the
> > AlwaysRefCounted trait to understand how to manipulate the count. Only
> > types that implement the trait with an impl block like above can be
> > used with ARef<_>.
>
> OK so when you first instantiate an ARef<_> you don't increment the
> reference count, only if it is cloned from there on in?
Well it depends on which ARef constructor you are using. But the uses
of ARef::from_raw do not increment the count.
> > > > +// These methods are safe to call even if `mm_users` is zero.
> > > > +impl Mm {
> > > > + /// Call `mmgrab` on `current.mm`.
> > > > + #[inline]
> > > > + pub fn mmgrab_current() -> Option<ARef<Mm>> {
> > > > + // SAFETY: It's safe to get the `mm` field from current.
> > > > + let mm = unsafe {
> > > > + let current = bindings::get_current();
> > > > + (*current).mm
> > > > + };
> > >
> > > I don't see an equivalent means of obtaining mm from current for
> > > MmWithUser, is that intentional, would there be another means of obtaining
> > > an mm? (perhaps via vma->vm_mm I guess?)
> > >
> > > An aside -->
> > >
> > > If we're grabbing from current, and this is non-NULL (i.e. not a kernel
> > > thread), this is kinda MmWithUser to _start out_ with, but I guess if
> > > you're grabbing the current one you might not expect it.
> > >
> > > I guess one thing I want to point out (maybe here is wrong place) is that
> > > the usual way of interacting with current->mm is that we do _not_ increment
> > > mm->mm_count, mm->mm_users or any refernce count, as while we are in the
> > > kernel portion of the task, we are guaranteed the mm and the backing
> > > virtual address space will stick around.
> > >
> > > With reference to MmWithUser, in fact, if you look up users of
> > > mmget()/mmput() it is pretty rare to do that.
> > >
> > > So ideally we'd avoid doing this if we could (though having the semantics
> > > of grabbing a reference if we were to copy the object somehow again or hold
> > > its state or something would be nice).
> > >
> > > I guess this might actually be tricky in rust, because we'd probably need
> > > to somehow express the current task's lifetime and tie this to that
> > > and... yeah.
> > >
> > > <-- aside
> >
> > Ah, yeah, I guess this API is incomplete. We could have an API that
> > lets you obtain an &MmWithUser instead. Then, if the user wants to
> > increment the refcount, they can manually convert that into an
> > ARef<Mm> or ARef<MmWithUser>.
> >
> > It's true that it's slightly tricky to express in Rust, but it's
> > possible. We already have a way to get a &Task pointing at current.
>
> Yeah, but I think we should do this incrementally as I want this initial
> series to merge first, after which we can tweak things.
Rest assured that the API can be written to not automatically
increment the refcount when accessing current. That's just binder's
case.
> > > > + unsafe { &*ptr.cast() }
> > > > + }
> > > > +
> > > > + /// Calls `mmget_not_zero` and returns a handle if it succeeds.
> > > > + #[inline]
> > > > + pub fn mmget_not_zero(&self) -> Option<ARef<MmWithUser>> {
> > >
> > > I actually kinda love that this takes an mm and guarantees that you get an
> > > MmWithUser out of it which is implied by the fact this succeeds.
> > >
> > > However as to the point above, I'm concerned that this might be seen as
> > > 'the way' to access an mm, i.e. mm.mmgrab_current().mmget_not_zero() or
> > > something.
> > >
> > > Whereas, the usual way of referencing current->mm is to not increment any
> > > reference counts at all (assuming what you are doing resides in the same
> > > lifetime as the task).
> > >
> > > Obviously if you step outside of that lifetime, then you _do_ have to pin
> > > the mm (nearly always you want to grab rather than get though in that
> > > circumstance).
> >
> > I can add a way to obtain an &MmWithUser from current without
> > incrementing the refcount.
>
> Yeah, I feel like we definitely need this.
>
> However we'd need to _not_ drop the refcount when it goes out of scope too
> in this case.
>
> I'm not sure how you'd express that, however.
The way you express that is by giving the user a &Mm or &MmWithUser
instead of giving the user an ARef<Mm> or ARef<MmWithUser>. Using a
normal reference implies that you don't have ownership over the
refcount, and the reference has no destructor when it goes out of
scope. The only slightly tricky piece is tying the lifetime of that
reference to the scope of the current task, but this is a problem with
a known solution.
For more on this, see the discussion on the various versions of
Christian's PidNamespace series:
https://lore.kernel.org/rust-for-linux/20241002-brauner-rust-pid_namespace-v5-1-a90e70d44fde@kernel.org/
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access
2024-11-21 12:50 ` Lorenzo Stoakes
@ 2024-11-21 14:39 ` Alice Ryhl
0 siblings, 0 replies; 34+ messages in thread
From: Alice Ryhl @ 2024-11-21 14:39 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Thu, Nov 21, 2024 at 1:50 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Thu, Nov 21, 2024 at 11:23:39AM +0100, Alice Ryhl wrote:
> > On Wed, Nov 20, 2024 at 8:07 PM Lorenzo Stoakes
> > <lorenzo.stoakes@oracle.com> wrote:
> > >
> > > On Wed, Nov 20, 2024 at 02:49:56PM +0000, Alice Ryhl wrote:
> > > > + #[inline]
> > > > + pub fn end(&self) -> usize {
> > > > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > > > + // access is not a data race.
> > > > + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
> > > > + }
> > > > +
> > > > + /// Unmap pages in the given page range.
> > >
> > > This needs some more description, as 'unmapping' pages is unfortunately an
> > > overloaded term in the kernel and this very much might confuse people as
> > > opposed to e.g. munmap()'ing a range.
> > >
> > > I'd say something like 'clear page table mappings for the range at the leaf
> > > level, leaving all other page tables intact, freeing any memory referenced
> > > by the VMA in this range (anonymous memory is completely freed, file-backed
> > > memory has its reference count on page cache folio's dropped, any dirty
> > > data will still be written back to disk as usual)'.
> >
> > Sure, will add that to the docs.
>
> Thanks, I assume you mean this comment, which will form part of the docs? As
> here we should at least replace the 'unmap' with 'zap' to avoid confusion
> vs. munmap().
Yes. Comments with three slashes are rendered in the html documentation.
> > > > + #[inline]
> > > > + pub fn zap_page_range_single(&self, address: usize, size: usize) {
> > > > + // SAFETY: By the type invariants, the caller has read access to this VMA, which is
> > > > + // sufficient for this method call. This method has no requirements on the vma flags. Any
> > > > + // value of `address` and `size` is allowed.
> > > > + unsafe {
> > > > + bindings::zap_page_range_single(
> > >
> > > Hm weirdly I see this in rust/bindings/bindings_generated.rs but not in
> > > rust/helpers/mm.c is this intended?
> > >
> > > Is this meant to be generated _from_ somewhere? Is something missing for
> > > that?
> >
> > The bindings_generated.rs file is generated at built-time from C
> > headers. The zap_page_range_single is a real function, not a fake
> > static inline header-only function, so bindgen is able to generate it
> > without anything in rust/helpers.
> >
> > > > + self.as_ptr(),
> > > > + address as _,
> > > > + size as _,
> > > > + core::ptr::null_mut(),
> > > > + )
> > > > + };
> > > > + }
> > > > +}
> > > > +
> > > > +/// The integer type used for vma flags.
> > > > +#[doc(inline)]
> > > > +pub use bindings::vm_flags_t;
> > >
> > > Where do you declare this type?
> >
> > It's declared in include/linux/mm_types.h
>
> I meant from a rust perspective, but I guess bindgen handles this?
Yes, anything in `bindings::` is output from bindgen based on C headers.
Alice
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct
2024-11-20 18:13 ` Lorenzo Stoakes
2024-11-21 9:52 ` Alice Ryhl
@ 2024-11-22 7:42 ` Paolo Bonzini
2024-11-22 17:41 ` Lorenzo Stoakes
1 sibling, 1 reply; 34+ messages in thread
From: Paolo Bonzini @ 2024-11-22 7:42 UTC (permalink / raw)
To: Lorenzo Stoakes, Alice Ryhl
Cc: Miguel Ojeda, Matthew Wilcox, Vlastimil Babka, John Hubbard,
Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On 11/20/24 19:13, Lorenzo Stoakes wrote:
> I will obviously inevitably ask a TON of questions as a result of not being
> a rust person so, bear with me...
I'm just a lurker on the rust-for-linux mailing list but... thanks *so
much* Lorenzo and Alice for this thread. It's the best introduction to
Rust idioms for kernel programmers I've ever seen.
Paolo
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct
2024-11-22 7:42 ` Paolo Bonzini
@ 2024-11-22 17:41 ` Lorenzo Stoakes
0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2024-11-22 17:41 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Alice Ryhl, Miguel Ojeda, Matthew Wilcox, Vlastimil Babka,
John Hubbard, Liam R. Howlett, Andrew Morton, Greg Kroah-Hartman,
Arnd Bergmann, Christian Brauner, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, linux-kernel,
linux-mm, rust-for-linux, Andreas Hindborg
On Fri, Nov 22, 2024 at 08:42:57AM +0100, Paolo Bonzini wrote:
> On 11/20/24 19:13, Lorenzo Stoakes wrote:
> > I will obviously inevitably ask a TON of questions as a result of not being
> > a rust person so, bear with me...
>
> I'm just a lurker on the rust-for-linux mailing list but... thanks *so much*
> Lorenzo and Alice for this thread. It's the best introduction to Rust
> idioms for kernel programmers I've ever seen.
>
> Paolo
>
Thanks :) I am merely asking questions from mm side, so all credit goes to
Alice for her patient and detailed explanations! :)
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-12-05 15:29 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-20 14:49 [PATCH v8 0/7] Rust support for mm_struct, vm_area_struct, and mmap Alice Ryhl
2024-11-20 14:49 ` [PATCH v8 1/7] mm: rust: add abstraction for struct mm_struct Alice Ryhl
2024-11-20 18:13 ` Lorenzo Stoakes
2024-11-21 9:52 ` Alice Ryhl
2024-11-21 12:37 ` Lorenzo Stoakes
2024-11-21 14:35 ` Alice Ryhl
2024-11-22 7:42 ` Paolo Bonzini
2024-11-22 17:41 ` Lorenzo Stoakes
2024-11-20 14:49 ` [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require read access Alice Ryhl
2024-11-20 19:07 ` Lorenzo Stoakes
2024-11-21 10:23 ` Alice Ryhl
2024-11-21 12:50 ` Lorenzo Stoakes
2024-11-21 14:39 ` Alice Ryhl
2024-11-20 14:49 ` [PATCH v8 3/7] mm: rust: add vm_insert_page Alice Ryhl
2024-11-20 19:20 ` Lorenzo Stoakes
2024-11-21 10:38 ` Alice Ryhl
2024-11-21 12:55 ` Lorenzo Stoakes
2024-11-20 14:49 ` [PATCH v8 4/7] mm: rust: add lock_vma_under_rcu Alice Ryhl
2024-11-20 19:29 ` Lorenzo Stoakes
2024-11-21 10:44 ` Alice Ryhl
2024-11-21 12:59 ` Lorenzo Stoakes
2024-11-20 14:49 ` [PATCH v8 5/7] mm: rust: add mmput_async support Alice Ryhl
2024-11-20 19:46 ` Lorenzo Stoakes
2024-11-21 11:39 ` Alice Ryhl
2024-11-21 13:04 ` Lorenzo Stoakes
2024-11-21 13:04 ` Lorenzo Stoakes
2024-11-20 14:50 ` [PATCH v8 6/7] mm: rust: add VmAreaNew Alice Ryhl
2024-11-20 20:02 ` Lorenzo Stoakes
2024-11-21 11:47 ` Alice Ryhl
2024-11-21 13:08 ` Lorenzo Stoakes
2024-11-20 14:50 ` [PATCH v8 7/7] rust: miscdevice: add mmap support Alice Ryhl
2024-11-20 20:07 ` Lorenzo Stoakes
2024-11-21 10:08 ` Alice Ryhl
2024-11-21 13:11 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox