From: Alice Ryhl <aliceryhl@google.com>
To: Benno Lossin <benno.lossin@proton.me>
Cc: a.hindborg@samsung.com, akpm@linux-foundation.org,
alex.gaynor@gmail.com, arnd@arndb.de, arve@android.com,
bjorn3_gh@protonmail.com, boqun.feng@gmail.com,
brauner@kernel.org, cmllamas@google.com, gary@garyguo.net,
gregkh@linuxfoundation.org, joel@joelfernandes.org,
keescook@chromium.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, maco@android.com, ojeda@kernel.org,
rust-for-linux@vger.kernel.org, surenb@google.com,
tkjos@android.com, viro@zeniv.linux.org.uk, wedsonaf@gmail.com,
willy@infradead.org
Subject: Re: [PATCH v3 4/4] rust: add abstraction for `struct page`
Date: Thu, 21 Mar 2024 15:16:07 +0100 [thread overview]
Message-ID: <CAH5fLgiHcru2uFmfwSvTnZvcnxh5Av-UWx1EEvttA0OyzhXSVQ@mail.gmail.com> (raw)
In-Reply-To: <CAH5fLggy1ci+gu2eVzjo0nHtNzT=+NgaaU5WAwq8qHrhu1QCYw@mail.gmail.com>
On Thu, Mar 21, 2024 at 3:11 PM Alice Ryhl <aliceryhl@google.com> wrote:
>
> On Thu, Mar 21, 2024 at 2:56 PM Benno Lossin <benno.lossin@proton.me> wrote:
> >
> > On 3/21/24 14:42, Alice Ryhl wrote:
> > > On Thu, Mar 21, 2024 at 2:16 PM Benno Lossin <benno.lossin@proton.me> wrote:
> > >>
> > >> On 3/20/24 09:46, Alice Ryhl wrote:
> > >>>> On 3/11/24 11:47, Alice Ryhl wrote:
> > >>>>> +/// A pointer to a page that owns the page allocation.
> > >>>>> +///
> > >>>>> +/// # Invariants
> > >>>>> +///
> > >>>>> +/// The pointer points at a page, and has ownership over the page.
> > >>>>
> > >>>> Why not "`page` is valid"?
> > >>>> Do you mean by ownership of the page that `page` has ownership of the
> > >>>> allocation, or does that entail any other property/privilege?
> > >>>
> > >>> I can add "at a valid page".
> > >>
> > >> I don't think that helps, what you need as an invariant is that the
> > >> pointer is valid.
> > >
> > > To me "points at a page" implies that the pointer is valid. I mean, if
> > > it was dangling, it would not point at a page?
> > >
> > > But I can reword to something else if you have a preferred phrasing.
> >
> > I would just say "`page` is valid" or "`self.page` is valid".
> >
> > >>>>> + /// Runs a piece of code with this page mapped to an address.
> > >>>>> + ///
> > >>>>> + /// The page is unmapped when this call returns.
> > >>>>> + ///
> > >>>>> + /// It is up to the caller to use the provided raw pointer correctly.
> > >>>>
> > >>>> This says nothing about what 'correctly' means. What I gathered from the
> > >>>> implementation is that the supplied pointer is valid for the execution
> > >>>> of `f` for `PAGE_SIZE` bytes.
> > >>>> What other things are you allowed to rely upon?
> > >>>>
> > >>>> Is it really OK for this function to be called from multiple threads?
> > >>>> Could that not result in the same page being mapped multiple times? If
> > >>>> that is fine, what about potential data races when two threads write to
> > >>>> the pointer given to `f`?
> > >>>>
> > >>>>> + pub fn with_page_mapped<T>(&self, f: impl FnOnce(*mut u8) -> T) -> T {
> > >>>
> > >>> I will say:
> > >>>
> > >>> /// It is up to the caller to use the provided raw pointer correctly.
> > >>> /// The pointer is valid for `PAGE_SIZE` bytes and for the duration in
> > >>> /// which the closure is called. Depending on the gfp flags and kernel
> > >>> /// configuration, the pointer may only be mapped on the current thread,
> > >>> /// and in those cases, dereferencing it on other threads is UB. Other
> > >>> /// than that, the usual rules for dereferencing a raw pointer apply.
> > >>> /// (E.g., don't cause data races, the memory may be uninitialized, and
> > >>> /// so on.)
> > >>
> > >> I would simplify and drop "depending on the gfp flags and kernel..." and
> > >> just say that the pointer is only valid on the current thread.
> > >
> > > Sure, that works for me.
> > >
> > >> Also would it make sense to make the pointer type *mut [u8; PAGE_SIZE]?
> > >
> > > I think it's a trade-off. That makes the code more error-prone, since
> > > `pointer::add` now doesn't move by a number of bytes, but a number of
> > > pages.
> >
> > Yeah. As long as you document that the pointer is valid for r/w with
> > offsets in `0..PAGE_SIZE` bytes, leaving the type as is, is fine by me.
> >
> >
> > >>> It's okay to map it multiple times from different threads.
> > >>
> > >> Do you still need to take care of data races?
> > >> So would it be fine to execute this code on two threads in parallel?
> > >>
> > >> static PAGE: Page = ...; // assume we have a page accessible by both threads
> > >>
> > >> PAGE.with_page_mapped(|ptr| {
> > >> loop {
> > >> unsafe { ptr.write(0) };
> > >> pr_info!("{}", unsafe { ptr.read() });
> > >> }
> > >> });
> > >
> > > Like I said, the usual pointer rules apply. Two threads can access it
> > > in parallel as long as one of the following are satisfied:
> > >
> > > * Both accesses are reads.
> > > * Both accesses are atomic.
> > > * They access disjoint byte ranges.
> > >
> > > Other than the fact that it uses a thread-local mapping on machines
> > > that can't address all of their memory at the same time, it's
> > > completely normal memory. It's literally just a PAGE_SIZE-aligned
> > > allocation of PAGE_SIZE bytes.
> >
> > Thanks for the info, what do you think of this?:
> >
> > /// It is up to the caller to use the provided raw pointer correctly. The pointer is valid for reads
> > /// and writes for `PAGE_SIZE` bytes and for the duration in which the closure is called. The
> > /// pointer must only be used on the current thread. The caller must also ensure that no data races
> > /// occur: when mapping the same page on two threads accesses to memory with the same offset must be
> > /// synchronized.
>
> I would much rather phrase it in terms of "the usual pointer" rules. I
> mean, the memory could also be uninitialized if you don't pass
> __GFP_ZERO when you create it, so you also have to make sure to follow
> the rules about uninitialized memory. I don't want to be in the
> business of listing all requirements for accessing memory here.
>
> > >> If this is not allowed, I don't really like the API. As a raw version it
> > >> would be fine, but I think we should have a safer version (eg by taking
> > >> `&mut self`).
> > >
> > > I don't understand what you mean. It is the *most* raw API that `Page`
> > > has. I can make them private if you want me to. The API cannot take
> > > `&mut self` because I need to be able to unsafely perform concurrent
> > > writes to disjoint byte ranges.
> >
> > If you don't need these functions to be public, I think we should
> > definitely make them private.
> > Also we could add a `raw` suffix to the functions to make it clear that
> > it is a primitive API. If you think that it is highly unlikely that we
> > get a safer version, then I don't think there is value in adding the
> > suffix.
>
> The old code on the Rust branch didn't have these functions, but
> that's because the old `read_raw` and `write_raw` methods did all of
> these things directly in their implementation:
>
> * Map the memory so we can get a pointer.
> * Get a pointer to a subslice (with bounds checks!)
> * Do the actual read/write.
>
> I thought that doing this many things in a single function was
> convoluted, so I decided to refactor the code by extracting the "get a
> pointer to the page" logic into `with_page_mapped` and the "point to
> subslice with bounds check" logic into `with_pointer_into_page`. That
> way, each function has only one responsibility, instead of mixing
> three responsibilities into one.
>
> So even if we get a safer version, I would not want to get rid of this
> method. I don't want to inline its implementation into more
> complicated functions. The safer method would call the raw method, and
> then do whatever additional logic it wants to do on top of that.
Adding to this: To me, we *do* already have safer versions of this
method. Those are the read_raw and write_raw and fill_zero and
copy_from_user_slice methods.
Alice
next prev parent reply other threads:[~2024-03-21 14:16 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-11 10:47 [PATCH v3 0/4] Memory management patches needed by Rust Binder Alice Ryhl
2024-03-11 10:47 ` [PATCH v3 1/4] rust: uaccess: add userspace pointers Alice Ryhl
2024-03-16 14:16 ` Benno Lossin
2024-03-18 18:59 ` Boqun Feng
2024-03-18 19:12 ` Alice Ryhl
2024-03-18 19:33 ` Boqun Feng
2024-03-18 20:10 ` Alice Ryhl
2024-03-18 21:07 ` Boqun Feng
2024-03-20 2:27 ` Boqun Feng
2024-03-20 18:39 ` Boqun Feng
2024-03-11 10:47 ` [PATCH v3 2/4] uaccess: always export _copy_[from|to]_user with CONFIG_RUST Alice Ryhl
2024-03-18 21:16 ` Boqun Feng
2024-03-11 10:47 ` [PATCH v3 3/4] rust: uaccess: add typed accessors for userspace pointers Alice Ryhl
2024-03-16 14:56 ` Benno Lossin
2024-03-19 19:32 ` Boqun Feng
2024-03-11 10:47 ` [PATCH v3 4/4] rust: add abstraction for `struct page` Alice Ryhl
2024-03-11 10:50 ` Alice Ryhl
2024-03-15 8:16 ` Andreas Hindborg
2024-03-16 15:30 ` Matthew Wilcox
2024-03-16 20:39 ` Benno Lossin
2024-03-20 8:46 ` Alice Ryhl
2024-03-21 13:15 ` Benno Lossin
2024-03-21 13:42 ` Alice Ryhl
2024-03-21 13:56 ` Benno Lossin
2024-03-21 14:11 ` Alice Ryhl
2024-03-21 14:16 ` Alice Ryhl [this message]
2024-03-21 14:19 ` Benno Lossin
2024-03-19 22:16 ` Boqun Feng
2024-03-19 22:28 ` Benno Lossin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAH5fLgiHcru2uFmfwSvTnZvcnxh5Av-UWx1EEvttA0OyzhXSVQ@mail.gmail.com \
--to=aliceryhl@google.com \
--cc=a.hindborg@samsung.com \
--cc=akpm@linux-foundation.org \
--cc=alex.gaynor@gmail.com \
--cc=arnd@arndb.de \
--cc=arve@android.com \
--cc=benno.lossin@proton.me \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=brauner@kernel.org \
--cc=cmllamas@google.com \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=joel@joelfernandes.org \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maco@android.com \
--cc=ojeda@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=surenb@google.com \
--cc=tkjos@android.com \
--cc=viro@zeniv.linux.org.uk \
--cc=wedsonaf@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox