From: "Gary Guo" <gary@garyguo.net>
To: "Andreas Hindborg" <a.hindborg@kernel.org>,
"Gary Guo" <gary@garyguo.net>, "Boqun Feng" <boqun@kernel.org>
Cc: "Alice Ryhl" <aliceryhl@google.com>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Benno Lossin" <lossin@kernel.org>,
"Trevor Gross" <tmgross@umich.edu>,
"Danilo Krummrich" <dakr@kernel.org>,
linux-mm@kvack.org, rust-for-linux@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] rust: page: add volatile memory copy methods
Date: Sat, 31 Jan 2026 20:48:55 +0000 [thread overview]
Message-ID: <DG32JI45HFKS.29745T7AZGFTV@garyguo.net> (raw)
In-Reply-To: <87bji9r0cp.fsf@t14s.mail-host-address-is-not-set>
On Sat Jan 31, 2026 at 8:30 PM GMT, Andreas Hindborg wrote:
> "Gary Guo" <gary@garyguo.net> writes:
>
>> On Sat Jan 31, 2026 at 1:34 PM GMT, Andreas Hindborg wrote:
>>> "Boqun Feng" <boqun@kernel.org> writes:
>>>
>>>> On Fri, Jan 30, 2026 at 01:41:05PM -0800, Boqun Feng wrote:
>>>>> On Fri, Jan 30, 2026 at 05:20:11PM +0100, Andreas Hindborg wrote:
>>>>> [...]
>>>>> > >> In the last discussions we had on this, the conclusion was to use
>>>>> > >> `volatile_copy_memory` whenever that is available, or write a volatile
>>>>> > >> copy function in assembly.
>>>>> > >>
>>>>> > >> Using memcpy_{from,to}io is the latter solution. These functions are
>>>>> > >> simply volatile memcpy implemented in assembly.
>>>>> > >>
>>>>> > >> There is nothing special about MMIO. These functions are name as they
>>>>> > >> are because they are useful for MMIO.
>>>>> > >
>>>>> > > No. MMIO are really special. A few architectures require them to be accessed
>>>>> > > completely differently compared to normal memory. We also have things like
>>>>> > > INDIRECT_IOMEM. memory_{from,to}io are special as they use MMIO accessor such as
>>>>> > > readb to perform access on the __iomem pointer. They should not be mixed with
>>>>> > > normal memory. They must be treated as if they're from a completely separate
>>>>> > > address space.
>>>>> > >
>>>>> > > Normal memory vs DMA vs MMIO are all distinct, and this is demonstrated by the
>>>>> > > different types of barriers needed to order things correctly for each type of
>>>>> > > memory region.
>>>>> > >
>>>>> > > Userspace-mapped memory (that is also mapped in the kernel space, not __user) is
>>>>> > > the least special one out of these. They could practically share all atomic infra
>>>>> > > available for the kernel, hence the suggestion of using byte-wise atomic memcpy.
>>>>> >
>>>>> > I see. I did not consider this.
>>>>> >
>>>>> > At any rate, I still don't understand why I need an atomic copy function, or why I
>>>>> > need a byte-wise copy function. A volatile copy function should be fine, no?
>>>>> >
>>>>>
>>>>> but memcpy_{from,to}io() are not just volatile copy functions, they have
>>>>> additional side effects for MMIO ;-)
>>>>>
>>>>
>>>> For example, powerpc's memcpy_fromio() has eioio() in it, which we don't
>>>> need for normal (user -> kernel) memory copy.
>>>
>>> Ok, I see. Thanks for explaining. I was only looking at the x86
>>> implementation, which is of course not enough.
>>>
>>>>
>>>>> > And what is the exact problem in using memcpy_{from,to}io. Looking at
>>>>
>>>> I think the main problem of using memcpy_{from,to}io here is not that
>>>> they are not volatile memcpy (they might be), but it's because we
>>>> wouldn't use them for the same thing in C, because they are designed for
>>>> memory copying between MMIO and kernel memory (RAM).
>>>>
>>>> For MMIO, as Gary mentioned, because they are different than the normal
>>>> memory, special instructions or extra barriers are needed.
>>>
>>> I see, I was not aware.
>>>
>>>>
>>>> For DMA memory, it can be almost treated as external normal memory,
>>>> however, different archictures/systems/platforms may have different
>>>> requirement regarding cache coherent between CPU and devices, specially
>>>> mapping or special instructions may be needed.
>>>
>>> Cache flushing and barriers, got it.
>>>
>>>>
>>>> For __user memory, because kernel is only given a userspace address, and
>>>> userspace can lie or unmap the address while kernel accessing it,
>>>> copy_{from,to}_user() is needed to handle page faults.
>>>
>>> Just to clarify, for my use case, the page is already mapped to kernel
>>> space, and it is guaranteed to be mapped for the duration of the call
>>> where I do the copy. Also, it _may_ be a user page, but it might not
>>> always be the case.
>>
>> In that case you should also assume there might be other kernel-space users.
>> Byte-wise atomic memcpy would be best tool.
>
> Other concurrent kernel readers/writers would be a kernel bug in my use
> case. We could add this to the safety requirements.
>
>>
>>>
>>>>
>>>> Your use case (copying between userspace-mapped memory and kernel
>>>> memory) is, as Gary said, the least special here. So using
>>>> memcpy_{from,to}io() would be overkill and probably misleading.
>>>
>>> Ok, I understand.
>>>
>>>> I
>>>> suggest we use `{read,write}_volatile()` (unless I'm missing something
>>>> subtle of course), however `{read,write}_volatile()` only works on Sized
>>>> types,
>>>
>>> We can copy as u8? Or would it be more efficient to copy as a larger size?
>>
>> Byte-wise atomic means that the atomicity is restricted to byte level (hence
>> it's okay to say if you read a u32 with it and does not observe an atomic
>> update). It does not mean that the access needs to be byte-wise, so it's
>> perfectly fine to do a 32-bit load and it'll still be byte-wise atomic.
>
> Ah.
>
>>
>>>
>>> You suggested atomic in the other email, did you abandon that idea?
>>
>> The semantics we want is byte-wise atomic, although as a impl detail, using
>> volatile for now is all that we need.
>>
>>>
>>>> so we may have to use `bindings::memcpy()` or
>>>> core::intrinsics::volatile_copy_memory() [1]
>>>
>>> I was looking at this one, but it is unstable behind `core_intrinsics`.
>>> I was uncertain about pulling in additional unstable features. This is
>>> why I was looking for something in the C kernel to use.
>>>
>>> I think `bindings::memcpy` is not guaranteed to be implemented as inline
>>> assembly, so it may not have volatile semantics?
>>
>> In absence of full language LTO as we have today, it'll be fine (in practice).
>> Unlike C, if you reference a symbol called "memcpy", it won't be treated as
>> special and get turned into non-volatile memcpy.
>>
>> If the volatile memcpy intrinsics is stable, then we can switch to use that.
>
> Got it, this aligns with what Boqun is writing. Let's go for that.
>
> It also looks like memcpy is implemented in assembly for arm, arm32,
> x86_64. Which would exempt it from LTO. Not sure about 32bit x86 though.
> It defers to `__memcpy`. I could not figure out what that resolves to.
> Is it from the compiler?
I think it's the one in arch/x86/include/asm/string_32.h? That is also inline
assembly.
There's no need to worry about if things can be optimized wrongly. I haven't
looked at the current defence against LTO when the code is implemented in C, but
As Boqun pointed out, the `memcpy` and `memmove` symbols are assumed to have
volatile semantics anyway. So the issue is not unique to Rust (also, we're
immune at the moment as there's no linker-plugin LTO support for Rust).
Ultimately, `volatile_copy_nonoverlapping_memory` is translated to `memcpy`
(similarly, `volatile_copy_memory` is `memmove`). The benefit of the intrinsics
is that if the size is fixed, it can be optimized a single volatile load/store
by LLVM.
Best,
Gary
>
>
> Best regards,
> Andreas Hindborg
next prev parent reply other threads:[~2026-01-31 20:49 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-30 12:33 Andreas Hindborg
2026-01-30 13:10 ` Gary Guo
2026-01-30 13:48 ` Andreas Hindborg
2026-01-30 14:14 ` Gary Guo
2026-01-30 14:42 ` Andreas Hindborg
2026-01-30 15:04 ` Gary Guo
2026-01-30 15:23 ` Andreas Hindborg
2026-01-30 15:48 ` Gary Guo
2026-01-30 16:20 ` Andreas Hindborg
2026-01-30 21:41 ` Boqun Feng
2026-01-31 7:22 ` Boqun Feng
2026-01-31 13:34 ` Andreas Hindborg
2026-01-31 16:09 ` Gary Guo
2026-01-31 20:30 ` Andreas Hindborg
2026-01-31 20:48 ` Gary Guo [this message]
2026-01-31 21:31 ` Andreas Hindborg
2026-02-03 1:07 ` Boqun Feng
2026-02-04 13:16 ` Andreas Hindborg
2026-02-04 13:48 ` Alice Ryhl
2026-02-04 15:58 ` Andreas Hindborg
2026-02-04 16:12 ` Gary Guo
2026-02-12 14:21 ` Andreas Hindborg
2026-01-31 16:26 ` Boqun Feng
2026-01-31 20:14 ` Andreas Hindborg
2026-01-31 13:19 ` Andreas Hindborg
2026-01-31 16:43 ` Boqun Feng
2026-01-31 19:10 ` Andreas Hindborg
2026-01-31 19:30 ` Boqun Feng
2026-01-31 20:20 ` Andreas Hindborg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DG32JI45HFKS.29745T7AZGFTV@garyguo.net \
--to=gary@garyguo.net \
--cc=Liam.Howlett@oracle.com \
--cc=a.hindborg@kernel.org \
--cc=aliceryhl@google.com \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=boqun@kernel.org \
--cc=dakr@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=lossin@kernel.org \
--cc=ojeda@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=tmgross@umich.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox