Re: [LSF/MM/BPF TOPIC] Improve this_cpu_ops performance for ARM64 (and potentially other architectures)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Ryan Roberts <ryan.roberts@arm.com>
To: Yang Shi <shy828301@gmail.com>,
	lsf-pc@lists.linux-foundation.org, Linux MM <linux-mm@kvack.org>,
	"Christoph Lameter (Ampere)" <cl@gentwo.org>,
	dennis@kernel.org, Tejun Heo <tj@kernel.org>,
	urezki@gmail.com, Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Subject: Re: [LSF/MM/BPF TOPIC] Improve this_cpu_ops performance for ARM64 (and potentially other architectures)
Date: Thu, 12 Feb 2026 18:41:57 +0000	[thread overview]
Message-ID: <5a648f49-97b1-4195-a825-47f3261225eb@arm.com> (raw)
In-Reply-To: <CAHbLzkpcN-T8MH6=W3jCxcFj1gVZp8fRqe231yzZT-rV_E_org@mail.gmail.com>

On 11/02/2026 23:14, Yang Shi wrote:
> Background
> =========
> The APIs using this_cpu_*() operate on a local copy of a percpu
> variable for the current processor. In order to obtain the address of
> this cpu specific variable a cpu specific offset has to be added to
> the address.
> On x86 this address calculation can be created by prefixing an
> instruction with a segment register. x86 can increment a percpu
> counter with a single instruction. Since the address calculation and
> the RMV operation occurs within one instruction it is atomic vs the
> scheduler. So no preemption is needed.
> f.e
> INC %gs:[my_counter]
> See https://www.kernel.org/doc/Documentation/this_cpu_ops.txt for more details.
> 
> ARM64 and some other non-x86 architectures don't have a segment
> register. The address of the current percpu variable has to be
> calculated and then that address can be used for an operation on
> percpu data. This process must be atomic vs the scheduler. Therefore,
> it is necessary to disable preemption, perform the address calculation
> and then the increment operation. The cpu specific offset is in a MSR
> that also needs to be accessed on ARM64. The code flow looks like:
>     Disable preemption
>     Calculate the current CPU copy address by using the offset
>     Manipulate the counter
>     Enable preemption

By massive coincidence, Dev Jain and I have been investigating a large
regression seen in a munmap micro-benchmark in 6.19, which is root caused to a
change that ends up using this_cpu_*() a lot more on the path.

We have concluded that we can simplify this_cpu_read() to not bother
disabling/enabling preemption, since it is read-only and a migration between the
2 ops vs after the second op is indistinguishable. I believe Dev is planning to
post a patch to list soon. This will solve our immediate regression issue.

But we can't do the same trick for ops that write. See [1].

[1] https://lore.kernel.org/all/20190311164837.GD24275@lakrids.cambridge.arm.com/

> 
> This process is inefficient relative to x86 and has to be repeated for
> every access to per cpu data.
> ARM64 has an increment instruction but this increment does not allow
> the use of a base register or a segment register like on x86. So an
> address calculation is always necessary even if the atomic instruction
> is used.
> A page table allows us to do remapping of addresses. So if the atomic
> instruction would be using a virtual address and the page tables for
> the local processor would map this area to the local per cpu data then
> we can also create a single instruction on ARM64 (hopefully for some
> other non-x86 architectures too) and be as efficient as x86 is.
> 
> So, the code flow should just become:
> INC VIRTUAL_BASE + percpu_variable_offset
> 
> In order to do that we need to have the same virtual address mapped
> differently for each processor. This means we need different page
> tables for each processor. These page tables
> can map almost all of the address space in the same way. The only area
> that will be special is the area starting at VIRTUAL_BASE.

This is an interesting idea. I'm keen to be involved in discussions.

My immediate concern is that this would not be compatible with FEAT_TTCNP, which
allows multiple PEs (ARM speak for CPU) to share a TLB - e.g. for SMT. I'm not
sure if that would be the end of the world; the perf numbers below are
compelling. I'll defer to others' opions on that.

Thanks,
Ryan

next prev parent reply	other threads:[~2026-02-12 18:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11 23:14 Yang Shi
2026-02-11 23:29 ` Tejun Heo
2026-02-11 23:39   ` Christoph Lameter (Ampere)
2026-02-11 23:40     ` Tejun Heo
2026-02-12  0:05       ` Christoph Lameter (Ampere)
2026-02-11 23:58   ` Yang Shi
2026-02-12 17:54     ` Catalin Marinas
2026-02-12 18:43       ` Catalin Marinas
2026-02-13  0:23         ` Yang Shi
2026-02-12 18:45       ` Ryan Roberts
2026-02-12 19:36         ` Catalin Marinas
2026-02-12 21:12           ` Ryan Roberts
2026-02-16 10:37             ` Catalin Marinas
2026-02-18  8:59               ` Ryan Roberts
2026-02-12 18:41 ` Ryan Roberts [this message]
2026-02-12 18:55   ` Christoph Lameter (Ampere)
2026-02-12 18:58     ` Ryan Roberts
2026-02-13 18:42   ` Yang Shi
2026-02-16 11:39     ` Catalin Marinas
2026-02-17 17:28       ` Christoph Lameter (Ampere)
2026-02-18  9:18         ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a648f49-97b1-4195-a825-47f3261225eb@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=dennis@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=shy828301@gmail.com \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox