From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F7DCC83F1A for ; Thu, 17 Jul 2025 13:05:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 96DB36B00A3; Thu, 17 Jul 2025 09:05:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 944E56B00A4; Thu, 17 Jul 2025 09:05:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85A976B00A9; Thu, 17 Jul 2025 09:05:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 739236B00A3 for ; Thu, 17 Jul 2025 09:05:52 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3A722C0744 for ; Thu, 17 Jul 2025 13:05:52 +0000 (UTC) X-FDA: 83673778944.20.158F1DE Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by imf09.hostedemail.com (Postfix) with ESMTP id 27BC5140019 for ; Thu, 17 Jul 2025 13:05:49 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; spf=pass (imf09.hostedemail.com: domain of alex@ghiti.fr designates 217.70.183.194 as permitted sender) smtp.mailfrom=alex@ghiti.fr ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752757550; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XTLmn4fxr3AhOk/YWXKbwc+DYJ3vpLa6i8pbK+ZWQDw=; b=cE1hBPi5vJtgdMDGnpwItqtnfeNYrT9jQ2uxNteZlC+KcrribZ9VroRU2ZBU+PEEO/raDm a0u5St7NeDSuDb4Xv6ebCYh6KjtVyayx8crZLX3CwSBiSNVcvhkQDsmDNeAj5BTm1jbyRk UmCjUCvtPsC57/cVVrXFRyRx9A5gFpY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752757550; a=rsa-sha256; cv=none; b=bqMtJnDjO3GRyq6hS56QpA65kpztpNeLSBlncKe9SwmPREkxI3gr9senKrYQqk+GVy+5QM VZOP6FQyyh0i4OLZ9YjDiHqJu/W+JnfXJCrAEvAqllcjGinoZYMcoQGK2oitb4Dc4UKhro azE4eG1tMkLeg5xzsuXLEkzUwpHHpWA= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of alex@ghiti.fr designates 217.70.183.194 as permitted sender) smtp.mailfrom=alex@ghiti.fr; dmarc=none Received: by mail.gandi.net (Postfix) with ESMTPSA id A208A441BF; Thu, 17 Jul 2025 13:05:45 +0000 (UTC) Message-ID: Date: Thu, 17 Jul 2025 15:05:44 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC 2/2] riscv: introduce percpu.h into include/asm From: Alexandre Ghiti To: Yunhui Cui , yury.norov@gmail.com, linux@rasmusvillemoes.dk, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, dennis@kernel.org, tj@kernel.org, cl@gentwo.org, linux-mm@kvack.org References: <20250618034328.21904-1-cuiyunhui@bytedance.com> <20250618034328.21904-2-cuiyunhui@bytedance.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-GND-State: clean X-GND-Score: -100 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgdeitdeihecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfitefpfffkpdcuggftfghnshhusghstghrihgsvgenuceurghilhhouhhtmecufedtudenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepkfffgggfuffhvfhfjggtgfesthekredttddvjeenucfhrhhomheptehlvgigrghnughrvgcuifhhihhtihcuoegrlhgvgiesghhhihhtihdrfhhrqeenucggtffrrghtthgvrhhnpeeiudelveeghfevteefueejleejgfeiveduffdvueehveekhfehueefvdelgfffueenucfkphepudekhedrvddufedrudehgedrudehudenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpedukeehrddvudefrdduheegrdduhedupdhhvghloheplgdutddrudegrddtrddufegnpdhmrghilhhfrhhomheprghlvgigsehghhhithhirdhfrhdpnhgspghrtghpthhtohepuddvpdhrtghpthhtoheptghuihihuhhnhhhuihessgihthgvuggrnhgtvgdrtghomhdprhgtphhtthhopeihuhhrhidrnhhorhhovhesghhmrghilhdrtghomhdprhgtphhtthhopehlihhnuhigsehrrghsmhhushhvihhllhgvmhhovghsrdgukhdprhgtphhtthhopehprghulhdrfigrlhhmshhlvgihsehsihhfihhvvgdrtghomhdprhgtphhtthhopehprghlmhgvrhesuggrsggsvghlthdrtghomhdprhgtphhtthhopegrohhusegvv ggtshdrs ggvrhhkvghlvgihrdgvughupdhrtghpthhtoheplhhinhhugidqrhhishgtvheslhhishhtshdrihhnfhhrrgguvggrugdrohhrghdprhgtphhtthhopehlihhnuhigqdhkvghrnhgvlhesvhhgvghrrdhkvghrnhgvlhdrohhrgh X-GND-Sasl: alex@ghiti.fr X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 27BC5140019 X-Stat-Signature: muq8p7byai5jj9fqo3o65hhss8g4a7fm X-Rspam-User: X-HE-Tag: 1752757549-741460 X-HE-Meta: U2FsdGVkX19lmCSSEwSKgdiCQtZ9vVdpwCmzcGfA6YpwQ2wGXN2oxC21St6yP7AGK+ETlWKyQZGeqDWvdawh3ENtC96qlntZ8S4IzEFaSeWV0fluvwJEXuu9MVwSbkxsLz45HrvSHTH2dPsF8IXtKX8kd7bMemMNFWkXMQbzl8tLJkfEr23L/lIV3HVFOJv16P607fdPP6554sa+xktxi1tlNvgqRRhKVyWm9CLbXeqC6BuiXKhxBgN7IH/gUSDwNJA8Eqx7xABI4Dp7SYHET48s0BJ66+PcIRQ9l1QjKDhqw40ll3DA6gu6JY7osB63yCYjs/Fq7eJ93KOENH6Io47qwu1EJ+Xta8EEJ+WBqMkmP8OJS/AqxEhHnxRWWDbr9Az7YFf/BXSXnexQngSnG1Bp8EZXzrOC+SqLWvi37mn3OJBD+ZmlQbHn7aPpgL+hVVuTgRmPLJe+xtdemiHV1EptIncVmL4jarXT5wuYflslal3fnaSP5/aBPOEKVrFYKysfeYn35SFIm9JhJFgtb1qox8KNgCBu8ECGM2vNO0zDCSlOrlleV4XChdEhfohzIG45yP/rD/xmZwHEy+mvQlNWmLlmVNECKXYga8EooHyLEG+c0NhCFrvlShS0X+6MTHnILBCi1F4+gvK7M3wSANBgaEj/gAOcHhDNJ3t3S3GHb6Tph2RqPfTsQUCTTFkDNljRbJH2M+pE6qhb/NxjgqJIxNPdL2bIzojDnTUe1z1nQl0fAUROJrP/xvPwTlUQW/Dxm1KkVtKpHEX4clktXHxdjLE7k/XpONXdnnPdq3fIXi1Az5EhcNSu0JTSRTMrqlYkXKphkPC+iHVI+IyeXd1OY3vQpjXNya/TK8qoC6sQDdFGvkblD1NJlfElyFob5YWXc1Q7qYbMuVC6mSblZD8FjnsoJ4b9fISli5JntWk0tYs9/9sv9Y6Z8EXb1UHuPGw4qp3y8uYhEewN5Hu bxTq6m7w S2awe2kgbQwRD8PdH+q806hCcFJh6sBdVJLWOXieTq3VG+WgYcEhTCb/qDuT9e4EQk+jFskJydir4pw597BMO2oe/oyblTp36j3sXKtYeFssa08gkuVTDLq82x8h1jmpVeUvAxSpS4j6ZLU2NLjUkz/wlSpUkuwRpMP2lhRnzt3++zY6kpIhs+JBNKwNiH7+nMYxESrckHyyvm8jZBcu3QpTk4oo4fK/cWQPKMk26fjev5qcwpKWv7bte416jA9roM8mmU2j0/YFJflY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 7/17/25 15:04, Alexandre Ghiti wrote: > Hi Yunhui, > > On 6/18/25 05:43, Yunhui Cui wrote: >> Current percpu operations rely on generic implementations, where >> raw_local_irq_save() introduces substantial overhead. Optimization >> is achieved through atomic operations and preemption disabling. >> >> Signed-off-by: Yunhui Cui >> --- >>   arch/riscv/include/asm/percpu.h | 138 ++++++++++++++++++++++++++++++++ >>   1 file changed, 138 insertions(+) >>   create mode 100644 arch/riscv/include/asm/percpu.h >> >> diff --git a/arch/riscv/include/asm/percpu.h >> b/arch/riscv/include/asm/percpu.h >> new file mode 100644 >> index 0000000000000..423c0d01f874c >> --- /dev/null >> +++ b/arch/riscv/include/asm/percpu.h >> @@ -0,0 +1,138 @@ >> +/* SPDX-License-Identifier: GPL-2.0-only */ >> + >> +#ifndef __ASM_PERCPU_H >> +#define __ASM_PERCPU_H >> + >> +#include >> + >> +#define PERCPU_RW_OPS(sz)                        \ >> +static inline unsigned long __percpu_read_##sz(void *ptr)        \ >> +{                                    \ >> +    return READ_ONCE(*(u##sz *)ptr);                \ >> +}                                    \ >> +                                    \ >> +static inline void __percpu_write_##sz(void *ptr, unsigned long >> val)    \ >> +{                                    \ >> +    WRITE_ONCE(*(u##sz *)ptr, (u##sz)val);                \ >> +} >> + >> +#define __PERCPU_AMO_OP_CASE(sfx, name, sz, amo_insn)            \ >> +static inline void                            \ >> +__percpu_##name##_amo_case_##sz(void *ptr, unsigned long val)        \ >> +{                                    \ >> +    asm volatile (                            \ >> +    "amo" #amo_insn #sfx " zero, %[val], %[ptr]"            \ >> +    : [ptr] "+A" (*(u##sz *)ptr)                    \ >> +    : [val] "r" ((u##sz)(val))                    \ >> +    : "memory");                            \ >> +} >> + >> +#define __PERCPU_AMO_RET_OP_CASE(sfx, name, sz, amo_insn)        \ >> +static inline u##sz                            \ >> +__percpu_##name##_return_amo_case_##sz(void *ptr, unsigned long >> val)    \ >> +{                                    \ >> +    register u##sz ret;                        \ >> +                                    \ >> +    asm volatile (                            \ >> +    "amo" #amo_insn #sfx " %[ret], %[val], %[ptr]"            \ >> +    : [ptr] "+A" (*(u##sz *)ptr), [ret] "=r" (ret)            \ >> +    : [val] "r" ((u##sz)(val))                    \ >> +    : "memory");                            \ >> +                                    \ >> +    return ret + val;                        \ >> +} >> + >> +#define PERCPU_OP(name, amo_insn)                    \ >> +    __PERCPU_AMO_OP_CASE(.b, name, 8, amo_insn)            \ >> +    __PERCPU_AMO_OP_CASE(.h, name, 16, amo_insn)            \ >> +    __PERCPU_AMO_OP_CASE(.w, name, 32, amo_insn)            \ >> +    __PERCPU_AMO_OP_CASE(.d, name, 64, amo_insn)            \ >> + >> +#define PERCPU_RET_OP(name, amo_insn)                    \ >> +    __PERCPU_AMO_RET_OP_CASE(.b, name, 8, amo_insn) \ >> +    __PERCPU_AMO_RET_OP_CASE(.h, name, 16, amo_insn)        \ >> +    __PERCPU_AMO_RET_OP_CASE(.w, name, 32, amo_insn)        \ >> +    __PERCPU_AMO_RET_OP_CASE(.d, name, 64, amo_insn) >> + >> +PERCPU_RW_OPS(8) >> +PERCPU_RW_OPS(16) >> +PERCPU_RW_OPS(32) >> +PERCPU_RW_OPS(64) >> + >> +PERCPU_OP(add, add) >> +PERCPU_OP(andnot, and) >> +PERCPU_OP(or, or) >> +PERCPU_RET_OP(add, add) >> + >> +#undef PERCPU_RW_OPS >> +#undef __PERCPU_AMO_OP_CASE >> +#undef __PERCPU_AMO_RET_OP_CASE >> +#undef PERCPU_OP >> +#undef PERCPU_RET_OP >> + >> +#define _pcp_protect(op, pcp, ...)                    \ >> +({                                    \ >> +    preempt_disable_notrace();                    \ >> +    op(raw_cpu_ptr(&(pcp)), __VA_ARGS__);                \ >> +    preempt_enable_notrace();                    \ >> +}) >> + >> +#define _pcp_protect_return(op, pcp, args...)                \ >> +({                                    \ >> +    typeof(pcp) __retval;                        \ >> +    preempt_disable_notrace();                    \ >> +    __retval = (typeof(pcp))op(raw_cpu_ptr(&(pcp)), ##args);    \ >> +    preempt_enable_notrace();                    \ >> +    __retval;                            \ >> +}) >> + >> +#define this_cpu_read_1(pcp) _pcp_protect_return(__percpu_read_8, pcp) >> +#define this_cpu_read_2(pcp) _pcp_protect_return(__percpu_read_16, pcp) >> +#define this_cpu_read_4(pcp) _pcp_protect_return(__percpu_read_32, pcp) >> +#define this_cpu_read_8(pcp) _pcp_protect_return(__percpu_read_64, pcp) >> + >> +#define this_cpu_write_1(pcp, val) _pcp_protect(__percpu_write_8, >> pcp, (unsigned long)val) >> +#define this_cpu_write_2(pcp, val) _pcp_protect(__percpu_write_16, >> pcp, (unsigned long)val) >> +#define this_cpu_write_4(pcp, val) _pcp_protect(__percpu_write_32, >> pcp, (unsigned long)val) >> +#define this_cpu_write_8(pcp, val) _pcp_protect(__percpu_write_64, >> pcp, (unsigned long)val) >> + >> +#define this_cpu_add_1(pcp, val) >> _pcp_protect(__percpu_add_amo_case_8, pcp, val) >> +#define this_cpu_add_2(pcp, val) >> _pcp_protect(__percpu_add_amo_case_16, pcp, val) >> +#define this_cpu_add_4(pcp, val) >> _pcp_protect(__percpu_add_amo_case_32, pcp, val) >> +#define this_cpu_add_8(pcp, val) >> _pcp_protect(__percpu_add_amo_case_64, pcp, val) >> + >> +#define this_cpu_add_return_1(pcp, val)        \ >> +_pcp_protect_return(__percpu_add_return_amo_case_8, pcp, val) >> + >> +#define this_cpu_add_return_2(pcp, val)        \ >> +_pcp_protect_return(__percpu_add_return_amo_case_16, pcp, val) >> + >> +#define this_cpu_add_return_4(pcp, val)        \ >> +_pcp_protect_return(__percpu_add_return_amo_case_32, pcp, val) >> + >> +#define this_cpu_add_return_8(pcp, val)        \ >> +_pcp_protect_return(__percpu_add_return_amo_case_64, pcp, val) >> + >> +#define this_cpu_and_1(pcp, val) >> _pcp_protect(__percpu_andnot_amo_case_8, pcp, ~val) >> +#define this_cpu_and_2(pcp, val) >> _pcp_protect(__percpu_andnot_amo_case_16, pcp, ~val) >> +#define this_cpu_and_4(pcp, val) >> _pcp_protect(__percpu_andnot_amo_case_32, pcp, ~val) >> +#define this_cpu_and_8(pcp, val) >> _pcp_protect(__percpu_andnot_amo_case_64, pcp, ~val) > > > Why do we define __percpu_andnot based on amoand, and use > __percpu_andnot with ~val here? Can't we just define __percpu_and? > > >> + >> +#define this_cpu_or_1(pcp, val) _pcp_protect(__percpu_or_amo_case_8, >> pcp, val) >> +#define this_cpu_or_2(pcp, val) >> _pcp_protect(__percpu_or_amo_case_16, pcp, val) >> +#define this_cpu_or_4(pcp, val) >> _pcp_protect(__percpu_or_amo_case_32, pcp, val) >> +#define this_cpu_or_8(pcp, val) >> _pcp_protect(__percpu_or_amo_case_64, pcp, val) >> + >> +#define this_cpu_xchg_1(pcp, val) _pcp_protect_return(xchg_relaxed, >> pcp, val) >> +#define this_cpu_xchg_2(pcp, val) _pcp_protect_return(xchg_relaxed, >> pcp, val) >> +#define this_cpu_xchg_4(pcp, val) _pcp_protect_return(xchg_relaxed, >> pcp, val) >> +#define this_cpu_xchg_8(pcp, val) _pcp_protect_return(xchg_relaxed, >> pcp, val) >> + >> +#define this_cpu_cmpxchg_1(pcp, o, n) >> _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) >> +#define this_cpu_cmpxchg_2(pcp, o, n) >> _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) >> +#define this_cpu_cmpxchg_4(pcp, o, n) >> _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) >> +#define this_cpu_cmpxchg_8(pcp, o, n) >> _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) >> + >> +#include >> + >> +#endif /* __ASM_PERCPU_H */ > > > It all looks good to me, just one thing, can you also implement > this_cpu_cmpxchg64/128()? > One last thing sorry, can you add a cover letter too? Thanks! Alex > And since this is almost a copy/paste from arm64, either mention it at > the top of the file or (better) merge both implementations somewhere > to avoid redefining existing code :) But up to you. > > Reviewed-by: Alexandre Ghiti > > Thanks, > > Alex > > >