From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92B21C83F1B for ; Thu, 17 Jul 2025 13:04:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01C246B00B1; Thu, 17 Jul 2025 09:04:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F35D56B00B3; Thu, 17 Jul 2025 09:04:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E729C6B00B4; Thu, 17 Jul 2025 09:04:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D53B26B00B1 for ; Thu, 17 Jul 2025 09:04:32 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8EA8514072C for ; Thu, 17 Jul 2025 13:04:31 +0000 (UTC) X-FDA: 83673775542.02.19AC23E Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net [217.70.183.196]) by imf16.hostedemail.com (Postfix) with ESMTP id 47819180014 for ; Thu, 17 Jul 2025 13:04:29 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; spf=pass (imf16.hostedemail.com: domain of alex@ghiti.fr designates 217.70.183.196 as permitted sender) smtp.mailfrom=alex@ghiti.fr ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752757469; a=rsa-sha256; cv=none; b=3qEWSRLQiktoZji1GyAcxTZ9s6gqdFAM+Az/1rJSKNZiFQEoVQ4IrWGZ5ypbaedRNIMvlD jcTA04bBUFFnM425fHauXU0iYUZOc3NLBEV8UYf2CrnYkkcjiOzWjX2uYxLHrySu6T7aRz yQsUmnf/oY6Lx29el/s9+2LIonXDbPE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of alex@ghiti.fr designates 217.70.183.196 as permitted sender) smtp.mailfrom=alex@ghiti.fr; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752757469; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VlLweqbyxIlH/TRHVVUiueenzI2GeEubJd42st4scQ0=; b=tHIMfGKtpdmwcNse1Z25MiFhcgrHnp2jWee2aiIE+8bBIqe5VV8e9ozjU6oDcpcJ9TJ86C jACvf5hCI2pSZccxc/RXZUUYKuRq14FHOZZeWcMh13L9bQJ968UN1sIr5RDZyV+wlaS5xc yfRZLtYeIXVr/UsDuDntM+rNxs5aj+g= Received: by mail.gandi.net (Postfix) with ESMTPSA id 6497E44380; Thu, 17 Jul 2025 13:04:24 +0000 (UTC) Message-ID: Date: Thu, 17 Jul 2025 15:04:22 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC 2/2] riscv: introduce percpu.h into include/asm To: Yunhui Cui , yury.norov@gmail.com, linux@rasmusvillemoes.dk, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, dennis@kernel.org, tj@kernel.org, cl@gentwo.org, linux-mm@kvack.org References: <20250618034328.21904-1-cuiyunhui@bytedance.com> <20250618034328.21904-2-cuiyunhui@bytedance.com> Content-Language: en-US From: Alexandre Ghiti In-Reply-To: <20250618034328.21904-2-cuiyunhui@bytedance.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-GND-State: clean X-GND-Score: -100 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgdeitdeihecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfitefpfffkpdcuggftfghnshhusghstghrihgsvgenuceurghilhhouhhtmecufedtudenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepkfffgggfuffvfhfhjggtgfesthejredttddvjeenucfhrhhomheptehlvgigrghnughrvgcuifhhihhtihcuoegrlhgvgiesghhhihhtihdrfhhrqeenucggtffrrghtthgvrhhnpeeufeegheehgedtvdegleegjeejfedukeegteffieejgfevjeduheffffdtlefhhfenucfkphepudekhedrvddufedrudehgedrudehudenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpedukeehrddvudefrdduheegrdduhedupdhhvghloheplgdutddrudegrddtrddufegnpdhmrghilhhfrhhomheprghlvgigsehghhhithhirdhfrhdpnhgspghrtghpthhtohepuddvpdhrtghpthhtoheptghuihihuhhnhhhuihessgihthgvuggrnhgtvgdrtghomhdprhgtphhtthhopeihuhhrhidrnhhorhhovhesghhmrghilhdrtghomhdprhgtphhtthhopehlihhnuhigsehrrghsmhhushhvihhllhgvmhhovghsrdgukhdprhgtphhtthhopehprghulhdrfigrlhhmshhlvgihsehsihhfihhvvgdrtghomhdprhgtphhtthhopehprghlmhgvrhesuggrsggsvghlthdrtghomhdprhgtphhtthhopegrohhusegvv ggtshdrs ggvrhhkvghlvgihrdgvughupdhrtghpthhtoheplhhinhhugidqrhhishgtvheslhhishhtshdrihhnfhhrrgguvggrugdrohhrghdprhgtphhtthhopehlihhnuhigqdhkvghrnhgvlhesvhhgvghrrdhkvghrnhgvlhdrohhrgh X-GND-Sasl: alex@ghiti.fr X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 47819180014 X-Stat-Signature: 4kihrx1oqha86agzwfe69t3sprs3mxce X-HE-Tag: 1752757469-43245 X-HE-Meta: U2FsdGVkX19kLTJ3SqqwUxUddgzSpsJ7PRBxrheNMLsMewSVLcw35xDdg16qJ1q4BybIJ6YHSmG9wDFq97XdE5elA2nGpBV3Ns1XgCldC/TyRvUnrUTjpe/3U2MCl1sR1glWU1sIBMp0pCtlnPIUAfxLxRibpQtMjzwKw4y4K+fVmMvekPsuPgOjt7JpqMSH9dMZiV8BAWyt+UDrP0yYD76LAQBfPmHxOGoqo3oq4AWlDUE33NlFyZq16jTo/RSDeM1uqZhrXf+pUBfCBUS0+ZHdJFh3+3VoFcyKbyy9dikg5PdVasr00Gw0VLmQ3ntsvnvLEEj0DkrvB6pmEA7pVQZ4VecZeLT5SmdPD3tE80tORo4tk8XhmZZnL3YyMrd75hcV8HaVW0nT9j2aQ4q9y1aXfnn/ndphFSSBtkZMdJPyNXrGRc7mO4J4N+Qa414lCsoRn2qVfYOtUt6jyJpnWmtfgMFrYu1OXDVgMog7mTeLHIn9Kqzdtu9sgpL+bYSqWrmD3OAVR6Wtr9XmMeuBrh+atucWoh5Nuz/S9fTTfDiFHgPEsywbFFX3VA5XFNp03JPWPOFg+kReNls27n+bPYA312KrxDBxm5wYyKKZxTqjWnG4I/iPFS4j63nbKG8pLWH0OQxfWMLauZfHmNQmyBSQc/91oFzB2pPx5W3v//dW1mehy1UI4/MiMw65nnLgzMx+kmJuxCg5gmjCa/HahjoBurjRCqH5hkVAIRI949mqi1JTzTLmHgdIYLHN89gytuklZ/SeKeW/K8bEPCH+tlVWGMeUqN8rnb19Wvcww8xeHM4AXhG7OFm7HdzvQGFTNSgm7Cdldg1D327wkPnYXqNWTFzZtSZGAB+zMdfweIju8+MhYWYDRxrWu5z8on/oFJ5+Wn1YI2ghiSrkrOqg5QSUOCyVoRpkxzS4EVQVXuTTRK12kgEHyaCNnHzL3/HPu0EyfGTvsDK4H0LGpdH E/Z5I62+ WaUIjl4NP7fpAtq788AZ7u7anXMSvqNOg3gXWkq+zP7Sdj51NBEJOeD5gxn8EVz4FfnnF4LRLiwDjsskM7XXWvJplsUm1szKM5zU0SzwqmaT4xPV8ywTOvc9FLRuF4oBstYEAsHMFineW3Gh01rGyowKhwSP52DSlN/0PjpQlraHVaaBGHVLx65BV6EsMui3PGHloIY3sgE7yJO3oip1nRLac8NX7iTE3Xi4ry00NOqnjWljzaajVAdXedA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Yunhui, On 6/18/25 05:43, Yunhui Cui wrote: > Current percpu operations rely on generic implementations, where > raw_local_irq_save() introduces substantial overhead. Optimization > is achieved through atomic operations and preemption disabling. > > Signed-off-by: Yunhui Cui > --- > arch/riscv/include/asm/percpu.h | 138 ++++++++++++++++++++++++++++++++ > 1 file changed, 138 insertions(+) > create mode 100644 arch/riscv/include/asm/percpu.h > > diff --git a/arch/riscv/include/asm/percpu.h b/arch/riscv/include/asm/percpu.h > new file mode 100644 > index 0000000000000..423c0d01f874c > --- /dev/null > +++ b/arch/riscv/include/asm/percpu.h > @@ -0,0 +1,138 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > + > +#ifndef __ASM_PERCPU_H > +#define __ASM_PERCPU_H > + > +#include > + > +#define PERCPU_RW_OPS(sz) \ > +static inline unsigned long __percpu_read_##sz(void *ptr) \ > +{ \ > + return READ_ONCE(*(u##sz *)ptr); \ > +} \ > + \ > +static inline void __percpu_write_##sz(void *ptr, unsigned long val) \ > +{ \ > + WRITE_ONCE(*(u##sz *)ptr, (u##sz)val); \ > +} > + > +#define __PERCPU_AMO_OP_CASE(sfx, name, sz, amo_insn) \ > +static inline void \ > +__percpu_##name##_amo_case_##sz(void *ptr, unsigned long val) \ > +{ \ > + asm volatile ( \ > + "amo" #amo_insn #sfx " zero, %[val], %[ptr]" \ > + : [ptr] "+A" (*(u##sz *)ptr) \ > + : [val] "r" ((u##sz)(val)) \ > + : "memory"); \ > +} > + > +#define __PERCPU_AMO_RET_OP_CASE(sfx, name, sz, amo_insn) \ > +static inline u##sz \ > +__percpu_##name##_return_amo_case_##sz(void *ptr, unsigned long val) \ > +{ \ > + register u##sz ret; \ > + \ > + asm volatile ( \ > + "amo" #amo_insn #sfx " %[ret], %[val], %[ptr]" \ > + : [ptr] "+A" (*(u##sz *)ptr), [ret] "=r" (ret) \ > + : [val] "r" ((u##sz)(val)) \ > + : "memory"); \ > + \ > + return ret + val; \ > +} > + > +#define PERCPU_OP(name, amo_insn) \ > + __PERCPU_AMO_OP_CASE(.b, name, 8, amo_insn) \ > + __PERCPU_AMO_OP_CASE(.h, name, 16, amo_insn) \ > + __PERCPU_AMO_OP_CASE(.w, name, 32, amo_insn) \ > + __PERCPU_AMO_OP_CASE(.d, name, 64, amo_insn) \ > + > +#define PERCPU_RET_OP(name, amo_insn) \ > + __PERCPU_AMO_RET_OP_CASE(.b, name, 8, amo_insn) \ > + __PERCPU_AMO_RET_OP_CASE(.h, name, 16, amo_insn) \ > + __PERCPU_AMO_RET_OP_CASE(.w, name, 32, amo_insn) \ > + __PERCPU_AMO_RET_OP_CASE(.d, name, 64, amo_insn) > + > +PERCPU_RW_OPS(8) > +PERCPU_RW_OPS(16) > +PERCPU_RW_OPS(32) > +PERCPU_RW_OPS(64) > + > +PERCPU_OP(add, add) > +PERCPU_OP(andnot, and) > +PERCPU_OP(or, or) > +PERCPU_RET_OP(add, add) > + > +#undef PERCPU_RW_OPS > +#undef __PERCPU_AMO_OP_CASE > +#undef __PERCPU_AMO_RET_OP_CASE > +#undef PERCPU_OP > +#undef PERCPU_RET_OP > + > +#define _pcp_protect(op, pcp, ...) \ > +({ \ > + preempt_disable_notrace(); \ > + op(raw_cpu_ptr(&(pcp)), __VA_ARGS__); \ > + preempt_enable_notrace(); \ > +}) > + > +#define _pcp_protect_return(op, pcp, args...) \ > +({ \ > + typeof(pcp) __retval; \ > + preempt_disable_notrace(); \ > + __retval = (typeof(pcp))op(raw_cpu_ptr(&(pcp)), ##args); \ > + preempt_enable_notrace(); \ > + __retval; \ > +}) > + > +#define this_cpu_read_1(pcp) _pcp_protect_return(__percpu_read_8, pcp) > +#define this_cpu_read_2(pcp) _pcp_protect_return(__percpu_read_16, pcp) > +#define this_cpu_read_4(pcp) _pcp_protect_return(__percpu_read_32, pcp) > +#define this_cpu_read_8(pcp) _pcp_protect_return(__percpu_read_64, pcp) > + > +#define this_cpu_write_1(pcp, val) _pcp_protect(__percpu_write_8, pcp, (unsigned long)val) > +#define this_cpu_write_2(pcp, val) _pcp_protect(__percpu_write_16, pcp, (unsigned long)val) > +#define this_cpu_write_4(pcp, val) _pcp_protect(__percpu_write_32, pcp, (unsigned long)val) > +#define this_cpu_write_8(pcp, val) _pcp_protect(__percpu_write_64, pcp, (unsigned long)val) > + > +#define this_cpu_add_1(pcp, val) _pcp_protect(__percpu_add_amo_case_8, pcp, val) > +#define this_cpu_add_2(pcp, val) _pcp_protect(__percpu_add_amo_case_16, pcp, val) > +#define this_cpu_add_4(pcp, val) _pcp_protect(__percpu_add_amo_case_32, pcp, val) > +#define this_cpu_add_8(pcp, val) _pcp_protect(__percpu_add_amo_case_64, pcp, val) > + > +#define this_cpu_add_return_1(pcp, val) \ > +_pcp_protect_return(__percpu_add_return_amo_case_8, pcp, val) > + > +#define this_cpu_add_return_2(pcp, val) \ > +_pcp_protect_return(__percpu_add_return_amo_case_16, pcp, val) > + > +#define this_cpu_add_return_4(pcp, val) \ > +_pcp_protect_return(__percpu_add_return_amo_case_32, pcp, val) > + > +#define this_cpu_add_return_8(pcp, val) \ > +_pcp_protect_return(__percpu_add_return_amo_case_64, pcp, val) > + > +#define this_cpu_and_1(pcp, val) _pcp_protect(__percpu_andnot_amo_case_8, pcp, ~val) > +#define this_cpu_and_2(pcp, val) _pcp_protect(__percpu_andnot_amo_case_16, pcp, ~val) > +#define this_cpu_and_4(pcp, val) _pcp_protect(__percpu_andnot_amo_case_32, pcp, ~val) > +#define this_cpu_and_8(pcp, val) _pcp_protect(__percpu_andnot_amo_case_64, pcp, ~val) Why do we define __percpu_andnot based on amoand, and use __percpu_andnot with ~val here? Can't we just define __percpu_and? > + > +#define this_cpu_or_1(pcp, val) _pcp_protect(__percpu_or_amo_case_8, pcp, val) > +#define this_cpu_or_2(pcp, val) _pcp_protect(__percpu_or_amo_case_16, pcp, val) > +#define this_cpu_or_4(pcp, val) _pcp_protect(__percpu_or_amo_case_32, pcp, val) > +#define this_cpu_or_8(pcp, val) _pcp_protect(__percpu_or_amo_case_64, pcp, val) > + > +#define this_cpu_xchg_1(pcp, val) _pcp_protect_return(xchg_relaxed, pcp, val) > +#define this_cpu_xchg_2(pcp, val) _pcp_protect_return(xchg_relaxed, pcp, val) > +#define this_cpu_xchg_4(pcp, val) _pcp_protect_return(xchg_relaxed, pcp, val) > +#define this_cpu_xchg_8(pcp, val) _pcp_protect_return(xchg_relaxed, pcp, val) > + > +#define this_cpu_cmpxchg_1(pcp, o, n) _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > +#define this_cpu_cmpxchg_2(pcp, o, n) _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > +#define this_cpu_cmpxchg_4(pcp, o, n) _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > +#define this_cpu_cmpxchg_8(pcp, o, n) _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) > + > +#include > + > +#endif /* __ASM_PERCPU_H */ It all looks good to me, just one thing, can you also implement this_cpu_cmpxchg64/128()? And since this is almost a copy/paste from arm64, either mention it at the top of the file or (better) merge both implementations somewhere to avoid redefining existing code :) But up to you. Reviewed-by: Alexandre Ghiti Thanks, Alex