From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A550DC83F07 for ; Mon, 7 Jul 2025 07:55:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AEB68D000E; Mon, 7 Jul 2025 03:55:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 438B78D0002; Mon, 7 Jul 2025 03:55:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3000E8D000E; Mon, 7 Jul 2025 03:55:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 175348D0002 for ; Mon, 7 Jul 2025 03:55:43 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8D7CB80407 for ; Mon, 7 Jul 2025 07:55:42 +0000 (UTC) X-FDA: 83636709324.02.12BFCCE Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf16.hostedemail.com (Postfix) with ESMTP id 96D3E18000A for ; Mon, 7 Jul 2025 07:55:40 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=lKSgDum1; spf=pass (imf16.hostedemail.com: domain of cleger@rivosinc.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=cleger@rivosinc.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751874940; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aApJabUejUWlSHVtY2EMwiJ02+kG5XX+8RqRV7KPGb4=; b=f1tDYtOJu42D6hILYB17PDdbez2xZMLFNNv7YdIMzQ+3kgJDnjSpR8mYGbGIPqkpqYwXsV zcUmITRQaYLuYSiYKLEx+r70lvDbyu1IPvG2VPyGErO2hc1bCDUMACUI6f1F/05L18JN2z cV6sZHut1DoFid70sG+sjznZT6djT+M= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=lKSgDum1; spf=pass (imf16.hostedemail.com: domain of cleger@rivosinc.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=cleger@rivosinc.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751874940; a=rsa-sha256; cv=none; b=ztVCWiaSshBtE24Nbs4VCm0hr/7z/bV8WP0FyhwHrjDyQWuzA8X6zErqpYzYT2YfXjKEEY sLQ4nNcY0BYjasVDE9UUXT51cRdPNDwu3cWvBTb0Ol8BmDq6PzwkVy7r5SRE3PMj4w30uq nvcghmCnhmOwcE/VdZsiNHuYUfbemhQ= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-23c8a505177so15742125ad.2 for ; Mon, 07 Jul 2025 00:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1751874939; x=1752479739; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=aApJabUejUWlSHVtY2EMwiJ02+kG5XX+8RqRV7KPGb4=; b=lKSgDum19IEoIyawcype/hIDvei4C8iEsu/TlEonThSLLg4UOZMS4Tyby3TfmkpLEF cD6tf7u/bxMxoW/fVYAOMWWNahojqx2UUt7HKrAEJSk/WxTGdOQsfqYqeM6drJq4gl1n wiKIzkXyB2GDyQxQDDKExSCOMoR3tEzZ8rV4Q5i8lrSiqF7rGPAcN6m49/lMikyJB+S0 egu8C1Acd8k8xmwUq8CNGPukQSUfxvEpQSUvDLq6mDZ3PtkArn31nLLIAMt63NfGujo5 L1bRjH8SmUvvJ4ylzmpDBp2XGIMdCC79vsjG3fX1B6dO+ZNxQ8thevWsjOVM1E2GIwHq ER1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751874939; x=1752479739; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aApJabUejUWlSHVtY2EMwiJ02+kG5XX+8RqRV7KPGb4=; b=kR/Cmu1mGHtEFsEulS/nUE+5iM1PU0fS6dBT8uSIWKDpy2Kx/j6Be4cKjasb1eEPKX ktmARi3HuBHlWox741+Zdb4O4JfEn4lENwepGYyRrGPzrgf9ajn77hsbqR85b9cug6KO Au/UtzWkqEOY4166bZjvee7eRq846f46TQLISL1biy4zLzuEdcn0HqvP6OkNy3xqaBWt fmN2xZ/QiIvOUjI5IGcKPtTgVqEeqKF15P0h6JDxMfjOWmYV6rHmcWfYY5eqQDfXWOEX FDP7YTzUHw22cE8u+V1LFnyI/3s1DwGteCO4tcJOOSFRBwM0lRm9vXM8lKHcN+IRLuIp 2TzA== X-Forwarded-Encrypted: i=1; AJvYcCUF1n3HuEtDbpVZd4+yxYuBhYt0pMzkyyx/moyY/9c1JbVKyHR7bIGcDssZA5jaMLRHwHdsOmS2cA==@kvack.org X-Gm-Message-State: AOJu0YyEFt1QMgOTOgkjv5Sp7eKHadhNTqVYRU61ypVnU5idkd1NYO8Y /QwekHQ99/XstQvhMsB6LWddV3w6wXAWJmmoPt+b8bYf5OxWSfz9HWTkFQoQSG+5oqg= X-Gm-Gg: ASbGncsXLQNJWz6EOckjrURVLu8mdX7fLxkoOItTuHZWZr/nzCWfbs+oP3UR/qVo+kI mP0GHoiS09BXJKZfBls8+IR/YBUbVs/X+t5bb4Ec6pyMhPVttUEav6SS5bFejd5IkuQfsKnFVsi C85lkUiw9vvpHIqY6Vl3R5wUpAIjcCI75wtRdBxNSzaKrqlXQOl7hQvGKd7ihC6cwuvTJ+037+p lD0ANRCTtS33Ke22q9X1fD3NJDRtL+jRi5OuEcKk0YA1EpbIg6rB1QS4Tp4+K7jK2s5YEXMmpn7 hyYbqQwjrO4jopuBKT40M8AHf9Cu8jg1bEcFxqiSC4Znt1c47RAG9qS20nCEmDFGE8Fuk9eQI+9 oJz7jcRpC3jmptr9cS8tsxUsiP8klOOg= X-Google-Smtp-Source: AGHT+IFfTih1LqKT1LDm5be4/O7hUomMhMwDtvn8B+cv+CxwUB3mfhw4Xg4ugcd+O6z8cN5GXNsIqg== X-Received: by 2002:a17:902:ce81:b0:234:c549:d9dd with SMTP id d9443c01a7336-23c90ffd4ddmr129310345ad.48.1751874939247; Mon, 07 Jul 2025 00:55:39 -0700 (PDT) Received: from ?IPV6:2a01:e0a:e17:9700:16d2:7456:6634:9626? ([2a01:e0a:e17:9700:16d2:7456:6634:9626]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23c8431a381sm78207605ad.20.2025.07.07.00.55.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 07 Jul 2025 00:55:38 -0700 (PDT) Message-ID: <977847c2-d94b-44a9-9c1f-f1ff95ffb985@rivosinc.com> Date: Mon, 7 Jul 2025 09:55:19 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC] RISC-V: Fix a register to store the percpu offset To: Yunhui Cui , masahiroy@kernel.org, nathan@kernel.org, nicolas.schier@linux.dev, dennis@kernel.org, tj@kernel.org, cl@gentwo.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, andybnac@gmail.com, bjorn@rivosinc.com, cyrilbur@tenstorrent.com, rostedt@goodmis.org, puranjay@kernel.org, ben.dooks@codethink.co.uk, zhangchunyan@iscas.ac.cn, ruanjinjie@huawei.com, jszhang@kernel.org, charlie@rivosinc.com, antonb@tenstorrent.com, ajones@ventanamicro.com, debug@rivosinc.com, haibo1.xu@intel.com, samuel.holland@sifive.com, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org References: <20250704084500.62688-1-cuiyunhui@bytedance.com> Content-Language: en-US From: =?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?= In-Reply-To: <20250704084500.62688-1-cuiyunhui@bytedance.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 96D3E18000A X-Stat-Signature: b7kz5g4ueq6umo9h9dwm131fewfjyu5f X-Rspam-User: X-HE-Tag: 1751874940-43826 X-HE-Meta: U2FsdGVkX1/lPWG81Lwvodo2/4mjhxpvXCPKvAP53uYJxC6EEYwxvtVZsM8KGkXdiiESe1FbXUjSQUl/sdBRNaz37H/faD13mnA4K75j1UPhatlnraJjbPe45DzLiUg2pXTC2GW/GOWKGJs3L3BSb+unepoh5Y8osO7rCMophwbs32Hh/nS/X0reCjF/uOcGwa/3JIN1oEnuQLRz/EKTQRA489FraJc/JbidwVTBPQpG6nMwOsOfuvAyYYTQesCL9iiPkU3vBAyCge1lhLq8FiT4QL3uSQYqIJaBzx00i1XX5Y5UEyWb3c3iOfZshf6cYSwZemxAqsdKeYbGmvwArhZcByMrqmP29VCYb6cZTFehVg9xE8PdWEBJvuPrsl9jD2Rpr+D9WOqreaqkpKk2NCFKpB/6puCWNR2ShHtPD/K1eTh9vAzHwDlDoN22NXE3YhunnfQovS6+Epwb124wzM6m0ID0M7cF6AojHXhLhPytFifv+IyCBRA69eVcH1BKxXvEuTodrVfqVY6VsuD84gY9TTsVZXuE9GwOSLU+t8gSgCwjNPPzUhpUWs5e0H86pkIwJRmIMduAXrzs88dFfLHEwAQDsI9CKxkauGTfQ2D8Ivbqj+jWShC07Pzii/ITWqRmQbgG0Rq+MWz9nnrTX5E0S9Ciq3/eMdtwIsk4t7YQ7OyXwYncOwEXbKevAu72oVNhtZJka4Rzk04lz5JSqivcgwX4p47hKOA8tu82AUhpya+Y6BWO1FHj2geFr7DttSHAZY42ePoc6yHgwjI8jzZbkGZlncg0jFJXhAHGxHpmBpIVulz1nUIwI43Y6CVKt89pbz4ChnsBd7pVnGczGDVnS+aL78vU35xXgGtHzoYeBQykGv54IouiDzRYWK3XOpX6RMAnUUOs6J/S3DlJyCQGLAvDeXW0iH2tX466kN9FnalD4pFKWxd1+XjJPGANrmPoF/o+idIXV1Qfptl OxGy1GTK NIYMzVQ7eY0EioEyxsNzgQy86SivGosd68OWcGzOJoK92nZVk8IEJ7D1YhHHPIeFjDzE5hv/zSiGn5FQ4K82zrHFOclziwwrXxfE/0NM25YmoiFf4m5FjybpHH3riryFejWpIOgWnoY7ZVWyG5wXFQD9IQ564OGEzA5UU1LMJ/gtXQIDbDZKWFHIBEwyZkOFpOd5yV7ijtdWC4THTQzkm4VGzBcKLPIhA6N0ui4wtH9SoPH5dF4+fMdyqv3HOFMVIIIU7NK/jywQXRPQ09/AV12VWj4O+SPUDKagdZhaaNFKx+QVaIknMFBuxW6DdYEvcH2WHgYIW6DZXOj1AGwHcHigyPlTU+4Ya1HPC6rLRDUUhcrk/Dou3uFvyxqs7SFA+JTv8UHUwJre0oz/U/NKjBTxmb8Lipc8twofTGcv1QqPHtb6YzQlDxcVlycEQWbN7nT9P268nAv0zdcOd7ETlFe0gnDLokUdlHL5PjU22DyIlRqMy9LqYmMxjaV0m0TN1VI19/nzPYaOGKH9kvA0KOFC6FM7oCeQiNHTm/onKuZQHtsfbNqOlmjziinAeVPPe8bRaU7g9JSYk9fjSG2bhlJ4OIg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 04/07/2025 10:45, Yunhui Cui wrote: > The following data was collected from tests conducted on the > Spacemit(R) X60 using the fixed register method: > > No modifications: > 6.77, 6.791, 6.792, 6.826, 6.784, 6.839, 6.776, 6.733, 6.795, 6.763 > Average: 6.786839305 > > ffix-x27: > 7.106, 7.035, 7.362, 7.141, 7.096, 7.182, 7.109, 7.126, 7.186, 7.12 > Average: 7.145826539 > > ffix-x27 + x27(s11) used for offset optimization: > 7.059, 6.951, 6.961, 6.985, 6.93, 6.964, 6.977, 6.907, 6.983, 6.944 > Average: 6.965993173 Hi Yunhui, Out of curiosity, did you tried using a register different than x27 ? Thanks, Clément > > Analysis: > The fixed register method reduced performance by 5.29%. > The per-CPU offset optimization improved performance by 2.52%. > > Issues with the fixed register method (beyond code size): > Performance degradation due to the loss of one general-purpose register. > Each handle_exception() call requires loading the per-CPU offset into the > fixed register. > > Signed-off-by: Yunhui Cui > --- > Makefile | 4 ++-- > arch/riscv/include/asm/percpu.h | 22 ++++++++++++++++++++++ > arch/riscv/kernel/asm-offsets.c | 1 + > arch/riscv/kernel/entry.S | 11 +++++++++-- > arch/riscv/kernel/smpboot.c | 3 +++ > 5 files changed, 37 insertions(+), 4 deletions(-) > create mode 100644 arch/riscv/include/asm/percpu.h > > diff --git a/Makefile b/Makefile > index b7d5f2f0def0..e291f865adc4 100644 > --- a/Makefile > +++ b/Makefile > @@ -1026,8 +1026,8 @@ include $(addprefix $(srctree)/, $(include-y)) > > # Add user supplied CPPFLAGS, AFLAGS, CFLAGS and RUSTFLAGS as the last assignments > KBUILD_CPPFLAGS += $(KCPPFLAGS) > -KBUILD_AFLAGS += $(KAFLAGS) > -KBUILD_CFLAGS += $(KCFLAGS) > +KBUILD_AFLAGS += $(KAFLAGS) -ffixed-x27 > +KBUILD_CFLAGS += $(KCFLAGS) -ffixed-x27 > KBUILD_RUSTFLAGS += $(KRUSTFLAGS) > > KBUILD_LDFLAGS_MODULE += --build-id=sha1 > diff --git a/arch/riscv/include/asm/percpu.h b/arch/riscv/include/asm/percpu.h > new file mode 100644 > index 000000000000..5d6b109cfab7 > --- /dev/null > +++ b/arch/riscv/include/asm/percpu.h > @@ -0,0 +1,22 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > + > +#ifndef __ASM_PERCPU_H > +#define __ASM_PERCPU_H > + > +static inline void set_my_cpu_offset(unsigned long off) > +{ > + asm volatile("addi s11, %0, 0" :: "r" (off)); > +} > + > +static inline unsigned long __kern_my_cpu_offset(void) > +{ > + unsigned long off; > + asm ("mv %0, s11" :"=r" (off) :); > + return off; > +} > + > +#define __my_cpu_offset __kern_my_cpu_offset() > + > +#include > + > +#endif > diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c > index a03129f40c46..0ce96f30bf32 100644 > --- a/arch/riscv/kernel/asm-offsets.c > +++ b/arch/riscv/kernel/asm-offsets.c > @@ -35,6 +35,7 @@ void asm_offsets(void) > OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]); > OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]); > OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]); > + OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); > OFFSET(TASK_TI_FLAGS, task_struct, thread_info.flags); > OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); > OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 9d1a305d5508..529d6576265e 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -77,6 +77,13 @@ SYM_CODE_START(handle_exception) > */ > csrw CSR_SCRATCH, x0 > > + /* load __per_cpu_offset[cpu] to s11*/ > + REG_L t6, TASK_TI_CPU(tp) > + slli t6, t6, 3 > + la s11, __per_cpu_offset > + add s11, s11, t6 > + REG_L s11, 0(s11) > + > /* Load the global pointer */ > load_global_pointer > > @@ -298,7 +305,7 @@ SYM_FUNC_START(__switch_to) > REG_S s8, TASK_THREAD_S8_RA(a3) > REG_S s9, TASK_THREAD_S9_RA(a3) > REG_S s10, TASK_THREAD_S10_RA(a3) > - REG_S s11, TASK_THREAD_S11_RA(a3) > + /* REG_S s11, TASK_THREAD_S11_RA(a3) */ > /* Save the kernel shadow call stack pointer */ > scs_save_current > /* Restore context from next->thread */ > @@ -315,7 +322,7 @@ SYM_FUNC_START(__switch_to) > REG_L s8, TASK_THREAD_S8_RA(a4) > REG_L s9, TASK_THREAD_S9_RA(a4) > REG_L s10, TASK_THREAD_S10_RA(a4) > - REG_L s11, TASK_THREAD_S11_RA(a4) > + /* REG_L s11, TASK_THREAD_S11_RA(a4) */ > /* The offset of thread_info in task_struct is zero. */ > move tp, a1 > /* Switch to the next shadow call stack */ > diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c > index fb6ab7f8bfbd..6fa12cc84523 100644 > --- a/arch/riscv/kernel/smpboot.c > +++ b/arch/riscv/kernel/smpboot.c > @@ -43,6 +43,7 @@ static DECLARE_COMPLETION(cpu_running); > > void __init smp_prepare_boot_cpu(void) > { > + set_my_cpu_offset(per_cpu_offset(smp_processor_id())); > } > > void __init smp_prepare_cpus(unsigned int max_cpus) > @@ -240,6 +241,8 @@ asmlinkage __visible void smp_callin(void) > mmgrab(mm); > current->active_mm = mm; > > + set_my_cpu_offset(per_cpu_offset(curr_cpuid)); > + > store_cpu_topology(curr_cpuid); > notify_cpu_starting(curr_cpuid); >