From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D85AC83F03 for ; Fri, 4 Jul 2025 08:45:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B8B476B8017; Fri, 4 Jul 2025 04:45:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B63056B800A; Fri, 4 Jul 2025 04:45:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A78F56B8017; Fri, 4 Jul 2025 04:45:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 916B36B800A for ; Fri, 4 Jul 2025 04:45:16 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 52D37809E5 for ; Fri, 4 Jul 2025 08:45:16 +0000 (UTC) X-FDA: 83625947832.18.78C0BEA Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf18.hostedemail.com (Postfix) with ESMTP id F32931C0007 for ; Fri, 4 Jul 2025 08:45:13 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ECfLZJG8; spf=pass (imf18.hostedemail.com: domain of cuiyunhui@bytedance.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=cuiyunhui@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751618714; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=63zu7OUSE7H3LEgq1aY0wKyYCc7q428IAvJH1wbVnOQ=; b=fOHoNQWEJ95tQlB+PErAoGRGcsvG7tah/bt6KjK1LdKBnX+jJywE06cnYKq1vLFEfAy+yO n/SdJudEQtKxfa1FqtdrO3FsX2LK/amwhXahhS5aSn9m361+cOXOWceEq0TRCVANMEAfJT cvUPp2pi8rFni07KJ1eXGa2aO/Z3ba4= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ECfLZJG8; spf=pass (imf18.hostedemail.com: domain of cuiyunhui@bytedance.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=cuiyunhui@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751618714; a=rsa-sha256; cv=none; b=eP5LpV+NknUb0gfeuqXZMEcwzYkv5hIitUFsY+qdfgONPMmmdWK49EAiDORaXay6lZ1f7t ZTx2v+SEveAUvQYjSBCR+9FIMxUeZmZNDtnxKIRs1ZaroHgPawqxSH6PRfq5RITAylsFVB dykf6LFgeo45U2WVTOE+BqS20bKHYF8= Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-748feca4a61so474463b3a.3 for ; Fri, 04 Jul 2025 01:45:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1751618712; x=1752223512; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=63zu7OUSE7H3LEgq1aY0wKyYCc7q428IAvJH1wbVnOQ=; b=ECfLZJG8/KIKo1yatc32AOQiB9LeY0orokYTS19mChYfUdsGs7ejrxcs4HlMZCp4NG h1KHrb1fE+tRFRQCXDd6iIb2/HmMl8eYH6w/Wi8qrWygTdN5CTO/3BjEtMK42N3GVqfc lYgeSCWfouLHP1+tWzhOeqLv57aO0Gbw/y6bPGx42V+4KEtWD17muQkmtcU1ogVP0Piz f3NwRjeBA8TwbWAZUH0E9WxGDV1yx0DV2d5t7kbzRVvLBrgPSzDQdl9RtXzdqb3gBgpx Lx4X11qLcFYdTUomh4MyaxCroY9PEftPT99xndefSKuwS6TIqz2LKrhqNA5KnakMoLDM YWqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751618712; x=1752223512; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=63zu7OUSE7H3LEgq1aY0wKyYCc7q428IAvJH1wbVnOQ=; b=vpaQ3+y1TgiD6bQvwmrzM0XoLSeeA0qv7sEBdSC1HhzuxOEw0Q5bWdvIrpHBSTr0jr 5UAMGe2mz0LVclA1YQJs7lwcuYYoc4qysjG1+WdqcMXS7nJ/Oknkupetimz1VqXofyWu Sw7t9ROcF5aOSPpxKdB3nd7zcg5ckhSPMzDb0Yh08H70wi9yyarCyDtbXHuE/91A8MYe aYTl882XLPOIGmtfqi1PB35epQKvkQ5e8BJ8uzQYAoVnYzfBLRpv4LBjpLcXwPSBOQSN KGYWHHP9iommB+XnCiSQ9PesKPVwPZzLAhyKq/6nb/zcXbWXtEA5rxu/pH39fc7n1L1U ZYPw== X-Forwarded-Encrypted: i=1; AJvYcCVRX3s/NISak+EWRoYj0rGVp7+CPe9FvLXWdfdRFzNdWCoFrSWvSbV8rT/MfDNMHbAOTf2VnKjOPA==@kvack.org X-Gm-Message-State: AOJu0Yz4SB20BZ48LeWwYMF36bosiA3cPrunJL98niVeHA3Hr7yyCBgR RPFCHipzgN2xDxZB2Eo+U4yzcRbMyc90ARjJdwE+EWSQhCWJ7yuAFahF/pCjDakMhrU= X-Gm-Gg: ASbGncu1EiHI64HxlKA+mb/XtaxTZ8AyeNp7ShtjYXwWUyyY/3ZUmPD+bFaePlEu4sp J/kru13Ut9JcNFzMwM0S6Xn7/QI9m/O8G8tb9OS8cozIkO/2lAIT1/0+Q8JTJLRNgpWZ9RWkPPl X4GNO3j8uWrdLq2TcCipLExiJ6Uda6YFSqDsfB1o7xIhQpYY/i/rMY98LcC+JdDoMmJWRVxYdPl a3NNRq5YaK1Y6z7juvy4Vs8WeBiAmw06LZx5/nC/aCNh1H0xBUbtuZkdvZB67Ye7EJyqI34HEK9 5GURKxpXi0+5TUCEfB4aZUbedPvThglzcqnbHyo1YZOe0+edSp1ttXLHutXni1n5mBnAjX0IxyU KSjAPsSQTUopTue3I X-Google-Smtp-Source: AGHT+IHwfnsSoknuDyaRYIcga1M43ZSFI80Oh8nP0dEX5TdzD+tVrQdta0nFCVUmLLuNum1o9VjWJA== X-Received: by 2002:a05:6a00:6ca1:b0:74c:efae:ffae with SMTP id d2e1a72fcca58-74cefaf0042mr19469b3a.5.1751618712474; Fri, 04 Jul 2025 01:45:12 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.9]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-74ce35cd066sm1722118b3a.53.2025.07.04.01.45.04 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 04 Jul 2025 01:45:12 -0700 (PDT) From: Yunhui Cui To: masahiroy@kernel.org, nathan@kernel.org, nicolas.schier@linux.dev, dennis@kernel.org, tj@kernel.org, cl@gentwo.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, andybnac@gmail.com, bjorn@rivosinc.com, cyrilbur@tenstorrent.com, rostedt@goodmis.org, puranjay@kernel.org, ben.dooks@codethink.co.uk, zhangchunyan@iscas.ac.cn, ruanjinjie@huawei.com, jszhang@kernel.org, charlie@rivosinc.com, cleger@rivosinc.com, antonb@tenstorrent.com, ajones@ventanamicro.com, debug@rivosinc.com, cuiyunhui@bytedance.com, haibo1.xu@intel.com, samuel.holland@sifive.com, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org Subject: [PATCH RFC] RISC-V: Fix a register to store the percpu offset Date: Fri, 4 Jul 2025 16:45:00 +0800 Message-Id: <20250704084500.62688-1-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: 9zfx1b777yejgfooirtjq8nfsx46qpz5 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: F32931C0007 X-HE-Tag: 1751618713-468594 X-HE-Meta: U2FsdGVkX1+Cr4Agmm+Bgagt/I/6u419p7uRUWRdeulcV0lcg2xGMSzQHvs2ZFL5yse5onCTWg+PO/w8YufSh9JoQj67xJr5E5i+aHG7lw7Tb6RjwfjtQP0gS2Yk0TOVaif4usWk9VJTdrpeqcxh3gxPkvUI2tSQvNoWW9lIbIPuClcwDcp0WIPGF0hYR7KdbFWOx635+oWNtehpELr1kSII7REQC0KHomwu9wv2jfYJeJKOtNykH4O9lS1q/0YIV4mprm28z2bfT9j5ezBCPUT4Y4bXJ/wZXKzgPmE4WkwvThQKtsqYsFm+tKpW5yLOu18uTn9LIrgwNg1f0vUKBWLNbkCSBNQDaQvelNEGX96sdNW22GEHF4ZHd5xHcg8aiNXNupjubKaBZoXlF3tUklkpqhsFo7NDfM3D31vtfWea9g396cduAzlIe6haTvqZ4iSwbfsRWzc/1ZVqB6z32eDzOEXO0elSBwMnud5Acv0Dlap+shEHz1fUNwbH2LCwNiBfxyXrKjwJgMdPJzIm5Zqa9Uv9G/7mWYykRrx2yeO2skSgw03yOoHIp+hLJNMJ7x9Ty5f7ZvhCJMxUgnsrriC9Vt0YASZsbYzU5eBRhZ72hsiUbZacCIRAKXRNxARnZO10GK/KnWOycaNvRh98ckYkCCqWr2ONfHwu87lE3JbyAmVadOdZQh5tR7XkTTjxw56sxkuXR/WXdftDka6lhtfRgdbcFc0MBujRhdbBYOPQNls0gx2w9IAoMgw7cZb13UlHAL31IMpS31M+chfekTcX79jW2KY5gRb6StrGOv9UyB+I+bvZS5i4+YvKAMdpn9Q2QmbhkVfcm0lxR/vfJDjXAUY3NM48i1liZ6Ztp2didbgnIEaVNISkz+VT4/NWwrWIZklFRqERXuq3sCMXrj0RArP9x8GV4KiIh3RTqYVXPFdyOaUT6LNMsLeYEXgps2e3klv8k5W9ywcnuQB T5cCxNXr wh8WgvMUj3FeW7yaLkibItPzePDvKbF787iafv7O37fejqH2EZuHB1A3Ff1ZYJAUY/QJsyE2DAUHWAQj0xvhu+Yq8vq5gT3wt4yC09/hW/DQQhJnshraaSe6DQhoJAad2ytMlgfdpzQ8vrruaQRbqDOiinDGc3Cmvm8HpDDNFmcfeTgLspEZtkgfvf2PE2yOyfVlUqgcuvhyjvgak8n9B3Mza/QWImNgy0fYo1gdmaTsx6bCHRkEeAZLwIE4fkKp+rh/DuffxNVE6YWbFJ4epM8JrD96FHdtzTiVVpUPRGDSoIcrflwTFZ9wX1KTGgbdSa9N12F/Tym8XyGE8VYvlBy6JCw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The following data was collected from tests conducted on the Spacemit(R) X60 using the fixed register method: No modifications: 6.77, 6.791, 6.792, 6.826, 6.784, 6.839, 6.776, 6.733, 6.795, 6.763 Average: 6.786839305 ffix-x27: 7.106, 7.035, 7.362, 7.141, 7.096, 7.182, 7.109, 7.126, 7.186, 7.12 Average: 7.145826539 ffix-x27 + x27(s11) used for offset optimization: 7.059, 6.951, 6.961, 6.985, 6.93, 6.964, 6.977, 6.907, 6.983, 6.944 Average: 6.965993173 Analysis: The fixed register method reduced performance by 5.29%. The per-CPU offset optimization improved performance by 2.52%. Issues with the fixed register method (beyond code size): Performance degradation due to the loss of one general-purpose register. Each handle_exception() call requires loading the per-CPU offset into the fixed register. Signed-off-by: Yunhui Cui --- Makefile | 4 ++-- arch/riscv/include/asm/percpu.h | 22 ++++++++++++++++++++++ arch/riscv/kernel/asm-offsets.c | 1 + arch/riscv/kernel/entry.S | 11 +++++++++-- arch/riscv/kernel/smpboot.c | 3 +++ 5 files changed, 37 insertions(+), 4 deletions(-) create mode 100644 arch/riscv/include/asm/percpu.h diff --git a/Makefile b/Makefile index b7d5f2f0def0..e291f865adc4 100644 --- a/Makefile +++ b/Makefile @@ -1026,8 +1026,8 @@ include $(addprefix $(srctree)/, $(include-y)) # Add user supplied CPPFLAGS, AFLAGS, CFLAGS and RUSTFLAGS as the last assignments KBUILD_CPPFLAGS += $(KCPPFLAGS) -KBUILD_AFLAGS += $(KAFLAGS) -KBUILD_CFLAGS += $(KCFLAGS) +KBUILD_AFLAGS += $(KAFLAGS) -ffixed-x27 +KBUILD_CFLAGS += $(KCFLAGS) -ffixed-x27 KBUILD_RUSTFLAGS += $(KRUSTFLAGS) KBUILD_LDFLAGS_MODULE += --build-id=sha1 diff --git a/arch/riscv/include/asm/percpu.h b/arch/riscv/include/asm/percpu.h new file mode 100644 index 000000000000..5d6b109cfab7 --- /dev/null +++ b/arch/riscv/include/asm/percpu.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef __ASM_PERCPU_H +#define __ASM_PERCPU_H + +static inline void set_my_cpu_offset(unsigned long off) +{ + asm volatile("addi s11, %0, 0" :: "r" (off)); +} + +static inline unsigned long __kern_my_cpu_offset(void) +{ + unsigned long off; + asm ("mv %0, s11" :"=r" (off) :); + return off; +} + +#define __my_cpu_offset __kern_my_cpu_offset() + +#include + +#endif diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index a03129f40c46..0ce96f30bf32 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -35,6 +35,7 @@ void asm_offsets(void) OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]); OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]); OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]); + OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); OFFSET(TASK_TI_FLAGS, task_struct, thread_info.flags); OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 9d1a305d5508..529d6576265e 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -77,6 +77,13 @@ SYM_CODE_START(handle_exception) */ csrw CSR_SCRATCH, x0 + /* load __per_cpu_offset[cpu] to s11*/ + REG_L t6, TASK_TI_CPU(tp) + slli t6, t6, 3 + la s11, __per_cpu_offset + add s11, s11, t6 + REG_L s11, 0(s11) + /* Load the global pointer */ load_global_pointer @@ -298,7 +305,7 @@ SYM_FUNC_START(__switch_to) REG_S s8, TASK_THREAD_S8_RA(a3) REG_S s9, TASK_THREAD_S9_RA(a3) REG_S s10, TASK_THREAD_S10_RA(a3) - REG_S s11, TASK_THREAD_S11_RA(a3) + /* REG_S s11, TASK_THREAD_S11_RA(a3) */ /* Save the kernel shadow call stack pointer */ scs_save_current /* Restore context from next->thread */ @@ -315,7 +322,7 @@ SYM_FUNC_START(__switch_to) REG_L s8, TASK_THREAD_S8_RA(a4) REG_L s9, TASK_THREAD_S9_RA(a4) REG_L s10, TASK_THREAD_S10_RA(a4) - REG_L s11, TASK_THREAD_S11_RA(a4) + /* REG_L s11, TASK_THREAD_S11_RA(a4) */ /* The offset of thread_info in task_struct is zero. */ move tp, a1 /* Switch to the next shadow call stack */ diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c index fb6ab7f8bfbd..6fa12cc84523 100644 --- a/arch/riscv/kernel/smpboot.c +++ b/arch/riscv/kernel/smpboot.c @@ -43,6 +43,7 @@ static DECLARE_COMPLETION(cpu_running); void __init smp_prepare_boot_cpu(void) { + set_my_cpu_offset(per_cpu_offset(smp_processor_id())); } void __init smp_prepare_cpus(unsigned int max_cpus) @@ -240,6 +241,8 @@ asmlinkage __visible void smp_callin(void) mmgrab(mm); current->active_mm = mm; + set_my_cpu_offset(per_cpu_offset(curr_cpuid)); + store_cpu_topology(curr_cpuid); notify_cpu_starting(curr_cpuid); -- 2.43.0