From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59AAFC83F1A for ; Thu, 10 Jul 2025 06:35:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CDA096B00A8; Thu, 10 Jul 2025 02:35:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C63B26B00A9; Thu, 10 Jul 2025 02:35:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B525A6B00AA; Thu, 10 Jul 2025 02:35:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A34B76B00A8 for ; Thu, 10 Jul 2025 02:35:21 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 607531D970A for ; Thu, 10 Jul 2025 06:35:20 +0000 (UTC) X-FDA: 83647393200.24.F902F89 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) by imf11.hostedemail.com (Postfix) with ESMTP id 5095040002 for ; Thu, 10 Jul 2025 06:35:18 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=ventanamicro.com header.s=google header.b=Kh95Kawx; dmarc=none; spf=pass (imf11.hostedemail.com: domain of rkrcmar@ventanamicro.com designates 209.85.221.54 as permitted sender) smtp.mailfrom=rkrcmar@ventanamicro.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752129318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kyhfcIGHykgvkrP2e/H7VmjAsjt0CsIijqEuhYmyjZA=; b=X0UB6On16ZIdoD9MRCxhV2DSUhJJtlFq1B8D58Zrs9KRwujFlVB9IUC9gORuCTwHG+1H92 w+xMXIaeChh+hQfvuIHwSAZnHptzxBgeqTVqC4npkFNTuDQCmM96ITQq140Fe7SIbZkNHt ZfPasZiSlEOrnZwz58P8itagvwQ1lTs= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=ventanamicro.com header.s=google header.b=Kh95Kawx; dmarc=none; spf=pass (imf11.hostedemail.com: domain of rkrcmar@ventanamicro.com designates 209.85.221.54 as permitted sender) smtp.mailfrom=rkrcmar@ventanamicro.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752129318; a=rsa-sha256; cv=none; b=BDhkW7WjBcGeSBscrSFB3YcSc9DaPhpxrrAWDjI6P763d5lAbhQ8qeod0/oGQRgsQANGjK Axnliajb4OnFc0GDJpELvjw/jMyj3eM4qjo003UX0QjuwOxcqy8a0nIoXROOSG5hmYUUPk 78bcWOK1rO8N0PGbzmyp8BNo+SDUiDo= Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-3a4eed70f24so88143f8f.0 for ; Wed, 09 Jul 2025 23:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1752129317; x=1752734117; darn=kvack.org; h=in-reply-to:references:to:cc:subject:from:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kyhfcIGHykgvkrP2e/H7VmjAsjt0CsIijqEuhYmyjZA=; b=Kh95KawxIzViCkBToI0ipmwPeMNnLfg8P8rew1+ABY2C8OFyhU7GAji83ESPXdHjV+ KLefHMBMPUD2fkEirR+0TBqcmASd0Bb3szgXW8g8ukPJtnBSjg31ZWLBXPCNqQmcxVeW uB3D70te/MBv9QeZGCYKF9Kg95RnVh6yTGGbuI2lJZunHWlFzEoc0K02QRv15/JszU1h 0jgJNvVuysXOhv0U7fHoAzEhofyeF/RoD7TMMSf8WVfsF+ICGCdgyRskQS/PuwqSyi1z OIMnXoU+DuWBGuAMbsTl+ydUxV3ZRL34kDZ2CJrcjG3gcw29TySMv9Y0zvFPLCczXmbd K30w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752129317; x=1752734117; h=in-reply-to:references:to:cc:subject:from:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=kyhfcIGHykgvkrP2e/H7VmjAsjt0CsIijqEuhYmyjZA=; b=KQLz4x8xr0ZrXkkLm8YnGPMuZe/wSf2zSvFhDXWuf6OGdiQjU2Ta0JR8C/i//vkqrK dNsE4zsgdMQtwNB8QCxFN4Zu2Ufmjec3PR3PFK6KJSyJG348gzkwE8Iw9tZKH+Ub1A5+ mmthXfhOTLDd/HRzJCawfAEIAt39uUMKpFU9nv6STO/PQnBlZgjjNd4NdWKScvOdR6Px ZExACRBPBifwRiaC/BC1APB7/sDSIIK4M472gVEeP4QBFZ5HVHET7gCPlk8F7SlM/MWr Z7tA6/NsdK9nXQrBpQL4Q3/NV8gUxoQ2BqHFAOv3FXznf+3xojt+vBdc3nWf7iOcQoCF KYtA== X-Forwarded-Encrypted: i=1; AJvYcCVMOq6xED07p87CsC039+C8jyHkMHJcYpvMqtazXSGuFVsOkJjNgfidFbadf1Eo0VnJ7YRPj54JqA==@kvack.org X-Gm-Message-State: AOJu0YyBxZzFbI/f0GO84/PMuHv8SZmhgps7hdIG+ml63yHL9Q8KZTsj yXdH492sE/PwFtnSYKBZA1tmQ5DCqp2FDsDowvg9bs1k+i0vaWROxsmyd/7ksBHnin8= X-Gm-Gg: ASbGncsdlqTk+RAw2hn1EgeSyQ2VtNtxUDkkScVjA6wxF6gFLm+94d5M4Uh3nJN65y0 EvaE6g4mBQso3H2HpODWIUKi8sU9fj73PqJ95Ps+o9eKQYrjxWSskgyIRz2KTuJg4m020lOuGl6 7pUZZvj425pId58x/FmFveQBj6dWZdjH4qyOgZJ2nwcZvyGb0u+OAcyiWXIXCJYMPMvdxc498YM bPDDObmAYzWepOyjJ6Bf8Xc+7BZW/i1+GqKtnxW+CHbjv9bTBCHS6zKq2WeUywshiJViyxsf5Bh S4gj6ev5wRaWWMLp2Wg+TBF0pbhhlbXmtiM+9I7lTC/nvXfChHz9ob6lSe3F8mdWPvqJjh2ViNG 4u1em/i7VP+0NOei/1giNnw== X-Google-Smtp-Source: AGHT+IEqEitaMgB8tObGyqBHxc/XW6tkIZ1LssQgO+WUWZXRaXyhTeRPn6KcRlFtzKo7i/2kYbIvaw== X-Received: by 2002:a05:600c:3584:b0:453:76e2:5b16 with SMTP id 5b1f17b1804b1-454db9090c3mr7110435e9.0.1752129316466; Wed, 09 Jul 2025 23:35:16 -0700 (PDT) Received: from localhost (ip-89-103-73-235.bb.vodafone.cz. [89.103.73.235]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-454cdb381a6sm57354855e9.1.2025.07.09.23.35.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Jul 2025 23:35:16 -0700 (PDT) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Thu, 10 Jul 2025 08:35:15 +0200 Message-Id: From: =?utf-8?q?Radim_Kr=C4=8Dm=C3=A1=C5=99?= Subject: Re: [External] [PATCH] RISC-V: store percpu offset in CSR_SCRATCH Cc: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , "linux-riscv" , To: "yunhui cui" References: <20250704084500.62688-1-cuiyunhui@bytedance.com> In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5095040002 X-Stat-Signature: 54noirscsz5opcjf4ia38uqb4uqj5ken X-Rspam-User: X-HE-Tag: 1752129318-615459 X-HE-Meta: U2FsdGVkX19BH726bysds63ebU12KFGQp0Pi5ESmieFIETJQVd0BLsGpam+ddjX3nercG4/otnY7EInLh9g1Po5B0ni9nuIxHqZHEDa/oHNU7kn7zDc0cgMPPJM4e5KllyTtt2Xa9BspHpwejsDlbG0lZkcuKAqYWUGY1I0YIbDmbWNphF1rUeW5FQn+DZf4Y/7DpyEtzKeRENdC4dpshH939zCIDczcAQ1DQuSiVW6269zTo0Yhco1VJXBbsnihyQtpOy5QSL74WEktBbAdITzXpGiP+GmOUBHpKZzHiTwu9NjTdPYJLZJja8UfwKgoltYXt22S7G9SNiBlotDNrIRbZsz0DuRjG93dvvSTefvUCGSnJUtfpmlLo6xdvXt/XSP+OEXJxj0BXrcHxJea++6PyCCEK0fep1D+a0C7KMRIr0Oj2/Ni7PUtAeeZzgWB9kp3ww5SqpvyuZHWuqOFs3Ry/jZQEtMeVoE6yAR5Yb+h7gWU+aEaDu4TazIdBuKyDP4C19N4gEzllxQQbMd2VX0SkU8xhp5AU8/O8b/eCXu0nIlxfoU0wbGQRgcTSRfxIz2/N7J6ICCD5YAsycy5LGJCe0sUu4kG2tOJxEl3JMHG55+yhL3MNRu+2wo/jCDSwuuImliQxZPb69KY40GoDNuo0G0lNvscIQJuK/RVpEUM8elmqd7fZbieKpsvzprVGIgYAFKIyQES1Qtv7Sbe1VG0hn/PBmZv6dgM7GAZXOxYxT6MnJ5cZYrVtCvKhX6XCFIA+siV6H3B6GI7airnbkw40ZpXobwSB892o5H8xZTVo6hkECGz3JmrILkG5Iwqdum6ce6ucIkZbUveGP2fhMoq5CQRBEHxtCJwNv4FXCSUoLgpJP+AoWQIF05+H/CJLro5mFha1O27k3gsEJ1tCx5XNWU263zIcWzN/QEFE0rvO3mhTSAYDq2YzISe+urV8Rumv1QdyFN/uBrx1j+ yU8ZQKsN Ze7sdjcIndle3cYVhelBWFLS2rsCs8vJHMLTlDC5IG9lplbvQtX1n3rYwmqg9MoTq0NktaJzxeuGavUNJfyvHYRMjsE2iGorLpcVqmX/swOyvgklSVCmPKiondw3ZBoAkoFQJJ9erRGbFjL0CX37JLGcjPuJbmFLl2T9+04g7cWUwdJp9ovDkhqPDdKZvfXKd/no0ftX9EfaiWRY9VzIJSZhC6oeyWNW/cCDQLSxMUre922Iixef0OmELg5za0cTU8y+MwX2lq194O66bpxm1YS5hy48nrjFckARqlLoUfHwV/sjgXXeYeCLictczZjfmM83+0XpH57GqgkLww4lbcQ97ICrnHq1Dc8aYGF0gFkmgzsbTWaDzDiDWOxn9rL8TSamQOfmJtbcvdcKCAq1eymRz7zIsw6EJw9PIglfTZ2nQg4HfT3QWjuOdZkGaPlzNCA2qdNuWceJWWJMM65xYdxA5mlK6CjKz4mwtByBZojrfP1J6Sn6WHoenMVUbz6I/v/G37m/k0X3flSzoIpurCwDkx3+0PCgWW/kXhMgD21bmQ4a7Cyvkd25vtzk/S67KNl3qr7aJWjWmbgG1qJzrcIpHUoKsiFglspB61/S/XI3sAJrbyB5m7NbrlA4EiEsIrIS2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 2025-07-10T11:45:06+08:00, yunhui cui : > On Wed, Jul 9, 2025 at 10:20=E2=80=AFPM Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: >> Is the overhead above with this patch? And when we then use the >> CSR_SCRATCH for percpu, does it degrade even further? > > We can see that the percpu optimization is around 2.5% through the > method of fixing registers, and we can consider that the percpu > optimization can bring a 2.5% gain. Is there no need to add the percpu > optimization logic on the basis of the scratch patch for testing? > > Reference: https://lists.riscv.org/g/tech-privileged/message/2485 That is when the value is in a GPR, though, and we don't know the performance of a CSR_SCRATCH access. We can hope that it's not much worse than a GPR, but an implementation might choose to be very slow with CSR_SCRATCH. I have in mind another method where we can use the current CSR_SCRATCH without changing CSR_TVAL, but I don't really want to spend time on it if reading the CSR doesn't give any benefit. It would be to store the percpu offset in CSR_SCRATCH permanently, do the early exception register shuffling with a percpu area storage, and load the thread pointer from there as well. That method would also eliminate writing CSR_SCRATCH on every exception entry+exit, so maybe it makes sense to try it even if CSRs are slow... Thanks.