From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5D19CA0EFB for ; Fri, 30 Aug 2024 10:01:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA1BF6B00EC; Fri, 30 Aug 2024 06:01:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C51236B00ED; Fri, 30 Aug 2024 06:01:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B18B76B00EE; Fri, 30 Aug 2024 06:01:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8F3D76B00EC for ; Fri, 30 Aug 2024 06:01:47 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3F621AB1EA for ; Fri, 30 Aug 2024 10:01:47 +0000 (UTC) X-FDA: 82508470254.07.34A560A Received: from mail-oo1-f52.google.com (mail-oo1-f52.google.com [209.85.161.52]) by imf02.hostedemail.com (Postfix) with ESMTP id 249088002B for ; Fri, 30 Aug 2024 10:01:43 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=iq560X4d; spf=pass (imf02.hostedemail.com: domain of cuiyunhui@bytedance.com designates 209.85.161.52 as permitted sender) smtp.mailfrom=cuiyunhui@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012014; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TGh6tOVfT3AGgZmYj7CotHEnsUZ6hNFR9Fk9bpSkfUo=; b=2LfUVEngwwwxT92L1yUlIltkUvmvOdPZfvM8P2/I+fioVU10pdr+47AZDyU4ZDwhwy8du9 3Vh1XoMuUyx5Ht+BpW8zD+G2LAsBSGN7lkcwghu3RTJ4OCUrbtyQiufDP4sZakQxSASaO1 TNrlKaZ/CKah9YgkTU9fEyrJE/shm7c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012014; a=rsa-sha256; cv=none; b=1b3KtjBEzDxGeq2h1my1gTmp8FMknNDSP5oTv0+dBfzRMfSTYL94O1NWtTR8p87W1Mh/7E 6IvlkRX99P3wWsl2fixto/pY4dLTDGdZURbuQgtKK+QeStmnqbz9c3GNcM07Nv6wSqRjTd TfWoQpdEo/NyypMQFZHMyQTEo8BuK60= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=iq560X4d; spf=pass (imf02.hostedemail.com: domain of cuiyunhui@bytedance.com designates 209.85.161.52 as permitted sender) smtp.mailfrom=cuiyunhui@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-oo1-f52.google.com with SMTP id 006d021491bc7-5de8647f0d5so1003820eaf.3 for ; Fri, 30 Aug 2024 03:01:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1725012103; x=1725616903; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TGh6tOVfT3AGgZmYj7CotHEnsUZ6hNFR9Fk9bpSkfUo=; b=iq560X4dVLIRQeR0hD2emzo1sErewimj150ZIztp5RpUuvzNkEMVqTbPjdKjUWKyKU w/M5q3tk9ibHajClRcDKv01hYHdaRBXl5yGINeulHoSVJiAeDSQSQWB4tHEI/J2oExlC ae6DQCiBhVJTHkYZsw0DjQ/ISBZUHBYmo1FED9WWMWwx6HJCAn5qv9I8C9lR9jfKws5q EEHK9MUGKO/ghEO8nnQ36BbiIonpyxGp8EHW8KueW/lbuFmqwcODlBpBdCpRzcSRxuhP ouJAebPQVGI11GWzVcDllVFuBl3IJwcQ8+mJwMgERg3kj69iAh/SR+YNB57ez1FyHstg aGzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012103; x=1725616903; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TGh6tOVfT3AGgZmYj7CotHEnsUZ6hNFR9Fk9bpSkfUo=; b=Ha5lzDFRiTctBkuqFAj4MfSXEOtPI6kXY2l34L53SHwfWJxuW2o7e0QIY6BTEQA0Sh s5+e8h/F2QNXGYMQoVhJtwjRsE8fMIbn1pl+t3z+cnvUR12+h0TeBgzB7rmB+nsZAImr 7MRmkqt6bM6W8T+OZ2ETPlQf/bv+5md9YMxoPpEdyrkdyWr8XBUPEgpTqnXaG/VykLyB eqAicMMpsJnIVDfIcbAcrbnRT2J2aN3B3MAOO1ctUzkuCU+sJl0FXW5he9Bn1cFIXajZ 7MtlqPOkSe/8A4eVKbXPhtcHDUqauCT7sx8E/DhICjO8obmpuxFQUEmhKY0p0tpam8mb hHpQ== X-Forwarded-Encrypted: i=1; AJvYcCUHpvfAIFkv02XxMZPBLgfv/tBWFHKCweGx0qYNLbt/NHFb6QHpvz+wi2uM/35t53MvlwUsRFfk8Q==@kvack.org X-Gm-Message-State: AOJu0Yxt/xVvONNSx77QBg6SdUOBdYbCJQ2yDgKon8+e/mhHbQo3suv/ gHLBOF4CtkHfFX4hx0GXZMCSjdhXCfTfNCqfoJStCU9dt4HJkeLZYSMzSASqFW2tItVQ2lUErLo ROyBntFipDmI6NJX/89+xTaicVLHj3/NfNNFX/A== X-Google-Smtp-Source: AGHT+IGgJq/nQWVrLteUCrCDYkCkCI+cLIsJ8x+FujDVpbUmLezFx0No7xlEbHjnEnq9yqlFxxMSaWSfJiKh7JuD0qo= X-Received: by 2002:a05:6870:9121:b0:261:2357:5a29 with SMTP id 586e51a60fabf-277902f60bamr6302103fac.46.1725012102541; Fri, 30 Aug 2024 03:01:42 -0700 (PDT) MIME-Version: 1.0 References: <20240824004920.35877-1-cuiyunhui@bytedance.com> In-Reply-To: From: yunhui cui Date: Fri, 30 Aug 2024 18:01:31 +0800 Message-ID: Subject: Re: [External] Re: [PATCH RFC] riscv: use gp to save percpu offset To: "Christoph Lameter (Ampere)" Cc: punit.agrawal@bytedance.com, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, dennis@kernel.org, tj@kernel.org, samitolvanen@google.com, guoren@kernel.org, debug@rivosinc.com, charlie@rivosinc.com, cleger@rivosinc.com, puranjay@kernel.org, antonb@tenstorrent.com, namcaov@gmail.com, andy.chiu@sifive.com, ajones@ventanamicro.com, samuel.holland@sifive.com, haxel@fzi.de, yang.zhang@hexintek.com, conor.dooley@microchip.com, evan@rivosinc.com, yang.lee@linux.alibaba.com, tglx@linutronix.de, haibo1.xu@intel.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: xkox6zio9zgqcneono9tyescnfbcn4q1 X-Rspamd-Queue-Id: 249088002B X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1725012103-419242 X-HE-Meta: U2FsdGVkX1+sGV73V86lUfTMzkUAAIRiZDRxhKwELsMmRpci6CxP8azI5aOpfZ2o91WlmQJCbNOdT/X7oG5NlkmkfiNYOyt+AphBweFg26jmlkGkB04Jpe6hTtaBgNi0KM4tW11/VWutcshOWpGghOxn2OlzdDdBXicvn8ks74pZ/CMQclcVkogvpfS/incPRIdTGCGOgHc0veXD14WGlZe44V0LWl5s+IpEaQga3+MQW5HiuI46Vb7kq/ROM70YYmj4Wk1OhX4q6gmruEimzDWCwS1cJeuv9nMgBtQ8XJNEEAMsBXboJwhr4lmlQiRVZ8Xko6pgs1sLBU5h67R1fIANWrmm03qOtksxDtJ+DvYPFrkVPbW/x97Ye84aZCC3jkjLedyzE5PwnOAUam/XbWj90FnDwEh1Az3wWEFd13OLz6PFha7jZ3p32qhxK+YqDSQQnzr3ioE+REAh49boIIo8R7gQr3OMMwF9SHXSucVl8NbLJW0ts2STEGV7teUDX975xeriuzn8BtQuAHrwukWEktzJU5pB0Z36bUK+6I4jYJndYYxPXYa5CxfDoUky6EWPABAPKFAeuI8DgwxbVq38X9AKfolhBseuIybqxkzGrgRA5KqSTyo8LpM7VL96F1FBKN4PQoAbmpZr5Np1UhNJZwOH3fAdALB6fRfhdx47ryCbm0cSyB/GZdihIq/eFi4CW7bwEei6MTAbItd3ugwBBRyGRPh3Pynqf6DpKPV+n4DLcc1SH+LF/jWGp1lSAXKrauuWJF+Ci/lm+ayfMK59JNq7smYLnkyUpXVIx3Jb90H88R2FQZyzciTeHzoXgbv/+wUkUIoOCbc1MhciUVt/WRmbgk/faz4axX9dFvZO7z3VBMomDYbToY53ww/+sPu6irWudCwnaIzFYNavalt+m5DzmGYYnqHcbf0p6bVUms/5vONJ6CK4op/+QNs+f7QkF9O11RJGzbkUTjz S5QSJMLW sHgeEpR4VZdRjNaUdq1eyBAHo9MHSmVT63Pw5OKv4l+IeuzrV592fRzSSto6AfwzkaMiFHEaWAIy0BBfLjRj8W5yTKRX25xyB/YbVB3TNmLPb8oVnzYEhf45m+pEIo5KFtTNJPxk7yVfqU3baUeZCYU57tEnOpMbyd/xqPiR7y0V8Q5pzUDtvVQ4dUI+/4dnhi8cY2kcyJuiAAmiYO/1DIHqvHBqK/EmHyjzMVYYKdQINtANZZAYH21wREVXKFxHBj2CWvY+8VdeC5UOrgQobNA1yoYTmTau47k64 X-Bogosity: Ham, tests=bogofilter, spamicity=0.140475, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Christoph, On Sat, Aug 24, 2024 at 9:57=E2=80=AFAM Christoph Lameter (Ampere) wrote: > > On Sat, 24 Aug 2024, Yunhui Cui wrote: > > > Compared to directly fetching the per-CPU offset from memory (or cache)= , > > using the global pointer (gp) to store the per-CPU offset can save one > > memory access. > > Yes! That is a step in the right direction. > > Is there something like gp relative addressing so that we can do loads > and stores relative to gp as well? > > Are there atomics that can do read modify write relative to GP? That woul= d > get you to comparable per cpu efficiency to x86. x86 can do relative > addressing and RMV in one instruction which allows one to drop the preemp= t > enable/disable since one instruction cannot be interrupted. Your suggestion is excellent. If conditions permit, we can indeed move closer to the x86 architecture. Thanks, Yunhui