From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13879C3ABBC for ; Tue, 6 May 2025 16:00:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D4AC6B000A; Tue, 6 May 2025 12:00:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 283BE6B0082; Tue, 6 May 2025 12:00:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14EC86B0085; Tue, 6 May 2025 12:00:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E62396B000A for ; Tue, 6 May 2025 12:00:16 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C36B659CFE for ; Tue, 6 May 2025 16:00:17 +0000 (UTC) X-FDA: 83412944874.08.EAE13F9 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf05.hostedemail.com (Postfix) with ESMTP id 0B781100021 for ; Tue, 6 May 2025 16:00:15 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746547216; a=rsa-sha256; cv=none; b=1t5vr4JEyy6G9OpQGNZVgAW//stoIGRuydBYxMFhOQfB+KdoaoXrVkqzHgd9op3PmNMYqw Md7WVy2qbsf/P55fETkPL/0zw81t39YsxSGzyVw3kKZevT2DH1FMDFQ6FNT46ra5Th6RB5 4qo6O4Ph/kVmkXUNvhN6VFGLj8wcXgo= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746547216; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=INfxscuPtoUCTKcRpuCQfQ3DIdVXITaXhwrWbdfcPiE=; b=TPe6BwfrPxttmZgQG6wa0PkaWY4Vn7aKMc9N7XfVlY84lSyFLOVN2xy2lKcqDyxO2fGHR7 G85VQSA67huzkNNGo7ySUN30cUEXrJJ5W/B1NcP5QOGtVvtsZl+rcfKdNdULYYiOEdULkA aEuVNjBQ2NwAPE+ZXEJ/wxLWKiGQrfQ= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uCKiC-000000007cB-2gam; Tue, 06 May 2025 12:00:04 -0400 Message-ID: <29b154c679adeab912f8f5770344126264a698b9.camel@surriel.com> Subject: Re: [RFC PATCH 7/9] x86/mm: Introduce Remote Action Request From: Rik van Riel To: Nadav Amit Cc: Linux Kernel Mailing List , "open list:MEMORY MANAGEMENT" , the arch/x86 maintainers , kernel-team@meta.com, Dave Hansen , luto@kernel.org, peterz@infradead.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Yu-cheng Yu Date: Tue, 06 May 2025 12:00:04 -0400 In-Reply-To: References: <20250506003811.92405-1-riel@surriel.com> <20250506003811.92405-8-riel@surriel.com> <03E5F4D7-3E3F-4809-87FE-6BD0B792E90F@gmail.com> <09b6eb12ede47b2e2be69bdd68a8732104b26eb0.camel@surriel.com> Autocrypt: addr=riel@surriel.com; prefer-encrypt=mutual; keydata=mQENBFIt3aUBCADCK0LicyCYyMa0E1lodCDUBf6G+6C5UXKG1jEYwQu49cc/gUBTTk33A eo2hjn4JinVaPF3zfZprnKMEGGv4dHvEOCPWiNhlz5RtqH3SKJllq2dpeMS9RqbMvDA36rlJIIo47 Z/nl6IA8MDhSqyqdnTY8z7LnQHqq16jAqwo7Ll9qALXz4yG1ZdSCmo80VPetBZZPw7WMjo+1hByv/ lvdFnLfiQ52tayuuC1r9x2qZ/SYWd2M4p/f5CLmvG9UcnkbYFsKWz8bwOBWKg1PQcaYHLx06sHGdY dIDaeVvkIfMFwAprSo5EFU+aes2VB2ZjugOTbkkW2aPSWTRsBhPHhV6dABEBAAG0HlJpayB2YW4gU mllbCA8cmllbEByZWRoYXQuY29tPokBHwQwAQIACQUCW5LcVgIdIAAKCRDOed6ShMTeg05SB/986o gEgdq4byrtaBQKFg5LWfd8e+h+QzLOg/T8mSS3dJzFXe5JBOfvYg7Bj47xXi9I5sM+I9Lu9+1XVb/ r2rGJrU1DwA09TnmyFtK76bgMF0sBEh1ECILYNQTEIemzNFwOWLZZlEhZFRJsZyX+mtEp/WQIygHV WjwuP69VJw+fPQvLOGn4j8W9QXuvhha7u1QJ7mYx4dLGHrZlHdwDsqpvWsW+3rsIqs1BBe5/Itz9o 6y9gLNtQzwmSDioV8KhF85VmYInslhv5tUtMEppfdTLyX4SUKh8ftNIVmH9mXyRCZclSoa6IMd635 Jq1Pj2/Lp64tOzSvN5Y9zaiCc5FucXtB9SaWsgdmFuIFJpZWwgPHJpZWxAc3VycmllbC5jb20+iQE +BBMBAgAoBQJSLd2lAhsjBQkSzAMABgsJCAcDAgYVCAIJCgsEFgIDAQIeAQIXgAAKCRDOed6ShMTe g4PpB/0ZivKYFt0LaB22ssWUrBoeNWCP1NY/lkq2QbPhR3agLB7ZXI97PF2z/5QD9Fuy/FD/jddPx KRTvFCtHcEzTOcFjBmf52uqgt3U40H9GM++0IM0yHusd9EzlaWsbp09vsAV2DwdqS69x9RPbvE/Ne fO5subhocH76okcF/aQiQ+oj2j6LJZGBJBVigOHg+4zyzdDgKM+jp0bvDI51KQ4XfxV593OhvkS3z 3FPx0CE7l62WhWrieHyBblqvkTYgJ6dq4bsYpqxxGJOkQ47WpEUx6onH+rImWmPJbSYGhwBzTo0Mm G1Nb1qGPG+mTrSmJjDRxrwf1zjmYqQreWVSFEt26tBpSaWsgdmFuIFJpZWwgPHJpZWxAZmIuY29tP okBPgQTAQIAKAUCW5LbiAIbIwUJEswDAAYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQznneko TE3oOUEQgAsrGxjTC1bGtZyuvyQPcXclap11Ogib6rQywGYu6/Mnkbd6hbyY3wpdyQii/cas2S44N cQj8HkGv91JLVE24/Wt0gITPCH3rLVJJDGQxprHTVDs1t1RAbsbp0XTksZPCNWDGYIBo2aHDwErhI omYQ0Xluo1WBtH/UmHgirHvclsou1Ks9jyTxiPyUKRfae7GNOFiX99+ZlB27P3t8CjtSO831Ij0Ip QrfooZ21YVlUKw0Wy6Ll8EyefyrEYSh8KTm8dQj4O7xxvdg865TLeLpho5PwDRF+/mR3qi8CdGbkE c4pYZQO8UDXUN4S+pe0aTeTqlYw8rRHWF9TnvtpcNzZw== Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.3 (3.54.3-1.fc41) MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0B781100021 X-Stat-Signature: gkoqrdnaymnxdts33hkryz1ug3yu6okm X-Rspam-User: X-HE-Tag: 1746547215-761401 X-HE-Meta: U2FsdGVkX18y2FP02ZaX8tPZZ7YUf+rDi54fTGafh4JwcORWXqPrgWwPsSjmHYw9zkHz+e/iOATFIwEh7v0MRBd+XUsJLq/sY8SThpKrVHZ/UVi942TQPCz2Tskil8GPT1tjWdCdBQaSfZsOoN14AjfckRLqZqwyuFHbhuF6VTsmrkpqBCFKP2j1S8/+lZfS00scDrNpJqO4dcHdlzGx7Bm7oVSl4jb8Qj23aR9gj0DHlCsNkJXD/oYcnzf/eNL6QUSza4tztXEkt35y3Am2ATWYpXngqL8foYvJcJ2hb3Tuqb0B45l5AaSNHgb0DzHtaeRb3t1MGWCPEgfgK6TIQqMEsNZyO5eRP78HJxeQykW9KM4qxate8j5vYkJNWIqzzDyxbcBhdKn/0oWQnwd9WTiekjpogSCf3CaWWGSkM8Xl/mX2ggoszsjsL4yE9QulE1YPVgTysBMQOpwW7RHKHiULc5Wt8D7hPerddN6kJ0d70Y2HBB/zL+9Q6MbgXC9YsfFcjne8QaX2tEXb8ba5XEOpWEyDcjhArl6L3yHtHF4BffC+rojS0LHZD+wS/59/9hFqJySL3XfA2ONOyO3TiitnSy/DeU/PoDHcSXr2SVSWJbU1S9TZaqcm3C0/nP4IEUKjk2qB11p98Edg0SY/dFlCrE9FLWiVxnvRoWvBc0QDv+3gpJDHp/34sLZdiVqcf432WGxKc3XgI6sXqYbCiswDYOX1Z0BmiIjO6r6yza/AOtq/JqwyyO6vFI8BQoosJll/O3hos5Kj3glCKpcxZv7Z5u9MMRUB8Le2jAAvy8LdibDYYEtrJCqGkqeZIL6ifeeFFX3Y7R46fGkrLQfscfMRmvrrelEu9+uSdANQZ3sKBYWRGG3L6eTRDfwUOVudSmttSSNRpyDS4eIMDz6c9Pjg1vooqsKZ23v93PjUC/pci3y2hxQ7n3m4Bvq1zBu3Q2I2Sn29evFTF7004CJ WA9psE2S bAEN0ce6L2u3mYdAaIauR0kdaLIbEqU9M84KVRsSLIi3RHAHBW+5Co2YLWc43c58JFhJBYfMcJ3uWYHJGgzLCZy9WnGrWelxDdKPuhydRrBVN1Ass+aw1SHoABU93XkE4UTK+Olhc2FwdQKG9LcIq6CXH3Xt5jiUhKsnTfwMu5A6b0fMJ8HDbRmn/tCNzcZ8McPRbpNc4hKBHMnhRJDBhDliI3LDZQd7MfIUyvr4rHyviCI1ebM8wLL5UMWw+QIpDCmBtk4w7hgNUIQ3t6K06j/M5HQhRMiSXuv6PgnJ0DnlaGWAJvthsyAjTI1cHBbM45jUJ3Nw0Ucg15PcnZ2tJnUkDQo5m2UeXo7pIQ/TQcsy3qD1GEi80UTYJNmxx14Qpfp9GcfGtjd+MzrDRuLmd0iMso4B0bBapcNrR+lQQLKw23byAfHepWBIhhw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 2025-05-06 at 18:50 +0300, Nadav Amit wrote: >=20 >=20 > > On 6 May 2025, at 18:16, Rik van Riel wrote: > >=20 > > It gets better. Page 8 of the RAR whitepaper tells > > us that we can simply use RAR to have a CPU send > > itself TLB flush instructions, and the microcode > > will do the flush at the same time the other CPUs > > handle theirs. > >=20 > > "At this point, the ILP may invalidate its own TLB by=20 > > signaling RAR to itself in order to invoke the RAR handler > > locally as well" > >=20 > > I tried this, but things blew up very early in > > boot, presumably due to the CPU trying to send > > itself a RAR before it was fully configured to > > handle them. > >=20 > > The code may need a better decision point than > > cpu_feature_enabled(X86_FEATURE_RAR) to decide > > whether or not to use RAR. > >=20 > > Probably something that indicates RAR is actually > > ready to use on all CPUs. > >=20 >=20 > Once you get something working (perhaps with a branch for > now) you can take the static-key/static-call path, presumably. > I would first try to get something working properly. >=20 The static-key code is implemented with alternatives, which call flush_tlb_mm_range. I've not spent the time digging into whether that creates any chicken-egg scenarios yet :) > > I think we have 3 cases here: > >=20 > > 1) Only the local TLB needs to be flushed. > > =C2=A0 In this case we can INVPCID locally, and skip any > > =C2=A0 potential contention on the RAR payload table. >=20 > More like INVLPG (and INVPCID to the user PTI). AFAIK, Andy said > INVLPG performs better than INVPCID for a single entry. But yes, > this is a simple and hot scenario that should have a separate > code-path. I think this can probably be handled in flush_tlb_mm_range(), so the RAR code is only called for cases (2) and (3) to begin with. >=20 > >=20 > > 2) Only one remote CPU needs to be flushed (no local). > > =C2=A0 This can use the arch_rar_send_single_ipi() thing. > >=20 > > 3) Multiple CPUs need to be flushed. This could include > > =C2=A0 the local CPU, or be only multiple remote CPUs. > > =C2=A0 For this case we could just use arch_send_rar_ipi_mask(), > > =C2=A0 including sending a RAR request to the local CPU, which > > =C2=A0 should handle it concurrently with the other CPUs. > >=20 > > Does that seem like a reasonable way to handle things? >=20 > It it. It is just that code-wise, I think the 2nd and 3rd cases > are similar, and it can be better to distinguish the differences > between them without creating two completely separate code-paths. > This makes maintenance and reasoning more simple, I think. >=20 > Consider having a look at smp_call_function_many_cond(). I think > it handles the 2nd and 3rd cases nicely in the manner I just > described. Admittedly, I am a bit biased=E2=80=A6 I need to use smp_call_function_many_cond() anyway, to prevent sending RARs to CPUs that are in lazy TLB mode (and possibly in a power saving idle state). IPI TLB flushing and RAR can probably both use the same should_flush_tlb() helper function. --=20 All Rights Reversed.