From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9201DC7115A for ; Sun, 22 Jun 2025 19:05:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F1886B00A6; Sun, 22 Jun 2025 15:05:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A2906B00A7; Sun, 22 Jun 2025 15:05:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 191236B00A8; Sun, 22 Jun 2025 15:05:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 03E5B6B00A6 for ; Sun, 22 Jun 2025 15:05:01 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7BCA45D62D for ; Sun, 22 Jun 2025 19:05:00 +0000 (UTC) X-FDA: 83583963960.17.E856CF8 Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by imf27.hostedemail.com (Postfix) with ESMTP id A1BFA40005 for ; Sun, 22 Jun 2025 19:04:55 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of segher@kernel.crashing.org designates 63.228.1.57 as permitted sender) smtp.mailfrom=segher@kernel.crashing.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750619098; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lXvkinor2Kxjpw0TfsCTa3CsaB0P+Y72nKpP+5Ar/Z4=; b=pxICeFkyBxFLpGquktUP291cA6q4bIi2s4mJ1DpG6irQHcjktEwKRjQW+Fqtc5x+H1U8C5 ckrPzVFP8bLuR1LUThgj0CO2cDJNI0TmV5x2zXt/56wzYEObQZbwvLWT6f78YV9RcpCwrr 829sDz64olXBvT+W1qhVamH9dw8QaeA= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of segher@kernel.crashing.org designates 63.228.1.57 as permitted sender) smtp.mailfrom=segher@kernel.crashing.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750619098; a=rsa-sha256; cv=none; b=fGxT2TJVnAIvwJyba35eveVw0fs7rB21A+ltgzCxRNWrjGUTEWhhD2KrYkroweajjkih8E 4s5xQgseYQmTlRtGQV2759n7dNtCkxQzNSDN/A9Z7IN2I6b9s1Lf1YBnPA/0lXPU+5Lecw NOu27O8ZNB+2Nh8+KpSTZxMRe5Dc6sc= Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 55MIvBBC005896; Sun, 22 Jun 2025 13:57:11 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 55MIv72V005883; Sun, 22 Jun 2025 13:57:07 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Sun, 22 Jun 2025 13:57:06 -0500 From: Segher Boessenkool To: David Laight Cc: Christophe Leroy , Michael Ellerman , Nicholas Piggin , Naveen N Rao , Madhavan Srinivasan , Alexander Viro , Christian Brauner , Jan Kara , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Andre Almeida , Andrew Morton , Dave Hansen , Linus Torvalds , linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 5/5] powerpc: Implement masked user access Message-ID: <20250622185706.GB17294@gate.crashing.org> References: <9dfb66c94941e8f778c4cabbf046af2a301dd963.1750585239.git.christophe.leroy@csgroup.eu> <20250622181351.08141b50@pumpkin> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250622181351.08141b50@pumpkin> User-Agent: Mutt/1.4.2.3i X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: A1BFA40005 X-Stat-Signature: mpqxcbdu85qk9g97bjtpnax71gebga6n X-HE-Tag: 1750619095-62838 X-HE-Meta: U2FsdGVkX1+jryVR8GZrjwrNx69EzW79FQMxUBicko17INm/WcdjlbQoNm97HkZjZQsD7UdJXEWcBDtWsZmSowW0NE+hxH7pponqfoSq50ujDPMUC3NqzDSB7HUVj4Npfe9j7y1xsjtasY7yHcNE6NeozJo+iodKpPF/1A2ew09Dmy1Xkep/KdYFp/koHWYOR3xBJ3VnxXJsaQAw/K0h75Z19CMgvLjQfutOE/zz+yboXmmadq7z0+7sqWKmxc30GKXEE6M1LsitzFz+UOgoKgOrfi9Rl7SvgJ27MR+XZCLygkeNtqAgEW8kxkkA5jugKLKccBTAQ+cy0v2Sc8OwHa4aU/cG1VmlfhzkDFwIBQouLzdgbt8qChBUqrHSjSWYT0Q4AT28swOgZAHGwqVBvMmYIhP0UVypv1i0QM2DpfEoN5QCmF3gahB7LaFCGxCAgRPJKAO0xgTaVuVMpf2G3TLQ2NBqpQd5WtOiZRszVnBnVXbc9T1c+TmcKhCmHsP7Jm76VuoEn2oSR287y9k5WYLz5YT2zK9ddN0dMgBSZGIhUTUJ46tgsNc9bpbwZf8e8IQuctrsY+w/2aQLVFDP/1Y7gXZKE2YAM8a/I8ycr951m4wz0+qAKhdATob3Qeg8mssXgbnOxgdHnVJrkXjUQRrtQsevOW8d8T8tUQ8i4vWrpewPeo0rCTVVPXH28ac+hDEbjoXjrXNkDlP3HCWeHK3W9u/ws1wfgZmYGsZG2w7Ns3lVDraaI14Z5GGnfOM9v50ZmND3II5qZyDOYsn7b+rghlPH7tMJh9XUSQbUDeLolEUrmUsWbIQSD8WbPnreluHTa+cqAVZ6UC2M682izXABq1q+LHip1DfLldHaiRe3YtXR4LM2XeL2CchEZGI4YMlGBSZg34usr7ipgfbJA1uotjABNWfLAABLTM42GTj3HQrtxSdl/LCPBacEbNUEJV4XSBJkMVBhGI6s3i1 BJypmHi0 YYWDLM3SElkHqLE/SrQL6Ks5vVCviNt6xvGH9lTDLUwkm+CRuF5UyzK0XkVSTdECYFFAqe3mj7kPWIMoqhQ5LkTArh0JgyNh47gJvi79G7sqzTOaCr+jJ/k8ymgHWp2yJk721OOJ/Y+qxFkxuWRZeLZwDQWfy+A3bpuSmZYv6z56I/q+Osk/wKFASN0Zu8ZV3ZSNYlNGLiE9V5xUi5PAL4loS1hnQ7ISXGm1qG7Q+ASzDXxo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi! On Sun, Jun 22, 2025 at 06:13:51PM +0100, David Laight wrote: > On Sun, 22 Jun 2025 11:52:43 +0200 > Christophe Leroy wrote: > > e500 has the isel instruction which allows selecting one value or > > the other without branch and that instruction is not speculative, so > > use it. Allthough GCC usually generates code using that instruction, > > it is safer to use inline assembly to be sure. The result is: The instruction (which is a standard Power instruction since architecture version 2.03, published in 2006) can in principle be speculative, but there exist no Power implementations that do any data speculation like this at all. If you want any particular machine instructions to be generated you have to manually write it, sure, in inline asm or preferably in actual asm. But you can be sure that GCC will generate isel or similar (like the v3.1 set[n]bc[r] insns, best instructions ever!), whenever appropriate, i.e. when it is a) allowed at all, and b) advantageous. > > 14: 3d 20 bf fe lis r9,-16386 > > 18: 7c 03 48 40 cmplw r3,r9 > > 1c: 7c 69 18 5e iselgt r3,r9,r3 > > > > On other ones, when kernel space is over 0x80000000 and user space > > is below, the logic in mask_user_address_simple() leads to a > > 3 instruction sequence: > > > > 14: 7c 69 fe 70 srawi r9,r3,31 > > 18: 7c 63 48 78 andc r3,r3,r9 > > 1c: 51 23 00 00 rlwimi r3,r9,0,0,0 > > > > This is the default on powerpc 8xx. > > > > When the limit between user space and kernel space is not 0x80000000, > > mask_user_address_32() is used and a 6 instructions sequence is > > generated: > > > > 24: 54 69 7c 7e srwi r9,r3,17 > > 28: 21 29 57 ff subfic r9,r9,22527 > > 2c: 7d 29 fe 70 srawi r9,r9,31 > > 30: 75 2a b0 00 andis. r10,r9,45056 > > 34: 7c 63 48 78 andc r3,r3,r9 > > 38: 7c 63 53 78 or r3,r3,r10 > > > > The constraint is that TASK_SIZE be aligned to 128K in order to get > > the most optimal number of instructions. > > > > When CONFIG_PPC_BARRIER_NOSPEC is not defined, fallback on the > > test-based masking as it is quicker than the 6 instructions sequence > > but not necessarily quicker than the 3 instructions sequences above. > > Doesn't that depend on whether the branch is predicted correctly? > > I can't read ppc asm well enough to check the above. [ PowerPC or Power (or Power Architecture, or Power ISA) ] > And the C is also a bit tortuous. I can read the code ;-) All those instructions are normal simple integer instructions. Shifts, adds, logicals. In general, correctly predicted non-taken bvranches cost absolutely nothing. Correctly predicted taken branches cost the same as any taken branch, so a refetch, maybe resulting in a cycle or so of decode bubble. And a mispredicted branch can be very expensive, say on the order of a hundred cycles (but usually more like ten, which is still a lot of insns worth). So branches are great for predictable stuff, and "not so great" for not so predictable stuff. Segher