From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6C53C3DA78 for ; Tue, 17 Jan 2023 17:18:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E78A16B0072; Tue, 17 Jan 2023 12:18:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E28A76B0073; Tue, 17 Jan 2023 12:18:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF1626B0074; Tue, 17 Jan 2023 12:18:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C08936B0072 for ; Tue, 17 Jan 2023 12:18:24 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7C301802EA for ; Tue, 17 Jan 2023 17:18:24 +0000 (UTC) X-FDA: 80364949728.04.8784186 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) by imf01.hostedemail.com (Postfix) with ESMTP id A195340008 for ; Tue, 17 Jan 2023 17:18:21 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=D2K9yYaI; spf=pass (imf01.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.128.181 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673975901; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6HSheNmlAClWC6vMTM9BSa0fSEOlMS/5u5Jz6BQdsnk=; b=IKpMfqB+oFoLwG/6wX4lI/IekQVcxm7btmulxDQI4RSUOgzl/Z5znU2Gy2KcYc3URLwyK4 JtJB/Hzuodxl6tmOw8gzpExcmMcsj3HI58ktl9DR1lSYFtYb0ffgeFcG1qpNWkQwPaaG6U 59GIJCz1+iDwo7zsJe15W7wnl57hWE8= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=D2K9yYaI; spf=pass (imf01.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.128.181 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673975901; a=rsa-sha256; cv=none; b=TYDt8ue7xNMOHe3l5Ztngb6dfJR2JP2EEZpINpcS8SZKF245e6SgWgeP94qIhLZPL7MyJ3 ZFK4ITlpa9lfCEGQ1H3mjXwmz5hSju+k6320XPb+4eGZshdxuXCPcR4GdJ9dYncmEP0bHe 13gU4YNQBhyEbc1mxJ/Mbn3BhzdTsZw= Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-4a2f8ad29d5so431398007b3.8 for ; Tue, 17 Jan 2023 09:18:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=6HSheNmlAClWC6vMTM9BSa0fSEOlMS/5u5Jz6BQdsnk=; b=D2K9yYaITqvsffWwjNqR0hHtUrPyIVa/JmS5mqTqRIG9kbwWP5A46y9Xv5sEnUZONg f+9iudv6g8t1ikG4S0IXghWrgbgxfC+RQpkzFmOAKmPZx5gUmxRHRC1YgZP5I6pOM7y8 n2lnDlUYraU6+0Kdnn4Pb0QLSlOo5EQAHG46s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6HSheNmlAClWC6vMTM9BSa0fSEOlMS/5u5Jz6BQdsnk=; b=CVXJKT2XS0iorJ6fRzfmqDO4Er7WFpFleSWJwAGgnSNgF+G9M9kH8xgH7PM5yvmDz8 EObguNOKr3sw9zLV7Q2SXbP0DFpHxh2uv50HzLX5ffJFWOg8wp7syJFkHepGghVo4AOY 154rmFkdeVxKKhIBJPACU28pY+NOh6dCkpmWfDjCPuGc9a2h/yWx9yIvoCRr+Xbvj+Ri izzkL2P2a82+QIECPI3f7VGGhtrUmHPBKMxGrpry4lQmbyPg2TgVlB1d33jiOfLFrTfP Dbpa946AioANliK7hsko4V7v5g0WEZ2JcioLPjcm5Do2lG9QpVNjmCN4WFZXpN4U8Ite 2mow== X-Gm-Message-State: AFqh2koIGbqHBR5CN+bFI+eAIwP413QEpdvGrh/7ORd1giJgI4YzEfoR J8YWAZh0EW6wuxLmKnYBvm8p1NbOGwa0/tCS X-Google-Smtp-Source: AMrXdXubvtI7mwDH92h045xYT+Prc/LaqLD5ivEDmETpLE/lrydF5YHAziRUZwhehoYJFqKTnLQ8hg== X-Received: by 2002:a05:7500:16cb:b0:f1:f97d:8d43 with SMTP id ce11-20020a05750016cb00b000f1f97d8d43mr265060gab.60.1673975899944; Tue, 17 Jan 2023 09:18:19 -0800 (PST) Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com. [209.85.222.179]) by smtp.gmail.com with ESMTPSA id d3-20020ae9ef03000000b00705e0ad29cdsm12325871qkg.77.2023.01.17.09.18.18 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Jan 2023 09:18:18 -0800 (PST) Received: by mail-qk1-f179.google.com with SMTP id i28so2808757qkl.6 for ; Tue, 17 Jan 2023 09:18:18 -0800 (PST) X-Received: by 2002:a37:6387:0:b0:706:92f4:125 with SMTP id x129-20020a376387000000b0070692f40125mr201159qkb.72.1673975898327; Tue, 17 Jan 2023 09:18:18 -0800 (PST) MIME-Version: 1.0 References: <20230111123736.20025-1-kirill.shutemov@linux.intel.com> <20230111123736.20025-9-kirill.shutemov@linux.intel.com> <20230117135703.voaumisreld7crfb@box> In-Reply-To: From: Linus Torvalds Date: Tue, 17 Jan 2023 09:18:01 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user To: Peter Zijlstra Cc: "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sami Tolvanen , ndesaulniers@google.com, joao@overdrivepizza.com Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: w3th6ewnpgfgmizoe4igubnqxi33kged X-Rspam-User: X-Rspamd-Queue-Id: A195340008 X-Rspamd-Server: rspam06 X-HE-Tag: 1673975901-575417 X-HE-Meta: U2FsdGVkX19ZjjBI/YoWDIH/cWTx9o5iban+OxRQqKYXq6j5M4GBr5Paf1iydwEPdsgh7JV6GW+jWRzdhfM83jv2SkmNOnlgHKBmKPWSnS9fWrwhWzEa1VCgWqH25p2HS4KxwlYo6A4TFNKSjYmlJvEtfo3IKuum/5AafWSDo8HKDiO3dthDwPA0ZWMSYaWbBcG5jrbefFRd4fbibW6cTbUhSbZ2ca+s24O/5ZhCXDlQXgdacPQCWsH32Ehg9icLvhxo4lMN9HvaBELzndrw97O9dvlyP02OXz5l5jEIffcgkLL4DQhdENXL8jqvlc13KD+0/Fd+JkHtxMLwIMMw79jiTrGw7xUeRiiWNCro0kXMBtuB3bIkmhRiBqZ7rr+6HmJf/sOcl1DwUSJZXZwm0q3J9RLpEd5ZplYQ3tJoi2EONLgM3ZWQIjm/3LzAYb+9Ja+kni+9gqUTY7EHAI3zX8n2JOmvpa72RwMYx7chW/KlfKRVnbDx7uU8C1+1WAs1JZD9ujaA3+ix5xP53aag+7PUvOgF/J+M/ON6Vc+slQJVcFuMoxBukRHB3EAUsJ7QtE7kIdSm6CqmxL6y2BIBpiSsBWSdxWwjM41trXdbD6CbdPiwrLu26/8m94Uq0Kwt3t9ua7vqy253rzNsK6hxtA09O8j3oDPCNKxuUgjvLY9PV38aHQ+ssWqGMYMr3Z/AJT+nhsu2O3VyjyyDZtD2MvfyZoJINV6rqHc8iARAEMXliaDH+u6vp5kSIIFDhZVrT4w7KDBJvhVJJTOeAz5WyoSOtXEMvf67NfvNV2rL3zPnAgmkaSWU8eoEKH1axJL8k1+Mlg0jy8d5ajvnVyyEEk75vKaM8jl2jPk3+xq8iUTSn0gucAAakMAK3dyfO0IMcAo2Exk43kG3Uw6jARCukr1Xi8NBDodm4qdrpyX69eRp/wwX9uMkysv7e5ysj+CujXu/9MUZGs5g8UhyTe2 aQcRtnn6 /kA+RSJtcB+WCFx0w1UJs/TEQ3rY8rYSgy3rf6bmYwmEoU1CeOiTxVjN4oOnwdRpjKd0f61HceByXgm8RfSiwn1ao+Y0v3TFPblEtrW8vpewt17rVbrF2TBraPAtB5+v1rhaTfboy6hy17npJRqz5ReFEx22qlbb5YRLAMguCzCD54PtHw9fIc0+zSZfBZn7nSeOyCr3e+ZvfvPsd+ZwloHhw4yJkJll38lAKcyAVOZD+7S9m2OCDo+eMW0AXVKr8biejU/HSaPs7euNYJwMG80gqpf6JyKoyFa6EBCU+a1lMKP/ld/LOemI4uYXrAvPAtrP0+V3wrN88lx1aSL8BGPxFZzhOchzRDKFoC3LaBFQ9pmWsd7KM+6IJvGg7iecBNv2nh/Q6ZcYYZvvvSrBW4hZxwy20Jfl/66Gy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 17, 2023 at 7:02 AM Peter Zijlstra wrote: > > On Tue, Jan 17, 2023 at 04:57:03PM +0300, Kirill A. Shutemov wrote: > > On Tue, Jan 17, 2023 at 02:05:22PM +0100, Peter Zijlstra wrote: > > > On Wed, Jan 11, 2023 at 03:37:27PM +0300, Kirill A. Shutemov wrote: > > > > > > > #define __untagged_addr(untag_mask, addr) > > > > u64 __addr = (__force u64)(addr); \ > > > > - s64 sign = (s64)__addr >> 63; \ > > > > - __addr &= untag_mask | sign; \ > > > > + if (static_branch_likely(&tagged_addr_key)) { \ > > > > + s64 sign = (s64)__addr >> 63; \ > > > > + __addr &= untag_mask | sign; \ > > > > + } \ > > > > (__force __typeof__(addr))__addr; \ > > > > }) > > > > > > > > #define untagged_addr(addr) __untagged_addr(current_untag_mask(), addr) > > > > > > Is the compiler clever enough to put the memop inside the branch? > > > > Hm. You mean current_untag_mask() inside static_branch_likely()? > > > > But it is preprocessor who does this, not compiler. So, yes, the memop is > > inside the branch. > > > > Or I didn't understand your question. > > Nah, call it a pre-lunch dip, I overlooked the whole CPP angle -- d'0h. > > That said, I did just put it through a compiler to see wth it did and it > is pretty gross: Yeah, I think the static branch likely just makes things worse. And if we really want to make the "no untag mask exists" case better, I think the code should probably use static_branch_unlikely() rather than *_likely(). That should make it jump to the masking code, and leave the unmasked code as a fallthrough, no? The reason clang seems to generate saner code is that clang seems to largely ignore the whole "__builtin_expect()", at least not to the point where it tries to make the unlikely case be out-of-line. But on the whole, I think we'd be better off without this whole static branch. The cost of "untagged_addr()" generally shouldn't be worth this. There are few performance-crticial users - the most common case is, I think, just mmap() and friends, and the single load is going to be a non-issue there. Looking around, I think the only situation where we may care is strnlen_user() and strncpy_from_user(). Those *can* be performance-critical. They're used for paths and for execve() strings, and can be a bit hot. And both of those cases actually just use it because of the whole "maximum address" calculation to avoid traversing into kernel addresses, so I wonder if we could use alternatives there, kind of like the get_user/put_user cases did. Except it's generic code, so .. But maybe even those aren't worth worrying about. At least they do the unmasking outside the loop - although then in the case of execve(), the string copies themselves are obviously done in a loop anyway. Kirill, do you have clear numbers for that static key being a noticeable win? Linus