From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E26D7C54E76 for ; Tue, 17 Jan 2023 18:34:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 84D8C6B0074; Tue, 17 Jan 2023 13:34:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FD7E6B0078; Tue, 17 Jan 2023 13:34:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6773F6B007B; Tue, 17 Jan 2023 13:34:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 564D36B0074 for ; Tue, 17 Jan 2023 13:34:01 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2F87C120BF9 for ; Tue, 17 Jan 2023 18:34:01 +0000 (UTC) X-FDA: 80365140282.27.55B9F23 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) by imf15.hostedemail.com (Postfix) with ESMTP id 774F6A0007 for ; Tue, 17 Jan 2023 18:33:59 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=NvzU4VWl; spf=pass (imf15.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.171 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673980439; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=27WlW4iebYEpp37p1l2V6Frv0TlUOFiXdW58IAriYVM=; b=Z0Z3RedQAk08Rgfsgag9EImctxf97NZyVPyYbGpYEzQmruJcuuJ9Qvf8yPBlbCzEKinImO MTeRyCl9XV9p5fGCaXPSeDyUWLD7pMUgtXxpTCsCJlgA4Yx9X1CDPzL12clPRSvBmyIuv9 BZaiG5xnNcCH/y5+ryWyvZhhpeq3eqg= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=NvzU4VWl; spf=pass (imf15.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.171 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673980439; a=rsa-sha256; cv=none; b=wgCMDfID4IpcT9IpDBSwFPDCxFAZQNOZgOVJGlryq0LIZjst7U1XoNoxHfxxc0XEL83iFT 5J8FyH8OMS7aVL2+uY31H825s8AsqytNOHz5iZvq+nBN5hmHRowcw2lu3AskDEpvuA4dFX 4yqoQHRFl3SqYTcuBCJOLWvr+ROFCBs= Received: by mail-qt1-f171.google.com with SMTP id d16so14971174qtw.8 for ; Tue, 17 Jan 2023 10:33:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=27WlW4iebYEpp37p1l2V6Frv0TlUOFiXdW58IAriYVM=; b=NvzU4VWlAw329EaV4WmrknWoyZhqmU4sCrTZ1qb8Y/Isdl37ozdiJJnOSrcmerujxc 6RIgGEiHXC5DfHG/0LrDZ6Wjg7lBg9nT6fV0VQpsFnKx355j0OTv/PWyd+0iw6p+7/ug o2Cpz64z1qNTKNXfXqwBHmG/cNxqgb1+WjNJI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=27WlW4iebYEpp37p1l2V6Frv0TlUOFiXdW58IAriYVM=; b=cmmy4CSuVTaI29ba1pG147/bXVbVUrJw6XTDroxKCnoFAXGlkXBrqbJynID/ywUM13 RFWhCkCCeyAJHfSs64nn91/A5FYtbjpIFvZ65tic2EiMMhJndOdVY7AB10t8JAzzcLX9 mX40FBes/8CfFeMm3TIrLzr6iih/rfCgcW/Q27fR93DxEz2ll+iFKIcSQUNbYbqYmkFi tW8wo0U7ibcideUfJPk2QgP0eMMHF+hlZqv6tfDNIcXwi2Vo5YDIXkhegVhJ4A5papZp /JPSKG/ff+MSn34voPM8JEe7hkk0tmm0ivE3k5C6tUbVD55hauJ9oVQyPHuVeW0TUrxC eNeg== X-Gm-Message-State: AFqh2koyFiCHrW17EBu90A9yUyjVtHUgToJ4TUuzTR5SulXWOPokuPbv NVPDSWV7yyG5xu9nDrVQ/Jy7NCjHVHPEBR7E X-Google-Smtp-Source: AMrXdXtMDVPg10fxYoaYQj9TcHFc/a+sMiQNOGsQoeW3lvJwkxjlk0a1CUCKMXptcWoDvREndw3AHA== X-Received: by 2002:ac8:4e39:0:b0:3b6:3af6:f2e1 with SMTP id d25-20020ac84e39000000b003b63af6f2e1mr5989420qtw.59.1673980438290; Tue, 17 Jan 2023 10:33:58 -0800 (PST) Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com. [209.85.222.170]) by smtp.gmail.com with ESMTPSA id x22-20020ac87316000000b003a530a32f67sm4304204qto.65.2023.01.17.10.33.57 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Jan 2023 10:33:57 -0800 (PST) Received: by mail-qk1-f170.google.com with SMTP id z9so3967636qkl.13 for ; Tue, 17 Jan 2023 10:33:57 -0800 (PST) X-Received: by 2002:a05:620a:144a:b0:6ff:cbda:a128 with SMTP id i10-20020a05620a144a00b006ffcbdaa128mr197733qkl.697.1673980437301; Tue, 17 Jan 2023 10:33:57 -0800 (PST) MIME-Version: 1.0 References: <20230111123736.20025-1-kirill.shutemov@linux.intel.com> <20230111123736.20025-9-kirill.shutemov@linux.intel.com> <20230117135703.voaumisreld7crfb@box> In-Reply-To: From: Linus Torvalds Date: Tue, 17 Jan 2023 10:33:41 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user To: Nick Desaulniers Cc: Peter Zijlstra , "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sami Tolvanen , joao@overdrivepizza.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: c6m7wqtfeobkcf3fuhqysfnfxm3mxo7x X-Rspamd-Queue-Id: 774F6A0007 X-HE-Tag: 1673980439-879706 X-HE-Meta: U2FsdGVkX1/TREzOilIx0CxYbw3w/v0rhPsnXIXjd1gMLXdkvOdKQxECqeJ3PeTwWsv0uVGY7COWt2USlYbiWplm7K2XdM+lTSGEe2n4my387wxH3e3OUfpkIogR93kkqGNOnTAhHTz2LOWlce37pxSpiSNVbKr37hTtdK7fZ3cgJ4+tadbCjWQ7TYRD/6B9GbhvYsuof3Ln0gkM1M7H59yhTvHh0biqtv5II0OpA3+PtM6X6MWvGXkeKPIUNfyVqUKOWXGGlpIun09Nl+S4lkH4yPDktomKoFW3alGgPz4XnL7eT30337cXqkOmmS7EBAVcOn08LZ87Tn6UEvC0QvGzIzN7g4YzVBiKlHcrVaTuBIHrJGlmrB1TtG88F2QiVoXMKXaMFKGi1bb1d/K6hyh2byda3w2lieq69IWUpmMBvea/d3Hqq2/YnZlMSv5r1HIl55KQb5T2p+WWXqKh3tAQB5VRwHhiZu3vMQiuzOIxh/3XOXrHg65OnIFD2P0rdc3TAMzoBO+w5VAU7zsu9W2GdWbjip1EqtXaqERThz0kpaSv9mJQnnW2Mbo8nHcW0pxKG5wr47FHekmxBKiU4vz34Tn+SQCZ5f6sL8rCCBE5RCvDn1zT7O77x5Q9+a/TTp15Vu6t9/+kbUe4LjbuWS/eCDrCDLR7xd8No9djOdOIdO5W30WdcB9lSHaTb+Gbhsm1gjOstKjhy3xPy4+Ig/RR459s/nrZC3I5TGJ3Y0MDwmGeHcyi3uvgHZQGgEUn1v0yOYS7BWsXZfOyd0ksAzEPOE7QIqL860u5G7yAwTry6VqQxuKLZxWqBY6gaQjWnXuU55W/kqoCN1XcXB1E8KmJ9ZGglL+nknn8RTvzD/WXVDk9pbr40dkbp5CFs4EfwpILfvvl2wzZR+UyfYKxSrrgs+OKUfQGWMmmUbbWWhiECl40UEunuWrHEi8Md/0fjDOhJIT/ZIZAAD9uBIU MBGI2YVw vfYtJvhC1KsoOueAVxW6RJwWIxLz73/Xz0phD/p913DfyQCfna7YOA7KIgY1zfvpy/ePCeIIuJ23eZdtpC+mlKGdhPpyof4nyOVgOH//22pIZ0PrSW3fZxmHhdhVuUiOUge+Rig4AWEfD9jIt8grOWXCY0M1aFqmaAVK6jl8d+tFXjXhPtvhTn7Su1rnlFmoOjf7vvlE1w473mFI9uT+XZ8V4xWarOJlyEsoQPDFMic58B72gNyqtbc4MO3LaTPIEXC6n5ocg0/5KWHgz+otQvn9XUru01pGGzxSdPtV1hVGLqkNmXZS9tQd9B26HcPoxEY7bYGRQYJul+N+4u1h7bUugCUFx0H24hz1AUEFhFbs4lRKSI0cYIJoCicG83TFZ6BQr/G9Pc8pYwpv66+M6g9WWh4FJm2bRzIvq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 17, 2023 at 10:26 AM Nick Desaulniers wrote: > > On Tue, Jan 17, 2023 at 9:29 AM Linus Torvalds > wrote: > > > > Side note: that's not something new or unusual. It's been the case > > since I started testing clang - we have several code-paths where we > > use "unlikely()" to try to get very unlikely cases to be out-of-line, > > and clang just mostly ignores it, or treats it as a very weak hint. I > > think the only way to get clang to treat it as a *strong* hint is to > > use PGO. > > I'd be surprised if that were intentional or by design. > > Do you guys have a bug report we could look at? Heh. I actually sent you an example long ago. Let me go fish it out of my mail archives and quote some of it below so that you can find it in yours.. Linus [ Time passes. Found this email to you and Bill Wendling from Feb 16, 2020, Message-ID CAHk-=wigVshsByCMjkUiZyQSR5N5zi2aAeQc+VJCzQV=nm8E7g@mail.gmail.com ]: Anyway, I'm looking at clang code generation, and comparing it with gcc on one of my "this has been optimized to hell and back" functions: __d_lookup_rcu(). It looks like clang does frame pointers, and ignores our likely/unlikely annotations. So this code: if (unlikely(parent->d_flags & DCACHE_OP_COMPARE)) { int tlen; const char *tname; ...... doesn't actually jump out of line, but instead generates the unlikely case as the fallthrough: testb $2, (%r12) je .LBB50_9 ... unlikely code goes here... and then the likely case ends up having unfortunate reloads inside the hot loop. Possibly because it has one fewer free registers than gcc because of the frame pointer. I didn't look into _why_ clang generates frame pointers but gcc doesn't. It may be just a compiler default, I think we don't end up explicitly asking either way. [ And then Bill replied with this ] It's not a no-op. We add branch probabilities to the IR, whether they're honored or not depends on the transformation. But they *should* be honored when available. I've seen in the past that instead of moving unlikely blocks out of the way (like gcc, which moves them below the function's "ret" instruction), LLVM will do something like this: <...> I.e. the loop is rotated and the unlikely code is first and the hotter code is closer together but between the unlikely and conditional test. Could this be what's going on? Otherwise, maybe clang decided that it's not beneficial to move the code out-of-line because the benefit was minimal? (I'm guessing here.)