From: Linus Torvalds <torvalds@linux-foundation.org>
To: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Andy Lutomirski <luto@kernel.org>,
x86@kernel.org, Kostya Serebryany <kcc@google.com>,
Andrey Ryabinin <ryabinin.a.a@gmail.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Alexander Potapenko <glider@google.com>,
Taras Madan <tarasmadan@google.com>,
Dmitry Vyukov <dvyukov@google.com>,
"H . J . Lu" <hjl.tools@gmail.com>,
Andi Kleen <ak@linux.intel.com>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
Bharata B Rao <bharata@amd.com>,
Jacob Pan <jacob.jun.pan@linux.intel.com>,
Ashok Raj <ashok.raj@intel.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Sami Tolvanen <samitolvanen@google.com>,
joao@overdrivepizza.com
Subject: Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user
Date: Tue, 17 Jan 2023 10:33:41 -0800 [thread overview]
Message-ID: <CAHk-=wjfmmYPw0wX1BW6gi_KAhdZi+81or024JFcRYHiQh-jpw@mail.gmail.com> (raw)
In-Reply-To: <CAKwvOdnCJmcGurUpHcdO44vVazz67jGDTXzug9LGv6C84xGmPw@mail.gmail.com>
On Tue, Jan 17, 2023 at 10:26 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Jan 17, 2023 at 9:29 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Side note: that's not something new or unusual. It's been the case
> > since I started testing clang - we have several code-paths where we
> > use "unlikely()" to try to get very unlikely cases to be out-of-line,
> > and clang just mostly ignores it, or treats it as a very weak hint. I
> > think the only way to get clang to treat it as a *strong* hint is to
> > use PGO.
>
> I'd be surprised if that were intentional or by design.
>
> Do you guys have a bug report we could look at?
Heh. I actually sent you an example long ago. Let me go fish it out of
my mail archives and quote some of it below so that you can find it in
yours..
Linus
[ Time passes. Found this email to you and Bill Wendling from Feb 16,
2020, Message-ID
CAHk-=wigVshsByCMjkUiZyQSR5N5zi2aAeQc+VJCzQV=nm8E7g@mail.gmail.com ]:
Anyway, I'm looking at clang code generation, and comparing it with
gcc on one of my "this has been optimized to hell and back" functions:
__d_lookup_rcu().
It looks like clang does frame pointers, and ignores our
likely/unlikely annotations.
So this code:
if (unlikely(parent->d_flags & DCACHE_OP_COMPARE)) {
int tlen;
const char *tname;
......
doesn't actually jump out of line, but instead generates the unlikely
case as the fallthrough:
testb $2, (%r12)
je .LBB50_9
... unlikely code goes here...
and then the likely case ends up having unfortunate reloads inside the
hot loop. Possibly because it has one fewer free registers than gcc
because of the frame pointer.
I didn't look into _why_ clang generates frame pointers but gcc
doesn't. It may be just a compiler default, I think we don't end up
explicitly asking either way.
[ And then Bill replied with this ]
It's not a no-op. We add branch probabilities to the IR, whether
they're honored or not depends on the transformation. But they
*should* be honored when available. I've seen in the past that instead
of moving unlikely blocks out of the way (like gcc, which moves them
below the function's "ret" instruction), LLVM will do something like
this:
<normal code>
<jmp to loop conditional test>
<unlikely code>
<more likely code>
<loop conditional test>
<...>
I.e. the loop is rotated and the unlikely code is first and the hotter
code is closer together but between the unlikely and conditional test.
Could this be what's going on? Otherwise, maybe clang decided that
it's not beneficial to move the code out-of-line because the benefit
was minimal? (I'm guessing here.)
next prev parent reply other threads:[~2023-01-17 18:34 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-11 12:37 [PATCHv14 00/17] Linear Address Masking enabling Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 01/17] x86/mm: Rework address range check in get_user() and put_user() Kirill A. Shutemov
2023-01-18 15:49 ` Peter Zijlstra
2023-01-18 15:59 ` Linus Torvalds
2023-01-18 16:48 ` Peter Zijlstra
2023-01-18 17:01 ` Linus Torvalds
2023-01-11 12:37 ` [PATCHv14 02/17] x86: Allow atomic MM_CONTEXT flags setting Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 03/17] x86: CPUID and CR3/CR4 flags for Linear Address Masking Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 04/17] x86/mm: Handle LAM on context switch Kirill A. Shutemov
2023-01-11 13:49 ` Linus Torvalds
2023-01-11 14:14 ` Kirill A. Shutemov
2023-01-11 14:37 ` [PATCHv14.1 " Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 05/17] mm: Introduce untagged_addr_remote() Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 06/17] x86/uaccess: Provide untagged_addr() and remove tags before address check Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 07/17] x86/mm: Provide arch_prctl() interface for LAM Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user Kirill A. Shutemov
2023-01-17 13:05 ` Peter Zijlstra
2023-01-17 13:57 ` Kirill A. Shutemov
2023-01-17 15:02 ` Peter Zijlstra
2023-01-17 17:18 ` Linus Torvalds
2023-01-17 17:28 ` Linus Torvalds
2023-01-17 18:26 ` Nick Desaulniers
2023-01-17 18:33 ` Linus Torvalds [this message]
2023-01-17 19:17 ` Nick Desaulniers
2023-01-17 20:10 ` Linus Torvalds
2023-01-17 20:43 ` Linus Torvalds
2023-01-17 18:14 ` Peter Zijlstra
2023-01-17 18:21 ` Peter Zijlstra
2023-01-19 23:06 ` Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 09/17] mm: Expose untagging mask in /proc/$PID/status Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 10/17] iommu/sva: Replace pasid_valid() helper with mm_valid_pasid() Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 11/17] x86/mm/iommu/sva: Make LAM and SVA mutually exclusive Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 12/17] selftests/x86/lam: Add malloc and tag-bits test cases for linear-address masking Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 13/17] selftests/x86/lam: Add mmap and SYSCALL " Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 14/17] selftests/x86/lam: Add io_uring " Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 15/17] selftests/x86/lam: Add inherit " Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 16/17] selftests/x86/lam: Add ARCH_FORCE_TAGGED_SVA " Kirill A. Shutemov
2023-01-11 12:37 ` [PATCHv14 17/17] selftests/x86/lam: Add test cases for LAM vs thread creation Kirill A. Shutemov
2023-01-18 16:49 ` [PATCHv14 00/17] Linear Address Masking enabling Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wjfmmYPw0wX1BW6gi_KAhdZi+81or024JFcRYHiQh-jpw@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=ak@linux.intel.com \
--cc=andreyknvl@gmail.com \
--cc=ashok.raj@intel.com \
--cc=bharata@amd.com \
--cc=dave.hansen@linux.intel.com \
--cc=dvyukov@google.com \
--cc=glider@google.com \
--cc=hjl.tools@gmail.com \
--cc=jacob.jun.pan@linux.intel.com \
--cc=joao@overdrivepizza.com \
--cc=kcc@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=ndesaulniers@google.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=ryabinin.a.a@gmail.com \
--cc=samitolvanen@google.com \
--cc=tarasmadan@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox