From: Nadav Amit <nadav.amit@gmail.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Rik van Riel <riel@surriel.com>,
the arch/x86 maintainers <x86@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
kernel-team@meta.com, Dave Hansen <dave.hansen@linux.intel.com>,
luto@kernel.org, peterz@infradead.org,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
zhengqi.arch@bytedance.com,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>
Subject: Re: [PATCH 06/12] x86/mm: use INVLPGB for kernel TLB flushes
Date: Fri, 10 Jan 2025 08:07:56 +0200 [thread overview]
Message-ID: <A893C53E-F15B-4519-9B1B-ED8F48397D6C@gmail.com> (raw)
In-Reply-To: <426011a9-1fbc-415c-bac7-df5d67417df3@intel.com>
> On 9 Jan 2025, at 23:18, Dave Hansen <dave.hansen@intel.com> wrote:
>
> But actually I think INVLPGB is *WAY* better than INVLPG here. INVLPG
> doesn't have ranged invalidation. It will only architecturally
> invalidate multiple 4K entries when the hardware fractured them in the
> first place. I think we should probably take advantage of what INVLPGB
> can do instead of following the INVLPG approach.
>
> INVLPGB will invalidate a range no matter where the underlying entries
> came from. Its "increment the virtual address at the 2M boundary" mode
> will invalidate entries of any size. That's my reading of the docs at
> least. Is that everyone else's reading too?
This is not my reading. I think that this reading assumes that besides
the broadcast, some new “range flush” was added to the TLB. My guess
is that this not the case, since presumably it would require a different
TLB structure (and who does 2 changes at once ;-) ).
My understanding is therefore that it’s all in microcode. There is a
“stride” and “number” which are used by the microcode for iterating
and on every iteration the microcode issues a TLB invalidation. This
invalidation is similar to INVLPG, just as it was always done (putting
aside the variants that do not invalidate the PWC). IOW, the page-size
is not given as part of the INVLPG and not as part of INVLPGB (regardless
of the stride) for whatever entries used to invalidate a given address.
I think my understanding is backed by the wording of "regardless of the
page size” appearing for INVLPG as well in AMD’s manual.
My guess is that invalidating more entries will take longer time, maybe
not on the sender, but at least on the receiver. I also guess that in
certain setups - big NUMA machines - INVLPGB might perform worse. I
remember vaguely ARM guys writing something about such behavior.
next prev parent reply other threads:[~2025-01-10 6:08 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-30 17:53 [PATCH v3 00/12] AMD broadcast TLB invalidation Rik van Riel
2024-12-30 17:53 ` [PATCH 01/12] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Rik van Riel
2024-12-30 18:41 ` Borislav Petkov
2024-12-31 16:11 ` Rik van Riel
2024-12-31 16:19 ` Borislav Petkov
2024-12-31 16:30 ` Rik van Riel
2025-01-02 11:52 ` Borislav Petkov
2025-01-02 19:56 ` Peter Zijlstra
2025-01-03 12:18 ` Borislav Petkov
2025-01-04 16:27 ` Peter Zijlstra
2025-01-06 15:54 ` Dave Hansen
2025-01-06 15:47 ` Rik van Riel
2024-12-30 17:53 ` [PATCH 02/12] x86/mm: remove pv_ops.mmu.tlb_remove_table call Rik van Riel
2024-12-31 3:18 ` Qi Zheng
2024-12-30 17:53 ` [PATCH 03/12] x86/mm: add X86_FEATURE_INVLPGB definition Rik van Riel
2025-01-02 12:04 ` Borislav Petkov
2025-01-03 18:27 ` Rik van Riel
2025-01-03 21:07 ` Borislav Petkov
2024-12-30 17:53 ` [PATCH 04/12] x86/mm: get INVLPGB count max from CPUID Rik van Riel
2025-01-02 12:15 ` Borislav Petkov
2025-01-10 18:44 ` Tom Lendacky
2025-01-10 20:27 ` Rik van Riel
2025-01-10 20:31 ` Tom Lendacky
2025-01-10 20:34 ` Borislav Petkov
2024-12-30 17:53 ` [PATCH 05/12] x86/mm: add INVLPGB support code Rik van Riel
2025-01-02 12:42 ` Borislav Petkov
2025-01-06 16:50 ` Dave Hansen
2025-01-06 17:32 ` Rik van Riel
2025-01-06 18:14 ` Borislav Petkov
2025-01-14 19:50 ` Rik van Riel
2025-01-03 12:44 ` Borislav Petkov
2024-12-30 17:53 ` [PATCH 06/12] x86/mm: use INVLPGB for kernel TLB flushes Rik van Riel
2025-01-03 12:39 ` Borislav Petkov
2025-01-06 17:21 ` Dave Hansen
2025-01-09 20:16 ` Rik van Riel
2025-01-09 21:18 ` Dave Hansen
2025-01-10 5:31 ` Rik van Riel
2025-01-10 6:07 ` Nadav Amit [this message]
2025-01-10 15:14 ` Dave Hansen
2025-01-10 16:08 ` Rik van Riel
2025-01-10 16:29 ` Dave Hansen
2025-01-10 16:36 ` Rik van Riel
2025-01-10 18:53 ` Tom Lendacky
2025-01-10 20:29 ` Rik van Riel
2024-12-30 17:53 ` [PATCH 07/12] x86/tlb: use INVLPGB in flush_tlb_all Rik van Riel
2025-01-06 17:29 ` Dave Hansen
2025-01-06 17:35 ` Rik van Riel
2025-01-06 17:54 ` Dave Hansen
2024-12-30 17:53 ` [PATCH 08/12] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Rik van Riel
2024-12-30 17:53 ` [PATCH 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Rik van Riel
2024-12-30 19:24 ` Nadav Amit
2025-01-01 4:42 ` Rik van Riel
2025-01-01 15:20 ` Nadav Amit
2025-01-01 16:15 ` Karim Manaouil
2025-01-01 16:23 ` Rik van Riel
2025-01-02 0:06 ` Nadav Amit
2025-01-03 17:36 ` Jann Horn
2025-01-04 2:55 ` Rik van Riel
2025-01-06 13:04 ` Jann Horn
2025-01-06 14:26 ` Rik van Riel
2025-01-06 14:52 ` Nadav Amit
2025-01-06 16:03 ` Rik van Riel
2025-01-06 18:40 ` Dave Hansen
2025-01-12 2:36 ` Rik van Riel
2024-12-30 17:53 ` [PATCH 10/12] x86,tlb: do targeted broadcast flushing from tlbbatch code Rik van Riel
2024-12-30 17:53 ` [PATCH 11/12] x86/mm: enable AMD translation cache extensions Rik van Riel
2024-12-30 18:25 ` Nadav Amit
2024-12-30 18:27 ` Rik van Riel
2025-01-03 17:49 ` Jann Horn
2025-01-04 3:08 ` Rik van Riel
2025-01-06 13:10 ` Jann Horn
2025-01-06 18:29 ` Sean Christopherson
2025-01-10 19:34 ` Tom Lendacky
2025-01-10 19:45 ` Rik van Riel
2025-01-10 19:58 ` Borislav Petkov
2025-01-10 20:43 ` Rik van Riel
2024-12-30 17:53 ` [PATCH 12/12] x86/mm: only invalidate final translations with INVLPGB Rik van Riel
2025-01-03 18:40 ` Jann Horn
2025-01-12 2:39 ` Rik van Riel
2025-01-06 19:03 ` [PATCH v3 00/12] AMD broadcast TLB invalidation Dave Hansen
2025-01-12 2:46 ` Rik van Riel
2025-01-06 22:49 ` Yosry Ahmed
2025-01-07 3:25 ` Rik van Riel
2025-01-08 1:36 ` Yosry Ahmed
2025-01-09 2:25 ` Andrew Cooper
2025-01-09 2:47 ` Andrew Cooper
2025-01-09 21:32 ` Yosry Ahmed
2025-01-09 23:00 ` Andrew Cooper
2025-01-09 23:26 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=A893C53E-F15B-4519-9B1B-ED8F48397D6C@gmail.com \
--to=nadav.amit@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox