linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: akpm@linux-foundation.org, bp@alien8.de,
	dave.hansen@linux.intel.com,  hpa@zytor.com, jackmanb@google.com,
	kernel-team@meta.com,  linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, luto@kernel.org,  mingo@redhat.com,
	nadav.amit@gmail.com, peterz@infradead.org,  reijiw@google.com,
	riel@surriel.com, tglx@linutronix.de, x86@kernel.org,
	 zhengqi.arch@bytedance.com
Subject: Re: [PATCH v3 00/12] AMD broadcast TLB invalidation
Date: Thu, 9 Jan 2025 13:32:55 -0800	[thread overview]
Message-ID: <CAJD7tkbnJGdJhyYkMJB2EFUDALoCh93pwsdQVBmm=a10anyTkg@mail.gmail.com> (raw)
In-Reply-To: <ed486ebe-d3ba-42fb-afdc-485b3f2504f0@citrix.com>

On Wed, Jan 8, 2025 at 6:47 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> >> I suspect AMD wouldn't tell us exactly ;)
> >
> > Well, ideally they would just tell us the conditions under which CPUs
> > respond to the broadcast TLB flush or the expectations around latency.
>
> [Resend, complete this time]
>
> Disclaimer.  I'm not at AMD; I don't know how they implement it; I'm
> just a random person on the internet.  But, here are a few things that
> might be relevant to know.
>
> AMD's SEV-SNP whitepaper [1] states that RMP permissions "are cached in
> the CPU TLB and related structures" and also "When required, hardware
> automatically performs TLB invalidations to ensure that all processors
> in the system see the updated RMP entry information."
>
> That sentence doesn't use "broadcast" or "remote", but "all processors"
> is a pretty clear clue.  Broadcast TLB invalidations are a building
> block of all the RMP-manipulation instructions.
>
> Furthermore, to be useful in this context, they need to be ordered with
> memory.  Specifically, a new pagewalk mustn't start after an
> invalidation, yet observe the stale RMP entry.
>
>
> x86 CPUs do have reasonable forward-progress guarantees, but in order to
> achieve forward progress, they need to e.g. guarantee that one memory
> access doesn't displace the TLB entry backing a different memory access
> from the same instruction, or you could livelock while trying to
> complete a single instruction.
>
> A consequence is that you can't safely invalidate a TLB entry of an
> in-progress instruction (although this means only the oldest instruction
> in the pipeline, because everything else is speculative and potentially
> transient).
>
>
> INVLPGB invalidations are interrupt-like from the point of view of the
> remote core, but are microarchitectural and can be taken irrespective of
> the architectural Interrupt and Global Interrupt Flags.  As a
> consequence, they'll need wait until an instruction boundary to be
> processed.  While not AMD, the Intel RAR whitepaper [2] discusses the
> handling of RARs on the remote processor, and they share a number of
> constraints in common with INVLPGB.
>
>
> Overall, I'd expect the INVLPGB instructions to be pretty quick in and
> of themselves; interestingly, they're not identified as architecturally
> serialising.  The broadcast is probably posted, and will be dealt with
> by remote processors on the subsequent instruction boundary.  TLBSYNC is
> the barrier to wait until the invalidations have been processed, and
> this will block for an unspecified length of time, probably bounded by
> the "longest" instruction in progress on a remote CPU.  e.g. I expect it
> probably will suck if you have to wait for a WBINVD instruction to
> complete on a remote CPU.
>
> That said, architectural IPIs have the same conditions too, except on
> top of that you've got to run a whole interrupt handler.  So, with
> reasonable confidence, however slow TLBSYNC might be in the worst case,
> it's got absolutely nothing on the overhead of doing invalidations the
> old fashioned way.

Generally speaking, I am not arguing that TLB flush IPIs are worse
than INLPGB/TLBSYNC, I think we should expect the latter to perform
better in most cases.

But there is a difference here because the processor executing TLBSYNC
cannot serve interrupts or NMIs while waiting for remote CPUs, because
they have to be served at an instruction boundary, right? Unless
TLBSYNC is an exception to that rule, or its execution is considered
completed before remote CPUs respond (i.e. the CPU executes it quickly
then enters into a wait doing "nothing").

There are also intriguing corner cases that are not documented. For
example, you mention that it's reasonable to expect that a remote CPU
does not serve TLBSYNC except at the instruction boundary. What if
that CPU is executing TLBSYNC? Do we have to wait for its execution to
complete? Is it possible to end up in a deadlock? This goes back to my
previous point about whether TLBSYNC is a special case or when it's
considered to have finished executing.

I am sure people thought about that and I am probably worried over
nothing, but there's little details here so one has to speculate.

Again, sorry if I am making a fuss over nothing and it's all in my head.

>
>
> ~Andrew
>
> [1]
> https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
> [2]
> https://www.intel.com/content/dam/develop/external/us/en/documents/341431-remote-action-request-white-paper.pdf


  reply	other threads:[~2025-01-09 21:33 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-30 17:53 Rik van Riel
2024-12-30 17:53 ` [PATCH 01/12] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Rik van Riel
2024-12-30 18:41   ` Borislav Petkov
2024-12-31 16:11     ` Rik van Riel
2024-12-31 16:19       ` Borislav Petkov
2024-12-31 16:30         ` Rik van Riel
2025-01-02 11:52           ` Borislav Petkov
2025-01-02 19:56       ` Peter Zijlstra
2025-01-03 12:18         ` Borislav Petkov
2025-01-04 16:27           ` Peter Zijlstra
2025-01-06 15:54             ` Dave Hansen
2025-01-06 15:47           ` Rik van Riel
2024-12-30 17:53 ` [PATCH 02/12] x86/mm: remove pv_ops.mmu.tlb_remove_table call Rik van Riel
2024-12-31  3:18   ` Qi Zheng
2024-12-30 17:53 ` [PATCH 03/12] x86/mm: add X86_FEATURE_INVLPGB definition Rik van Riel
2025-01-02 12:04   ` Borislav Petkov
2025-01-03 18:27     ` Rik van Riel
2025-01-03 21:07       ` Borislav Petkov
2024-12-30 17:53 ` [PATCH 04/12] x86/mm: get INVLPGB count max from CPUID Rik van Riel
2025-01-02 12:15   ` Borislav Petkov
2025-01-10 18:44   ` Tom Lendacky
2025-01-10 20:27     ` Rik van Riel
2025-01-10 20:31       ` Tom Lendacky
2025-01-10 20:34       ` Borislav Petkov
2024-12-30 17:53 ` [PATCH 05/12] x86/mm: add INVLPGB support code Rik van Riel
2025-01-02 12:42   ` Borislav Petkov
2025-01-06 16:50     ` Dave Hansen
2025-01-06 17:32       ` Rik van Riel
2025-01-06 18:14       ` Borislav Petkov
2025-01-14 19:50     ` Rik van Riel
2025-01-03 12:44   ` Borislav Petkov
2024-12-30 17:53 ` [PATCH 06/12] x86/mm: use INVLPGB for kernel TLB flushes Rik van Riel
2025-01-03 12:39   ` Borislav Petkov
2025-01-06 17:21   ` Dave Hansen
2025-01-09 20:16     ` Rik van Riel
2025-01-09 21:18       ` Dave Hansen
2025-01-10  5:31         ` Rik van Riel
2025-01-10  6:07         ` Nadav Amit
2025-01-10 15:14           ` Dave Hansen
2025-01-10 16:08             ` Rik van Riel
2025-01-10 16:29               ` Dave Hansen
2025-01-10 16:36                 ` Rik van Riel
2025-01-10 18:53   ` Tom Lendacky
2025-01-10 20:29     ` Rik van Riel
2024-12-30 17:53 ` [PATCH 07/12] x86/tlb: use INVLPGB in flush_tlb_all Rik van Riel
2025-01-06 17:29   ` Dave Hansen
2025-01-06 17:35     ` Rik van Riel
2025-01-06 17:54       ` Dave Hansen
2024-12-30 17:53 ` [PATCH 08/12] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Rik van Riel
2024-12-30 17:53 ` [PATCH 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Rik van Riel
2024-12-30 19:24   ` Nadav Amit
2025-01-01  4:42     ` Rik van Riel
2025-01-01 15:20       ` Nadav Amit
2025-01-01 16:15         ` Karim Manaouil
2025-01-01 16:23           ` Rik van Riel
2025-01-02  0:06             ` Nadav Amit
2025-01-03 17:36   ` Jann Horn
2025-01-04  2:55     ` Rik van Riel
2025-01-06 13:04       ` Jann Horn
2025-01-06 14:26         ` Rik van Riel
2025-01-06 14:52   ` Nadav Amit
2025-01-06 16:03     ` Rik van Riel
2025-01-06 18:40   ` Dave Hansen
2025-01-12  2:36     ` Rik van Riel
2024-12-30 17:53 ` [PATCH 10/12] x86,tlb: do targeted broadcast flushing from tlbbatch code Rik van Riel
2024-12-30 17:53 ` [PATCH 11/12] x86/mm: enable AMD translation cache extensions Rik van Riel
2024-12-30 18:25   ` Nadav Amit
2024-12-30 18:27     ` Rik van Riel
2025-01-03 17:49   ` Jann Horn
2025-01-04  3:08     ` Rik van Riel
2025-01-06 13:10       ` Jann Horn
2025-01-06 18:29         ` Sean Christopherson
2025-01-10 19:34   ` Tom Lendacky
2025-01-10 19:45     ` Rik van Riel
2025-01-10 19:58       ` Borislav Petkov
2025-01-10 20:43         ` Rik van Riel
2024-12-30 17:53 ` [PATCH 12/12] x86/mm: only invalidate final translations with INVLPGB Rik van Riel
2025-01-03 18:40   ` Jann Horn
2025-01-12  2:39     ` Rik van Riel
2025-01-06 19:03 ` [PATCH v3 00/12] AMD broadcast TLB invalidation Dave Hansen
2025-01-12  2:46   ` Rik van Riel
2025-01-06 22:49 ` Yosry Ahmed
2025-01-07  3:25   ` Rik van Riel
2025-01-08  1:36     ` Yosry Ahmed
2025-01-09  2:25       ` Andrew Cooper
2025-01-09  2:47       ` Andrew Cooper
2025-01-09 21:32         ` Yosry Ahmed [this message]
2025-01-09 23:00           ` Andrew Cooper
2025-01-09 23:26             ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJD7tkbnJGdJhyYkMJB2EFUDALoCh93pwsdQVBmm=a10anyTkg@mail.gmail.com' \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jackmanb@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterz@infradead.org \
    --cc=reijiw@google.com \
    --cc=riel@surriel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox