linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rik van Riel <riel@surriel.com>
To: Brendan Jackman <jackmanb@google.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de,
	 peterz@infradead.org, dave.hansen@linux.intel.com,
	zhengqi.arch@bytedance.com, 	nadav.amit@gmail.com,
	thomas.lendacky@amd.com, kernel-team@meta.com,
		linux-mm@kvack.org, akpm@linux-foundation.org, jannh@google.com,
		mhklinux@outlook.com, andrew.cooper3@citrix.com,
	Manali Shukla	 <Manali.Shukla@amd.com>
Subject: Re: [PATCH v9 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes
Date: Mon, 10 Feb 2025 22:07:14 -0500	[thread overview]
Message-ID: <c8068d0c8042e8f4e5de0e8af9cb3457ee795211.camel@surriel.com> (raw)
In-Reply-To: <CA+i-1C37LM62cvM33rNCfejA4fSumh_OSVkuuDU91iur_ZxCtQ@mail.gmail.com>

On Mon, 2025-02-10 at 15:15 +0100, Brendan Jackman wrote:
> On Thu, 6 Feb 2025 at 05:47, Rik van Riel <riel@surriel.com> wrote:
> > 
> > +       if (asid >= MAX_ASID_AVAILABLE) {
> > +               /* This should never happen. */
> > +               VM_WARN_ONCE(1, "Unable to allocate global ASID
> > despite %d available\n", global_asid_available);
> 
> If you'll forgive the nitpicking, please put the last arg on a new
> line or otherwise break this up, the rest of this file keeps below
> 100
> chars (this is 113).
> 

Nitpicks are great! Chances are I'll have to look at
this code again several times over the coming years,
so getting it in the best possible shape is in my
interest as much as anybody else's ;)

> > 
> > +static bool needs_global_asid_reload(struct mm_struct *next, u16
> > prev_asid)
> > +{
> > +       u16 global_asid = mm_global_asid(next);
> > +
> > +       if (global_asid && prev_asid != global_asid)
> > +               return true;
> > +
> > +       if (!global_asid && is_global_asid(prev_asid))
> > +               return true;
> 
> I think this needs clarification around when switches from
> global->nonglobal happen. Maybe commentary or maybe there's a way to
> just express the code that makes it obvious. Here's what I currently
> understand, please correct me if I'm wrong:
> 
> - Once a process gets a global ASID it keeps it forever. So within a
> process we never switch global->nonglobal.
> 
> - In flush_tlb_func() we are just calling this to check if the
> process
> has just been given a global ASID - there's no way loaded_mm_asid can
> be global yet !mm_global_asid(loaded_mm).
> 
> - When we call this from switch_mm_irqs_off() we are in the
> prev==next
> case. Is there something about lazy TLB that can cause the case above
> to happen here?
> 
In the current implementation, we never transition
from global->local ASID.

In a previous implementation, the code did do those
transitions, and they appeared to survive the testing
thrown at it.

If we implement more aggressive ASID reuse (which we
may need to), we may need to support that transition
again.

In short, while we do not need to support that
transition right now, I don't really want to remove
the two lines of code that make it work :)

I'll add comments.

> > +static bool meets_global_asid_threshold(struct mm_struct *mm)
> > +{
> > +       if (!global_asid_available)
> 
> I think we need READ_ONCE here.
> 
> Also - this doesn't really make sense in this function as it's
> currently named.
> 
> I think we could just inline this whole function into
> consider_global_asid(), it would still be nice and readable IMO.
> 
Done and done.

> > 
> > @@ -1058,9 +1375,12 @@ void flush_tlb_mm_range(struct mm_struct
> > *mm, unsigned long start,
> >          * a local TLB flush is needed. Optimize this use-case by
> > calling
> >          * flush_tlb_func_local() directly in this case.
> >          */
> > -       if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) {
> > +       if (mm_global_asid(mm)) {
> > +               broadcast_tlb_flush(info);
> > +       } else if (cpumask_any_but(mm_cpumask(mm), cpu) <
> > nr_cpu_ids) {
> >                 info->trim_cpumask = should_trim_cpumask(mm);
> >                 flush_tlb_multi(mm_cpumask(mm), info);
> > +               consider_global_asid(mm);
> 
> Why do we do this here instead of when the CPU enters the mm? Is the
> idea that in combination with the jiffies thing in
> consider_global_asid() we get a probability of getting a global ASID
> (within some time period) that scales with the amount of TLB flushing
> the process does? So then we avoid using up ASID space on processes
> that are multithreaded but just sit around with stable VMAs etc?
> 
You guessed right.

In the current x86 hardware, a global ASID is a scarce
resource, with about 4k available ASIDs (2k in a kernel
compiled with support for the KPTI mitigation), while
the largest available x86 systems have at least 8k CPUs.

We can either implement the much more aggressive ASID
reuse that ARM64 and RISC-V implement, though it is not
clear how to scale that to thousands of CPUs, or reserve
global ASIDs for the processes that are most likely to
benefit from them, continuing to use IPI-based flushing
for the processes that need it less.

I've added a comment to document that.

-- 
All Rights Reversed.


  reply	other threads:[~2025-02-11  3:11 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-06  4:43 [PATCH v9 00/12] AMD broadcast TLB invalidation Rik van Riel
2025-02-06  4:43 ` [PATCH v9 01/12] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Rik van Riel
2025-02-07 14:28   ` Brendan Jackman
2025-02-11 11:07     ` Peter Zijlstra
2025-02-11 12:10       ` Brendan Jackman
2025-02-11 20:23         ` Rik van Riel
2025-02-06  4:43 ` [PATCH v9 02/12] x86/mm: remove pv_ops.mmu.tlb_remove_table call Rik van Riel
2025-02-06  4:43 ` [PATCH v9 03/12] x86/mm: consolidate full flush threshold decision Rik van Riel
2025-02-07 14:50   ` Brendan Jackman
2025-02-07 20:22     ` Rik van Riel
2025-02-10 11:15       ` Brendan Jackman
2025-02-10 19:12     ` Rik van Riel
2025-02-06  4:43 ` [PATCH v9 04/12] x86/mm: get INVLPGB count max from CPUID Rik van Riel
2025-02-07 15:10   ` Brendan Jackman
2025-02-07 17:34     ` Brendan Jackman
2025-02-10  7:30   ` Vern Hao
2025-02-10 16:48     ` Rik van Riel
2025-02-12  1:18       ` Vern Hao
2025-02-12  1:57       ` Vern Hao
2025-02-12 15:56         ` Tom Lendacky
2025-02-13  8:16           ` Vern Hao
2025-02-06  4:43 ` [PATCH v9 05/12] x86/mm: add INVLPGB support code Rik van Riel
2025-02-06  4:43 ` [PATCH v9 06/12] x86/mm: use INVLPGB for kernel TLB flushes Rik van Riel
2025-02-07 16:03   ` Brendan Jackman
2025-02-07 20:50     ` Rik van Riel
2025-02-10 11:22       ` Brendan Jackman
2025-02-11  2:01     ` Rik van Riel
2025-02-06  4:43 ` [PATCH v9 07/12] x86/mm: use INVLPGB in flush_tlb_all Rik van Riel
2025-02-06  4:43 ` [PATCH v9 08/12] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Rik van Riel
2025-02-06  4:43 ` [PATCH v9 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Rik van Riel
2025-02-10 14:15   ` Brendan Jackman
2025-02-11  3:07     ` Rik van Riel [this message]
2025-02-06  4:43 ` [PATCH v9 10/12] x86/mm: do targeted broadcast flushing from tlbbatch code Rik van Riel
2025-02-10 15:27   ` Brendan Jackman
2025-02-11  3:45     ` Rik van Riel
2025-02-11 10:02       ` Brendan Jackman
2025-02-11 20:21         ` Rik van Riel
2025-02-12 10:38           ` Brendan Jackman
2025-02-06  4:43 ` [PATCH v9 11/12] x86/mm: enable AMD translation cache extensions Rik van Riel
2025-02-06  4:43 ` [PATCH v9 12/12] x86/mm: only invalidate final translations with INVLPGB Rik van Riel
2025-02-06 10:16 ` [PATCH v9 00/12] AMD broadcast TLB invalidation Oleksandr Natalenko
2025-02-06 14:16   ` Rik van Riel
2025-02-06 14:23     ` Peter Zijlstra
2025-02-06 14:48       ` Rik van Riel
2025-02-07  8:16         ` Peter Zijlstra
2025-02-07 17:46           ` Rik van Riel
2025-02-07 18:23 ` Brendan Jackman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8068d0c8042e8f4e5de0e8af9cb3457ee795211.camel@surriel.com \
    --to=riel@surriel.com \
    --cc=Manali.Shukla@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=jackmanb@google.com \
    --cc=jannh@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhklinux@outlook.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterz@infradead.org \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox