From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 76ED5E77188
	for <linux-mm@archiver.kernel.org>; Sat,  4 Jan 2025 02:56:52 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id B71D26B0082; Fri,  3 Jan 2025 21:56:51 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id B210A6B0088; Fri,  3 Jan 2025 21:56:51 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 9E8466B0089; Fri,  3 Jan 2025 21:56:51 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13])
	by kanga.kvack.org (Postfix) with ESMTP id 7F6AA6B0082
	for <linux-mm@kvack.org>; Fri,  3 Jan 2025 21:56:51 -0500 (EST)
Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id 2F111AFD56
	for <linux-mm@kvack.org>; Sat,  4 Jan 2025 02:56:51 +0000 (UTC)
X-FDA: 82968257022.03.DC9B89B
Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147])
	by imf27.hostedemail.com (Postfix) with ESMTP id 71E3240007
	for <linux-mm@kvack.org>; Sat,  4 Jan 2025 02:56:49 +0000 (UTC)
Authentication-Results: imf27.hostedemail.com;
	dkim=none;
	dmarc=none;
	spf=pass (imf27.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1735959409;
	h=from:from:sender:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=CXMU3Srs3CLMeWqmHN16L63gPFld4w+jsAcgJmK8A5A=;
	b=aESnIA6E8ww797TRbL57fb7rIeWLg48l8FxsetdBmTWb1Abs9rsZKffrQoBHL9PmwK/ROe
	UC6h5FSsrSf+cNUlar4Ue4TJEla39MDDk3Sz4JB/n2ZdpHyT+2uWccGN7NZ12guR0aBx4o
	60Xv0eY38iU01bSR8yf++10YoBWLQSU=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735959409; a=rsa-sha256;
	cv=none;
	b=WKlHPEdySPFnPTZbYtKAUYQfy8bPkbfAVDb3P9bCMHEHlOVLvIVPt+VqzQySfvAMfcHWj9
	C/QkBII7CLt8h2ptkGGUfjwm14aHsCktOnWRvx67Yr8zS8MZlNOHTuWkv6X4RjzCCFXuFh
	F+rDjcK3G9XG2FXWjoSOrcOoZALHwi4=
ARC-Authentication-Results: i=1;
	imf27.hostedemail.com;
	dkim=none;
	dmarc=none;
	spf=pass (imf27.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com
Received: from fangorn.home.surriel.com ([10.0.13.7])
	by shelob.surriel.com with esmtpsa  (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
	(Exim 4.97.1)
	(envelope-from <riel@shelob.surriel.com>)
	id 1tTuJu-000000000rh-11XI;
	Fri, 03 Jan 2025 21:55:22 -0500
Message-ID: <96d1b007b3b88c43feac58d2646d675b94daf1fb.camel@surriel.com>
Subject: Re: [PATCH 09/12] x86/mm: enable broadcast TLB invalidation for
 multi-threaded processes
From: Rik van Riel <riel@surriel.com>
To: Jann Horn <jannh@google.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, 
	dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, 
	tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, 
	akpm@linux-foundation.org, nadav.amit@gmail.com,
 zhengqi.arch@bytedance.com, 	linux-mm@kvack.org
Date: Fri, 03 Jan 2025 21:55:21 -0500
In-Reply-To: <CAG48ez1aPB14tgpNzk004_XuvQ0vLBURjDuGLom9KYc+uW08DQ@mail.gmail.com>
References: <20241230175550.4046587-1-riel@surriel.com>
	 <20241230175550.4046587-10-riel@surriel.com>
	 <CAG48ez1aPB14tgpNzk004_XuvQ0vLBURjDuGLom9KYc+uW08DQ@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
User-Agent: Evolution 3.54.1 (3.54.1-1.fc41) 
MIME-Version: 1.0
X-Rspamd-Queue-Id: 71E3240007
X-Rspam-User: 
X-Rspamd-Server: rspam07
X-Stat-Signature: tethta93kw4nuhtnsshetoeacsenwqnf
X-HE-Tag: 1735959409-137069
X-HE-Meta: U2FsdGVkX1+29iXvKe4Xccq5Sp9rLlgf7Yd0gEDtNzIfMZPiWSZJjchtM7C9dv93nJrp0Iw1vcrzubref+qZVqlKPbIOGGyqp4GlH5Dx5dnMrWEQFMITr2+3atIn72a0Ebd8pMwzBxSYiBFPD0WGu1M8NClRkfaePGWFbDsishLv89WDqk6tBzooLWBkymupJV5ACL4lUpkMPZmdxixC64iMRpgbRE5NcomuKKfVrxCsBLhZ4TooT4/E1VcA/SfgY9rK4Rze88hIHWDn+ic5prypb8ub3uGH6fCBf/QHl/cLoXwyIgSESKg3JI2eIpC+o4N8kJWxp77tst9eHBJP/1K5F8G4A5QQFV3eYKbjRmRVsruhiMSrA9Q+7JLaLhazsfPilbS6ndrz+HMn3VVrsN9MkUww6X3Dqk3mpR/8NL6n/+BDdc1YqiZtK+qIiV2bfJQI/m9lf7zbfdSuI61rt9XCMWFnsGiZZUMkqnYBag/53cFnL5bdeXcg4XQgIuQlHt42myfGHYOANf1dNaNPiEjLESmegUI4LaOoatluXb0ivj+K+rmuuXSYMuaBzIa+HDEUrsMVRNz9HHGzeEuW9DnKR05AgxwNBgxwYo0dG4E21fLL+QtTeRRTtC3Wne49zs1NlEa2zhRFnyftJBycKBjgB3KVz7+/0HXpcgASeAFaEb3ajTHGWip3ySJ6VO82VismxBcdj5JiksPCwoaz4vd6+yZ5Jg/UL6VxTmNUhd5tfkA4HeluZVVb2YjcNlkYzmykbplXF6TYwAvOc0E6PnYP6tXBybckT3A+k5CpWVuHuPnTJqBNVl1raF/QpuOmGDVwZ7QoV3VNcHElDN1yTgy7/yQTMdjlQ7zHNM4C5nee+NmHhMgeMv1TRT+Engle7iYey92TGE+no24NDSnSekJ91Y8R/u2kWnjcLoakZ6F5tYVHOt1jz4IrqV20Qh0LJHlh4e6X2MyEGsuz3Ka
 VHJDDnFA
 FA6pCUacQ6fkAQEIbmVkLUuSCMSUYR8E+8p5atmr9RnlsckOAulKEsYAtDon1UYtwbHmnXUppjbb428LW2bsj6wTlkMAKdemKPy3j6msiNwT4GlmK9zEsdFy7OdDlaEXkD0UW6CSN2tg/3mNBycld8adsa3XZsTj5PE9c9gokW9jHF/RhtXIF9Y1LCHdXZaTrLdzA3cV3eO0Pdy84u6XqdL3nADBVJ7+MJrEE7e+wSNnRNB5rpECR5AH6scjmU6FeLkwIKIobfS3MENwVzeQg0u0Q4MOyiJGHmFn4TZfAlp8X6jzvlY34HR3YiIx8UpKGDabpSEajmmnCFZnyZy6gtTKQew==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Fri, 2025-01-03 at 18:36 +0100, Jann Horn wrote:
>=20
> > +++ b/arch/x86/include/asm/mmu.h
> > @@ -48,6 +48,12 @@ typedef struct {
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long flags;
> > =C2=A0#endif
> >=20
> > +#ifdef CONFIG_CPU_SUP_AMD
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct list_head broadcast_asid_l=
ist;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 u16 broadcast_asid;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bool asid_transition;
>=20
> Please add a comment on the semantics of the "asid_transition" field
> here after addressing the comments below.

Will do.=20

>=20
> > +#endif
> > +
> > =C2=A0#ifdef CONFIG_ADDRESS_MASKING
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Active LAM mode:=C2=A0 X8=
6_CR3_LAM_U48 or X86_CR3_LAM_U57 or
> > 0 (disabled) */
> > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long lam_cr3_mask;
> [...]
> > +#ifdef CONFIG_CPU_SUP_AMD
> > +/*
> > + * Logic for AMD INVLPGB support.
> > + */
> > +static DEFINE_RAW_SPINLOCK(broadcast_asid_lock);
> > +static u16 last_broadcast_asid =3D TLB_NR_DYN_ASIDS;
>=20
> I wonder if this should be set to MAX_ASID_AVAILABLE or such to
> ensure
> that we do a flush before we start using the broadcast ASID space the
> first time... Or is there something else that already guarantees that
> all ASIDs of the TLB are flushed during kernel boot?

That is a good idea. I don't know if the TLBs always get
flushed on every kexec, for example, and having the
wraparound code exercised early on every boot will be
good as a self test for future proofing the code.

I'll do that in the next version.

>=20
> > +static u16 get_broadcast_asid(void)
> > +{
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 lockdep_assert_held(&broadcast_as=
id_lock);
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 do {
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 u16 start =3D last_broadcast_asid;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 u16 asid =3D find_next_zero_bit(broadcast_asid_used,
> > MAX_ASID_AVAILABLE, start);
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 if (asid >=3D MAX_ASID_AVAILABLE) {
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 reset_broad=
cast_asid_space();
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 continue;
>=20
> Can this loop endlessly without making forward progress if we have a
> few thousand processes on the system that are multi-threaded (or used
> to be multi-threaded) and race the wrong way?
> meets_broadcast_asid_threshold() checks if we have free IDs
> remaining,
> but that check happens before broadcast_asid_lock is held, so we
> could
> theoretically race such that no free IDs are available, right?

You are right, I need to duplicate that check under
the spinlock! I'll get that done in the next version.

>=20
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mm->context.broadcast_asid =3D ge=
t_broadcast_asid();
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mm->context.asid_transition =3D t=
rue;
>=20
> This looks buggy to me:=20

Nadav found the same issue. I've fixed it locally
already for the next version.

> Maybe change how mm->context.asid_transition works such that it is
> immediately set on mm creation and cleared when the transition is
> done, so that you don't have to touch it here?
>=20
If we want to document the ordering, won't it be better
to keep both assignments close to each other (with WRITE_ONCE),
so the code stays easier to understand for future maintenance?

> Also, please use at least WRITE_ONCE() for writes here, and add
> comments documenting ordering requirements.

I'll add a comment.

>=20
> > +static void finish_asid_transition(struct flush_tlb_info *info)
> > +{
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct mm_struct *mm =3D info->mm=
;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int bc_asid =3D mm_broadcast_asid=
(mm);
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int cpu;
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!mm->context.asid_transition)
>=20
> AFAIU this can be accessed concurrently - please use at least
> READ_ONCE(). (I think in the current version of the patch, this needs
> to be ordered against the preceding mm_broadcast_asid() read, but
> that's implicit on x86, so I guess writing a barrier here would be
> superfluous.)

I'll add a READ_ONCE here. Good point.

>=20
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 return;
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 for_each_cpu(cpu, mm_cpumask(mm))=
 {
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu))
> > !=3D mm)
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 continue;
>=20
> switch_mm_irqs_off() picks an ASID and writes CR3 before writing
> loaded_mm:
> "/* Make sure we write CR3 before loaded_mm. */"
>=20
> Can we race with a concurrent switch_mm_irqs_off() on the other CPU
> such that the other CPU has already switched CR3 to our MM using the
> old ASID, but has not yet written loaded_mm, such that we skip it
> here? And then we'll think we finished the ASID transition, and the
> next time we do a flush, we'll wrongly omit the flush for that other
> CPU even though it's still using the old ASID?

That is a very good question.

I suppose we need to check against LOADED_MM_SWITCHING
too, and possibly wait to see what mm shows up on that
CPU before proceeding?

Maybe as simple as this?

        for_each_cpu(cpu, mm_cpumask(mm)) {
		while (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)
=3D=3D LOADED_MM_SWITCHING)
			cpu_relax();

                if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) !=3D
mm)
                        continue;

                /*
                 * If at least one CPU is not using the broadcast ASID
yet,
                 * send a TLB flush IPI. The IPI should cause
stragglers
                 * to transition soon.
                 */
                if (per_cpu(cpu_tlbstate.loaded_mm_asid, cpu) !=3D
bc_asid) {
                        flush_tlb_multi(mm_cpumask(info->mm), info);
                        return;
                }
        }

Then the only change needed to switch_mm_irqs_off
would be to move the LOADED_MM_SWITCHING line to
before choose_new_asid, to fully close the window.

Am I overlooking anything here?


>=20
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 /*
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 * If at least one CPU is not using the broadcast
> > ASID yet,
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 * send a TLB flush IPI. The IPI should cause
> > stragglers
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 * to transition soon.
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 */
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 if (per_cpu(cpu_tlbstate.loaded_mm_asid, cpu) !=3D
> > bc_asid) {
>=20
> READ_ONCE()? Also, I think this needs a comment explaining that this
> can race with concurrent MM switches such that we wrongly think that
> there's a straggler (because we're not reading the loaded_mm and the
> loaded_mm_asid as one atomic combination).

I'll add the READ_ONCE.

Will the race still exist if we wait on
LOADED_MM_SWITCHING as proposed above?

>=20
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 flush_tlb_m=
ulti(mm_cpumask(info->mm),
> > info);
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 }
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 }
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* All the CPUs running this proc=
ess are using the
> > broadcast ASID. */
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mm->context.asid_transition =3D 0=
;
>=20
> WRITE_ONCE()?
> Also: This is a bool, please use "false".

Will do.

>=20
> > +}
> > +
> > +static void broadcast_tlb_flush(struct flush_tlb_info *info)
> > +{
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bool pmd =3D info->stride_shift =
=3D=3D PMD_SHIFT;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long maxnr =3D invlpgb_c=
ount_max;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long asid =3D info->mm->=
context.broadcast_asid;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long addr =3D info->star=
t;
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long nr;
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Flushing multiple pages at onc=
e is not supported with
> > 1GB pages. */
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (info->stride_shift > PMD_SHIF=
T)
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 maxnr =3D 1;
> > +
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (info->end =3D=3D TLB_FLUSH_AL=
L) {
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 invlpgb_flush_single_pcid(kern_pcid(asid));
>=20
> What orders this flush with the preceding page table update? Does the
> instruction implicitly get ordered after preceding memory writes, or
> do we get that ordering from inc_mm_tlb_gen() or something like that?

I believe inc_mm_tlb_gen() should provide the ordering.

You are right that it should be documented.

>=20
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Broadcast ASIDs are always kep=
t up to date with INVLPGB.
> > */
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (is_broadcast_asid(loaded_mm_a=
sid))
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 return;
>=20
> This relies on the mm_broadcast_asid() read in flush_tlb_mm_range()
> being ordered after the page table update, correct? And we get that
> required ordering from the inc_mm_tlb_gen(), which implies a full
> barrier? It might be nice if there were some more comments on this.

I will add some comments, and I hope you can review
those in the next series, because I'll no doubt
forget to explain something important.

Thank you for the thorough review!


--=20
All Rights Reversed.