From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 65A9BE77188
	for <linux-mm@archiver.kernel.org>; Fri,  3 Jan 2025 17:36:47 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id EFF016B0083; Fri,  3 Jan 2025 12:36:46 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id EAE1E6B0088; Fri,  3 Jan 2025 12:36:46 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id D9C4B6B0089; Fri,  3 Jan 2025 12:36:46 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10])
	by kanga.kvack.org (Postfix) with ESMTP id BC5AF6B0083
	for <linux-mm@kvack.org>; Fri,  3 Jan 2025 12:36:46 -0500 (EST)
Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay07.hostedemail.com (Postfix) with ESMTP id 66334160979
	for <linux-mm@kvack.org>; Fri,  3 Jan 2025 17:36:46 +0000 (UTC)
X-FDA: 82966842000.17.9054569
Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41])
	by imf25.hostedemail.com (Postfix) with ESMTP id 385FBA0005
	for <linux-mm@kvack.org>; Fri,  3 Jan 2025 17:36:06 +0000 (UTC)
Authentication-Results: imf25.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=PHl+yzTq;
	spf=pass (imf25.hostedemail.com: domain of jannh@google.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=jannh@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735925756; a=rsa-sha256;
	cv=none;
	b=05a2CrOdfS9tDEr5jz5dDcVwjzHyFriB4UrU0nB6fcMlmKohg4aRhMdW6NNUImxkFKx41g
	a1AplaSOb7s6NSe5NTKGC569KCy1BeWj0NJE7Z7yWFIKWprOX89tWUMp5BafY/lhTvIWGT
	32VsBa22LW3I1BNu3ZVXk8IPdQKybtA=
ARC-Authentication-Results: i=1;
	imf25.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=PHl+yzTq;
	spf=pass (imf25.hostedemail.com: domain of jannh@google.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=jannh@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1735925756;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=hgsRcKEHugbngVifvsbBSXQ3nqR8ddn0jOJ04jCOFt0=;
	b=BVUIy9qHaZT4j1ohcpJ852Jr2bU5rhmJusZIgQyrvP32vkMfkKKxziiBhejktIl/YtFjoU
	wNQL76bLD8TCpyF04DcxdekipLE11MhurH02xmONzKASBuiATXHugkX6ksgWtRcJAt5H81
	G7Ti5Q+G1eMXmffs9/w1gG006cb5jjA=
Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-5d3e638e1b4so7658a12.1
        for <linux-mm@kvack.org>; Fri, 03 Jan 2025 09:36:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1735925802; x=1736530602; darn=kvack.org;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=hgsRcKEHugbngVifvsbBSXQ3nqR8ddn0jOJ04jCOFt0=;
        b=PHl+yzTqUzTCS3tsRfkN3ouJ+l48SoNo7JeigcPf+JKsNF7kBdP0BimcjP7Bn0LJh0
         GWmHTIVCWObrLUJ8912WAN/AnKybu94aroj19zAHl4FySDSmF005sl4kjzZCY2leV4In
         DSKfMDa5B3YGqpN2WzJmA1RYORso7OxrUvwNgycghMhpW5GtpKNlMCrAb7i1OZsXuDUO
         nfYFV1xtuj1gDS00nyssOoMUvHdvBzf6KgDEjVjX3gPV+FwRXFrD9zFobAj5abZ2vh5R
         FtDm6OUOY4r35zRHXZ0n6ApTuxDDCFSBMcQ/DbBq6Ea7AapK6MyY4dFkRrQ03tRsv0pj
         AcYQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1735925802; x=1736530602;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=hgsRcKEHugbngVifvsbBSXQ3nqR8ddn0jOJ04jCOFt0=;
        b=V5Eld3P5VJT8wmb+p/oH1GwtCkpo3NbVCL9alUECBP1TKTkmdljVJ2y6jd1XUIZb0u
         DwCd9if749EkfghlgJbuHiXceQrPWpSfVFy040jYWZKh3VTDiuIeHzRwOnOml7eqXbmF
         w99JURWEM1YI94s4gNgmyo/BDXrQ+qaqq56EyfqiPFzl36SUOnVLw+fkCyGnP0iMYULD
         mAl7LNnJ3qn/bJtaWiLwiQWUOSg+3surr3qF6YLujv4zdT+oopWImkFBk2nozLAeyPKK
         f3MbdS4F1pJkKpYLJWyEBK7LmRGD2vH4xLgCemk2cHKXeU1Dzj9xdTUxeeyCRp5rAm0u
         Cx9A==
X-Forwarded-Encrypted: i=1; AJvYcCUcxqkJXqozTpck/2Yav6T79FQn3OipWvMP5QDnFn5ZTHcyIqk5yaZRrIqOVNB/FBc8hC+ce9bk8g==@kvack.org
X-Gm-Message-State: AOJu0Yy6ySg9BcU4qWXZsO4pMJCDJDAcmxpmjEfmmjXvtjvQNI5RKoGY
	N2++Me28iLXHLIileR0QhVg3gZq8pyarf4Q8cH5tvXUEdEN1YHs8LcqaUNz9IdmFKyjPMjzm7ZJ
	OzKJJG+0rSHr10Z2DPzUKWVUbGFrWUhmVcqBPqTZQCD23qrs911RGb5I=
X-Gm-Gg: ASbGncsEAe8g4oyyBc7FWItX9YYG2QhjEI0CD/MWmdW2CiNo61HGodtWLGdnQ31zx08
	BVFYHdWv1fmGogFBS542/v+F377hCEqsytNwu51TjUC9fGyL3KyRWAM1noR5lA6GW1g==
X-Google-Smtp-Source: AGHT+IEcpzWXDDxt4Y9tyP2JCDis59/mS56ZoTP2KQ4D+k4ZE3oNtEMmIj6FHxYrd2CLt4qrpo9cetU37pfoFatAwBw=
X-Received: by 2002:a05:6402:390a:b0:5d1:22e1:7458 with SMTP id
 4fb4d7f45d1cf-5d915fd0203mr90304a12.4.1735925802125; Fri, 03 Jan 2025
 09:36:42 -0800 (PST)
MIME-Version: 1.0
References: <20241230175550.4046587-1-riel@surriel.com> <20241230175550.4046587-10-riel@surriel.com>
In-Reply-To: <20241230175550.4046587-10-riel@surriel.com>
From: Jann Horn <jannh@google.com>
Date: Fri, 3 Jan 2025 18:36:06 +0100
X-Gm-Features: AbW1kvbJlIVrm5J8jUjhKy2TQU5CMk7bACRF8CXapju7MVjhZ0s4O5ebx1Kskd8
Message-ID: <CAG48ez1aPB14tgpNzk004_XuvQ0vLBURjDuGLom9KYc+uW08DQ@mail.gmail.com>
Subject: Re: [PATCH 09/12] x86/mm: enable broadcast TLB invalidation for
 multi-threaded processes
To: Rik van Riel <riel@surriel.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, 
	dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, 
	tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, 
	akpm@linux-foundation.org, nadav.amit@gmail.com, zhengqi.arch@bytedance.com, 
	linux-mm@kvack.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Rspamd-Queue-Id: 385FBA0005
X-Stat-Signature: 1fqdhqiw8qnka6ujc3k5p9snd9wf9pm3
X-Rspam-User: 
X-Rspamd-Server: rspam09
X-HE-Tag: 1735925766-241282
X-HE-Meta: U2FsdGVkX19J7XM1e583jZoe9hUuWWwXkLcGIstz7tTPng9jiJD3G22WXoDR/xBzJZX6H2gczi6lyxirxYCOi1W2B/AmV40g4sq5c0vQmFJ0csBUTEOeebeJ6AXN1vEDViG83QC+wsvR2ZM2MBqZbzjdBMNaoOoykkAc78FsOTZHuD5hT5WIEnnTXwuZPrQhFVlkx2rDoZzhUPBYqQvrV7VVHn82RSq0Ps7jI8AMR7HOrsMd2jH+vVQEno3Cn/KGcSnXk8UVStoN/CjgVarCJxG7DnJTxwUo+3KgZaJ+iQ9YDs82+LVs0GuXX10hbooEKJ3JwonxeKdeje3DXeVH1QxCCpQkdc1didmwLNX4Ld6ScZu+UOy4PAu4OyLAEWtGQjHGRMmnR0/J5O0wwritzyyba/o1eBL8yrj2Qo6yzS4Wh9P1Tj6vF5QPHuAY9qbQcl5/urQDnGDFc2SlpMcOMfjzbi9CweoxVafHh+PpXpdJpSQcvsm2Bt3DMeRlFkxH6aPrhPUcP9bwQVgMnq6ekgSdKyjliUffpyqVk+sltXQWA9DS0TkrUpmcZS2I+Nlfdyw7cQZhK+tVuRvK4f3qA9TNAnug+hApzStdhdcX/930TY4vWwAOMLnmsECBubZpUIl6l7UhPiO8+NGq6buO44Yqz7LAr6zagqQqxVbknCBrgkbGDnBK+rWn1Q9k27fi2J2ecaTofu5fbQNVci8z4a/zV4/kkhlcxPLHBbx7sc+R3cjMHymwCOnaHcyV5KZX2ULJiI0SMNz5hGTf1r0fHU5ggvucm88wc9vasYKQRwMjODxaq7t33+9CNpGtbCQVqrG/O+hF9jl1UUcicJFdkfxP6jAuE4lySDi4K2bMCu72qqT4mlTViNJrIS4/NNqe9Ww+00qpUOXH4DQJiinWsdOr3yhHj+UCP9vGmF4XcScGROE5fyUSho/4jjc3mZDtf/Ih/2PypfNu+YOtf0n
 hKlUEmhq
 QCBK67N6NWBhJhSjxmW97AzGh7aAxZjIqd1XF+I+vNj69LpTns5yXtgyrwjBHCunshEtuc2kJ2ORo540lzkfrlLNYqWgPcL2brE7wUeu9YNi6XerfSA1z0RhKQjExYh2ElyqrspgAnJEwoVJDGfhbS1Dnn3R9yIsdcqtwklyS8xta0WfKPQd5UmpLMiYskLbUXhc4QBVH3hhzuc9S0OxuPXPEHU+omKssleXY9cUhoHSYJbxVTNva5m3R0qyIwGXv9nn/JEJcHAqpyJFAEiKMwrDsHdCq+Lk/gWOuCafOQ61InmqDq2ZeRtOCPPJzCzV5RDVTFcDkpbS+jSk=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Mon, Dec 30, 2024 at 6:53=E2=80=AFPM Rik van Riel <riel@surriel.com> wro=
te:
> Use broadcast TLB invalidation, using the INVPLGB instruction, on AMD EPY=
C 3
> and newer CPUs.
>
> In order to not exhaust PCID space, and keep TLB flushes local for single
> threaded processes, we only hand out broadcast ASIDs to processes active =
on
> 3 or more CPUs, and gradually increase the threshold as broadcast ASID sp=
ace
> is depleted.
[...]
> ---
>  arch/x86/include/asm/mmu.h         |   6 +
>  arch/x86/include/asm/mmu_context.h |  12 ++
>  arch/x86/include/asm/tlbflush.h    |  17 ++
>  arch/x86/mm/tlb.c                  | 310 ++++++++++++++++++++++++++++-
>  4 files changed, 336 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> index 3b496cdcb74b..a8e8dfa5a520 100644
> --- a/arch/x86/include/asm/mmu.h
> +++ b/arch/x86/include/asm/mmu.h
> @@ -48,6 +48,12 @@ typedef struct {
>         unsigned long flags;
>  #endif
>
> +#ifdef CONFIG_CPU_SUP_AMD
> +       struct list_head broadcast_asid_list;
> +       u16 broadcast_asid;
> +       bool asid_transition;

Please add a comment on the semantics of the "asid_transition" field
here after addressing the comments below.

> +#endif
> +
>  #ifdef CONFIG_ADDRESS_MASKING
>         /* Active LAM mode:  X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (dis=
abled) */
>         unsigned long lam_cr3_mask;
[...]
> +#ifdef CONFIG_CPU_SUP_AMD
> +/*
> + * Logic for AMD INVLPGB support.
> + */
> +static DEFINE_RAW_SPINLOCK(broadcast_asid_lock);
> +static u16 last_broadcast_asid =3D TLB_NR_DYN_ASIDS;

I wonder if this should be set to MAX_ASID_AVAILABLE or such to ensure
that we do a flush before we start using the broadcast ASID space the
first time... Or is there something else that already guarantees that
all ASIDs of the TLB are flushed during kernel boot?

> +static DECLARE_BITMAP(broadcast_asid_used, MAX_ASID_AVAILABLE) =3D { 0 }=
;
> +static LIST_HEAD(broadcast_asid_list);
> +static int broadcast_asid_available =3D MAX_ASID_AVAILABLE - TLB_NR_DYN_=
ASIDS - 1;
> +
> +static void reset_broadcast_asid_space(void)
> +{
> +       mm_context_t *context;
> +
> +       lockdep_assert_held(&broadcast_asid_lock);
> +
> +       /*
> +        * Flush once when we wrap around the ASID space, so we won't nee=
d
> +        * to flush every time we allocate an ASID for boradcast flushing=
.

nit: typoed "broadcast"

> +        */
> +       invlpgb_flush_all_nonglobals();
> +       tlbsync();
> +
> +       /*
> +        * Leave the currently used broadcast ASIDs set in the bitmap, si=
nce
> +        * those cannot be reused before the next wraparound and flush..
> +        */
> +       bitmap_clear(broadcast_asid_used, 0, MAX_ASID_AVAILABLE);
> +       list_for_each_entry(context, &broadcast_asid_list, broadcast_asid=
_list)
> +               __set_bit(context->broadcast_asid, broadcast_asid_used);
> +
> +       last_broadcast_asid =3D TLB_NR_DYN_ASIDS;
> +}
> +
> +static u16 get_broadcast_asid(void)
> +{
> +       lockdep_assert_held(&broadcast_asid_lock);
> +
> +       do {
> +               u16 start =3D last_broadcast_asid;
> +               u16 asid =3D find_next_zero_bit(broadcast_asid_used, MAX_=
ASID_AVAILABLE, start);
> +
> +               if (asid >=3D MAX_ASID_AVAILABLE) {
> +                       reset_broadcast_asid_space();
> +                       continue;

Can this loop endlessly without making forward progress if we have a
few thousand processes on the system that are multi-threaded (or used
to be multi-threaded) and race the wrong way?
meets_broadcast_asid_threshold() checks if we have free IDs remaining,
but that check happens before broadcast_asid_lock is held, so we could
theoretically race such that no free IDs are available, right?

> +               }
> +
> +               /* Try claiming this broadcast ASID. */
> +               if (!test_and_set_bit(asid, broadcast_asid_used)) {
> +                       last_broadcast_asid =3D asid;
> +                       return asid;
> +               }
> +       } while (1);
> +}
[...]
> +/*
> + * Assign a broadcast ASID to the current process, protecting against
> + * races between multiple threads in the process.
> + */
> +static void use_broadcast_asid(struct mm_struct *mm)
> +{
> +       guard(raw_spinlock_irqsave)(&broadcast_asid_lock);
> +
> +       /* This process is already using broadcast TLB invalidation. */
> +       if (mm->context.broadcast_asid)
> +               return;
> +
> +       mm->context.broadcast_asid =3D get_broadcast_asid();
> +       mm->context.asid_transition =3D true;

This looks buggy to me: If we first set mm->context.broadcast_asid and
then later set mm->context.asid_transition, then a
flush_tlb_mm_range() that happens in between will observe
mm_broadcast_asid() being true (meaning broadcast invalidation should
be used) while mm->context.asid_transition is false (meaning broadcast
invalidation alone is sufficient); but actually we haven't even
started transitioning CPUs over to the new ASID yet, so I think the
flush does nothing?

Maybe change how mm->context.asid_transition works such that it is
immediately set on mm creation and cleared when the transition is
done, so that you don't have to touch it here?

Also, please use at least WRITE_ONCE() for writes here, and add
comments documenting ordering requirements.

> +       list_add(&mm->context.broadcast_asid_list, &broadcast_asid_list);
> +       broadcast_asid_available--;
> +}
[...]
> +static void finish_asid_transition(struct flush_tlb_info *info)
> +{
> +       struct mm_struct *mm =3D info->mm;
> +       int bc_asid =3D mm_broadcast_asid(mm);
> +       int cpu;
> +
> +       if (!mm->context.asid_transition)

AFAIU this can be accessed concurrently - please use at least
READ_ONCE(). (I think in the current version of the patch, this needs
to be ordered against the preceding mm_broadcast_asid() read, but
that's implicit on x86, so I guess writing a barrier here would be
superfluous.)

> +               return;
> +
> +       for_each_cpu(cpu, mm_cpumask(mm)) {
> +               if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) !=3D =
mm)
> +                       continue;

switch_mm_irqs_off() picks an ASID and writes CR3 before writing loaded_mm:
"/* Make sure we write CR3 before loaded_mm. */"

Can we race with a concurrent switch_mm_irqs_off() on the other CPU
such that the other CPU has already switched CR3 to our MM using the
old ASID, but has not yet written loaded_mm, such that we skip it
here? And then we'll think we finished the ASID transition, and the
next time we do a flush, we'll wrongly omit the flush for that other
CPU even though it's still using the old ASID?

> +
> +               /*
> +                * If at least one CPU is not using the broadcast ASID ye=
t,
> +                * send a TLB flush IPI. The IPI should cause stragglers
> +                * to transition soon.
> +                */
> +               if (per_cpu(cpu_tlbstate.loaded_mm_asid, cpu) !=3D bc_asi=
d) {

READ_ONCE()? Also, I think this needs a comment explaining that this
can race with concurrent MM switches such that we wrongly think that
there's a straggler (because we're not reading the loaded_mm and the
loaded_mm_asid as one atomic combination).

> +                       flush_tlb_multi(mm_cpumask(info->mm), info);
> +                       return;
> +               }
> +       }
> +
> +       /* All the CPUs running this process are using the broadcast ASID=
. */
> +       mm->context.asid_transition =3D 0;

WRITE_ONCE()?
Also: This is a bool, please use "false".

> +}
> +
> +static void broadcast_tlb_flush(struct flush_tlb_info *info)
> +{
> +       bool pmd =3D info->stride_shift =3D=3D PMD_SHIFT;
> +       unsigned long maxnr =3D invlpgb_count_max;
> +       unsigned long asid =3D info->mm->context.broadcast_asid;
> +       unsigned long addr =3D info->start;
> +       unsigned long nr;
> +
> +       /* Flushing multiple pages at once is not supported with 1GB page=
s. */
> +       if (info->stride_shift > PMD_SHIFT)
> +               maxnr =3D 1;
> +
> +       if (info->end =3D=3D TLB_FLUSH_ALL) {
> +               invlpgb_flush_single_pcid(kern_pcid(asid));

What orders this flush with the preceding page table update? Does the
instruction implicitly get ordered after preceding memory writes, or
do we get that ordering from inc_mm_tlb_gen() or something like that?

> +               /* Do any CPUs supporting INVLPGB need PTI? */
> +               if (static_cpu_has(X86_FEATURE_PTI))
> +                       invlpgb_flush_single_pcid(user_pcid(asid));
> +       } else do {
> +               /*
> +                * Calculate how many pages can be flushed at once; if th=
e
> +                * remainder of the range is less than one page, flush on=
e.
> +                */
> +               nr =3D min(maxnr, (info->end - addr) >> info->stride_shif=
t);
> +               nr =3D max(nr, 1);
> +
> +               invlpgb_flush_user_nr(kern_pcid(asid), addr, nr, pmd);
> +               /* Do any CPUs supporting INVLPGB need PTI? */
> +               if (static_cpu_has(X86_FEATURE_PTI))
> +                       invlpgb_flush_user_nr(user_pcid(asid), addr, nr, =
pmd);
> +               addr +=3D nr << info->stride_shift;
> +       } while (addr < info->end);
> +
> +       finish_asid_transition(info);
> +
> +       /* Wait for the INVLPGBs kicked off above to finish. */
> +       tlbsync();
> +}
> +#endif /* CONFIG_CPU_SUP_AMD */
[...]
> @@ -769,6 +1042,16 @@ static void flush_tlb_func(void *info)
>         if (unlikely(loaded_mm =3D=3D &init_mm))
>                 return;
>
> +       /* Reload the ASID if transitioning into or out of a broadcast AS=
ID */
> +       if (needs_broadcast_asid_reload(loaded_mm, loaded_mm_asid)) {
> +               switch_mm_irqs_off(NULL, loaded_mm, NULL);
> +               loaded_mm_asid =3D this_cpu_read(cpu_tlbstate.loaded_mm_a=
sid);
> +       }
> +
> +       /* Broadcast ASIDs are always kept up to date with INVLPGB. */
> +       if (is_broadcast_asid(loaded_mm_asid))
> +               return;

This relies on the mm_broadcast_asid() read in flush_tlb_mm_range()
being ordered after the page table update, correct? And we get that
required ordering from the inc_mm_tlb_gen(), which implies a full
barrier? It might be nice if there were some more comments on this.

>         VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].ctx_id=
) !=3D
>                    loaded_mm->context.ctx_id);
>