From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB546C02198 for ; Mon, 10 Feb 2025 14:15:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 796116B0089; Mon, 10 Feb 2025 09:15:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 71DA16B008A; Mon, 10 Feb 2025 09:15:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E568280001; Mon, 10 Feb 2025 09:15:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 381B26B0089 for ; Mon, 10 Feb 2025 09:15:30 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EE9AD1C7529 for ; Mon, 10 Feb 2025 14:15:29 +0000 (UTC) X-FDA: 83104232778.22.98ECCEF Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf18.hostedemail.com (Postfix) with ESMTP id 06EB91C0006 for ; Mon, 10 Feb 2025 14:15:27 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bNK+uHvY; spf=pass (imf18.hostedemail.com: domain of jackmanb@google.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=jackmanb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739196928; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dLdP/oKxMdakWzZ4EPoe5ogtcPVBnnydtLFs9opdq1M=; b=NQ/Hwr0JvIEhm3p4CKms4z/oqiizVRQ4LTaXahFggG1CSSylvx+zr1C2lC8SAxX3OM642J FXEhNeUVxJdAcM+C4N4wRkEPtbhQDa67huX0TxekBGLX0TU1i7FHTbpjDhE8GTcvvCJYW9 qX0SB0CxxJnjkcga4dHP6Yaua6A/rxw= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bNK+uHvY; spf=pass (imf18.hostedemail.com: domain of jackmanb@google.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=jackmanb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739196928; a=rsa-sha256; cv=none; b=JAl8vexAgMJ222bjQtUBtDWngc9+15pf3VXFSk1mfrx7dP2ozeZDONvu5ECVuYQB9uc3QK 5jnr8Z8ghmWmKd0ShhYUtTa7VMFvg+KpyRsFnoLHKVNfMs+ArdUNZDP04pFlXbnjoQWB+g qY63VK5w+u2roBGJUsbBICV2qhMh1/M= Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4718aea0718so300501cf.0 for ; Mon, 10 Feb 2025 06:15:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739196927; x=1739801727; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=dLdP/oKxMdakWzZ4EPoe5ogtcPVBnnydtLFs9opdq1M=; b=bNK+uHvYO8wXcIWUhJONhm9kVArMH/UVxS7B9bmRtc/q/fCguiOdN70CDpDy0wI+pH Ohteyf4Of7OVATFjVfCZNm4RxI/xF831LQHCXGgA0Mif2L4CneWjWbTjsvK+WZbiTcuu aIj4mH+VouJ8MXxP9xZ+tpZReAIGD8aOGGmfrgWiwO549dnVf81gBkBwuo2KgWPpFomC adBY72ffrE27ljuak/1Cu0QRsdXYeZ5inFa4agl2GU6oABt446TdjYsMLnEgybpUuLXr +0Hy3HiNCQUoAYK47KWdW2WZkcrijKg+M7aK7HeBeHTZmBLC4FmbCOWP9nh+5OZXIiDM fTkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739196927; x=1739801727; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dLdP/oKxMdakWzZ4EPoe5ogtcPVBnnydtLFs9opdq1M=; b=wuA7FZe74jNhFxJyGoOkbFiq8N+0Cjs7TQt8NAvAlQ5T2zLdPTlA/lLFfW6pgWL8zH zPcX6hKPypjAiQHPzwS+d1AW2r5FI284iilnOW7RrY5Nz9ld3Pf2pzIHprABXprUCd0H Vc48MRiv9xiPu8+NTNCf8iFr+WHsjJAazxwS31nkF6NUvzgJZzCByDMUdr3mpWHpwyhG KskBMb9zUy3W0LLmKgGMP824q7xNQIiPR/74kA8IEb4TupU0Oz/VN+xZA+w4ZPvi3gS+ 1NxZUVMapEvfkTMMwuDrRIV3pw4bPmNZqw7dWg+CRNyhbbJz9HvHFnYUldgWJyu3hZMa vPmQ== X-Forwarded-Encrypted: i=1; AJvYcCXeFspEv8IRDsj5WbMrDYLXn8KGgYgHY6xSUmyv3dnD5wArgptAKq0TNxQkMj6E9gg6gxS13KB9mw==@kvack.org X-Gm-Message-State: AOJu0YxnrcfFhPB3nOz7so4VxgcIzXfy4nayOlTuU23FjhbAU9F5UoxI ZEwhw50AgDPx2AHvTWyli+OZ1QgSi1jX2yO/JeKhxfKJdf90jw3/CgQPwSToDrF/ng6CATzPH4d nskAp80824FtlCfG0cwX8XA4SqMJ09XwO4QP3 X-Gm-Gg: ASbGncsA9lzagCivKjBnrZgG1r21zH/yrhJ7v10Z1MqdYt3yt0VBmDWreVW3uMM1w0Y 9b8FGIR0xk9pKILpqs5HJhZ076Pk0IVLMKG3qE9gPb1P2pNYZ6nl3Pxnab2ti/mTknnCdiwIuY6 PnhOQrVa8f8nicQa1eCiX45iV2AKw= X-Google-Smtp-Source: AGHT+IEGexvXS6MPJZAKDHaRlF5nijqAut2h9euMWvOzIVltnkcIQSbLpdWS9TcKpaINqYk37SxjSgdRI0B8hH1PTpc= X-Received: by 2002:a05:622a:4291:b0:466:975f:b219 with SMTP id d75a77b69052e-471759fcc9cmr7886411cf.8.1739196926921; Mon, 10 Feb 2025 06:15:26 -0800 (PST) MIME-Version: 1.0 References: <20250206044346.3810242-1-riel@surriel.com> <20250206044346.3810242-10-riel@surriel.com> In-Reply-To: <20250206044346.3810242-10-riel@surriel.com> From: Brendan Jackman Date: Mon, 10 Feb 2025 15:15:15 +0100 X-Gm-Features: AWEUYZk8MqG_c7jLb4tGoW116wgmCinWT2b92SLYLT4paBK5Ts1DNGfOwUSJIZU Message-ID: Subject: Re: [PATCH v9 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes To: Rik van Riel Cc: x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali Shukla Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 06EB91C0006 X-Stat-Signature: uctij4zydk38k93rgdi1ap1ypksazncw X-Rspam-User: X-HE-Tag: 1739196927-432101 X-HE-Meta: U2FsdGVkX1+pdTAylOgrz+s8kzAmD38F3m4eIbGvPykxffaabpFUY5OXUjK2xCIpyH8Jol6EIJyral4PVyckgXhkPe7iXQq+2owutKBdExjNzdjl0mM1Gw778YbgbHxgZiJARDbMm+V9GqMGXhn11Njadscdzm3McNDMiXnaruuMP1+WNWC489LDZyIOkhz+2lJIMER4d8PxKSxEUZ6SH93/QNqLNVu4rEPU9ltUYlvejHZ55+uxHFNSatJtVcg68B6e1Wp1Tz8eYv+92OWznXRlIN70txaZSVWfMMxV6AwDmyfqmG6n84IOnnPobfGI6kDsYG4ia+qLmVCzP+CnV9sGOQkkLLkn22fFDB9la3ki0pqNDUxwr80sN43jRboJteubWuWYqeM9oBxAFUiUarsCckDOaG9rr+GYB/6Mxb0WpMf9FxpiJ8l1Eui8f1Un5iQss4b1Xm61B4m0hOWeA68z/a5vKKzKnE04lD1ceH3y34q28uwuT69DGV97SvffyTeq6tFEzFvek3eel7nRKh7UDU5tlnA62SmtHI49h46a1a6UrlKmciO1gmmPS2oqDAsXx3L6Ex2F7dvQHam/Ij5HDSX9k7SyU36DC9oNTNvbzIMG+hTaH2J6hCUVXuKLcdmUL1fGuM+Jld3WU87kzhkPar5E+8avBQqcQjLCWjBlM3ZYJaXThPJkbzaKwqUVyi5FLcAAyn6lIVyqa2aZ50u886mNcLCG/xAqampa0YLF64AOEMktCSQut0acm467y2vajRLZWJmWat1pLnD+/MjbzXg6ORRqmgmPwdxH0tGjlbNmemv7POLoF5aFxEyDtArKen0efOYaRQo367xycZXEbYh4y4HJ6W/2KV4cYUIEway5yIklewpPc2EkBbsVXWeymW+eI5O56txmBZlMRlg0JCrlKd4eiDDRABmJ9u3iRcl/wQGaj6HVUkw1TVLf5n73PNAvnMtusoMUvKI 28ah5XDZ Uqx1M6h3wIJkQn5ls+vf3/JDDOvjXT2sBrvc+cFXb4RKooifXftI3JUOs4Y+gayWa7CKDLaHxd/69taXObUavULAWIV7hw9Ot3ewmLF8koxuPrfPZtm9Kkc6GBwVQgSK6LL/N+ufttZu4a5Izi3h7WHaIdy95sUAi9JDwR2qmeoGl6SJVCum/NwwWhgf6MMUEeLBVrta6yr38aGUxbsbfpJBvoYOpsuCKlrbMbG7dQGy5nsIMK/u0WXIvqeRQjmpfetbo+c6a4+o/j0R0qLQK5iptfH9yq4n8Pp+y1jh1wGF8gItc9m++6aePQJQQMnr93stihCjCPRKW770= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 6 Feb 2025 at 05:47, Rik van Riel wrote: > +static u16 get_global_asid(void) > +{ > + > + u16 asid; > + > + lockdep_assert_held(&global_asid_lock); > + > + /* The previous allocated ASID is at the top of the address space. */ > + if (last_global_asid >= MAX_ASID_AVAILABLE - 1) > + reset_global_asid_space(); > + > + asid = find_next_zero_bit(global_asid_used, MAX_ASID_AVAILABLE, last_global_asid); > + > + if (asid >= MAX_ASID_AVAILABLE) { > + /* This should never happen. */ > + VM_WARN_ONCE(1, "Unable to allocate global ASID despite %d available\n", global_asid_available); If you'll forgive the nitpicking, please put the last arg on a new line or otherwise break this up, the rest of this file keeps below 100 chars (this is 113). > + return 0; > + } > + > + /* Claim this global ASID. */ > + __set_bit(asid, global_asid_used); > + last_global_asid = asid; > + global_asid_available--; > + return asid; > +} > + > +/* > + * Returns true if the mm is transitioning from a CPU-local ASID to a global > + * (INVLPGB) ASID, or the other way around. > + */ > +static bool needs_global_asid_reload(struct mm_struct *next, u16 prev_asid) > +{ > + u16 global_asid = mm_global_asid(next); > + > + if (global_asid && prev_asid != global_asid) > + return true; > + > + if (!global_asid && is_global_asid(prev_asid)) > + return true; I think this needs clarification around when switches from global->nonglobal happen. Maybe commentary or maybe there's a way to just express the code that makes it obvious. Here's what I currently understand, please correct me if I'm wrong: - Once a process gets a global ASID it keeps it forever. So within a process we never switch global->nonglobal. - In flush_tlb_func() we are just calling this to check if the process has just been given a global ASID - there's no way loaded_mm_asid can be global yet !mm_global_asid(loaded_mm). - When we call this from switch_mm_irqs_off() we are in the prev==next case. Is there something about lazy TLB that can cause the case above to happen here? > +static bool meets_global_asid_threshold(struct mm_struct *mm) > +{ > + if (!global_asid_available) I think we need READ_ONCE here. Also - this doesn't really make sense in this function as it's currently named. I think we could just inline this whole function into consider_global_asid(), it would still be nice and readable IMO. > @@ -786,6 +1101,8 @@ static void flush_tlb_func(void *info) > return; > } > > + local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); > + > if (unlikely(f->new_tlb_gen != TLB_GENERATION_INVALID && > f->new_tlb_gen <= local_tlb_gen)) { > /* > @@ -953,7 +1270,7 @@ STATIC_NOPV void native_flush_tlb_multi(const struct cpumask *cpumask, > * up on the new contents of what used to be page tables, while > * doing a speculative memory access. > */ > - if (info->freed_tables) > + if (info->freed_tables || in_asid_transition(info->mm)) > on_each_cpu_mask(cpumask, flush_tlb_func, (void *)info, true); > else > on_each_cpu_cond_mask(should_flush_tlb, flush_tlb_func, > @@ -1058,9 +1375,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, > * a local TLB flush is needed. Optimize this use-case by calling > * flush_tlb_func_local() directly in this case. > */ > - if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { > + if (mm_global_asid(mm)) { > + broadcast_tlb_flush(info); > + } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { > info->trim_cpumask = should_trim_cpumask(mm); > flush_tlb_multi(mm_cpumask(mm), info); > + consider_global_asid(mm); Why do we do this here instead of when the CPU enters the mm? Is the idea that in combination with the jiffies thing in consider_global_asid() we get a probability of getting a global ASID (within some time period) that scales with the amount of TLB flushing the process does? So then we avoid using up ASID space on processes that are multithreaded but just sit around with stable VMAs etc? I guess another reason would be in the bizarre case that we ran out of global ASIDs and then entered one big workload that effectively owns all the CPUs, that big workload can still get a global ASID later once the old processes free them up, even if we enter it before reset_global_asid_space(). Just to be clear - this isn't an objection, I just wanna see if I actually understood the design. I guess it would be worth having a comment about it - especially if I'm missing something or totally wrong.