From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2B6CC02183 for ; Mon, 13 Jan 2025 17:49:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 319566B0096; Mon, 13 Jan 2025 12:49:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2CA6D6B0098; Mon, 13 Jan 2025 12:49:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 142F96B0099; Mon, 13 Jan 2025 12:49:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EAC6E6B0096 for ; Mon, 13 Jan 2025 12:49:16 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 98C6D8013B for ; Mon, 13 Jan 2025 17:49:16 +0000 (UTC) X-FDA: 83003165112.25.1A36256 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) by imf11.hostedemail.com (Postfix) with ESMTP id A383340006 for ; Mon, 13 Jan 2025 17:49:14 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LEVoSFxv; spf=pass (imf11.hostedemail.com: domain of jannh@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736790554; a=rsa-sha256; cv=none; b=VK64de4x9DHWACPW5//9yJYtwtfzB8LIu5+7SZIS7UEUNoJuxNcw1DT+yORYJCqzvCb8sS nup2Lowp30owkYDMLXA5wzm/yq7+2fXmZZFSow+/AkL/TaV5n9VHonq8FI7SPISe6xOC0+ s90KVG9S4l4NNGypvMQxT6V4S+GbJgE= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LEVoSFxv; spf=pass (imf11.hostedemail.com: domain of jannh@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736790554; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/Nt01Y9++TlIzfbZQP5sT2fZou+exbB9OAyv2Nu6/oc=; b=1PS+7G9cW1aMHP5Zhr15lw8zeiIQk2nOsqrsCwItT5FKxzkZvhe44v4u8XlKPNP9W/Gwxw aB705M1B5N5hY/bcIA7KcBFREYFMHBO+8qSm0wlQ2uyctuhm46imzVGHI1NfEX8qDsw5AX M92KYRkSPBg8kNGc/v+vNIpdEkC3sIo= Received: by mail-ed1-f42.google.com with SMTP id 4fb4d7f45d1cf-5d3e638e1b4so9821a12.1 for ; Mon, 13 Jan 2025 09:49:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736790553; x=1737395353; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=/Nt01Y9++TlIzfbZQP5sT2fZou+exbB9OAyv2Nu6/oc=; b=LEVoSFxvE+a7UHuHRXQwQaSbohy6aPYYxizulcQtT35sadwOfaBOjtwXD1r6SvBKdH YrHjqlKIDWDqKzL7mtYn9s01W51AqdndIyoNMmR+QNxqSKM9Ud+FGQCIF6/H6lEyV8RS zcHZ9BGuTsxiL3Icuu8tnfl6aHBcwJlNBXCqK1a52+Ypvw4brUPSumaF7oxCJivBnhje shqdjC4oOTVilJRAz6QDwXH54Kajs9dRKzKQWTpbaVncV8I1FQcx9ll6ufdRhjulpzSL Tal01aNfyEb2Ge/btVVMPuIqqMe6tKYfyjmeVm2C7muC7nJ4draaw/7Mst8Ndnb7CI1u s/cQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736790553; x=1737395353; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/Nt01Y9++TlIzfbZQP5sT2fZou+exbB9OAyv2Nu6/oc=; b=DifZcJBSJKmvwxkgWvLdyepSeRWjM6gGpCI8VMtmR9a1YBkkwKqNyhSfMZ98u2l/IL FFy3/afZVQ4s6G2kwZIbY0YQConVFI9/bjkNilzU52ZvB4EfXq/f2aNDwnp8R6yB8RNx FtTy2yOmabEA3L0A8Ro5ZVC79AgafVOixsUZt+D88VlZcxmCdyf/5boaZPAWoqAtEEpv 7SfWU34tsB1UO4jCSWH5P1YtBQnrP+uPu7wk61oSp4KbyhAQj+xhV4lHibS1848RuUhM rdg3XMpgvqInaCsxFFghWb+J6Z5Fpczs1Kdgn5vvnQgyfG5m5cDxld8M6p0XXSpnNyuw mUiQ== X-Forwarded-Encrypted: i=1; AJvYcCXu/i3h+7MPM3soU7gOBN3OYfbUIJH55FmEzZofCP+zcjqKwIiP7n0zKB0TYzQ6SBTDaITw06l15g==@kvack.org X-Gm-Message-State: AOJu0YzJPNuhn6gjBpFEQU9A7UPe3/kAP3wnCbfewfPsgVrQg0DiJ4DL ef/8dZ8eVwdZuoX13LjDBuY8oCF85YD5nyVOmsMXiBR1fXBYLVuVzn6awI7TIcHKnpD1klmu8B2 yLHH2rLENgq0A031ywbnaeTeSPwJWeDLA+4If X-Gm-Gg: ASbGncutGeia9BNscs1CrzCxKznX5Man2LFOduqgAP4DV51nSJ1z4kv5m6BfCdHFf3g d+T70/saaIARzneNYodHFIkCYEDFJPrmB25U4BL1vHnKrspRe0trncLxrNsc1nrNSGg== X-Google-Smtp-Source: AGHT+IFdmkXEDoN1HI03KmC6NuzU9UphO6K0SdJJ2nVk1iIcFh+Ifv28Zy1u1RbbjVouKabjkkPvSfsq3sAtrMhBHAM= X-Received: by 2002:aa7:c0d3:0:b0:5d0:eb21:264d with SMTP id 4fb4d7f45d1cf-5d9a0cc9b6dmr257238a12.1.1736790552843; Mon, 13 Jan 2025 09:49:12 -0800 (PST) MIME-Version: 1.0 References: <20250112155453.1104139-1-riel@surriel.com> <20250112155453.1104139-11-riel@surriel.com> In-Reply-To: From: Jann Horn Date: Mon, 13 Jan 2025 18:48:36 +0100 X-Gm-Features: AbW1kvYBG-T4FzMm3gapXRenDUfUaNltX_vX5beFWyPb0GUs8DfZ1imhgaCClp8 Message-ID: Subject: Re: [PATCH v4 10/12] x86,tlb: do targeted broadcast flushing from tlbbatch code To: Rik van Riel Cc: x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A383340006 X-Stat-Signature: u9ezkdioj45rkbuecdsespxsmhnu1i54 X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1736790554-259511 X-HE-Meta: U2FsdGVkX18GktUggxBk8+Hr2+mr+LCGbKUSRQTW7QfP4yFtF1kTVrwFChKIkUmYVXuu16MuTGwwpPfljYUHR3Ku0TxrxjKaaJYF+Nxpg384dhae3MCRGoYnlwhIvuSxIBI/Dp1+cKAIxwDkDkpOdYbJie0i51Wndzhs0B46tKg4eYF15qQC55gNV31IHm/WmvzVCyImRZDK+Zg3Ha/4LE3v4gWiHQLpog38/PCh125QTdB6ixKFoKRXRZH0hYXWaGurNrIH3VJx97VAm2SsY95s0jYR5JQk0HIEZbCYf4b3k9ZPjzSHmT/rb8Hg+Y6GLiO7eOvtZSJLLoypgY7l1JdJaWaOo+cFeFskb/hncsJRCe4U6iDezlN0JF7nPKYcrKdY3f5TycN4fEJvEFFsf74236cq6bnhXvvFFcIefWbXALk0AmwvLlOoTUV04TSpI6Aj35p+7iAc/bf/irY+kJR9LjAo/Wkx6hHPpti1eYrUppaDeB+qVAPbr/rOnuOkCKPzF6F90CPnVgmRe9Lrq86e0VX9yYqnWkOaZb4cc5riqOnZRUFBz80wSjoQbWxc/3JaTQt2T7Sek4eCisJllnfdj/dN5dYHtkhx0Myomn2I4aDPnwiwrkDg5Yz8wNke0IQxQuB45yE/SG+/OSKlz6WnhiBw51DD51vOxtxuGFUG3SIf7MjxG7sD+hk/JijcFzcnbj6zoUF7OsAMq9M1yN8fbNFFdVYkB1DrjdXYaVTLAhGb/ggw8mLwxeCcNn9eRvxUYO1eZ25g/guUr1ohA3wAQnh2Zpgk1Gy75FuPIevuBS/gSkVrzFu7QuaNTPQmNzhM2c+iWP7VFtz3d+kTwVt9F6u+EwgkF1A20KB++hxhX/r8BNctBS0BmAW91+ju4PLm2ygtLXrImmizBLnaFNjLnrQB0iHQ3V5Fpe6xF5AE3LbeQ7kHb9oqM+kaAaBzDVTybxgoVKOU0dzzkAs Xa80KfGS Vu1fBE4PjskSuZW5zdZcldruZC9cdC3u8meCILYcGii+vpc+v0i1V/pyGLg6ebJs5Ftmw4nWqSXAh2GCUsPUzLwTPmRBFlpc5V1jNe1Y6CBJNnsMzaaiJ9SpGE4gPWFbZdXPM808Y8f6X5qAhQUC9B4/UC6bH0+QSBwnTqgaqd+rR3nHwozCtLpF7QGwwPKaU3Tfn9wKYgPMm1Vsf7LeZmR1i6kjYJNpopaSW4x5og0GLliw9AdynO7KFewLnXvnPDXndVMhnQWMQ6Ka703FMUx3mll2ozC1ZZU/jjUyY10yAmPKuMxHyg8UYKYdq3gIPRXAhbXjDRE9vivwRcF3Y4ekyQR8xMOI1qeSX X-Bogosity: Ham, tests=bogofilter, spamicity=0.002286, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 13, 2025 at 6:05=E2=80=AFPM Jann Horn wrote: > On Sun, Jan 12, 2025 at 4:55=E2=80=AFPM Rik van Riel w= rote: > > Instead of doing a system-wide TLB flush from arch_tlbbatch_flush, > > queue up asynchronous, targeted flushes from arch_tlbbatch_add_pending. > > > > This also allows us to avoid adding the CPUs of processes using broadca= st > > flushing to the batch->cpumask, and will hopefully further reduce TLB > > flushing from the reclaim and compaction paths. > [...] > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > > index 80375ef186d5..532911fbb12a 100644 > > --- a/arch/x86/mm/tlb.c > > +++ b/arch/x86/mm/tlb.c > > @@ -1658,9 +1658,7 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unm= ap_batch *batch) > > * a local TLB flush is needed. Optimize this use-case by calli= ng > > * flush_tlb_func_local() directly in this case. > > */ > > - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { > > - invlpgb_flush_all_nonglobals(); > > - } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) = { > > + if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { > > flush_tlb_multi(&batch->cpumask, info); > > } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { > > lockdep_assert_irqs_enabled(); > > @@ -1669,12 +1667,49 @@ void arch_tlbbatch_flush(struct arch_tlbflush_u= nmap_batch *batch) > > local_irq_enable(); > > } > > > > + /* > > + * If we issued (asynchronous) INVLPGB flushes, wait for them h= ere. > > + * The cpumask above contains only CPUs that were running tasks > > + * not using broadcast TLB flushing. > > + */ > > + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) && batch->used_inv= lpgb) { > > + tlbsync(); > > + migrate_enable(); > > + batch->used_invlpgb =3D false; > > + } > > + > > cpumask_clear(&batch->cpumask); > > > > put_flush_tlb_info(); > > put_cpu(); > > } > > > > +void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch= , > > + struct mm_struct *mm, > > + unsigned long uaddr) > > +{ > > + if (static_cpu_has(X86_FEATURE_INVLPGB) && mm_global_asid(mm)) = { > > + u16 asid =3D mm_global_asid(mm); > > + /* > > + * Queue up an asynchronous invalidation. The correspon= ding > > + * TLBSYNC is done in arch_tlbbatch_flush(), and must b= e done > > + * on the same CPU. > > + */ > > + if (!batch->used_invlpgb) { > > + batch->used_invlpgb =3D true; > > + migrate_disable(); > > + } > > + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1,= false); > > + /* Do any CPUs supporting INVLPGB need PTI? */ > > + if (static_cpu_has(X86_FEATURE_PTI)) > > + invlpgb_flush_user_nr_nosync(user_pcid(asid), u= addr, 1, false); > > + } else { > > + inc_mm_tlb_gen(mm); > > + cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask= (mm)); > > + } > > + mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); > > +} > > How does this work if the MM is currently transitioning to a global > ASID? Should the "mm_global_asid(mm)" check maybe be replaced with > something that checks if the MM has fully transitioned to a global > ASID, so that we keep using the classic path if there might be holdout > CPUs? Ah, but if we did that, we'd also have to ensure that the MM switching path keeps invalidating the TLB when the MM's TLB generation count increments, even if the CPU has already switched to the global ASID.