From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EE6BC0219E for ; Tue, 11 Feb 2025 03:50:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 107B16B0093; Mon, 10 Feb 2025 22:50:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0905B6B0095; Mon, 10 Feb 2025 22:50:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4D856B0096; Mon, 10 Feb 2025 22:50:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C45076B0093 for ; Mon, 10 Feb 2025 22:50:41 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 75B3EB0F41 for ; Tue, 11 Feb 2025 03:50:41 +0000 (UTC) X-FDA: 83106287082.24.01B1389 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf21.hostedemail.com (Postfix) with ESMTP id C2CCB1C000D for ; Tue, 11 Feb 2025 03:50:39 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf21.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739245839; a=rsa-sha256; cv=none; b=kHxllDjRClYOQ7lxvijBMzOXlSfM1lxBzkzQZkdoyqba9vbbyqVAO/pMzBf9ziBWjT3ThZ KOAlM/JjQxdDXuzAswV2JRgzNROTpVAEpWUbDcGPQ3updaxvSQLSvWEVGSPNKOE/L0Oudb 4ATJJfNiC0nAuKGAp1zr5DnvgBCttMA= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf21.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739245839; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P8kYaq4v52Rvu8UsFnpKQqjfLGeDgXZIwdUK9B6Mz1A=; b=CW3GqP+GKfeFUssClsqG8XME2dpGV08+GATHoU56ZQ9eIypXJLTnRsqWwjuwlRK3HdD3lm /Od+z/O3Z4KthdNlqv/IH8ZiLqrcx/aBIKzZlDh/NNx5TropKX3SEqayk5NG8JJCCSNbef 3Il2WTilag3OP4UBBScWf/3V7wlZ8Dk= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1thhDA-000000002VE-1hXi; Mon, 10 Feb 2025 22:45:24 -0500 Message-ID: <2d20c333400b890f4983cf799576435abf1d8824.camel@surriel.com> Subject: Re: [PATCH v9 10/12] x86/mm: do targeted broadcast flushing from tlbbatch code From: Rik van Riel To: Brendan Jackman Cc: x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali Shukla Date: Mon, 10 Feb 2025 22:45:24 -0500 In-Reply-To: References: <20250206044346.3810242-1-riel@surriel.com> <20250206044346.3810242-11-riel@surriel.com> Autocrypt: addr=riel@surriel.com; prefer-encrypt=mutual; keydata=mQENBFIt3aUBCADCK0LicyCYyMa0E1lodCDUBf6G+6C5UXKG1jEYwQu49cc/gUBTTk33A eo2hjn4JinVaPF3zfZprnKMEGGv4dHvEOCPWiNhlz5RtqH3SKJllq2dpeMS9RqbMvDA36rlJIIo47 Z/nl6IA8MDhSqyqdnTY8z7LnQHqq16jAqwo7Ll9qALXz4yG1ZdSCmo80VPetBZZPw7WMjo+1hByv/ lvdFnLfiQ52tayuuC1r9x2qZ/SYWd2M4p/f5CLmvG9UcnkbYFsKWz8bwOBWKg1PQcaYHLx06sHGdY dIDaeVvkIfMFwAprSo5EFU+aes2VB2ZjugOTbkkW2aPSWTRsBhPHhV6dABEBAAG0HlJpayB2YW4gU mllbCA8cmllbEByZWRoYXQuY29tPokBHwQwAQIACQUCW5LcVgIdIAAKCRDOed6ShMTeg05SB/986o gEgdq4byrtaBQKFg5LWfd8e+h+QzLOg/T8mSS3dJzFXe5JBOfvYg7Bj47xXi9I5sM+I9Lu9+1XVb/ r2rGJrU1DwA09TnmyFtK76bgMF0sBEh1ECILYNQTEIemzNFwOWLZZlEhZFRJsZyX+mtEp/WQIygHV WjwuP69VJw+fPQvLOGn4j8W9QXuvhha7u1QJ7mYx4dLGHrZlHdwDsqpvWsW+3rsIqs1BBe5/Itz9o 6y9gLNtQzwmSDioV8KhF85VmYInslhv5tUtMEppfdTLyX4SUKh8ftNIVmH9mXyRCZclSoa6IMd635 Jq1Pj2/Lp64tOzSvN5Y9zaiCc5FucXtB9SaWsgdmFuIFJpZWwgPHJpZWxAc3VycmllbC5jb20+iQE +BBMBAgAoBQJSLd2lAhsjBQkSzAMABgsJCAcDAgYVCAIJCgsEFgIDAQIeAQIXgAAKCRDOed6ShMTe g4PpB/0ZivKYFt0LaB22ssWUrBoeNWCP1NY/lkq2QbPhR3agLB7ZXI97PF2z/5QD9Fuy/FD/jddPx KRTvFCtHcEzTOcFjBmf52uqgt3U40H9GM++0IM0yHusd9EzlaWsbp09vsAV2DwdqS69x9RPbvE/Ne fO5subhocH76okcF/aQiQ+oj2j6LJZGBJBVigOHg+4zyzdDgKM+jp0bvDI51KQ4XfxV593OhvkS3z 3FPx0CE7l62WhWrieHyBblqvkTYgJ6dq4bsYpqxxGJOkQ47WpEUx6onH+rImWmPJbSYGhwBzTo0Mm G1Nb1qGPG+mTrSmJjDRxrwf1zjmYqQreWVSFEt26tBpSaWsgdmFuIFJpZWwgPHJpZWxAZmIuY29tP okBPgQTAQIAKAUCW5LbiAIbIwUJEswDAAYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQznneko TE3oOUEQgAsrGxjTC1bGtZyuvyQPcXclap11Ogib6rQywGYu6/Mnkbd6hbyY3wpdyQii/cas2S44N cQj8HkGv91JLVE24/Wt0gITPCH3rLVJJDGQxprHTVDs1t1RAbsbp0XTksZPCNWDGYIBo2aHDwErhI omYQ0Xluo1WBtH/UmHgirHvclsou1Ks9jyTxiPyUKRfae7GNOFiX99+ZlB27P3t8CjtSO831Ij0Ip QrfooZ21YVlUKw0Wy6Ll8EyefyrEYSh8KTm8dQj4O7xxvdg865TLeLpho5PwDRF+/mR3qi8CdGbkE c4pYZQO8UDXUN4S+pe0aTeTqlYw8rRHWF9TnvtpcNzZw== Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.1 (3.54.1-1.fc41) MIME-Version: 1.0 X-Rspamd-Queue-Id: C2CCB1C000D X-Stat-Signature: i4f3de1mcq15pg81fh9xmky496y9x4gb X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1739245839-515226 X-HE-Meta: U2FsdGVkX1/7Fy1GoTPUjV+sX0GdBwVepMWy7FZqH6qsIGMy7k+BCXxd12lJkviYrApyVCI5+KhEkaH2EjILbCMtspDZ4oao5tKPaCTKTYvRMoL3QWvAQADolr6FEtHU75Gn1jC+5J8xum2MZKE/BUEt/v40IIYvKhZWibLICzjOak6+5C4r3IG1fMxK1q/e24NPUb5sgzKmS0HyPdLPLsiWjCwafFaZ8HYDRNNt3Yuq9oMCRqTp+hUeEH1lQvCRAbOwTjbKzHbj7HilIMML1EqQUFmAKmvvHhiZHTh7zP29ZeIQE1S8gjEyMjs8XUeeBFPKdNzgnqjrJLUCJbUmhg7r90JYAeQifuDw4azVjeEEKtewCS3BMlDU453+tRXxZsK7yHct/u8k1qn/4qxc15/7ddMO3W8+oNLhfARh0jvd6WYWREfR/ggvUFNmGt8eYr/cnRIXfVzGvTsdSN7uHLqTAxhoHeqwYIrfOWoDHU73jXYsZc/l9OAykF77SpwuBZO4xcCQqFzObsD33lQPTmtsHDmK1i8/RmVGlUL0e8JxOrLGsSbP5ip/jI63EgVS28FapehFQJZegp9E6PHCL8jgSvSiw8nW/KPn1rxpDExMyq/DHL2Xsfwk550Ts712irZnaWW7PkAQZedsPIIm+33Uq0PRNFRfP0M9D8L/ynkZOt1rowN5k4xarsWLFsKbXyp/Ek7d+2Ws182fbhlmfIkMnCeBdSGZAdT3iMHyX58z81EJ+InwyhaUZKe6UX+WVfkPjP4uvfcz7xCQlB7xFcy2NVFznd9z+9WJLkvdDLexQqqQzOjxEiVbg7s3idlTfpzYWO3djIWj0W0tLjV0x4p3xiemLyxgV/TEmwJc/WXfQx1Hafn/6FMUliI+frQYfanEgtmNjYRuFRONjYDHWa3jUyAWxmVp3y1kP84I9FRSeHaWFUWqXiFlE6Xdn/hjjIRBpFrtt3nRW/TjVNw TEAjwfAl LWKdvohV0ShTyUBG7h6dAbfkvW3xBL6emBDpmGzrEcBVU6bICH+dM4N83pxkMfrpQXMQmTHeHvXYwDppa2afVWdI00I4XZauYq356OtKShcL+/R4uu5TVutGjpmtAMSkQc0GfqgYuFeu3h5yJRKB4WWchqrEkFt8ezJfyOE1dY1nUYppoqcmo92OGy1Ok7v1uSw2IdukW6p+BiOIGUHuWMtF2Rs+0O66mEKKfRAOurzL1tiRGr6B1RxPJuAIJ8W7gt8BmJrJuXQbT/NFu+XJWnWBC6LgQwANXjUUrfemFsJXybTpIrbO53McAolpvvHcmdwPF9PwKIuJoKx/DypA5XRa/CMbJ2JFw5Xau0VY5klCJpIZmk+bOL1tylHZOIKzsKNquoipfGcvkYHT7oYDuq9oR2ku+N8Gwmx4NqUmRqrkB107gaGUGN0ex17tuZb/X70BpaviskbsAAlU1C5OQb7NMYx6PrYtOWwRzWkRZOFhsNjI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 2025-02-10 at 16:27 +0100, Brendan Jackman wrote: > On Thu, 6 Feb 2025 at 05:46, Rik van Riel wrote: > > =C2=A0/* Wait for INVLPGB originated by this CPU to complete. */ > > -static inline void tlbsync(void) > > +static inline void __tlbsync(void) > > =C2=A0{ > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cant_migrate(); >=20 > Why does this have to go away? I'm not sure the current task in sched_init() has all the correct bits set to prevent the warning from firing, but on the flip side it won't have called INVLPGB yet at that point, so the call to enter_lazy_tlb() won't actually end up here. I'll put it back. >=20 > > diff --git a/arch/x86/include/asm/tlbflush.h > > b/arch/x86/include/asm/tlbflush.h > > index 234277a5ef89..bf167e215e8e 100644 > > --- a/arch/x86/include/asm/tlbflush.h > > +++ b/arch/x86/include/asm/tlbflush.h > > @@ -106,6 +106,7 @@ struct tlb_state { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * need to be invalidat= ed. > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bool invalidate_other; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bool need_tlbsync; >=20 > The ifdeffery is missing here. Added. >=20 > > @@ -794,6 +825,8 @@ void switch_mm_irqs_off(struct mm_struct > > *unused, struct mm_struct *next, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (IS_ENABLED(CONFIG_PROVE_= LOCKING)) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 WARN_ON_ONCE(!irqs_disabled()); > >=20 > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tlbsync(); > > + > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Verify that CR3 is w= hat we think it is.=C2=A0 This will catch > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * hypothetical buggy c= ode that directly switches to > > swapper_pg_dir > > @@ -973,6 +1006,8 @@ void switch_mm_irqs_off(struct mm_struct > > *unused, struct mm_struct *next, > > =C2=A0 */ > > =C2=A0void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk= ) > > =C2=A0{ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 tlbsync(); > > + >=20 > I have a feeling I'll look stupid for asking this, but why do we need > this and the one in switch_mm_irqs_off()? This is an architectural thing: TLBSYNC waits for the INVLPGB flushes to finish that were issued from the same CPU. That means if we have pending flushes (from the pageout code), we need to wait for them at context switch time, before the task could potentially be migrated to another CPU. >=20 > > @@ -1661,12 +1694,53 @@ void arch_tlbbatch_flush(struct > > arch_tlbflush_unmap_batch *batch) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 local_irq_enable(); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } > >=20 > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * If we issued (asynchronou= s) INVLPGB flushes, wait for > > them here. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * The cpumask above contain= s only CPUs that were running > > tasks > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * not using broadcast TLB f= lushing. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (cpu_feature_enabled(X86_FEATU= RE_INVLPGB)) >=20 > It feels wrong that we check the cpufeature here but not in e.g. > enter_lazy_tlb(). >=20 > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 tlbsync(); > > + We no longer need to check it here, with the change to tlbsync. Good catch. --=20 All Rights Reversed.