From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3465EE77188 for ; Tue, 14 Jan 2025 21:14:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BDBDD280005; Tue, 14 Jan 2025 16:14:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B6492280004; Tue, 14 Jan 2025 16:14:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DE0D280005; Tue, 14 Jan 2025 16:14:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7F44F280004 for ; Tue, 14 Jan 2025 16:14:02 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 33B2DC09EE for ; Tue, 14 Jan 2025 21:14:02 +0000 (UTC) X-FDA: 83007309924.03.198D147 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf14.hostedemail.com (Postfix) with ESMTP id 5A926100007 for ; Tue, 14 Jan 2025 21:14:00 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hFqlg7ru; spf=pass (imf14.hostedemail.com: domain of 3ltOGZwYKCJsN95IE7BJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--seanjc.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3ltOGZwYKCJsN95IE7BJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736889240; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d9cRGGic7Te5o12Iwy1M3OB31drPOXRp+PHli2u04Fc=; b=wQiDkHegkTrp/piOvevxIFUhXJ6MHpmuVAPPQC4AjAr0tLU46oWUlpVUmrPjFmg2L93t/1 w52o8VzhNiMTD4T80QRv5qi5XuDKgWZJgVJJ1OXMcCKVfkdXGiB3+rKiuGUp3n1vd8VS0p OAclP1Xqy7vR5FSvz9/hgWhUihigLTE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736889240; a=rsa-sha256; cv=none; b=K0OgOZDw79+IvkmssG/kBJUGU1e7gxRaM/cJ6RHX2211N6x8Utj4/oqbKnAQuHS4JbX66g T2F8vU69YnPtr2kaTWnEOEbLjI6XM0KM2ec19eJVMMMNX1VuncBzhzQoUzJxpZ458agrqX CdsqNsm4RP8Mut4vQu7ant/MDPZxYDs= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hFqlg7ru; spf=pass (imf14.hostedemail.com: domain of 3ltOGZwYKCJsN95IE7BJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--seanjc.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3ltOGZwYKCJsN95IE7BJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ee5668e09bso10664012a91.3 for ; Tue, 14 Jan 2025 13:14:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736889239; x=1737494039; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=d9cRGGic7Te5o12Iwy1M3OB31drPOXRp+PHli2u04Fc=; b=hFqlg7ruckifSAW4zZsAMYhHmOTvakMTaQRANcywH2/g1+Qvj6aRD/IV8kIc9t8Zr+ VEZ10L9KKvlL6tQRL19TKd7sERgBgCOi/DmKz/uis+/krslFzlbIk06mRYp6+/UPqUhA HKM/puigDvroH5gp283/5qTQ6bzrTFMLeZzcNsg2nD3Ae/zabV6sibWp/1po5T1dxtNJ XW+9OuUjpIgi64J3mISsHJqtYJBKXVkDuXHbj7eDI/z9qkBmEJrSintFy78KjXqvYPFI G0WTbp602rhvY4hYj1t6rrCNQ3k8vg+K6bYBZzgi9ORSgnBfuw4LeniNljw7oMCmeHK2 sDlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736889239; x=1737494039; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=d9cRGGic7Te5o12Iwy1M3OB31drPOXRp+PHli2u04Fc=; b=ZbsbTLZ6zDWCTx4gamR23BTLvzoCwKUGiRIT0OkAyoya1bGBoQfVZ/u3LkpuUwn2EZ 5YMs3hIVovf79si/QEKL5aGIoFnTtciH++mnz0hqnJj44BR38rK3I62kV9HHKxEVKSV+ HqxvltHHMfZTNch0JMft2b3kbjdrANOC4BNRepfTD3XDUi/zrM4hRoLVmA+RTV8mBbpA jrrQj2VWMjrErRgwygS+a3MQ4stlF4LZSp/kSLTyU5NNcKyC/dzoy8T/INuQMW/KtOqv YPJ9D2Tr3M4ZPdjh5Pym39RQp8xUFd8IJHmhxbau3An2JEy1GY23p8lPQCfZ4cHCh0+o p85g== X-Forwarded-Encrypted: i=1; AJvYcCUebAX6uClhVbSZSUglzTZvPO0m1sB5PYzAPGSSbYFJl4HXsUTjQ+Umjv5ucoXihrGMxzwFw+GnZA==@kvack.org X-Gm-Message-State: AOJu0YxEx8HUwGCMJaZtf6gBrF+Gq9JCI1itzjkcTEZoWyB8MxixkfYP G/mEKArHbHxa3n/q/gGSYQclX/4orXYS1AsAS+PDCMeSswh2I8+X9qKKdMMM0zLWbQ1U4wU9CyD QAA== X-Google-Smtp-Source: AGHT+IFnD8IBYtaFLZfmn+yp8faKF4j5W1D2QQ93HaiFkyR3KITtNCKyHKA+6Vtw5k/ROipPB4Gea/5UJbQ= X-Received: from pjbsn8.prod.google.com ([2002:a17:90b:2e88:b0:2f4:465d:5c61]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1f8b:b0:2ee:bf84:4fe8 with SMTP id 98e67ed59e1d1-2f548f1d44cmr36656404a91.30.1736889238866; Tue, 14 Jan 2025 13:13:58 -0800 (PST) Date: Tue, 14 Jan 2025 13:13:57 -0800 In-Reply-To: <20250114175143.81438-26-vschneid@redhat.com> Mime-Version: 1.0 References: <20250114175143.81438-1-vschneid@redhat.com> <20250114175143.81438-26-vschneid@redhat.com> Message-ID: Subject: Re: [PATCH v4 25/30] context_tracking,x86: Defer kernel text patching IPIs From: Sean Christopherson To: Valentin Schneider Cc: linux-kernel@vger.kernel.org, x86@kernel.org, virtualization@lists.linux.dev, linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev, linux-riscv@lists.infradead.org, linux-perf-users@vger.kernel.org, xen-devel@lists.xenproject.org, kvm@vger.kernel.org, linux-arch@vger.kernel.org, rcu@vger.kernel.org, linux-hardening@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, bcm-kernel-feedback-list@broadcom.com, Peter Zijlstra , Nicolas Saenz Julienne , Juergen Gross , Ajay Kaher , Alexey Makhalov , Russell King , Catalin Marinas , Will Deacon , Huacai Chen , WANG Xuerui , Paul Walmsley , Palmer Dabbelt , Albert Ou , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Kan Liang , Boris Ostrovsky , Josh Poimboeuf , Pawan Gupta , Paolo Bonzini , Andy Lutomirski , Arnd Bergmann , Frederic Weisbecker , "Paul E. McKenney" , Jason Baron , Steven Rostedt , Ard Biesheuvel , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Uladzislau Rezki , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Juri Lelli , Clark Williams , Yair Podemsky , Tomas Glozar , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Kees Cook , Andrew Morton , Christoph Hellwig , Shuah Khan , Sami Tolvanen , Miguel Ojeda , Alice Ryhl , "Mike Rapoport (Microsoft)" , Samuel Holland , Rong Xu , Geert Uytterhoeven , Yosry Ahmed , "Kirill A. Shutemov" , "Masami Hiramatsu (Google)" , Jinghao Jia , Luis Chamberlain , Randy Dunlap , Tiezhu Yang Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 5A926100007 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 3rurpik5i819wnry6a5qd91bhki4stqq X-HE-Tag: 1736889240-418490 X-HE-Meta: U2FsdGVkX18wdeJhwEintJsE6DXYwpddOMc8Hwr8mTqmt+Js+f/ombbcVNyWAOf/VvyIkKfTI6BqGKI555SiI7K0QOmgCciRt6my+yox2zEkWMjYTJOuGa5ochsuoA1xSAJFW5h1JxDPuxdq37hCOAQwA/7sND2FgVUPk0iQjl4kW7U/AF6wb1xb3QKfdhZVpyjgPak3zkPsQW2A+cuhjdsrxqRISeJN9uqFF5jomh5e8Fe3VlgEyMQfOwQBLnQWyxLoJfdcA0EKPzm7Yp/Da5MNDqIhaqdtzDe1K748IENGlC0HnqL3UXtIK1fJ8BHChJSpo639aLNiAA6N3EzxGjqn1tZo5Vpz8NNB7RIIF91G+jnfHS30OShWwmza2MEwZoICPpcDS97pgJ+5R9+W12Rz3nB4tSDcqjsjxOmiw8pCSFWk94udvOC4nlQAZ9r5Klv3Q2aWxiEuZhhdOnUlcBH+f3531quNpoPl4bCp42Hd6SGzhodoqQNZxMlpsSmTzpY9hstIoDiyVDSm88HN+gnGuiq9R/eVCaK9POGtaY8nOKrAdjINjaF0ZO6L/X7KpoCFL9Ni3GHIlns9XtkFclXzQd0eI3It9iBqy63rF27gKaKe8ULySrAeKp9ET5vyj92Ng/qHlqbXpUuVerXH62ICInFXj1DNmz4S+Ka7WGm6bog73Bq7kzDd9pfmlUR+5uKGhYOAAazViwluZXuxxoKWE8uAq3sYszWxX9NLA3gIPGN9eIkfC892LmCRE4QI7kjL2n8lUFEI50GTJijpnISFVClzImBgRgwB2p9DV+Hzgv2viUlbuzFXhZ69Loeo5UbXEef/nhk+pJ9etZ1QO2CmOxmedjMcYaCBlI7YZw/jmcqOSXpStdjuqCrWMI0+rAne4R/6I//h2q0yRVFM2IZlw27CrrLu+97DPX7fQn3YxCqj4M9ZIfLOfv4WyqlvF+8XQwk6NWPDQYmpp26 1U0QinYb 3L7b4rFqLiw/Tj/3o1iJSwFhz6YyTmrzq6yNbd+HZQQ8+Vv5YdXGU5AkVh6CaoVMpo76pH+vbP/+2xs9vZY1ttNoLSzQc+RQb+f/tVrb1C/whNmpiILtcY/JMrl6ctk28pdc2n2oHOjdz1QaOw4Bvvwmhu9WJQPq8keyD98KodtDUepEosZCADJxPLevFTTGWf11vsoZHH/F9iC6V70092HMUlX2sw1KrqtfOoseYpOYbuPqTVPWPEXj7lCnZNiN40Q9bExQYbb6KAoXvNKwVTot0CmIHm8bKvwegAx6m60jBcOlnulcuvxoh3jWwvImHF5KZI0Y6ESadc2yL86gfU48Fnfqh3TrDW+rTZc5VP3mpoNSQnssxZsQvI669Yws4VHY3tvp7EuNWSANR175HzRK3u4/d0CmuKdnaY7LzrtogQn6s+crZSx5DC1Hb4jzwczjbpGakOyUmUhJrlJu+AQnjE8ERvRBn6Rp0ivllU7tycipMfOwIY71bCxC8J9Liaqa0kFj9fedwU28Mh38pFgDeKkL2dHGc0milp99MTvnZv2qN5We17mKA7aThBWCM0Ig6Yq7eehp4nbRe2pXV1dLnebiA+dazG8aNZbpmKLhP7Yibyvzdg6wQ318midHxrI8h3RbXMtGkh7UDxenBgVol5GruZov1RtRbJezROuwEWSXysGO70lChcbynVkKTYaciXKMdlTIk8E0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 14, 2025, Valentin Schneider wrote: > text_poke_bp_batch() sends IPIs to all online CPUs to synchronize > them vs the newly patched instruction. CPUs that are executing in userspace > do not need this synchronization to happen immediately, and this is > actually harmful interference for NOHZ_FULL CPUs. ... > This leaves us with static keys and static calls. ... > @@ -2317,11 +2334,20 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries > * First step: add a int3 trap to the address that will be patched. > */ > for (i = 0; i < nr_entries; i++) { > - tp[i].old = *(u8 *)text_poke_addr(&tp[i]); > - text_poke(text_poke_addr(&tp[i]), &int3, INT3_INSN_SIZE); > + void *addr = text_poke_addr(&tp[i]); > + > + /* > + * There's no safe way to defer IPIs for patching text in > + * .noinstr, record whether there is at least one such poke. > + */ > + if (is_kernel_noinstr_text((unsigned long)addr)) > + cond = NULL; Maybe pre-check "cond", especially if multiple ranges need to be checked? I.e. if (cond && is_kernel_noinstr_text(...)) > + > + tp[i].old = *((u8 *)addr); > + text_poke(addr, &int3, INT3_INSN_SIZE); > } > > - text_poke_sync(); > + __text_poke_sync(cond); > > /* > * Second step: update all but the first byte of the patched range. ... > +/** > + * is_kernel_noinstr_text - checks if the pointer address is located in the > + * .noinstr section > + * > + * @addr: address to check > + * > + * Returns: true if the address is located in .noinstr, false otherwise. > + */ > +static inline bool is_kernel_noinstr_text(unsigned long addr) > +{ > + return addr >= (unsigned long)__noinstr_text_start && > + addr < (unsigned long)__noinstr_text_end; > +} This doesn't do the right thing for modules, which matters because KVM can be built as a module on x86, and because context tracking understands transitions to GUEST mode, i.e. CPUs that are running in a KVM guest will be treated as not being in the kernel, and thus will have IPIs deferred. If KVM uses a static key or branch between guest_state_enter_irqoff() and guest_state_exit_irqoff(), the patching code won't wait for CPUs to exit guest mode, i.e. KVM could theoretically use the wrong static path. I don't expect this to ever cause problems in practice, because patching code in KVM's VM-Enter/VM-Exit path that has *functional* implications, while CPUs are actively running guest code, would be all kinds of crazy. But I do think we should plug the hole. If this issue is unique to KVM, i.e. is not a generic problem for all modules (I assume module code generally isn't allowed in the entry path, even via NMI?), one idea would be to let KVM register its noinstr section for text poking.