From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 418D2C0219B for ; Tue, 11 Feb 2025 10:02:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 972EE6B007B; Tue, 11 Feb 2025 05:02:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 921E56B0082; Tue, 11 Feb 2025 05:02:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C2C76B0083; Tue, 11 Feb 2025 05:02:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 60DAE6B007B for ; Tue, 11 Feb 2025 05:02:42 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 12C381A14E6 for ; Tue, 11 Feb 2025 10:02:42 +0000 (UTC) X-FDA: 83107224564.19.1052648 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf11.hostedemail.com (Postfix) with ESMTP id 3ECD64000F for ; Tue, 11 Feb 2025 10:02:40 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LZd4zvbn; spf=pass (imf11.hostedemail.com: domain of jackmanb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=jackmanb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739268160; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eKLVYn9MNX6w1agOYUoY0ZzZutoTaHUPZL5isC3H0Ic=; b=PNAB6KlIMp7fSm/7BGsTNGw2N+yPmg4jq6a5kLOm2SsgcpDSSftNcs0QzUkYN6KRA6H9aC XmEKFt/gtDL3N+LmfQN2CzPaKJyZsmNWCfdNS6uSFlEsosFn/8cyoGh1hobvXmxuj/r6vn IplWMyQosGhQcRbjUOW/QNC1sThUiVU= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LZd4zvbn; spf=pass (imf11.hostedemail.com: domain of jackmanb@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=jackmanb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739268160; a=rsa-sha256; cv=none; b=fw0ArMx6jMiIK5neGri93MA6u+amaCLfBSUOlIfE0n9ah2TIkVeGVfwZWBEArtEQrelt+M KqnlaEuda0GwgWPk9Xljkuz/NBY7liLLyjXJzBRfwbZMh3i/1mYe/7ehcke+5PQJLaOJvF 5AZvIOJog1JBmK9HE+k7YR1sYCKjCLE= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-47190a013d4so148591cf.1 for ; Tue, 11 Feb 2025 02:02:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739268159; x=1739872959; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=eKLVYn9MNX6w1agOYUoY0ZzZutoTaHUPZL5isC3H0Ic=; b=LZd4zvbn8+TZo6vK+G/xVQbK4w9brX+ub7Q6iMwOLsFuATF0mhClQ3hjt6jWfedqtb DRUiNI4PqUojG+lkOJwKSy9FttDyjScWjmvrk34yuf51m5LdQZMLbjMZNyTUxlTbDRwo JZf4yc5DXzjyatPIoLm48RSJthCglHjynnlB1tLJj1KZqegDyo13020mq5buGV+Og6gm gj1yAoylydCl3uTOtvFE6MqFKhKZDgGq+E8mG+q9amb0DNJe1QBa4A1SEJgb4te9gCSC 3DqITNcIoobNyd2y+tL+VUtAIiAtMYkEFH7k9jFYKwO5O83dt6KlO9w4JBmWjSFxamPC QHaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739268159; x=1739872959; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=eKLVYn9MNX6w1agOYUoY0ZzZutoTaHUPZL5isC3H0Ic=; b=tE7xQ5oRN2sYjOtr4W++2eBw9MSVyU28WuSRmdbIbsCmTQpuhdPM2PhkkeH15NA7F5 +OUUvWwYVqdVtlz5MjwhXsFl0ekEsr0YfjLutvSlGkquYKnGhKAdZgjyalPRFf2u/n3y MUQjX18b6maZn1JkZOGRZYnXVem/NCZZZ+ig5bPR8fvwLFOEBWPCNJ2OTb8yZSkXwYdq UdddJ/KJCrQQA3de1FZg9bHbPUYvjBeiUe83ZIBxBhxxgR+kzB62S8SUbNI3qkzQYrdQ ht4ZhCeU9gsBncI1rgJayzNfjTkH3C6Mh6CgPYxx0qxCz7NqkLmxzbnIcObUqz49gwXl IUOQ== X-Forwarded-Encrypted: i=1; AJvYcCXtptZNBF5RDL5rLg7g6OGyr/UG/TAfMFnGjSAUN4UPgwS4njpJXUWhWw1ZKtgQjkIubhx4lg79OQ==@kvack.org X-Gm-Message-State: AOJu0YyieFOdbgOVpzX7lXg0Yl8oO5bMZ5mlodVmlxUZLjIF89ASbZhn +JW2ocVCkB1z1zwT586HA2TV6UWLH0NGUw/aAy64mXqSWusZEu+73ETIBzYs7QGv+Co6jy4vyPt qhx00sQWOLZ28A/hI4Fvur3Gz9l5Vfxqa2y09 X-Gm-Gg: ASbGncuqJWrvwyDSoAcl7UbR7SCuChc+jDYZe1cwV3NwRgfkUDOCpoWUHjLYmoOg7ZA aTRthnDuF7ztAY9ZfeyJlWeHDIEwrCorUdj29r2J9IYzTBV16GoTMnfFU7ZEFtZPjW2pVKTe3BY OCvNDkCThn9yMtsRbfG8de9Yq308U= X-Google-Smtp-Source: AGHT+IG9wgWbPuNrL5f3yGYLS5oyE9nJ8NVKVFY1HedcZ2iODNFy9eXq2tqvtxaiX1EX+N4ZCbS4zR4ophnheDMh7Bk= X-Received: by 2002:ac8:5707:0:b0:465:3d28:8c02 with SMTP id d75a77b69052e-471a2400771mr3200741cf.26.1739268158790; Tue, 11 Feb 2025 02:02:38 -0800 (PST) MIME-Version: 1.0 References: <20250206044346.3810242-1-riel@surriel.com> <20250206044346.3810242-11-riel@surriel.com> <2d20c333400b890f4983cf799576435abf1d8824.camel@surriel.com> In-Reply-To: <2d20c333400b890f4983cf799576435abf1d8824.camel@surriel.com> From: Brendan Jackman Date: Tue, 11 Feb 2025 11:02:27 +0100 X-Gm-Features: AWEUYZmACdHAjvQ84WF4VaVnnZCnBE3s7R27fpvrPXwIWGvbu2V7Sf7mmtUYrlg Message-ID: Subject: Re: [PATCH v9 10/12] x86/mm: do targeted broadcast flushing from tlbbatch code To: Rik van Riel Cc: x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Manali Shukla Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 3ECD64000F X-Stat-Signature: ksitpeurr5wgbrtk98cqqy595y3a4c1c X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1739268160-883436 X-HE-Meta: U2FsdGVkX1+76tKOyhgLrUsNyLMmuQwbQYPcvtcJRno5sKwdjJV80fJi7GKIVYBVzDxvqlOOjm1z1wGYIf+YfqQi6DXmegbBVs9qdtvohTKI4WnSwhVWfWUyH/ChM63ZzGceWqy+1BTWmrgkETSJcVZ9j1nqXwg/ZMIgJwaQepRwjS5BaJgclnATLXEJ/W8Kzm+4ySOqniZN3bBN+PegRJNV/1Rev+i+vuGlOhxMtplVUgkZMO54XgMeSOMvo/qTbIGwGljosAqXuelZTAxMqJK7KaNqUnyoZSx4OnCAnFCNSyf2eRjNHLhWD53+CTx4AAVKczkNDFwc+TLGKjDgfdJgfUqgEgb4qYlW3//E7XrsxvWYtIEXQ7tsBd5CjQ5WJDRiHZ80K+Dwu6nNxZ47pX+g0Ft8ckIjEgPyonCHky4NVMOhk+YOesTN1zteTMn9kahfoDDy6VU6BfSPoU2r7JgBxeIsP+XW80bPrKofU0fprxiSDWqVrLAuzIdjVJ2GqoLg6nJiMbXgnCon693gpbqdbaolQBTJcRoVecEsAM0hpnSA+xBWcg3QaNNjSPHTD0IvJQaLzQoBCZw0uTYdfhR3MH2MWCbkVHX5ff/59n5FHBmYLwHee4otnvheW4Kf9PgOhMd1INFX+5+OMrsfSGuNM6fasKIHY6jcMt9/P12e5gFphiDD10HIrobvBogJwB2OPxq0+j2KoAlZSafuJzWbYKOVkvRDyvLGGwOiWFxegJhOlqAcp8n7e44KT/gZ0pRMWQrjVF76H+w7Rw5eb3i/HDkqZrwdfxmLwS9vXyI6XuVgSnYKItjhzxB6OIvV8gz8nAHcZ9qwkJe818kacesYnpw1yDI9lc7dVHTnBbweXMNCKibCDyEr+xFoDoVMo+WCqmhsgNmqkpXW7fuis9zo9BtaECO5rx+gSVl4JdEuLdGUnVRYerpMPOExY0nq54/OAui98KGs3YhwvOZ IEvk1Q/g ud0ySyRLIfmV7PxEwXyqGlyNqJ7bVX7oruccMOcmNXSi0uMEkl9A4chjBsLgnrspQTovYnOtKGruRSkOSl3NN3V78OpMXBw+4Pcboms3EHHvpNa8134an9giOO4iUPEBFsGMkBY5x7OHW+D0EgBGjaSyLUzN7SMHcMNCxZhEdaEfcsdG+CtzXGufWs18yNZEIeky+0kYQ2Ge+9C7jSY2S6MJ1COONAsX3E6YTq5HcZanjJccdO4v6D56E7G3C+/H0IopHvXjFs519nGZ4IMTdI+DB2iTvIB2/7eetxNc7fUcTpRsvhmgRBRiDAP7zfeY+mrXUOrwUv0ZUjPU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 11 Feb 2025 at 04:50, Rik van Riel wrote: > > On Mon, 2025-02-10 at 16:27 +0100, Brendan Jackman wrote: > > On Thu, 6 Feb 2025 at 05:46, Rik van Riel wrote: > > > /* Wait for INVLPGB originated by this CPU to complete. */ > > > -static inline void tlbsync(void) > > > +static inline void __tlbsync(void) > > > { > > > - cant_migrate(); > > > > Why does this have to go away? > > I'm not sure the current task in sched_init() has > all the correct bits set to prevent the warning > from firing, but on the flip side it won't have > called INVLPGB yet at that point, so the call to > enter_lazy_tlb() won't actually end up here. > > I'll put it back. Sounds good. FWIW I think if we do run into early-boot code hitting false DEBUG_ATOMIC_SLEEP warnings, the best response might be to update the DEBUG_ATOMIC_SLEEP code. Like maybe there's a more targeted solution but something roughly equivalent to checking if (system_state == SYSTEM_STATE_SCHEDULING) before the warning. > > > @@ -794,6 +825,8 @@ void switch_mm_irqs_off(struct mm_struct > > > *unused, struct mm_struct *next, > > > if (IS_ENABLED(CONFIG_PROVE_LOCKING)) > > > WARN_ON_ONCE(!irqs_disabled()); > > > > > > + tlbsync(); > > > + > > > /* > > > * Verify that CR3 is what we think it is. This will catch > > > * hypothetical buggy code that directly switches to > > > swapper_pg_dir > > > @@ -973,6 +1006,8 @@ void switch_mm_irqs_off(struct mm_struct > > > *unused, struct mm_struct *next, > > > */ > > > void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) > > > { > > > + tlbsync(); > > > + > > > > I have a feeling I'll look stupid for asking this, but why do we need > > this and the one in switch_mm_irqs_off()? > > This is an architectural thing: TLBSYNC waits for > the INVLPGB flushes to finish that were issued > from the same CPU. > > That means if we have pending flushes (from the > pageout code), we need to wait for them at context > switch time, before the task could potentially be > migrated to another CPU. Oh right thanks, that makes sense. So I think here we're encoding the assumption that context_switch() always calls either enter_lazy_tlb() or switch_mm_irqs_off(), which is a little awkward, plus the job of these functions is already kinda hazy and this makes it even hazier. What about doing it in arch_start_context_switch() instead? That would mean a bit of plumbing since we'd still wanna have the tlbsync() in tlb.c, but that seems worth it to me. Plus, having it in one place would give us a spot to add a comment. Now that you point it out it does indeed seem obvious but it didn't seem so yesterday. Now I think about it... if we always tlbsync() before a context switch, is the cant_migrate() above actually required? I think with that, even if we migrated in the middle of e.g. broadcast_kernel_range_flush(), we'd be fine? (At least, from the specific perspective of the invplgb code, presumably having preemption on there would break things horribly in other ways).