From: Kevin Brodsky <kevin.brodsky@arm.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
Kevin Brodsky <kevin.brodsky@arm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Andreas Larsson <andreas@gaisler.com>,
Andrew Morton <akpm@linux-foundation.org>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Hildenbrand <david@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Jann Horn <jannh@google.com>, Juergen Gross <jgross@suse.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Ryan Roberts <ryan.roberts@arm.com>,
Suren Baghdasaryan <surenb@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>, Will Deacon <will@kernel.org>,
Yeoreum Yun <yeoreum.yun@arm.com>,
linux-arm-kernel@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org,
xen-devel@lists.xenproject.org, x86@kernel.org
Subject: [PATCH v3 00/13] Nesting support for lazy MMU mode
Date: Wed, 15 Oct 2025 09:27:14 +0100 [thread overview]
Message-ID: <20251015082727.2395128-1-kevin.brodsky@arm.com> (raw)
When the lazy MMU mode was introduced eons ago, it wasn't made clear
whether such a sequence was legal:
arch_enter_lazy_mmu_mode()
...
arch_enter_lazy_mmu_mode()
...
arch_leave_lazy_mmu_mode()
...
arch_leave_lazy_mmu_mode()
It seems fair to say that nested calls to
arch_{enter,leave}_lazy_mmu_mode() were not expected, and most
architectures never explicitly supported it.
Ryan Roberts' series from March [1] attempted to prevent nesting from
ever occurring, and mostly succeeded. Unfortunately, a corner case
(DEBUG_PAGEALLOC) may still cause nesting to occur on arm64. Ryan
proposed [2] to address that corner case at the generic level but this
approach received pushback; [3] then attempted to solve the issue on
arm64 only, but it was deemed too fragile.
It feels generally difficult to guarantee that lazy_mmu sections don't
nest, because callers of various standard mm functions do not know if
the function uses lazy_mmu itself. This series therefore performs a
U-turn and adds support for nested lazy_mmu sections, on all
architectures.
v3 is a full rewrite of the series based on the feedback from David
Hildenbrand on v2. Nesting is now handled using a counter in task_struct
(patch 7), like other APIs such as pagefault_{disable,enable}().
This is fully handled in a new generic layer in <linux/pgtable.h>; the
existing arch_* API remains unchanged. A new pair of calls,
lazy_mmu_mode_{pause,resume}(), is also introduced to allow functions
that are called with the lazy MMU mode enabled to temporarily pause it,
regardless of nesting.
An arch now opts in to using the lazy MMU mode by selecting
CONFIG_ARCH_LAZY_MMU; this is more appropriate now that we have a
generic API, especially with state conditionally added to task_struct.
The overall approach is very close to what David proposed on v2 [4].
Unlike in v1/v2, no special provision is made for architectures to
save/restore extra state when entering/leaving the mode. Based on the
discussions so far, this does not seem to be required - an arch can
store any relevant state in thread_struct during arch_enter() and
restore it in arch_leave(). Nesting is not a concern as these functions
are only called at the top level, not in nested sections.
The introduction of a generic layer, and tracking of the lazy MMU state
in task_struct, also allows to streamline the arch callbacks - this
series removes 72 lines from arch/.
Patch overview:
* Patch 1: cleanup - avoids having to deal with the powerpc
context-switching code
* Patch 2-4: prepare arch_flush_lazy_mmu_mode() to be called from the
generic layer (patch 7)
* Patch 5-6: new API + CONFIG_ARCH_LAZY_MMU
* Patch 7: nesting support
* Patch 8-13: move as much handling as possible to the generic layer
This series has been tested by running the mm kselfetsts on arm64 with
DEBUG_VM, DEBUG_PAGEALLOC and KFENCE. It was also build-tested on other
architectures (with and without XEN_PV on x86).
- Kevin
[1] https://lore.kernel.org/all/20250303141542.3371656-1-ryan.roberts@arm.com/
[2] https://lore.kernel.org/all/20250530140446.2387131-1-ryan.roberts@arm.com/
[3] https://lore.kernel.org/all/20250606135654.178300-1-ryan.roberts@arm.com/
[4] https://lore.kernel.org/all/ef343405-c394-4763-a79f-21381f217b6c@redhat.com/
---
Changelog
v2..v3:
- Full rewrite; dropped all Acked-by/Reviewed-by.
- Rebased on v6.18-rc1.
v2: https://lore.kernel.org/all/20250908073931.4159362-1-kevin.brodsky@arm.com/
v1..v2:
- Rebased on mm-unstable.
- Patch 2: handled new calls to enter()/leave(), clarified how the "flush"
pattern (leave() followed by enter()) is handled.
- Patch 5,6: removed unnecessary local variable [Alexander Gordeev's
suggestion].
- Added Mike Rapoport's Acked-by.
v1: https://lore.kernel.org/all/20250904125736.3918646-1-kevin.brodsky@arm.com/
---
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yeoreum Yun <yeoreum.yun@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: x86@kernel.org
---
Alexander Gordeev (1):
powerpc/64s: Do not re-activate batched TLB flush
Kevin Brodsky (12):
x86/xen: simplify flush_lazy_mmu()
powerpc/mm: implement arch_flush_lazy_mmu_mode()
sparc/mm: implement arch_flush_lazy_mmu_mode()
mm: introduce CONFIG_ARCH_LAZY_MMU
mm: introduce generic lazy_mmu helpers
mm: enable lazy_mmu sections to nest
arm64: mm: replace TIF_LAZY_MMU with in_lazy_mmu_mode()
powerpc/mm: replace batch->active with in_lazy_mmu_mode()
sparc/mm: replace batch->active with in_lazy_mmu_mode()
x86/xen: use lazy_mmu_state when context-switching
mm: bail out of lazy_mmu_mode_* in interrupt context
mm: introduce arch_wants_lazy_mmu_mode()
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/pgtable.h | 46 +------
arch/arm64/include/asm/thread_info.h | 3 +-
arch/arm64/mm/mmu.c | 4 +-
arch/arm64/mm/pageattr.c | 4 +-
.../include/asm/book3s/64/tlbflush-hash.h | 25 ++--
arch/powerpc/include/asm/thread_info.h | 2 -
arch/powerpc/kernel/process.c | 25 ----
arch/powerpc/mm/book3s64/hash_tlb.c | 10 +-
arch/powerpc/mm/book3s64/subpage_prot.c | 4 +-
arch/powerpc/platforms/Kconfig.cputype | 1 +
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/tlbflush_64.h | 5 +-
arch/sparc/mm/tlb.c | 14 +--
arch/x86/Kconfig | 1 +
arch/x86/boot/compressed/misc.h | 1 +
arch/x86/boot/startup/sme.c | 1 +
arch/x86/include/asm/paravirt.h | 1 -
arch/x86/include/asm/pgtable.h | 3 +-
arch/x86/include/asm/thread_info.h | 4 +-
arch/x86/xen/enlighten_pv.c | 3 +-
arch/x86/xen/mmu_pv.c | 9 +-
fs/proc/task_mmu.c | 4 +-
include/linux/mm_types_task.h | 5 +
include/linux/pgtable.h | 114 +++++++++++++++++-
include/linux/sched.h | 19 +++
mm/Kconfig | 3 +
mm/kasan/shadow.c | 8 +-
mm/madvise.c | 18 +--
mm/memory.c | 16 +--
mm/migrate_device.c | 4 +-
mm/mprotect.c | 4 +-
mm/mremap.c | 4 +-
mm/userfaultfd.c | 4 +-
mm/vmalloc.c | 12 +-
mm/vmscan.c | 12 +-
36 files changed, 226 insertions(+), 169 deletions(-)
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
--
2.47.0
next reply other threads:[~2025-10-15 8:27 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-15 8:27 Kevin Brodsky [this message]
2025-10-15 8:27 ` [PATCH v3 01/13] powerpc/64s: Do not re-activate batched TLB flush Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 02/13] x86/xen: simplify flush_lazy_mmu() Kevin Brodsky
2025-10-15 16:52 ` Dave Hansen
2025-10-16 7:32 ` Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 03/13] powerpc/mm: implement arch_flush_lazy_mmu_mode() Kevin Brodsky
2025-10-23 19:36 ` David Hildenbrand
2025-10-24 12:09 ` Kevin Brodsky
2025-10-24 14:42 ` David Hildenbrand
2025-10-24 14:54 ` Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 04/13] sparc/mm: " Kevin Brodsky
2025-10-23 19:37 ` David Hildenbrand
2025-10-15 8:27 ` [PATCH v3 05/13] mm: introduce CONFIG_ARCH_LAZY_MMU Kevin Brodsky
2025-10-18 9:52 ` Mike Rapoport
2025-10-20 10:37 ` Kevin Brodsky
2025-10-23 19:38 ` David Hildenbrand
2025-10-15 8:27 ` [PATCH v3 06/13] mm: introduce generic lazy_mmu helpers Kevin Brodsky
2025-10-17 15:54 ` Alexander Gordeev
2025-10-20 10:32 ` Kevin Brodsky
2025-10-23 19:52 ` David Hildenbrand
2025-10-24 12:13 ` Kevin Brodsky
2025-10-24 13:27 ` David Hildenbrand
2025-10-24 14:32 ` Kevin Brodsky
2025-10-27 16:24 ` David Hildenbrand
2025-10-28 10:34 ` Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 07/13] mm: enable lazy_mmu sections to nest Kevin Brodsky
2025-10-23 20:00 ` David Hildenbrand
2025-10-24 12:16 ` Kevin Brodsky
2025-10-24 13:23 ` David Hildenbrand
2025-10-24 14:33 ` Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 08/13] arm64: mm: replace TIF_LAZY_MMU with in_lazy_mmu_mode() Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 09/13] powerpc/mm: replace batch->active " Kevin Brodsky
2025-10-23 20:02 ` David Hildenbrand
2025-10-24 12:16 ` Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 10/13] sparc/mm: " Kevin Brodsky
2025-10-23 20:03 ` David Hildenbrand
2025-10-15 8:27 ` [PATCH v3 11/13] x86/xen: use lazy_mmu_state when context-switching Kevin Brodsky
2025-10-23 20:06 ` David Hildenbrand
2025-10-24 14:47 ` David Woodhouse
2025-10-24 14:51 ` David Hildenbrand
2025-10-24 15:13 ` David Woodhouse
2025-10-24 15:16 ` David Hildenbrand
2025-10-24 15:38 ` John Paul Adrian Glaubitz
2025-10-24 15:47 ` David Hildenbrand
2025-10-24 15:51 ` John Paul Adrian Glaubitz
2025-10-27 12:38 ` David Hildenbrand
2025-10-24 22:52 ` Demi Marie Obenour
2025-10-27 12:29 ` David Hildenbrand
2025-10-27 13:32 ` Kevin Brodsky
2025-10-24 15:05 ` Kevin Brodsky
2025-10-24 15:17 ` David Woodhouse
2025-10-27 13:38 ` Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 12/13] mm: bail out of lazy_mmu_mode_* in interrupt context Kevin Brodsky
2025-10-23 20:08 ` David Hildenbrand
2025-10-24 12:17 ` Kevin Brodsky
2025-10-15 8:27 ` [PATCH v3 13/13] mm: introduce arch_wants_lazy_mmu_mode() Kevin Brodsky
2025-10-23 20:10 ` David Hildenbrand
2025-10-24 12:17 ` Kevin Brodsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251015082727.2395128-1-kevin.brodsky@arm.com \
--to=kevin.brodsky@arm.com \
--cc=Liam.Howlett@oracle.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=andreas@gaisler.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=hpa@zytor.com \
--cc=jannh@google.com \
--cc=jgross@suse.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=maddy@linux.ibm.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=sparclinux@vger.kernel.org \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
--cc=yeoreum.yun@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox