From: Alexander Gordeev <agordeev@linux.ibm.com>
To: Kevin Brodsky <kevin.brodsky@arm.com>,
David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: [PATCH v2 1/6] mm: Make lazy MMU mode context-aware
Date: Wed, 15 Apr 2026 17:01:19 +0200 [thread overview]
Message-ID: <8809412aaed8a515fe2e149c822543d640060936.1776264097.git.agordeev@linux.ibm.com> (raw)
In-Reply-To: <cover.1776264097.git.agordeev@linux.ibm.com>
Lazy MMU mode is assumed to be context-independent, in the sense
that it does not need any additional information while operating.
However, the s390 architecture benefits from knowing the exact
page table entries being modified.
Introduce lazy_mmu_mode_enable_for_pte_range(), which is provided
with the process address space and the page table being operated on.
This information is required to enable s390-specific optimizations.
The function takes parameters that are typically passed to page-
table level walkers, which implies that the span of PTE entries
never crosses a page table boundary.
Architectures that do not require such information simply do not
need to define the lazy_mmu_mode_enable_for_pte_range() callback.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
---
fs/proc/task_mmu.c | 2 +-
include/linux/pgtable.h | 47 +++++++++++++++++++++++++++++++++++++++++
mm/madvise.c | 8 +++----
mm/memory.c | 8 +++----
mm/mprotect.c | 2 +-
mm/mremap.c | 2 +-
mm/vmalloc.c | 6 +++---
7 files changed, 61 insertions(+), 14 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e091931d7ca1..799db0d7ec8b 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -2752,7 +2752,7 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start,
return 0;
}
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(vma->vm_mm, start, end, start_pte);
if ((p->arg.flags & PM_SCAN_WP_MATCHING) && !p->vec_out) {
/* Fast path for performing exclusive WP */
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index a50df42a893f..9ff7b78d65b1 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -271,6 +271,51 @@ static inline void lazy_mmu_mode_enable(void)
arch_enter_lazy_mmu_mode();
}
+#ifndef arch_enter_lazy_mmu_mode_for_pte_range
+static inline void arch_enter_lazy_mmu_mode_for_pte_range(struct mm_struct *mm,
+ unsigned long addr, unsigned long end, pte_t *ptep)
+{
+ arch_enter_lazy_mmu_mode();
+}
+#endif
+
+/**
+ * lazy_mmu_mode_enable_for_pte_range() - Enable the lazy MMU mode with a speedup hint.
+ * @mm: Address space the pages are mapped into.
+ * @addr: Start address of the range.
+ * @end: End address of the range.
+ * @ptep: Page table pointer for the first entry.
+ *
+ * Enters a new lazy MMU mode section; if the mode was not already enabled,
+ * enables it and calls arch_enter_lazy_mmu_mode_for_pte_range().
+ *
+ * PTEs that fall within the specified range might observe update speedups.
+ * The PTE range must belong to the specified memory space and not cross
+ * a page table boundary.
+ *
+ * There are no requirements on the order or range completeness of PTE
+ * updates for the specified range.
+ *
+ * Must be paired with a call to lazy_mmu_mode_disable().
+ *
+ * Has no effect if called:
+ * - While paused - see lazy_mmu_mode_pause()
+ * - In interrupt context
+ */
+static inline void lazy_mmu_mode_enable_for_pte_range(struct mm_struct *mm,
+ unsigned long addr, unsigned long end, pte_t *ptep)
+{
+ struct lazy_mmu_state *state = ¤t->lazy_mmu_state;
+
+ if (in_interrupt() || state->pause_count > 0)
+ return;
+
+ VM_WARN_ON_ONCE(state->enable_count == U8_MAX);
+
+ if (state->enable_count++ == 0)
+ arch_enter_lazy_mmu_mode_for_pte_range(mm, addr, end, ptep);
+}
+
/**
* lazy_mmu_mode_disable() - Disable the lazy MMU mode.
*
@@ -353,6 +398,8 @@ static inline void lazy_mmu_mode_resume(void)
}
#else
static inline void lazy_mmu_mode_enable(void) {}
+static inline void lazy_mmu_mode_enable_for_pte_range(struct mm_struct *mm,
+ unsigned long addr, unsigned long end, pte_t *ptep) {}
static inline void lazy_mmu_mode_disable(void) {}
static inline void lazy_mmu_mode_pause(void) {}
static inline void lazy_mmu_mode_resume(void) {}
diff --git a/mm/madvise.c b/mm/madvise.c
index dbb69400786d..7faac3a627ff 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -451,7 +451,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
if (!start_pte)
return 0;
flush_tlb_batched_pending(mm);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, addr, end, start_pte);
for (; addr < end; pte += nr, addr += nr * PAGE_SIZE) {
nr = 1;
ptent = ptep_get(pte);
@@ -506,7 +506,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
if (!start_pte)
break;
flush_tlb_batched_pending(mm);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, addr, end, start_pte);
if (!err)
nr = 0;
continue;
@@ -673,7 +673,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
if (!start_pte)
return 0;
flush_tlb_batched_pending(mm);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, addr, end, start_pte);
for (; addr != end; pte += nr, addr += PAGE_SIZE * nr) {
nr = 1;
ptent = ptep_get(pte);
@@ -733,7 +733,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
if (!start_pte)
break;
flush_tlb_batched_pending(mm);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, addr, end, pte);
if (!err)
nr = 0;
continue;
diff --git a/mm/memory.c b/mm/memory.c
index c65e82c86fed..4c0f266df92a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1269,7 +1269,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
orig_src_pte = src_pte;
orig_dst_pte = dst_pte;
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(src_mm, addr, end, src_pte);
do {
nr = 1;
@@ -1917,7 +1917,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
return addr;
flush_tlb_batched_pending(mm);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, addr, end, start_pte);
do {
bool any_skipped = false;
@@ -2875,7 +2875,7 @@ static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
mapped_pte = pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
if (!pte)
return -ENOMEM;
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, addr, end, mapped_pte);
do {
BUG_ON(!pte_none(ptep_get(pte)));
if (!pfn_modify_allowed(pfn, prot)) {
@@ -3235,7 +3235,7 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
return -EINVAL;
}
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, addr, end, mapped_pte);
if (fn) {
do {
diff --git a/mm/mprotect.c b/mm/mprotect.c
index c0571445bef7..a7bfb4516dc5 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -233,7 +233,7 @@ static long change_pte_range(struct mmu_gather *tlb,
is_private_single_threaded = vma_is_single_threaded_private(vma);
flush_tlb_batched_pending(vma->vm_mm);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(vma->vm_mm, addr, end, pte);
do {
nr_ptes = 1;
oldpte = ptep_get(pte);
diff --git a/mm/mremap.c b/mm/mremap.c
index 2be876a70cc0..16320242da51 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -260,7 +260,7 @@ static int move_ptes(struct pagetable_move_control *pmc,
if (new_ptl != old_ptl)
spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
flush_tlb_batched_pending(vma->vm_mm);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(mm, old_addr, old_end, old_ptep);
for (; old_addr < old_end; old_ptep += nr_ptes, old_addr += nr_ptes * PAGE_SIZE,
new_ptep += nr_ptes, new_addr += nr_ptes * PAGE_SIZE) {
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 61caa55a4402..35a23044a969 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -108,7 +108,7 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
if (!pte)
return -ENOMEM;
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(&init_mm, addr, end, pte);
do {
if (unlikely(!pte_none(ptep_get(pte)))) {
@@ -371,7 +371,7 @@ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
unsigned long size = PAGE_SIZE;
pte = pte_offset_kernel(pmd, addr);
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(&init_mm, addr, end, pte);
do {
#ifdef CONFIG_HUGETLB_PAGE
@@ -538,7 +538,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (!pte)
return -ENOMEM;
- lazy_mmu_mode_enable();
+ lazy_mmu_mode_enable_for_pte_range(&init_mm, addr, end, pte);
do {
struct page *page = pages[*nr];
--
2.51.0
next prev parent reply other threads:[~2026-04-15 15:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 15:01 [PATCH v2 0/6] s390/mm: Batch PTE updates in lazy MMU mode Alexander Gordeev
2026-04-15 15:01 ` Alexander Gordeev [this message]
2026-04-15 15:01 ` [PATCH v2 2/6] mm/pgtable: Fix bogus comment to clear_not_present_full_ptes() Alexander Gordeev
2026-04-15 15:01 ` [PATCH v2 3/6] s390/mm: Complete ptep_get() conversion Alexander Gordeev
2026-04-15 15:01 ` [PATCH v2 4/6] s390/mm: Make PTC and UV call order consistent Alexander Gordeev
2026-04-15 15:01 ` [PATCH v2 5/6] s390/mm: Batch PTE updates in lazy MMU mode Alexander Gordeev
2026-04-15 15:01 ` [PATCH v2 6/6] s390/mm: Allow lazy MMU mode disabling Alexander Gordeev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8809412aaed8a515fe2e149c822543d640060936.1776264097.git.agordeev@linux.ibm.com \
--to=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=borntraeger@linux.ibm.com \
--cc=david@redhat.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=kevin.brodsky@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox