From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, Zi Yan <ziy@nvidia.com>,
Matthew Brost <matthew.brost@intel.com>,
Joshua Hahn <joshua.hahnjy@gmail.com>,
Rakie Kim <rakie.kim@sk.com>, Byungchul Park <byungchul@sk.com>,
Gregory Price <gourry@gourry.net>,
Ying Huang <ying.huang@linux.alibaba.com>,
Alistair Popple <apopple@nvidia.com>,
Pedro Falcato <pfalcato@suse.de>, Rik van Riel <riel@surriel.com>,
Harry Yoo <harry.yoo@oracle.com>,
Lance Yang <ioworker0@gmail.com>,
Oscar Salvador <osalvador@suse.de>,
Lance Yang <lance.yang@linux.dev>
Subject: [PATCH v2 1/4] mm: convert FPB_IGNORE_* into FPB_RESPECT_*
Date: Wed, 2 Jul 2025 12:49:23 +0200 [thread overview]
Message-ID: <20250702104926.212243-2-david@redhat.com> (raw)
In-Reply-To: <20250702104926.212243-1-david@redhat.com>
Respecting these PTE bits is the exception, so let's invert the meaning.
With this change, most callers don't have to pass any flags. This is
a preparation for splitting folio_pte_batch() into a non-inlined
variant that doesn't consume any flags.
Long-term, we want folio_pte_batch() to probably ignore most common
PTE bits (e.g., write/dirty/young/soft-dirty) that are not relevant for
most page table walkers: uffd-wp and protnone might be bits to consider in
the future. Only walkers that care about them can opt-in to respect them.
No functional change intended.
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/internal.h | 16 ++++++++--------
mm/madvise.c | 3 +--
mm/memory.c | 11 +++++------
mm/mempolicy.c | 4 +---
mm/mlock.c | 3 +--
mm/mremap.c | 3 +--
mm/rmap.c | 3 +--
7 files changed, 18 insertions(+), 25 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index e84217e27778d..170d55b6851ff 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -202,17 +202,17 @@ static inline void vma_close(struct vm_area_struct *vma)
/* Flags for folio_pte_batch(). */
typedef int __bitwise fpb_t;
-/* Compare PTEs after pte_mkclean(), ignoring the dirty bit. */
-#define FPB_IGNORE_DIRTY ((__force fpb_t)BIT(0))
+/* Compare PTEs respecting the dirty bit. */
+#define FPB_RESPECT_DIRTY ((__force fpb_t)BIT(0))
-/* Compare PTEs after pte_clear_soft_dirty(), ignoring the soft-dirty bit. */
-#define FPB_IGNORE_SOFT_DIRTY ((__force fpb_t)BIT(1))
+/* Compare PTEs respecting the soft-dirty bit. */
+#define FPB_RESPECT_SOFT_DIRTY ((__force fpb_t)BIT(1))
static inline pte_t __pte_batch_clear_ignored(pte_t pte, fpb_t flags)
{
- if (flags & FPB_IGNORE_DIRTY)
+ if (!(flags & FPB_RESPECT_DIRTY))
pte = pte_mkclean(pte);
- if (likely(flags & FPB_IGNORE_SOFT_DIRTY))
+ if (likely(!(flags & FPB_RESPECT_SOFT_DIRTY)))
pte = pte_clear_soft_dirty(pte);
return pte_wrprotect(pte_mkold(pte));
}
@@ -236,8 +236,8 @@ static inline pte_t __pte_batch_clear_ignored(pte_t pte, fpb_t flags)
* pages of the same large folio.
*
* All PTEs inside a PTE batch have the same PTE bits set, excluding the PFN,
- * the accessed bit, writable bit, dirty bit (with FPB_IGNORE_DIRTY) and
- * soft-dirty bit (with FPB_IGNORE_SOFT_DIRTY).
+ * the accessed bit, writable bit, dirty bit (unless FPB_RESPECT_DIRTY is set)
+ * and soft-dirty bit (unless FPB_RESPECT_SOFT_DIRTY is set).
*
* start_ptep must map any page of the folio. max_nr must be at least one and
* must be limited by the caller so scanning cannot exceed a single page table.
diff --git a/mm/madvise.c b/mm/madvise.c
index e61e32b2cd91f..661bb743d2216 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -347,10 +347,9 @@ static inline int madvise_folio_pte_batch(unsigned long addr, unsigned long end,
pte_t pte, bool *any_young,
bool *any_dirty)
{
- const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
int max_nr = (end - addr) / PAGE_SIZE;
- return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL,
+ return folio_pte_batch(folio, addr, ptep, pte, max_nr, 0, NULL,
any_young, any_dirty);
}
diff --git a/mm/memory.c b/mm/memory.c
index 0f9b32a20e5b7..0a47583ca9937 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -990,10 +990,10 @@ copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
* by keeping the batching logic separate.
*/
if (unlikely(!*prealloc && folio_test_large(folio) && max_nr != 1)) {
- if (src_vma->vm_flags & VM_SHARED)
- flags |= FPB_IGNORE_DIRTY;
- if (!vma_soft_dirty_enabled(src_vma))
- flags |= FPB_IGNORE_SOFT_DIRTY;
+ if (!(src_vma->vm_flags & VM_SHARED))
+ flags |= FPB_RESPECT_DIRTY;
+ if (vma_soft_dirty_enabled(src_vma))
+ flags |= FPB_RESPECT_SOFT_DIRTY;
nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr, flags,
&any_writable, NULL, NULL);
@@ -1535,7 +1535,6 @@ static inline int zap_present_ptes(struct mmu_gather *tlb,
struct zap_details *details, int *rss, bool *force_flush,
bool *force_break, bool *any_skipped)
{
- const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
struct mm_struct *mm = tlb->mm;
struct folio *folio;
struct page *page;
@@ -1565,7 +1564,7 @@ static inline int zap_present_ptes(struct mmu_gather *tlb,
* by keeping the batching logic separate.
*/
if (unlikely(folio_test_large(folio) && max_nr != 1)) {
- nr = folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags,
+ nr = folio_pte_batch(folio, addr, pte, ptent, max_nr, 0,
NULL, NULL, NULL);
zap_present_folio_ptes(tlb, vma, folio, page, pte, ptent, nr,
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 1ff7b2174eb77..2a25eedc3b1c0 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -675,7 +675,6 @@ static void queue_folios_pmd(pmd_t *pmd, struct mm_walk *walk)
static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, struct mm_walk *walk)
{
- const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
struct vm_area_struct *vma = walk->vma;
struct folio *folio;
struct queue_pages *qp = walk->private;
@@ -713,8 +712,7 @@ static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr,
continue;
if (folio_test_large(folio) && max_nr != 1)
nr = folio_pte_batch(folio, addr, pte, ptent,
- max_nr, fpb_flags,
- NULL, NULL, NULL);
+ max_nr, 0, NULL, NULL, NULL);
/*
* vm_normal_folio() filters out zero pages, but there might
* still be reserved folios to skip, perhaps in a VDSO.
diff --git a/mm/mlock.c b/mm/mlock.c
index 3cb72b579ffd3..2238cdc5eb1c1 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -307,14 +307,13 @@ void munlock_folio(struct folio *folio)
static inline unsigned int folio_mlock_step(struct folio *folio,
pte_t *pte, unsigned long addr, unsigned long end)
{
- const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
unsigned int count = (end - addr) >> PAGE_SHIFT;
pte_t ptent = ptep_get(pte);
if (!folio_test_large(folio))
return 1;
- return folio_pte_batch(folio, addr, pte, ptent, count, fpb_flags, NULL,
+ return folio_pte_batch(folio, addr, pte, ptent, count, 0, NULL,
NULL, NULL);
}
diff --git a/mm/mremap.c b/mm/mremap.c
index 36585041c760d..d4d3ffc931502 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -173,7 +173,6 @@ static pte_t move_soft_dirty_pte(pte_t pte)
static int mremap_folio_pte_batch(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t pte, int max_nr)
{
- const fpb_t flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
struct folio *folio;
if (max_nr == 1)
@@ -183,7 +182,7 @@ static int mremap_folio_pte_batch(struct vm_area_struct *vma, unsigned long addr
if (!folio || !folio_test_large(folio))
return 1;
- return folio_pte_batch(folio, addr, ptep, pte, max_nr, flags, NULL,
+ return folio_pte_batch(folio, addr, ptep, pte, max_nr, 0, NULL,
NULL, NULL);
}
diff --git a/mm/rmap.c b/mm/rmap.c
index 34311f654d0c2..98ed3e49e8bbe 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1849,7 +1849,6 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
struct page_vma_mapped_walk *pvmw,
enum ttu_flags flags, pte_t pte)
{
- const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
unsigned long end_addr, addr = pvmw->address;
struct vm_area_struct *vma = pvmw->vma;
unsigned int max_nr;
@@ -1869,7 +1868,7 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
if (pte_unused(pte))
return 1;
- return folio_pte_batch(folio, addr, pvmw->pte, pte, max_nr, fpb_flags,
+ return folio_pte_batch(folio, addr, pvmw->pte, pte, max_nr, 0,
NULL, NULL, NULL);
}
--
2.49.0
next prev parent reply other threads:[~2025-07-02 10:49 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-02 10:49 [PATCH v2 0/4] mm: folio_pte_batch() improvements David Hildenbrand
2025-07-02 10:49 ` David Hildenbrand [this message]
2025-07-03 10:05 ` [PATCH v2 1/4] mm: convert FPB_IGNORE_* into FPB_RESPECT_* Dev Jain
2025-07-02 10:49 ` [PATCH v2 2/4] mm: smaller folio_pte_batch() improvements David Hildenbrand
2025-07-02 14:24 ` Zi Yan
2025-07-03 10:18 ` Dev Jain
2025-07-02 10:49 ` [PATCH v2 3/4] mm: split folio_pte_batch() into folio_pte_batch() and folio_pte_batch_flags() David Hildenbrand
2025-07-02 14:25 ` Zi Yan
2025-07-03 10:45 ` Dev Jain
2025-07-02 10:49 ` [PATCH v2 4/4] mm: remove boolean output parameters from folio_pte_batch_ext() David Hildenbrand
2025-07-02 14:00 ` Oscar Salvador
2025-07-02 14:40 ` David Hildenbrand
2025-07-02 17:53 ` Oscar Salvador
2025-07-03 9:16 ` Oscar Salvador
2025-07-03 12:56 ` Dev Jain
2025-08-04 8:22 ` Wei Yang
2025-08-04 8:23 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250702104926.212243-2-david@redhat.com \
--to=david@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=byungchul@sk.com \
--cc=gourry@gourry.net \
--cc=harry.yoo@oracle.com \
--cc=ioworker0@gmail.com \
--cc=jannh@google.com \
--cc=joshua.hahnjy@gmail.com \
--cc=lance.yang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=matthew.brost@intel.com \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=pfalcato@suse.de \
--cc=rakie.kim@sk.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ying.huang@linux.alibaba.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox