linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wenchao Hao <haowenchao22@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Wenchao Hao <haowenchao22@gmail.com>
Subject: [RFC PATCH] mm: only set fault addrsss' access bit in do_anonymous_page
Date: Tue, 10 Feb 2026 12:34:56 +0800	[thread overview]
Message-ID: <20260210043456.2137482-1-haowenchao22@gmail.com> (raw)

When do_anonymous_page() creates mappings for huge pages, it currently sets
the access bit for all mapped PTEs (Page Table Entries) by default.

This causes an issue where the Referenced field in /proc/pid/smaps cannot
distinguish whether a page was actually accessed.

So here introduces a new interface, set_anon_ptes(), which only sets the
access bit for the PTE corresponding to the faulting address. This allows
accurate tracking of page access status in /proc/pid/smaps before memory
reclaim scan the folios.

During memory reclaim: folio_referenced() checks and clears the access bits
of PTEs, rmap verifies all PTEs under a folio. If any PTE mapped subpage of
folio has access bit set, the folio is retained during reclaim. So only
set the access bit for the faulting PTE in do_anonymous_page() is safe, as
it does not interfere with reclaim decisions.

The patch only supports architectures without custom set_ptes()
implementations (e.g., x86). ARM64 and other architectures are not yet
supported.

Additionally, I have some questions regarding the contiguous page tables
for 64K huge pages on the ARM64 architecture.

'commit 4602e5757bcc ("arm64/mm: wire up PTE_CONT for user mappings")'
described as following:

> Since a contpte block only has a single access and dirty bit, the semantic
> here changes slightly; when getting a pte (e.g.  ptep_get()) that is part
> of a contpte mapping, the access and dirty information are pulled from the
> block (so all ptes in the block return the same access/dirty info).

While the ARM64 manual states:

> If hardware updates a translation table entry, and if the Contiguous bit in
> that entry is 1, then the members in a group of contiguous translation table
> entries can have different AF, AP[2], and S2AP[1] values.

Does this mean the 16 PTEs are not necessary to share same AF for ARM?

Currently, for ARM64 huge pages with contiguous page tables enabled, the access
and dirty bits for 64K huge pages are actually folded in software.

However, I haven't found whether these access and dirty bits affect the TLB
coalescing of contiguous page tables. If they do not affect it, I think ARM64
can also set the access bit only for the PTE corresponding to the actual fault
address in do_anonymous_page().

Signed-off-by: Wenchao Hao <haowenchao22@gmail.com>
---
 include/linux/pgtable.h | 28 ++++++++++++++++++++++++++++
 mm/memory.c             |  2 +-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 652f287c1ef6..e2f3c932d672 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -302,6 +302,34 @@ static inline void set_ptes(struct mm_struct *mm, unsigned long addr,
 #endif
 #define set_pte_at(mm, addr, ptep, pte) set_ptes(mm, addr, ptep, pte, 1)
 
+#ifndef set_ptes
+static inline void set_anon_ptes(struct mm_struct *mm, unsigned long addr,
+		unsigned long fault_addr, pte_t *ptep, pte_t pte, unsigned int nr)
+{
+	bool young = pte_young(pte);
+
+	page_table_check_ptes_set(mm, ptep, pte, nr);
+
+	for (;;) {
+		if (young && addr == fault_addr)
+			pte = pte_mkyoung(pte);
+		else
+			pte = pte_mkold(pte);
+
+		set_pte(ptep, pte);
+		if (--nr == 0)
+			break;
+
+		addr += PAGE_SIZE;
+		ptep++;
+		pte = pte_next_pfn(pte);
+	}
+}
+#else
+#define set_anon_ptes(mm, addr, fault_addr, ptep, pte, nr) \
+		set_ptes(mm, addr, ptep, pte, nr)
+#endif
+
 #ifndef __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 extern int ptep_set_access_flags(struct vm_area_struct *vma,
 				 unsigned long address, pte_t *ptep,
diff --git a/mm/memory.c b/mm/memory.c
index da360a6eb8a4..65c69c7116a7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5273,7 +5273,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 setpte:
 	if (vmf_orig_pte_uffd_wp(vmf))
 		entry = pte_mkuffd_wp(entry);
-	set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr_pages);
+	set_anon_ptes(vma->vm_mm, addr, vmf->address, vmf->pte, entry, nr_pages);
 
 	/* No need to invalidate - it was non-present before */
 	update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages);
-- 
2.45.0



             reply	other threads:[~2026-02-10  4:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-10  4:34 Wenchao Hao [this message]
2026-02-10  9:07 ` David Hildenbrand (Arm)
2026-02-11  0:49   ` Wenchao Hao
2026-02-11  4:18     ` Dev Jain
2026-02-12  1:42       ` Wenchao Hao
2026-02-12  5:04         ` Dev Jain
2026-02-11  9:05     ` David Hildenbrand (Arm)
2026-02-12  1:57       ` Wenchao Hao
2026-02-12  8:54         ` David Hildenbrand (Arm)
2026-02-13  9:02           ` Wenchao Hao
2026-02-13  9:07             ` David Hildenbrand (Arm)
2026-02-13 14:52               ` Wenchao Hao
2026-02-13 15:08                 ` David Hildenbrand (Arm)
2026-02-10 11:56 ` Kiryl Shutsemau
2026-02-11  1:00   ` Wenchao Hao
2026-02-11 11:03     ` Kiryl Shutsemau
2026-02-12  2:08       ` Wenchao Hao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260210043456.2137482-1-haowenchao22@gmail.com \
    --to=haowenchao22@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox