linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nico Pache <npache@redhat.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	 akpm@linux-foundation.org, anshuman.khandual@arm.com,
	apopple@nvidia.com,  baohua@kernel.org,
	baolin.wang@linux.alibaba.com, byungchul@sk.com,
	 catalin.marinas@arm.com, cl@gentwo.org, corbet@lwn.net,
	 dave.hansen@linux.intel.com, david@kernel.org, dev.jain@arm.com,
	 gourry@gourry.net, hannes@cmpxchg.org, hughd@google.com,
	jackmanb@google.com,  jack@suse.cz, jannh@google.com,
	jglisse@google.com, kas@kernel.org,  lance.yang@linux.dev,
	Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com,
	 mathieu.desnoyers@efficios.com, matthew.brost@intel.com,
	mhiramat@kernel.org,  mhocko@suse.com, peterx@redhat.com,
	pfalcato@suse.de, rakie.kim@sk.com,  raquini@redhat.com,
	rdunlap@infradead.org, richard.weiyang@gmail.com,
	 rientjes@google.com, rostedt@goodmis.org, rppt@kernel.org,
	 ryan.roberts@arm.com, shivankg@amd.com, sunnanyong@huawei.com,
	 surenb@google.com, thomas.hellstrom@linux.intel.com,
	tiwai@suse.de,  usamaarif642@gmail.com, vbabka@suse.cz,
	vishal.moola@gmail.com,  wangkefeng.wang@huawei.com,
	will@kernel.org, willy@infradead.org,
	 yang@os.amperecomputing.com, ying.huang@linux.alibaba.com,
	ziy@nvidia.com,  zokeefe@google.com
Subject: Re: [PATCH mm-unstable v1 1/5] mm: consolidate anonymous folio PTE mapping into helpers
Date: Thu, 12 Feb 2026 12:33:32 -0700	[thread overview]
Message-ID: <CAA1CXcC7SPkehNLT8FUnnUf6m0vm3GbhZuDZyMMW4X0DHDs6Mg@mail.gmail.com> (raw)
In-Reply-To: <20260212155539.2083102-1-joshua.hahnjy@gmail.com>

On Thu, Feb 12, 2026 at 8:55 AM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> On Wed, 11 Feb 2026 19:18:31 -0700 Nico Pache <npache@redhat.com> wrote:
>
> Hello Nico,
>
> Thank you for the patch! I hope you are having a good day.

Hey Joshua!

Thank you for reviewing! I hope you have a good day too :)

>
> > The anonymous page fault handler in do_anonymous_page() open-codes the
> > sequence to map a newly allocated anonymous folio at the PTE level:
> >       - construct the PTE entry
> >       - add rmap
> >       - add to LRU
> >       - set the PTEs
> >       - update the MMU cache.
> >
> > Introduce a two helpers to consolidate this duplicated logic, mirroring the
> > existing map_anon_folio_pmd_nopf() pattern for PMD-level mappings:
> >
> >       map_anon_folio_pte_nopf(): constructs the PTE entry, takes folio
> >       references, adds anon rmap and LRU. This function also handles the
> >       uffd_wp that can occur in the pf variant.
> >
> >       map_anon_folio_pte_pf(): extends the nopf variant to handle MM_ANONPAGES
> >       counter updates, and mTHP fault allocation statistics for the page fault
> >       path.
> >
> > The zero-page read path in do_anonymous_page() is also untangled from the
> > shared setpte label, since it does not allocate a folio and should not
> > share the same mapping sequence as the write path. Make nr_pages = 1
> > rather than relying on the variable. This makes it more clear that we
> > are operating on the zero page only.
> >
> > This refactoring will also help reduce code duplication between mm/memory.c
> > and mm/khugepaged.c, and provides a clean API for PTE-level anonymous folio
> > mapping that can be reused by future callers.
>
> It seems that based on this description, there should be no functional change
> in the code below. Is that correct?

Correct, but as you and others have pointed out I believe I missed a
pte_sw_mkyoung.

On closer inspection I also may have inadvertently changed some
behavior around pte_mkdirty().

In the previous implementation we only called pte_mkdirty if its
VM_WRITE. When switching over to maybe_mkwrite pte_mkdirty is no
longer called conditionally, while pte_mkwrite still is.

Nothing showed up in my testing, but some of these things can be
tricky to trigger. Other callers also make this "mistake" (if it even
is one), but I'm aiming for no functional change so I appreciate the
thoroughness here! I will clean up both of these issues.

>
> > Signed-off-by: Nico Pache <npache@redhat.com>
> > ---
> >  include/linux/mm.h |  4 ++++
> >  mm/memory.c        | 56 ++++++++++++++++++++++++++++++----------------
> >  2 files changed, 41 insertions(+), 19 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index f8a8fd47399c..c3aa1f51e020 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -4916,4 +4916,8 @@ static inline bool snapshot_page_is_faithful(const struct page_snapshot *ps)
> >
> >  void snapshot_page(struct page_snapshot *ps, const struct page *page);
> >
> > +void map_anon_folio_pte_nopf(struct folio *folio, pte_t *pte,
> > +             struct vm_area_struct *vma, unsigned long addr,
> > +             bool uffd_wp);
> > +
> >  #endif /* _LINUX_MM_H */
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 8c19af97f0a0..61c2277c9d9f 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -5211,6 +5211,35 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf)
> >       return folio_prealloc(vma->vm_mm, vma, vmf->address, true);
> >  }
> >
> > +
>
> ^^^ extra newline?

oops yes! thanks

>
> > +void map_anon_folio_pte_nopf(struct folio *folio, pte_t *pte,
> > +             struct vm_area_struct *vma, unsigned long addr,
> > +             bool uffd_wp)
> > +{
> > +     pte_t entry = folio_mk_pte(folio, vma->vm_page_prot);
> > +     unsigned int nr_pages = folio_nr_pages(folio);
>
> Just reading through the code below on what was deleted and what was added,
> Maybe we are missing a pte_sw_mkyoung(entry) here? Seems like this would
> matter for MIPS systmes, but I couldn't find this change in the changelog.

I think you are correct. In my khugepaged implementation this was not
present. I will add it back and run some tests! Thank you.

>
> > +     entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> > +     if (uffd_wp)
> > +             entry = pte_mkuffd_wp(entry);
>
> The ordering here was also changed, but it wasn't immediately obvious to me
> why it was changed.

I dont see any data dependencies between the folio rmap/folio/ref
changes and the pte changes. The order was most likely due to the
setpte labeling which this also cleans up. I believe we can reorder
this, but if you see an issue with it please lmk!

>
> > +
> > +     folio_ref_add(folio, nr_pages - 1);
> > +     folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
> > +     folio_add_lru_vma(folio, vma);
> > +     set_ptes(vma->vm_mm, addr, pte, entry, nr_pages);
> > +     update_mmu_cache_range(NULL, vma, addr, pte, nr_pages);
> > +}
> > +
> > +static void map_anon_folio_pte_pf(struct folio *folio, pte_t *pte,
> > +             struct vm_area_struct *vma, unsigned long addr,
> > +             unsigned int nr_pages, bool uffd_wp)
> > +{
> > +     map_anon_folio_pte_nopf(folio, pte, vma, addr, uffd_wp);
> > +     add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages);
> > +     count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC);
> > +}
> > +
> > +
>
> ^^^ extra newline?

whoops yes thank you!

>
> >  /*
> >   * We enter with non-exclusive mmap_lock (to exclude vma changes,
> >   * but allow concurrent faults), and pte mapped but not yet locked.
> > @@ -5257,7 +5286,13 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> >                       pte_unmap_unlock(vmf->pte, vmf->ptl);
> >                       return handle_userfault(vmf, VM_UFFD_MISSING);
> >               }
> > -             goto setpte;
> > +             if (vmf_orig_pte_uffd_wp(vmf))
> > +                     entry = pte_mkuffd_wp(entry);
> > +             set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
> > +
> > +             /* No need to invalidate - it was non-present before */
> > +             update_mmu_cache_range(vmf, vma, addr, vmf->pte, /*nr_pages=*/ 1);
>
> NIT: Should we try to keep the line under 80 columns? ; -)

Ah yes, I still dont fully understand this rule as it's broken in a
lot of places. Ill move nr_pages to a new line.

>
> > +             goto unlock;
> >       }
> >
> >       /* Allocate our own private page. */
> > @@ -5281,11 +5316,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> >        */
> >       __folio_mark_uptodate(folio);
> >
> > -     entry = folio_mk_pte(folio, vma->vm_page_prot);
> > -     entry = pte_sw_mkyoung(entry);
> > -     if (vma->vm_flags & VM_WRITE)
> > -             entry = pte_mkwrite(pte_mkdirty(entry), vma);
> > -
> >       vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
> >       if (!vmf->pte)
> >               goto release;
> > @@ -5307,19 +5337,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> >               folio_put(folio);
> >               return handle_userfault(vmf, VM_UFFD_MISSING);
> >       }
> > -
> > -     folio_ref_add(folio, nr_pages - 1);
> > -     add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages);
> > -     count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC);
> > -     folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
> > -     folio_add_lru_vma(folio, vma);
> > -setpte:
> > -     if (vmf_orig_pte_uffd_wp(vmf))
> > -             entry = pte_mkuffd_wp(entry);
> > -     set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr_pages);
> > -
> > -     /* No need to invalidate - it was non-present before */
> > -     update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages);
> > +     map_anon_folio_pte_pf(folio, vmf->pte, vma, addr, nr_pages, vmf_orig_pte_uffd_wp(vmf));
>
> NIT: Maybe here as well?

ack

>
> >  unlock:
> >       if (vmf->pte)
> >               pte_unmap_unlock(vmf->pte, vmf->ptl);
> > --
> > 2.53.0
>
> Thank you for the patch again. I hope you have a great day!
> Joshua

Thank you! you as well
-- Nico

>



  reply	other threads:[~2026-02-12 19:34 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-12  2:18 [PATCH mm-unstable v1 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache
2026-02-12  2:18 ` [PATCH mm-unstable v1 1/5] mm: consolidate anonymous folio PTE mapping into helpers Nico Pache
2026-02-12 14:38   ` Pedro Falcato
2026-02-12 15:55   ` Joshua Hahn
2026-02-12 19:33     ` Nico Pache [this message]
2026-02-12 16:09   ` Zi Yan
2026-02-12 19:45     ` Nico Pache
2026-02-12 20:06       ` Zi Yan
2026-02-12 20:13         ` David Hildenbrand (Arm)
2026-02-12  2:18 ` [PATCH mm-unstable v1 2/5] mm: introduce is_pmd_order helper Nico Pache
2026-02-12 14:40   ` Pedro Falcato
2026-02-12 16:11   ` Zi Yan
2026-02-12 19:45   ` David Hildenbrand (Arm)
2026-02-13  3:51   ` Barry Song
2026-02-14  7:24   ` Lance Yang
2026-02-20 10:38   ` Dev Jain
2026-02-20 10:42     ` David Hildenbrand (Arm)
2026-02-12  2:18 ` [PATCH mm-unstable v1 3/5] mm/khugepaged: define COLLAPSE_MAX_PTES_LIMIT as HPAGE_PMD_NR - 1 Nico Pache
2026-02-12  6:56   ` Vernon Yang
2026-02-12 14:45   ` Pedro Falcato
2026-02-12 16:21   ` Zi Yan
2026-02-12 19:51   ` David Hildenbrand (Arm)
2026-02-12  2:23 ` [PATCH mm-unstable v1 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
2026-02-12  2:23   ` [PATCH mm-unstable v1 5/5] mm/khugepaged: unify khugepaged and madv_collapse with collapse_single_pmd() Nico Pache
2026-02-12 19:52   ` [PATCH mm-unstable v1 4/5] mm/khugepaged: rename hpage_collapse_* to collapse_* David Hildenbrand (Arm)
2026-02-12  2:25 ` [PATCH mm-unstable v1 5/5] mm/khugepaged: unify khugepaged and madv_collapse with collapse_single_pmd() Nico Pache
2026-02-12 15:33   ` Pedro Falcato
2026-02-12 17:34   ` Zi Yan
2026-02-12 20:03   ` David Hildenbrand (Arm)
2026-02-12 20:26     ` Nico Pache
2026-02-14  8:24       ` Lance Yang
2026-02-12  2:26 ` [PATCH mm-unstable v1 0/5] mm: khugepaged cleanups and mTHP prerequisites Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA1CXcC7SPkehNLT8FUnnUf6m0vm3GbhZuDZyMMW4X0DHDs6Mg@mail.gmail.com \
    --to=npache@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=apopple@nvidia.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=byungchul@sk.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jackmanb@google.com \
    --cc=jannh@google.com \
    --cc=jglisse@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kas@kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=matthew.brost@intel.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=raquini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shivankg@amd.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tiwai@suse.de \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox