From: Yafang Shao <laoar.shao@gmail.com>
To: akpm@linux-foundation.org, david@redhat.com, ziy@nvidia.com,
baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
dev.jain@arm.com, hannes@cmpxchg.org, usamaarif642@gmail.com,
gutierrez.asier@huawei-partners.com, willy@infradead.org,
ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
ameryhung@gmail.com, rientjes@google.com, corbet@lwn.net,
21cnbao@gmail.com, shakeel.butt@linux.dev, tj@kernel.org,
lance.yang@linux.dev
Cc: bpf@vger.kernel.org, linux-mm@kvack.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
Yafang Shao <laoar.shao@gmail.com>
Subject: [PATCH v8 mm-new 06/12] mm: thp: enable THP allocation exclusively through khugepaged
Date: Fri, 26 Sep 2025 17:33:37 +0800 [thread overview]
Message-ID: <20250926093343.1000-7-laoar.shao@gmail.com> (raw)
In-Reply-To: <20250926093343.1000-1-laoar.shao@gmail.com>
khugepaged_enter_vma() ultimately invokes any attached BPF function with
the TVA_KHUGEPAGED flag set when determining whether or not to enable
khugepaged THP for a freshly faulted in VMA.
Currently, on fault, we invoke this in do_huge_pmd_anonymous_page(), as
invoked by create_huge_pmd() and only when we have already checked to
see if an allowable TVA_PAGEFAULT order is specified.
Since we might want to disallow THP on fault-in but allow it via
khugepaged, we move things around so we always attempt to enter
khugepaged upon fault.
This change is safe because:
- the checks for thp_vma_allowable_order(TVA_KHUGEPAGED) and
thp_vma_allowable_order(TVA_PAGEFAULT) are functionally equivalent
- khugepaged operates at the MM level rather than per-VMA. The THP
allocation might fail during page faults due to transient conditions
(e.g., memory pressure), it is safe to add this MM to khugepaged for
subsequent defragmentation.
While we could also extend prctl() to utilize this new policy, such a
change would require a uAPI modification to PR_SET_THP_DISABLE.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Lance Yang <lance.yang@linux.dev>
---
mm/huge_memory.c | 1 -
mm/memory.c | 13 ++++++++-----
2 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 08372dfcb41a..2b155a734c78 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1346,7 +1346,6 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
ret = vmf_anon_prepare(vmf);
if (ret)
return ret;
- khugepaged_enter_vma(vma);
if (!(vmf->flags & FAULT_FLAG_WRITE) &&
!mm_forbids_zeropage(vma->vm_mm) &&
diff --git a/mm/memory.c b/mm/memory.c
index 58ea0f93f79e..64f91191ffff 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -6327,11 +6327,14 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
if (pud_trans_unstable(vmf.pud))
goto retry_pud;
- if (pmd_none(*vmf.pmd) &&
- thp_vma_allowable_order(vma, TVA_PAGEFAULT, PMD_ORDER)) {
- ret = create_huge_pmd(&vmf);
- if (!(ret & VM_FAULT_FALLBACK))
- return ret;
+ if (pmd_none(*vmf.pmd)) {
+ if (vma_is_anonymous(vma))
+ khugepaged_enter_vma(vma);
+ if (thp_vma_allowable_order(vma, TVA_PAGEFAULT, PMD_ORDER)) {
+ ret = create_huge_pmd(&vmf);
+ if (!(ret & VM_FAULT_FALLBACK))
+ return ret;
+ }
} else {
vmf.orig_pmd = pmdp_get_lockless(vmf.pmd);
--
2.47.3
next prev parent reply other threads:[~2025-09-26 9:34 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-26 9:33 [PATCH v8 mm-new 00/12] mm, bpf: BPF based THP order selection Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 01/12] mm: thp: remove disabled task from khugepaged_mm_slot Yafang Shao
2025-09-26 14:11 ` Usama Arif
2025-09-28 2:21 ` Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 02/12] mm: thp: remove vm_flags parameter from khugepaged_enter_vma() Yafang Shao
2025-09-26 14:49 ` Usama Arif
2025-09-28 2:35 ` Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 03/12] mm: thp: remove vm_flags parameter from thp_vma_allowable_order() Yafang Shao
2025-09-26 14:54 ` Usama Arif
2025-09-26 9:33 ` [PATCH v8 mm-new 04/12] mm: thp: add support for BPF based THP order selection Yafang Shao
2025-09-26 15:13 ` Usama Arif
2025-09-26 19:17 ` Randy Dunlap
2025-09-28 2:13 ` Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 05/12] mm: thp: decouple THP allocation between swap and page fault paths Yafang Shao
2025-09-26 15:19 ` Usama Arif
2025-09-26 9:33 ` Yafang Shao [this message]
2025-09-26 15:27 ` [PATCH v8 mm-new 06/12] mm: thp: enable THP allocation exclusively through khugepaged Usama Arif
2025-09-28 2:58 ` Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 07/12] bpf: mark mm->owner as __safe_rcu_or_null Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 08/12] bpf: mark vma->vm_mm as __safe_trusted_or_null Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 09/12] selftests/bpf: add a simple BPF based THP policy Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 10/12] selftests/bpf: add test case to update " Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 11/12] selftests/bpf: add test cases for invalid thp_adjust usage Yafang Shao
2025-09-26 9:33 ` [PATCH v8 mm-new 12/12] Documentation: add BPF-based THP policy management Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250926093343.1000-7-laoar.shao@gmail.com \
--to=laoar.shao@gmail.com \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bpf@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=gutierrez.asier@huawei-partners.com \
--cc=hannes@cmpxchg.org \
--cc=lance.yang@linux.dev \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=rientjes@google.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
--cc=usamaarif642@gmail.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox