From: Zi Yan <ziy@nvidia.com>
To: "Matthew Wilcox (Oracle)" <willy@infradead.org>,
Song Liu <songliubraving@fb.com>
Cc: Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>, Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
Barry Song <baohua@kernel.org>, Lance Yang <lance.yang@linux.dev>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, Shuah Khan <shuah@kernel.org>,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kselftest@vger.kernel.org
Subject: [PATCH 7.2 v2 00/12] Remove read-only THP support for FSes without large folio support
Date: Mon, 13 Apr 2026 15:20:18 -0400 [thread overview]
Message-ID: <20260413192030.3275825-1-ziy@nvidia.com> (raw)
Hi all,
This patchset removes READ_ONLY_THP_FOR_FS Kconfig and enables creating
read-only THPs for FSes with large folio support (the supported orders
need to include PMD_ORDER) by default.
Before the patchset, the status of creating read-only THPs is below:
| PF | MADV_COLLAPSE | khugepaged |
|-----------|---------------|------------|
large folio FSes only | ✓ | x | x |
READ_ONLY_THP_FOR_FS only | x | ✓ | ✓ |
both | ✓ | ✓ | ✓ |
where READ_ONLY_THP_FOR_FS implies no large folio FSes.
Now without READ_ONLY_THP_FOR_FS:
| PF | MADV_COLLAPSE | khugepaged |
|-----------|---------------|------------|
large folio FSes | ✓ | ✓ | ✓ |
no large folio FSes | x | x | x |
This means no large folio FSes need to add large folio support (the
supported orders need to include PMD_ORDER), so that they can leverage
read-only THP creation function.
To prevent breaking read-only THP support for large folio FSes,
1. first 3 patches enables the support, so that without READ_ONLY_THP_FOR_FS,
read-only THP still works for large folio FSes,
2. Patch 4 removes READ_ONLY_THP_FOR_FS Kconfig,
3. the rest of patches remove code related to READ_ONLY_THP_FOR_FS.
The overview of the changes is:
1. collapse_file() checks for to-be-collapsed folio dirtiness after they
are locked, unmapped, and corresponding mappings are flushed from
TLBs to prevent writes to candidate folios. Before, mapping->nr_thps
and inode->i_writecount are used to cause read-only THP truncation
before a fd becomes writable.
2. hugepage_pmd_enabled() no longer always returned true if
CONFIG_READ_ONLY_THP_FOR_FS was enabaled. This affects whether
khugepaged will be on or not. It now depends on anon and shmem
configurations. Namely, if a user who has set
/sys/kernel/mm/transparent_hugepage/enabled to always or madvise but
explicitly disabled anon PMD THP and shmem THP via the per-order
sysfs controls, read-only THP support will be disabled.
3. collapse_file() from mm/khugepaged.c, instead of checking
CONFIG_READ_ONLY_THP_FOR_FS, makes sure the mapping_max_folio_order()
of struct address_space of the file is at least PMD_ORDER.
4. file_thp_enabled() also checks mapping_max_folio_order() instead.
5. truncate_inode_partial_folio() calls folio_split() directly instead
of the removed try_folio_split_to_order(), since large folios can
only show up on a FS with large folio support.
6. nr_thps is removed from struct address_space, since it is no longer
needed to drop all read-only THPs from a FS without large folio
support when the fd becomes writable. Its related filemap_nr_thps*()
are removed too.
7. folio_check_splittable() no longer checks READ_ONLY_THP_FOR_FS.
8. Updated comments in various places.
Changelog
===
From V1[2]:
1. removed inode_is_open_for_write() check in collapse_file(), since the
added folio dirtiness check after try_to_unmap_flush() should be
sufficient to prevent writes to candidate folios.
2. removed READ_ONLY_THP_FOR_FS check in hugepage_pmd_enabled(), please
see Patch 5 and item 2 in the overview for more details.
3. moved the patch removing READ_ONLY_THP_FOR_FS Kconfig after enabling
khugepaged and MADV_COLLAPSE to create read-only THPs.
4. added mapping_pmd_thp_support() helper function.
5. used VM_WARN_ON_ONCE() in collapse_file() for mapping eligibility check
and address alignment check instead of if + return error code. Always
allow shmem, since MADV_COLLAPSE ignore shmem huge config.
6. added mapping eligibility check in collapse_scan_file().
7. removed trailing ; for folio_split() in the !CONFIG_TRANSPARENT_HUGEPAGE.
8. simplified code in folio_check_splittable() after removing
READ_ONLY_THP_FOR_FS code.
9. clarified that read-only THP works for FSes with PMD THP support by
default.
From RFC[1]:
1. instead of removing READ_ONLY_THP_FOR_FS function entirely, turn it
on by default for all FSes with large folio support and the supported
orders includes PMD_ORDER.
Suggestions and comments are welcome.
Link: https://lore.kernel.org/all/20260323190644.1714379-1-ziy@nvidia.com/ [1]
Link: https://lore.kernel.org/all/20260327014255.2058916-1-ziy@nvidia.com/ [2]
Zi Yan (12):
mm/khugepaged: remove READ_ONLY_THP_FOR_FS check
mm/khugepaged: add folio dirty check after try_to_unmap_flush()
mm/huge_memory: remove READ_ONLY_THP_FOR_FS from file_thp_enabled()
mm: remove READ_ONLY_THP_FOR_FS Kconfig option
mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in
hugepage_pmd_enabled()
mm: fs: remove filemap_nr_thps*() functions and their users
fs: remove nr_thps from struct address_space
mm/huge_memory: remove folio split check for READ_ONLY_THP_FOR_FS
mm/truncate: use folio_split() in truncate_inode_partial_folio()
fs/btrfs: remove a comment referring to READ_ONLY_THP_FOR_FS
selftests/mm: remove READ_ONLY_THP_FOR_FS in khugepaged
selftests/mm: remove READ_ONLY_THP_FOR_FS from comments in
guard-regions
fs/btrfs/defrag.c | 3 -
fs/inode.c | 3 -
fs/open.c | 27 ---------
include/linux/fs.h | 5 --
include/linux/huge_mm.h | 25 +--------
include/linux/pagemap.h | 29 ----------
mm/Kconfig | 11 ----
mm/filemap.c | 1 -
mm/huge_memory.c | 37 ++----------
mm/khugepaged.c | 65 ++++++++++------------
mm/truncate.c | 8 +--
tools/testing/selftests/mm/guard-regions.c | 9 +--
tools/testing/selftests/mm/khugepaged.c | 4 +-
13 files changed, 49 insertions(+), 178 deletions(-)
--
2.43.0
next reply other threads:[~2026-04-13 19:20 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-13 19:20 Zi Yan [this message]
2026-04-13 19:20 ` [PATCH 7.2 v2 01/12] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check Zi Yan
2026-04-13 20:20 ` Matthew Wilcox
2026-04-13 20:34 ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 02/12] mm/khugepaged: add folio dirty check after try_to_unmap_flush() Zi Yan
2026-04-13 20:23 ` Matthew Wilcox
2026-04-13 20:28 ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 03/12] mm/huge_memory: remove READ_ONLY_THP_FOR_FS from file_thp_enabled() Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 04/12] mm: remove READ_ONLY_THP_FOR_FS Kconfig option Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 05/12] mm/khugepaged: remove READ_ONLY_THP_FOR_FS check in hugepage_pmd_enabled() Zi Yan
2026-04-13 20:33 ` Matthew Wilcox
2026-04-13 20:42 ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 06/12] mm: fs: remove filemap_nr_thps*() functions and their users Zi Yan
2026-04-13 20:35 ` Matthew Wilcox
2026-04-13 19:20 ` [PATCH 7.2 v2 07/12] fs: remove nr_thps from struct address_space Zi Yan
2026-04-13 20:38 ` Matthew Wilcox
2026-04-13 19:20 ` [PATCH 7.2 v2 08/12] mm/huge_memory: remove folio split check for READ_ONLY_THP_FOR_FS Zi Yan
2026-04-13 20:41 ` Matthew Wilcox
2026-04-13 20:46 ` Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 09/12] mm/truncate: use folio_split() in truncate_inode_partial_folio() Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 10/12] fs/btrfs: remove a comment referring to READ_ONLY_THP_FOR_FS Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 11/12] selftests/mm: remove READ_ONLY_THP_FOR_FS in khugepaged Zi Yan
2026-04-13 19:20 ` [PATCH 7.2 v2 12/12] selftests/mm: remove READ_ONLY_THP_FOR_FS from comments in guard-regions Zi Yan
2026-04-13 20:47 ` Matthew Wilcox
2026-04-13 20:51 ` Zi Yan
2026-04-13 22:28 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260413192030.3275825-1-ziy@nvidia.com \
--to=ziy@nvidia.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=brauner@kernel.org \
--cc=clm@fb.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=dsterba@suse.com \
--cc=jack@suse.cz \
--cc=lance.yang@linux.dev \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shuah@kernel.org \
--cc=songliubraving@fb.com \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox