From: Zi Yan <ziy@nvidia.com>
To: Pankaj Raghav <p.raghav@samsung.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
Mike Rapoport <rppt@kernel.org>,
David Hildenbrand <david@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Michal Hocko <mhocko@suse.com>, Lance Yang <lance.yang@linux.dev>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Dev Jain <dev.jain@arm.com>, Barry Song <baohua@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Nico Pache <npache@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Jens Axboe <axboe@kernel.dk>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
mcgrof@kernel.org, gost.dev@samsung.com, kernel@pankajraghav.com,
tytso@mit.edu
Subject: Re: [RFC v2 0/3] Decoupling large folios dependency on THP
Date: Tue, 09 Dec 2025 11:03:23 -0500 [thread overview]
Message-ID: <64291696-C808-49D0-9F89-6B3B97F58717@nvidia.com> (raw)
In-Reply-To: <20251206030858.1418814-1-p.raghav@samsung.com>
On 5 Dec 2025, at 22:08, Pankaj Raghav wrote:
> File-backed Large folios were initially implemented with dependencies on Transparent
> Huge Pages (THP) infrastructure. As large folio adoption expanded across
> the kernel, CONFIG_TRANSPARENT_HUGEPAGE has become an overloaded
> configuration option, sometimes used as a proxy for large folio support
> [1][2][3].
>
> This series is a part of the LPC talk[4], and I am sending the RFC
> series to start the discussion.
>
> There are multiple solutions to solve this problem and this is one of
> them with minimal changes. I plan on discussing possible other solutions
> at the talk.
>
> Based on my investigation, the only feature large folios depend on is
> the THP splitting infrastructure. Either during truncation or memory
> pressure when the large folio has to be split, then THP's splitting
> infrastructure is used to split them into min order folio chunks.
>
> In this approach, we restrict the maximum order of the large folio to
> minimum order to ensure we never use the splitting infrastructure when
> THP is disabled.
>
> I disabled THP, and ran xfstests on XFS with 16k, 32k and 64k blocksizes
> and the changes seems to survive the test without any issues.
But are large folios really created?
IIUC, in do_sync_mmap_readahead(), when THP is disabled, force_thp_readahead
is never set to true and later ra->order is set to 0. Oh, page_cache_ra_order()
later bumps new_order to mapping_min_folio_order(). So large folios are
created there.
I wonder if core-mm should move mTHP code out of CONFIG_TRANSPARENT_HUGEPAGE
and mTHP might just work. Hmm, folio split might need to be moved out of
mm/huge_memory.c in that case. khugepaged should work for mTHP without
CONFIG_TRANSPARENT_HUGEPAGE as well. OK, for anon folios, the changes might
be more involved.
>
> Looking forward to some productive discussion.
>
> P.S: Thanks to Zi, David and willy for all the ideas they provided to
> solve this problem.
>
> [1] https://lore.kernel.org/linux-mm/731d8b44-1a45-40bc-a274-8f39a7ae0f7f@lucifer.local/
> [2] https://lore.kernel.org/all/aGfNKGBz9lhuK1AF@casper.infradead.org/
> [3] https://lore.kernel.org/linux-ext4/20251110043226.GD2988753@mit.edu/
> [4] https://lpc.events/event/19/contributions/2139/
>
> Pankaj Raghav (3):
> filemap: set max order to be min order if THP is disabled
> huge_memory: skip warning if min order and folio order are same in
> split
> blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices
>
> include/linux/blkdev.h | 5 -----
> include/linux/huge_mm.h | 40 ++++++++--------------------------------
> include/linux/pagemap.h | 17 ++++++-----------
> mm/memory.c | 41 +++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 55 insertions(+), 48 deletions(-)
>
>
> base-commit: e4c4d9892021888be6d874ec1be307e80382f431
> --
> 2.50.1
Best Regards,
Yan, Zi
next prev parent reply other threads:[~2025-12-09 16:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-06 3:08 Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 1/3] filemap: set max order to be min order if THP is disabled Pankaj Raghav
2025-12-09 7:45 ` Hannes Reinecke
2025-12-09 16:33 ` Pankaj Raghav
2025-12-10 0:38 ` Hannes Reinecke
2025-12-06 3:08 ` [RFC v2 2/3] huge_memory: skip warning if min order and folio order are same in split Pankaj Raghav
2025-12-06 3:08 ` [RFC v2 3/3] blkdev: remove CONFIG_TRANSPARENT_HUGEPAGES dependency for LBS devices Pankaj Raghav
2025-12-09 16:03 ` Zi Yan [this message]
2025-12-10 4:27 ` [RFC v2 0/3] Decoupling large folios dependency on THP Matthew Wilcox
2025-12-10 16:37 ` Zi Yan
2025-12-11 7:37 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64291696-C808-49D0-9F89-6B3B97F58717@nvidia.com \
--to=ziy@nvidia.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=gost.dev@samsung.com \
--cc=kernel@pankajraghav.com \
--cc=lance.yang@linux.dev \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mcgrof@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=p.raghav@samsung.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox