From: Barry Song <21cnbao@gmail.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: lsf-pc@lists.linux-foundation.org, Linux-MM <linux-mm@kvack.org>,
Matthew Wilcox <willy@infradead.org>,
Dave Chinner <david@fromorbit.com>
Subject: Re: [LSF/MM/BPF TOPIC] Mapping text with large folios
Date: Thu, 20 Mar 2025 09:47:46 +1300 [thread overview]
Message-ID: <CAGsJ_4yCP5ELP-jkuOay8zjbzJhw5f430b8JA5kGJD=PyTKB8A@mail.gmail.com> (raw)
In-Reply-To: <6201267f-6d3a-4942-9a61-371bd41d633d@arm.com>
On Thu, Mar 20, 2025 at 4:38 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> Hi All,
>
> I know this is very last minute, but I was hoping that it might be possible to
> squeeze in a session to discuss the following?
>
> Summary/Background:
>
> On arm64, physically contiguous and naturally aligned regions can take advantage
> of contpte mappings (e.g. 64 KB) to reduce iTLB pressure. However, for file
> regions containing text, current readahead behaviour often yields small,
> misaligned folios, preventing this optimization. This proposal introduces a
> special-case path for executable mappings, performing synchronous reads of an
> architecture-chosen size into large folios (64 KB on arm64). Early performance
> tests on real-world workloads (e.g. nginx, redis, kernel compilation) show ~2-9%
> gains.
>
> I’ve previously posted attempts to enable this performance improvement ([1],
> [2]), but there were objections and conversation fizzled out. Now that I have
> more compelling performance data, I’m hoping there is now stronger
> justification, and we can find a path forwards.
>
> What I’d Like to Cover:
>
> - Describe how text memory should ideally be mapped and why it benefits
> performance.
>
> - Brief review of performance data.
>
> - Discuss options for the best way to encourage text into large folios:
> - Let the architecture request a preferred size
> - Extend VMA attributes to include preferred THP size hint
We might need this for a couple of other cases.
1. The native heap—for example, a native heap like jemalloc—can configure
the base "granularity" and then use MADV_DONTNEED/FREE at that granularity
to manage memory. Currently, the default granularity is PAGE_SIZE, which can
lead to excessive folio splitting. For instance, if we set jemalloc's
granularity to
16KB while sysfs supports 16KB, 32KB, 64KB, etc., splitting can still occur.
Therefore, in some cases, I believe the kernel should be aware of how
userspace is managing memory.
2. Java heap GC compaction - userfaultfd_move() things.
I am considering adding support for batched PTE/folios moves in
userfaultfd_move().
If sysfs enables 16KB, 32KB, 64KB, 128KB, etc., but the userspace Java
heap moves
memory at a 16KB granularity, it could lead to excessive folio splitting.
For exec, it seems we need a userspace-transparent approach. Asking each
application to modify its code to madvise the kernel on its preferred exec folio
size seems cumbersome.
I mean, we could whitelist all execs by default unless an application explicitly
requests to disable it?
> - Provide a sysfs knob
> - Plug into the “mapping min folio order” infrastructure
> - Other approaches?
>
> [1] https://lore.kernel.org/all/20240215154059.2863126-1-ryan.roberts@arm.com/
> [2] https://lore.kernel.org/all/20240717071257.4141363-1-ryan.roberts@arm.com/
>
> Thanks,
> Ryan
Thanks
Barry
next prev parent reply other threads:[~2025-03-19 20:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-19 15:38 Ryan Roberts
2025-03-19 18:16 ` Yang Shi
2025-03-19 20:38 ` Dave Chinner
2025-03-19 22:13 ` Barry Song
2025-03-20 0:53 ` Dave Chinner
2025-03-20 14:47 ` Ryan Roberts
2025-03-20 12:16 ` Ryan Roberts
2025-03-20 12:13 ` Ryan Roberts
2025-03-19 20:47 ` Barry Song [this message]
2025-03-20 14:57 ` Ryan Roberts
2025-03-30 4:46 ` Barry Song
2025-04-01 11:09 ` Ryan Roberts
2025-04-01 10:53 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGsJ_4yCP5ELP-jkuOay8zjbzJhw5f430b8JA5kGJD=PyTKB8A@mail.gmail.com' \
--to=21cnbao@gmail.com \
--cc=david@fromorbit.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=ryan.roberts@arm.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox