From: David Hildenbrand <david@redhat.com>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>,
lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org
Cc: linux-cxl@vger.kernel.org, Byungchul Park <byungchul@sk.com>,
Honggyu Kim <honggyu.kim@sk.com>
Subject: Re: [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier
Date: Tue, 4 Feb 2025 10:59:55 +0100 [thread overview]
Message-ID: <3e5cfe8b-9ca8-4625-b6ff-7f170f1579ff@redhat.com> (raw)
In-Reply-To: <Z54hUTXRsw0LYQ8b@localhost.localdomain>
On 01.02.25 14:29, Hyeonggon Yoo wrote:
> Hi,
>
> Byungchul and I would like to suggest a topic about the performance impact of
> kernel allocations on CXL memory.
>
> As CXL-enabled servers and memory devices are being developed, CXL-supported
> hardware is expected to continue emerging in the coming years.
>
> The Linux kernel supports hot-plugging CXL memory via dax/kmem functionality.
> The hot-plugged memory allows either unmovable kernel allocations
> (ZONE_NORMAL), or restricts them to movable allocations (ZONE_MOVABLE)
> depending on the hot-plug policy.
>
> Recently, Byungchul and I observed a measurable performance degradation with
> memhp_default_state=online compared to memhp_default_state=online_movable
> on a server where the ratio of memory capacity between DRAM and CXL is 1:2
> when running the llama.cpp workload with the default mempolicy.
> The workload performs LLM inference and pressures the memory subsystem
> due to its large working set size.
>
> Obviously, allowing kernel allocations from CXL memory degrades performance
> because kernel memory like page tables, kernel stacks, and slab allocations,
> is accessed frequently and may reside in physical memory with significantly
> higher access latency.
>
> However, as far as I can tell there are at least two reasons why we need to
> support ZONE_NORMAL for CXL memory (please add if there are more):
> 1. When hot-plugging a huge amount of CXL memory, the size of
> the struct page array might not fit into DRAM
> -> This could be relaxed with memmap_on_memory
There are some others, although most are less significant, and I tried
documenting them here:
https://www.kernel.org/doc/html/latest/admin-guide/mm/memory-hotplug.html#zone-movable-sizing-considerations
E.g., a 4 KiB page requires a single PTE (8 bytes) to be mapped into
user space, corresponding to 0.2 %. At least for anonymous memory,
PMD-sized THPs don't help, because we still have to allocate the page
table to be prepared for a PMD->PTE remapping. In the worst case, the
directmap requires another 0.2 % (but usually, we rely on PMD mappings).
So that usage depends on how you are intending to use the CXL memory
(e.g., pagecache vs. anonymous memory).
> 2. To hot-unplug CXL memory, pages in CXL memory should be
migrated to DRAM,
> which means sometimes some portion of CXL memory should be
ZONE_NORMAL.
I don't quite understand that argument for ZONE_NORMAL.
--
Cheers,
David / dhildenb
prev parent reply other threads:[~2025-02-04 10:00 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-01 13:29 Hyeonggon Yoo
2025-02-01 14:04 ` Matthew Wilcox
2025-02-01 15:13 ` Hyeonggon Yoo
2025-02-01 16:30 ` Gregory Price
2025-02-01 18:48 ` Matthew Wilcox
2025-02-03 22:09 ` Dan Williams
2025-02-07 7:20 ` Byungchul Park
2025-02-07 8:57 ` Gregory Price
2025-02-07 9:27 ` Gregory Price
2025-02-07 9:34 ` Honggyu Kim
2025-02-07 9:54 ` Gregory Price
2025-02-07 10:49 ` Byungchul Park
2025-02-10 2:33 ` Harry (Hyeonggon) Yoo
2025-02-10 3:19 ` Matthew Wilcox
2025-02-10 6:00 ` Gregory Price
2025-02-10 7:17 ` Byungchul Park
2025-02-10 15:47 ` Gregory Price
2025-02-10 15:55 ` Matthew Wilcox
2025-02-10 16:06 ` Gregory Price
2025-02-11 1:53 ` Byungchul Park
2025-02-21 1:52 ` Harry Yoo
2025-02-25 4:54 ` [LSF/MM/BPF TOPIC] Gathering ideas to reduce ZONE_NORMAL cost Byungchul Park
2025-02-25 5:06 ` [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier Byungchul Park
2025-03-03 15:55 ` Gregory Price
2025-02-07 10:14 ` Byungchul Park
2025-02-10 7:02 ` Byungchul Park
2025-02-04 9:59 ` David Hildenbrand [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3e5cfe8b-9ca8-4625-b6ff-7f170f1579ff@redhat.com \
--to=david@redhat.com \
--cc=42.hyeyoo@gmail.com \
--cc=byungchul@sk.com \
--cc=honggyu.kim@sk.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox