linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org
Cc: linux-cxl@vger.kernel.org, Byungchul Park <byungchul@sk.com>,
	Honggyu Kim <honggyu.kim@sk.com>
Subject: Re: [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier
Date: Tue, 4 Feb 2025 10:59:55 +0100	[thread overview]
Message-ID: <3e5cfe8b-9ca8-4625-b6ff-7f170f1579ff@redhat.com> (raw)
In-Reply-To: <Z54hUTXRsw0LYQ8b@localhost.localdomain>

On 01.02.25 14:29, Hyeonggon Yoo wrote:
> Hi,
> 
> Byungchul and I would like to suggest a topic about the performance impact of
> kernel allocations on CXL memory.
> 
> As CXL-enabled servers and memory devices are being developed, CXL-supported
> hardware is expected to continue emerging in the coming years.
> 
> The Linux kernel supports hot-plugging CXL memory via dax/kmem functionality.
> The hot-plugged memory allows either unmovable kernel allocations
> (ZONE_NORMAL), or restricts them to movable allocations (ZONE_MOVABLE)
> depending on the hot-plug policy.
> 
> Recently, Byungchul and I observed a measurable performance degradation with
> memhp_default_state=online compared to memhp_default_state=online_movable
> on a server where the ratio of memory capacity between DRAM and CXL is 1:2
> when running the llama.cpp workload with the default mempolicy.
> The workload performs LLM inference and pressures the memory subsystem
> due to its large working set size.
> 
> Obviously, allowing kernel allocations from CXL memory degrades performance
> because kernel memory like page tables, kernel stacks, and slab allocations,
> is accessed frequently and may reside in physical memory with significantly
> higher access latency.
> 
> However, as far as I can tell there are at least two reasons why we need to
> support ZONE_NORMAL for CXL memory (please add if there are more):
>    1. When hot-plugging a huge amount of CXL memory, the size of
>       the struct page array might not fit into DRAM
>       -> This could be relaxed with memmap_on_memory

There are some others, although most are less significant, and I tried 
documenting them here:

https://www.kernel.org/doc/html/latest/admin-guide/mm/memory-hotplug.html#zone-movable-sizing-considerations


E.g., a 4 KiB page requires a single PTE (8 bytes) to be mapped into 
user space, corresponding to 0.2 %. At least for anonymous memory, 
PMD-sized THPs don't help, because we still have to allocate the page 
table to be prepared for a PMD->PTE remapping. In the worst case, the 
directmap requires another 0.2 % (but usually, we rely on PMD mappings). 
So that usage depends on how you are intending to use the CXL memory 
(e.g., pagecache vs. anonymous memory).


 >    2. To hot-unplug CXL memory, pages in CXL memory should be 
migrated to DRAM,
 >       which means sometimes some portion of CXL memory should be 
ZONE_NORMAL.

I don't quite understand that argument for ZONE_NORMAL.

-- 
Cheers,

David / dhildenb



      parent reply	other threads:[~2025-02-04 10:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-01 13:29 Hyeonggon Yoo
2025-02-01 14:04 ` Matthew Wilcox
2025-02-01 15:13   ` Hyeonggon Yoo
2025-02-01 16:30     ` Gregory Price
2025-02-01 18:48       ` Matthew Wilcox
2025-02-03 22:09       ` Dan Williams
2025-02-07  7:20   ` Byungchul Park
2025-02-07  8:57     ` Gregory Price
2025-02-07  9:27       ` Gregory Price
2025-02-07  9:34       ` Honggyu Kim
2025-02-07  9:54         ` Gregory Price
2025-02-07 10:49           ` Byungchul Park
2025-02-10  2:33           ` Harry (Hyeonggon) Yoo
2025-02-10  3:19             ` Matthew Wilcox
2025-02-10  6:00             ` Gregory Price
2025-02-10  7:17               ` Byungchul Park
2025-02-10 15:47                 ` Gregory Price
2025-02-10 15:55                   ` Matthew Wilcox
2025-02-10 16:06                     ` Gregory Price
2025-02-11  1:53                   ` Byungchul Park
2025-02-21  1:52                   ` Harry Yoo
2025-02-25  4:54                     ` [LSF/MM/BPF TOPIC] Gathering ideas to reduce ZONE_NORMAL cost Byungchul Park
2025-02-25  5:06                   ` [LSF/MM/BPF TOPIC] Restricting or migrating unmovable kernel allocations from slow tier Byungchul Park
2025-03-03 15:55                     ` Gregory Price
2025-02-07 10:14       ` Byungchul Park
2025-02-10  7:02       ` Byungchul Park
2025-02-04  9:59 ` David Hildenbrand [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e5cfe8b-9ca8-4625-b6ff-7f170f1579ff@redhat.com \
    --to=david@redhat.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=byungchul@sk.com \
    --cc=honggyu.kim@sk.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox