[LSF/MM/BPF TOPIC] Enhancements to Page Migration with Multi-threading and Batch Offloading to DMA

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Shivank Garg <shivankg@amd.com>
To: akpm@linux-foundation.org, lsf-pc@lists.linux-foundation.org,
	linux-mm@kvack.org, ziy@nvidia.com
Cc: AneeshKumar.KizhakeVeetil@arm.com, baolin.wang@linux.alibaba.com,
	bharata@amd.com, david@redhat.com, gregory.price@memverge.com,
	honggyu.kim@sk.com, jane.chu@oracle.com, jhubbard@nvidia.com,
	jon.grimm@amd.com, k.shutemov@gmail.com, leesuyeon0506@gmail.com,
	leillc@google.com, liam.howlett@oracle.com,
	linux-kernel@vger.kernel.org, mel.gorman@gmail.com,
	Michael.Day@amd.com, Raghavendra.KodsaraThimmappa@amd.com,
	riel@surriel.com, rientjes@google.com, santosh.shukla@amd.com,
	shivankg@amd.com, shy828301@gmail.com, sj@kernel.org,
	wangkefeng.wang@huawei.com, weixugc@google.com,
	willy@infradead.org, ying.huang@linux.alibaba.com
Subject: [LSF/MM/BPF TOPIC] Enhancements to Page Migration with Multi-threading and Batch Offloading to DMA
Date: Thu, 23 Jan 2025 11:25:05 +0530	[thread overview]
Message-ID: <cf6fc05d-c0b0-4de3-985e-5403977aa3aa@amd.com> (raw)

Hi all,

Zi Yan and I would like to propose the topic: Enhancements to Page
Migration with Multi-threading and Batch Offloading to DMA.

Page migration is a critical operation in NUMA systems that can incur
significant overheads, affecting memory management performance across
various workloads. For example, copying folios between DRAM NUMA nodes
can take ~25% of the total migration cost for migrating 256MB of data.

Modern systems are equipped with powerful DMA engines for bulk data
copying, GPUs, and high CPU core counts. Leveraging these hardware
capabilities becomes essential for systems where frequent page promotion
and demotion occur - from large-scale tiered-memory systems with CXL nodes
to CPU-GPU coherent system with GPU memory exposed as NUMA nodes.

Existing page migration performs sequential page copying, underutilizing
modern CPU architectures and high-bandwidth memory subsystems.

We have proposed and posted RFCs to enhance page migration through three
key techniques:
1. Batching migration operations for bulk copying data [1]
2. Multi-threaded folio copying [2]
3. DMA offloading to hardware accelerators [1]

By employing batching and multi-threaded folio copying, we are able to
achieve significant improvements in page migration throughput for large
pages.

Discussion points:
1. Performance:
   a. Policy decision for DMA and CPU selection
   b. Platform-specific scheduling of folio-copy worker threads for better
      bandwidth utilization
   c. Using Non-temporal instructions for CPU-based memcpy
   d. Upscaling/downscaling worker threads based on migration size, CPU
      availability (system load), bandwidth saturation, etc.
2. Interface requirements with DMA hardware:
   a. Standardizing APIs for DMA drivers and support for different DMA
      drivers
   b. Enhancing DMA drivers for bulk copying (e.g., SDXi Engine)
3. Resources Accounting:
   a. CPU cgroups accounting and fairness [3]
   b. Who bears migration cost? - (Migration cost attribution)

References:
[1] https://lore.kernel.org/all/20240614221525.19170-1-shivankg@amd.com
[2] https://lore.kernel.org/all/20250103172419.4148674-1-ziy@nvidia.com
[3] https://lore.kernel.org/all/CAHbLzkpoKP0fVZP5b10wdzAMDLWysDy7oH0qaUssiUXj80R6bw@mail.gmail.com

Best Regards,
Shivank

next             reply	other threads:[~2025-01-23  5:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-23  5:55 Shivank Garg [this message]
2025-01-27  6:55 ` David Rientjes
2025-01-27 12:37   ` Zi Yan
2025-01-27 13:55     ` Jonathan Cameron
2025-01-27 16:30       ` Zi Yan
2025-01-28  6:54     ` Shivank Garg
2025-03-24  6:01 ` Shivank Garg
2025-03-25  5:20   ` Shivank Garg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cf6fc05d-c0b0-4de3-985e-5403977aa3aa@amd.com \
    --to=shivankg@amd.com \
    --cc=AneeshKumar.KizhakeVeetil@arm.com \
    --cc=Michael.Day@amd.com \
    --cc=Raghavendra.KodsaraThimmappa@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bharata@amd.com \
    --cc=david@redhat.com \
    --cc=gregory.price@memverge.com \
    --cc=honggyu.kim@sk.com \
    --cc=jane.chu@oracle.com \
    --cc=jhubbard@nvidia.com \
    --cc=jon.grimm@amd.com \
    --cc=k.shutemov@gmail.com \
    --cc=leesuyeon0506@gmail.com \
    --cc=leillc@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mel.gorman@gmail.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=santosh.shukla@amd.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=wangkefeng.wang@huawei.com \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox