Re: Slow-tier Page Promotion discussion recap and open questions

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Rientjes <rientjes@google.com>
To: Shivank Garg <shivankg@amd.com>
Cc: Zi Yan <ziy@nvidia.com>,
	Aneesh Kumar <AneeshKumar.KizhakeVeetil@arm.com>,
	 David Hildenbrand <david@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>,
	 Kirill Shutemov <k.shutemov@gmail.com>,
	 Matthew Wilcox <willy@infradead.org>,
	Mel Gorman <mel.gorman@gmail.com>,
	 "Rao, Bharata Bhasker" <bharata@amd.com>,
	Rik van Riel <riel@surriel.com>,
	 RaghavendraKT <Raghavendra.KodsaraThimmappa@amd.com>,
	 Wei Xu <weixugc@google.com>,
	Suyeon Lee <leesuyeon0506@gmail.com>,
	 Lei Chen <leillc@google.com>,
	"Shukla, Santosh" <santosh.shukla@amd.com>,
	 "Grimm, Jon" <jon.grimm@amd.com>,
	sj@kernel.org, shy828301@gmail.com,
	 Liam Howlett <liam.howlett@oracle.com>,
	 Gregory Price <gregory.price@memverge.com>,
	linux-mm@kvack.org
Subject: Re: Slow-tier Page Promotion discussion recap and open questions
Date: Sun, 29 Dec 2024 21:30:51 -0800 (PST)	[thread overview]
Message-ID: <edfcb05e-090c-bdef-88f2-00a87aff6a9b@google.com> (raw)
In-Reply-To: <1c424899-d394-452f-9e13-d8cf77660c4a@amd.com>

On Thu, 19 Dec 2024, Shivank Garg wrote:

> On 12/18/2024 8:20 PM, Zi Yan wrote:
> > On 17 Dec 2024, at 23:19, David Rientjes wrote:
> > 
> >> Hi everybody,
> >>
> >> We had a very interactive discussion last week led by RaghavendraKT on
> >> slow-tier page promotion intended for memory tiering platforms, thank
> >> you!  Thanks as well to everybody who attended and provided great
> >> questions, suggestions, and feedback.
> >>
> >> The RFC patch series "mm: slowtier page promotion based on PTE A bit"[1]
> >> is a proposal to allow for asynchronous page promotion based on memory
> >> accesses as an alternative to NUMA Balancing based promotions.  There was
> >> widespread interest in this topic and the discussion surfaced multiple
> >> use cases and requirements, very focused on CXL use cases.
> >>
> > <snip>
> >> ----->o-----
> >> I asked about offloading the migration to a data mover, such as the PSP
> >> for AMD, DMA engine, etc and whether that should be treated entirely
> >> separately as a topic.  Bharata said there was a proof-of-concept
> >> available from AMD that does just that but the initial results were not
> >> that encouraging.
> >>
> >> Zi asked if the DMA engine saturated the link between the slow and fast
> >> tiers.  If we want to offload to a copy engine, we need to verify that
> >> the throughput is sufficient or we may be better off using idle cpus to
> >> perform the migration for us.
> > 
> > <snip>
> >>
> >>  - we likely want to reconsider the single threaded nature of the kthread
> >>    even if only for NUMA purposes
> >>
> > 
> > Related to using DMA engine and/or multi threads for page migration, I had
> > a patchset accelerating page migration[1] back in 2019. It showed good
> > throughput speedup, ~4x using 16 threads to copy multiple 2MB THP. I think
> > it is time to revisit the topic.
> > 
> > 
> > [1] https://lore.kernel.org/linux-mm/20190404020046.32741-1-zi.yan@sent.com/
> 
> Hi All,
> 
> I wanted to provide some additional context regarding the AMD DMA offloading
> POC mentioned by Bharata:
> https://lore.kernel.org/linux-mm/20240614221525.19170-1-shivankg@amd.com
> 
> While the initial results weren't as encouraging as hoped, I plan to improve this
> in next versions of the patchset.
> 
> The core idea in my RFC patchset is restructuring the folio move operation
> to better leverage DMA hardware. Instead of the current folio-by-folio approach:
> 
> for_each_folio() {
>     copy metadata + content + update PTEs
> }
> 
> We batch the operations to minimize overhead:
> 
> for_each_folio() {
>     copy metadata
> }
> DMA batch copy all content
> for_each_folio() {
>     update PTEs
> }
> 
> My experiment showed that folio copy can consume up to 26.6% of total migration
> cost when moving data between NUMA nodes. This suggests significant room for
> improvement through DMA offloading, particularly for the larger transfers expected
> in CXL scenarios.
> 
> It would be interesting work on combining these approaches for optimized page
> promotion.
> 

This is very exciting, thanks Shivank and Zi!  The reason I brought this 
topic up during the session on asynchronous page promotion for memory 
tiering was because page migration is likely going to become *much* more 
popular and will be in the critical path under system-wide memory 
pressure.  Hardware assist and any software optimizations that can go 
along with it would certainly be very interesting to discuss.

Shivank, do you have an estimated timeline for when that patch series will 
be refreshed?  Any planned integration with TMPM?

Zi, are you looking to refresh your series and continue discussing page 
migration offload?  We could set up another Linux MM Alignment Session 
topic focused exactly on this and get representatives from the vendors 
involved.

Thanks!

next prev parent reply	other threads:[~2024-12-30  5:30 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-18  4:19 David Rientjes
2024-12-18 14:50 ` Zi Yan
2024-12-19  6:38   ` Shivank Garg
2024-12-30  5:30     ` David Rientjes [this message]
2024-12-30 17:33       ` Zi Yan
2025-01-06  9:14       ` Shivank Garg
2024-12-18 15:21 ` Nadav Amit
2024-12-20 11:28   ` Raghavendra K T
2024-12-18 19:23 ` SeongJae Park
2024-12-19  0:56 ` Gregory Price
2024-12-26  1:28   ` Karim Manaouil
2024-12-30  5:36     ` David Rientjes
2024-12-30  6:51       ` Raghavendra K T
2025-01-06 17:02       ` Gregory Price
2024-12-20 11:21 ` Raghavendra K T
2025-01-02  4:44   ` David Rientjes
2025-01-06  6:29     ` Raghavendra K T
2025-01-08  5:43     ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=edfcb05e-090c-bdef-88f2-00a87aff6a9b@google.com \
    --to=rientjes@google.com \
    --cc=AneeshKumar.KizhakeVeetil@arm.com \
    --cc=Raghavendra.KodsaraThimmappa@amd.com \
    --cc=bharata@amd.com \
    --cc=david@redhat.com \
    --cc=gregory.price@memverge.com \
    --cc=jhubbard@nvidia.com \
    --cc=jon.grimm@amd.com \
    --cc=k.shutemov@gmail.com \
    --cc=leesuyeon0506@gmail.com \
    --cc=leillc@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=mel.gorman@gmail.com \
    --cc=riel@surriel.com \
    --cc=santosh.shukla@amd.com \
    --cc=shivankg@amd.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox