From: Shivank Garg <shivankg@amd.com>
To: David Rientjes <rientjes@google.com>
Cc: Zi Yan <ziy@nvidia.com>,
Aneesh Kumar <AneeshKumar.KizhakeVeetil@arm.com>,
David Hildenbrand <david@redhat.com>,
John Hubbard <jhubbard@nvidia.com>,
Kirill Shutemov <k.shutemov@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
Mel Gorman <mel.gorman@gmail.com>,
"Rao, Bharata Bhasker" <bharata@amd.com>,
Rik van Riel <riel@surriel.com>,
RaghavendraKT <Raghavendra.KodsaraThimmappa@amd.com>,
Wei Xu <weixugc@google.com>, Suyeon Lee <leesuyeon0506@gmail.com>,
Lei Chen <leillc@google.com>,
"Shukla, Santosh" <santosh.shukla@amd.com>,
"Grimm, Jon" <jon.grimm@amd.com>,
sj@kernel.org, shy828301@gmail.com,
Liam Howlett <liam.howlett@oracle.com>,
Gregory Price <gregory.price@memverge.com>,
linux-mm@kvack.org
Subject: Re: Slow-tier Page Promotion discussion recap and open questions
Date: Mon, 6 Jan 2025 14:44:51 +0530 [thread overview]
Message-ID: <78e27ff3-2cb2-4007-ac6f-b4ac82cdc6da@amd.com> (raw)
In-Reply-To: <edfcb05e-090c-bdef-88f2-00a87aff6a9b@google.com>
On 12/30/2024 11:00 AM, David Rientjes wrote:
> On Thu, 19 Dec 2024, Shivank Garg wrote:
>
>> On 12/18/2024 8:20 PM, Zi Yan wrote:
>>> On 17 Dec 2024, at 23:19, David Rientjes wrote:
>>>
>>>> Hi everybody,
>>>>
>>>> We had a very interactive discussion last week led by RaghavendraKT on
>>>> slow-tier page promotion intended for memory tiering platforms, thank
>>>> you! Thanks as well to everybody who attended and provided great
>>>> questions, suggestions, and feedback.
>>>>
>>>> The RFC patch series "mm: slowtier page promotion based on PTE A bit"[1]
>>>> is a proposal to allow for asynchronous page promotion based on memory
>>>> accesses as an alternative to NUMA Balancing based promotions. There was
>>>> widespread interest in this topic and the discussion surfaced multiple
>>>> use cases and requirements, very focused on CXL use cases.
>>>>
>>> <snip>
>>>> ----->o-----
>>>> I asked about offloading the migration to a data mover, such as the PSP
>>>> for AMD, DMA engine, etc and whether that should be treated entirely
>>>> separately as a topic. Bharata said there was a proof-of-concept
>>>> available from AMD that does just that but the initial results were not
>>>> that encouraging.
>>>>
>>>> Zi asked if the DMA engine saturated the link between the slow and fast
>>>> tiers. If we want to offload to a copy engine, we need to verify that
>>>> the throughput is sufficient or we may be better off using idle cpus to
>>>> perform the migration for us.
>>>
>>> <snip>
>>>>
>>>> - we likely want to reconsider the single threaded nature of the kthread
>>>> even if only for NUMA purposes
>>>>
>>>
>>> Related to using DMA engine and/or multi threads for page migration, I had
>>> a patchset accelerating page migration[1] back in 2019. It showed good
>>> throughput speedup, ~4x using 16 threads to copy multiple 2MB THP. I think
>>> it is time to revisit the topic.
>>>
>>>
>>> [1] https://lore.kernel.org/linux-mm/20190404020046.32741-1-zi.yan@sent.com/
>>
>> Hi All,
>>
>> I wanted to provide some additional context regarding the AMD DMA offloading
>> POC mentioned by Bharata:
>> https://lore.kernel.org/linux-mm/20240614221525.19170-1-shivankg@amd.com
>>
>> While the initial results weren't as encouraging as hoped, I plan to improve this
>> in next versions of the patchset.
>>
>> The core idea in my RFC patchset is restructuring the folio move operation
>> to better leverage DMA hardware. Instead of the current folio-by-folio approach:
>>
>> for_each_folio() {
>> copy metadata + content + update PTEs
>> }
>>
>> We batch the operations to minimize overhead:
>>
>> for_each_folio() {
>> copy metadata
>> }
>> DMA batch copy all content
>> for_each_folio() {
>> update PTEs
>> }
>>
>> My experiment showed that folio copy can consume up to 26.6% of total migration
>> cost when moving data between NUMA nodes. This suggests significant room for
>> improvement through DMA offloading, particularly for the larger transfers expected
>> in CXL scenarios.
>>
>> It would be interesting work on combining these approaches for optimized page
>> promotion.
>>
>
> This is very exciting, thanks Shivank and Zi! The reason I brought this
> topic up during the session on asynchronous page promotion for memory
> tiering was because page migration is likely going to become *much* more
> popular and will be in the critical path under system-wide memory
> pressure. Hardware assist and any software optimizations that can go
> along with it would certainly be very interesting to discuss.
>
> Shivank, do you have an estimated timeline for when that patch series will
> be refreshed? Any planned integration with TMPM?
Hi David,
It's definitely interesting for us to get it working with SDXI.
I'm going to try it out.
Thanks,
Shivank
>
> Zi, are you looking to refresh your series and continue discussing page
> migration offload? We could set up another Linux MM Alignment Session
> topic focused exactly on this and get representatives from the vendors
> involved.
>
> Thanks!
next prev parent reply other threads:[~2025-01-06 9:15 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-18 4:19 David Rientjes
2024-12-18 14:50 ` Zi Yan
2024-12-19 6:38 ` Shivank Garg
2024-12-30 5:30 ` David Rientjes
2024-12-30 17:33 ` Zi Yan
2025-01-06 9:14 ` Shivank Garg [this message]
2024-12-18 15:21 ` Nadav Amit
2024-12-20 11:28 ` Raghavendra K T
2024-12-18 19:23 ` SeongJae Park
2024-12-19 0:56 ` Gregory Price
2024-12-26 1:28 ` Karim Manaouil
2024-12-30 5:36 ` David Rientjes
2024-12-30 6:51 ` Raghavendra K T
2025-01-06 17:02 ` Gregory Price
2024-12-20 11:21 ` Raghavendra K T
2025-01-02 4:44 ` David Rientjes
2025-01-06 6:29 ` Raghavendra K T
2025-01-08 5:43 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=78e27ff3-2cb2-4007-ac6f-b4ac82cdc6da@amd.com \
--to=shivankg@amd.com \
--cc=AneeshKumar.KizhakeVeetil@arm.com \
--cc=Raghavendra.KodsaraThimmappa@amd.com \
--cc=bharata@amd.com \
--cc=david@redhat.com \
--cc=gregory.price@memverge.com \
--cc=jhubbard@nvidia.com \
--cc=jon.grimm@amd.com \
--cc=k.shutemov@gmail.com \
--cc=leesuyeon0506@gmail.com \
--cc=leillc@google.com \
--cc=liam.howlett@oracle.com \
--cc=linux-mm@kvack.org \
--cc=mel.gorman@gmail.com \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=santosh.shukla@amd.com \
--cc=shy828301@gmail.com \
--cc=sj@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox