From: Raghavendra K T <raghavendra.kt@amd.com>
To: Hillf Danton <hdanton@sina.com>
Cc: dave.hansen@intel.com, david@redhat.com, hannes@cmpxchg.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, ziy@nvidia.com
Subject: Re: [RFC PATCH V1 09/13] mm: Add heuristic to calculate target node
Date: Sun, 23 Mar 2025 23:44:02 +0530 [thread overview]
Message-ID: <584d3ace-ca64-424d-b8ce-c2cd54cec8a6@amd.com> (raw)
In-Reply-To: <20250321105309.3521-1-hdanton@sina.com>
On 3/21/2025 4:23 PM, Hillf Danton wrote:
> On Wed, 19 Mar 2025 19:30:24 +0000 Raghavendra K T wrote
>> One of the key challenges in PTE A bit based scanning is to find right
>> target node to promote to.
>>
>> Here is a simple heuristic based approach:
>> While scanning pages of any mm we also scan toptier pages that belong
>> to that mm. We get an insight on the distribution of pages that potentially
>> belonging to particular toptier node and also its recent access.
>>
>> Current logic walks all the toptier node, and picks the one with highest
>> accesses.
>>
> My $.02 for selecting promotion target node given a simple multi tier system.
>
> Tk /* top Tierk (k > 0) has K (K > 0) nodes */
> ...
> Tj /* Tierj (j > 0) has J (J > 0) nodes */
> ...
> T0 /* bottom Tier0 has O (O > 0) nodes */
>
> Unless config comes from user space (sysfs window for example should be opened),
>
> 1, adopt the data flow pattern of L3 cache <--> DRAM <--> SSD, to only
> select Tj+1 when promoting pages in Tj.
>
Hello Hillf ,
Thanks for giving a thought on this. This looks to be good idea in
general. Mostly be able to implement with reverse of preferred demotion
target?
Thinking loud, Can there be exception cases similar to non-temporal copy
operations, where we don't want to pollute cache?
I mean cases we don't want to hop via middle tier node..?
> 2, select the node in Tj+1 that has the most free pages for promotion
> by default.
Not sure if this is productive always.
for e.g.
node 0-1 toptier (100GB)
node2 slowtier
suppose a workload (that occupies 80GB in total) running on CPU of node1
where 40GB is already in node1 rest of 40GB is in node2.
Now it is preferred to consolidate workload on node1 when slowtier
data becomes hot?
(This assumes that node1 channel has enough bandwidth to cater to
requirement of the workload)
> 3, nothing more.
next prev parent reply other threads:[~2025-03-23 18:14 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-19 19:30 [RFC PATCH V1 00/13] mm: slowtier page promotion based on PTE A bit Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 01/13] mm: Add kmmscand kernel daemon Raghavendra K T
2025-03-21 16:06 ` Jonathan Cameron
2025-03-24 15:09 ` Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 02/13] mm: Maintain mm_struct list in the system Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 03/13] mm: Scan the mm and create a migration list Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 04/13] mm: Create a separate kernel thread for migration Raghavendra K T
2025-03-21 17:29 ` Jonathan Cameron
2025-03-24 15:17 ` Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 05/13] mm/migration: Migrate accessed folios to toptier node Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 06/13] mm: Add throttling of mm scanning using scan_period Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 07/13] mm: Add throttling of mm scanning using scan_size Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 08/13] mm: Add initial scan delay Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 09/13] mm: Add heuristic to calculate target node Raghavendra K T
2025-03-21 17:42 ` Jonathan Cameron
2025-03-24 16:17 ` Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 10/13] sysfs: Add sysfs support to tune scanning Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 11/13] vmstat: Add vmstat counters Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 12/13] trace/kmmscand: Add tracing of scanning and migration Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 13/13] prctl: Introduce new prctl to control scanning Raghavendra K T
2025-03-19 23:00 ` [RFC PATCH V1 00/13] mm: slowtier page promotion based on PTE A bit Davidlohr Bueso
2025-03-20 8:51 ` Raghavendra K T
2025-03-20 19:11 ` Raghavendra K T
2025-03-21 20:35 ` Davidlohr Bueso
2025-03-25 6:36 ` Raghavendra K T
2025-03-20 21:50 ` Davidlohr Bueso
2025-03-21 6:48 ` Raghavendra K T
2025-03-21 15:52 ` Jonathan Cameron
[not found] ` <20250321105309.3521-1-hdanton@sina.com>
2025-03-23 18:14 ` Raghavendra K T [this message]
[not found] ` <20250324110543.3599-1-hdanton@sina.com>
2025-03-24 14:54 ` [RFC PATCH V1 09/13] mm: Add heuristic to calculate target node Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=584d3ace-ca64-424d-b8ce-c2cd54cec8a6@amd.com \
--to=raghavendra.kt@amd.com \
--cc=dave.hansen@intel.com \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox