linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Raghavendra K T <raghavendra.kt@amd.com>
To: Hillf Danton <hdanton@sina.com>
Cc: dave.hansen@intel.com, david@redhat.com, hannes@cmpxchg.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, ziy@nvidia.com
Subject: Re: [RFC PATCH V1 09/13] mm: Add heuristic to calculate target node
Date: Mon, 24 Mar 2025 20:24:49 +0530	[thread overview]
Message-ID: <17b5d869-d1f7-4427-a293-aef42a37d639@amd.com> (raw)
In-Reply-To: <20250324110543.3599-1-hdanton@sina.com>



On 3/24/2025 4:35 PM, Hillf Danton wrote:
> On Sun, 23 Mar 2025 23:44:02 +0530 Raghavendra K T wrote
>> On 3/21/2025 4:23 PM, Hillf Danton wrote:
>>> On Wed, 19 Mar 2025 19:30:24 +0000 Raghavendra K T wrote
>>>> One of the key challenges in PTE A bit based scanning is to find right
>>>> target node to promote to.
>>>>
>>>> Here is a simple heuristic based approach:
>>>>      While scanning pages of any mm we also scan toptier pages that belong
>>>> to that mm. We get an insight on the distribution of pages that potentially
>>>> belonging to particular toptier node and also its recent access.
>>>>
>>>> Current logic walks all the toptier node, and picks the one with highest
>>>> accesses.
>>>>
>>> My $.02 for selecting promotion target node given a simple multi tier system.
>>>
>>> 	Tk /* top Tierk (k > 0) has K (K > 0) nodes */
>>> 	...
>>> 	Tj /* Tierj (j > 0) has J (J > 0) nodes */
>>> 	...
>>> 	T0 /* bottom Tier0 has O (O > 0) nodes */
>>>
>>> Unless config comes from user space (sysfs window for example should be opened),
>>>
>>> 1, adopt the data flow pattern of L3 cache <--> DRAM <--> SSD, to only
>>> select Tj+1 when promoting pages in Tj.
>>>
>>
>> Hello Hillf ,
>> Thanks for giving a thought on this. This looks to be good idea in
>> general. Mostly be able to implement with reverse of preferred demotion
>> target?
>>
>> Thinking loud, Can there be exception cases similar to non-temporal copy
>> operations, where we don't want to pollute cache?
>> I mean cases we don't want to hop via middle tier node..?
>>
> Given page cache, direct IO and coherent DMA have their roles to play.
>

Agree.

>>> 2, select the node in Tj+1 that has the most free pages for promotion
>>> by default.
>>
>> Not sure if this is productive always.
>>
> Trying to cure all pains with ONE pill wastes minutes I think.
> 

Very much true.

> To achive reliable high order pages, page allocator can not work well in
> combination with kswapd and kcompactd without clear boundaries drawn in
> between the tree parties for example.
> 
>> for e.g.
>> node 0-1 toptier (100GB)
>> node2 slowtier
>>
>> suppose a workload (that occupies 80GB in total) running on CPU of node1
>> where 40GB is already in node1 rest of 40GB is in node2.
>>
>> Now it is preferred to consolidate workload on node1 when slowtier
>> data becomes hot?
>>
> Yes and no (say, a couple seconds later mm pressure rises in node0).
> 
> In case of yes, I would like to turn on autonuma in the toptier instead
> without bothering to select the target node. You see a line is drawn
> between autonma and slowtier promotion now.

Yes, the goal has been slow tier promotion without much overhead to the
system + co-cooperatively work with NUMAB1 for top-tier balancing.
(for e.g., providing hints of hot VMAs).





      parent reply	other threads:[~2025-03-24 14:55 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-19 19:30 [RFC PATCH V1 00/13] mm: slowtier page promotion based on PTE A bit Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 01/13] mm: Add kmmscand kernel daemon Raghavendra K T
2025-03-21 16:06   ` Jonathan Cameron
2025-03-24 15:09     ` Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 02/13] mm: Maintain mm_struct list in the system Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 03/13] mm: Scan the mm and create a migration list Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 04/13] mm: Create a separate kernel thread for migration Raghavendra K T
2025-03-21 17:29   ` Jonathan Cameron
2025-03-24 15:17     ` Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 05/13] mm/migration: Migrate accessed folios to toptier node Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 06/13] mm: Add throttling of mm scanning using scan_period Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 07/13] mm: Add throttling of mm scanning using scan_size Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 08/13] mm: Add initial scan delay Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 09/13] mm: Add heuristic to calculate target node Raghavendra K T
2025-03-21 17:42   ` Jonathan Cameron
2025-03-24 16:17     ` Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 10/13] sysfs: Add sysfs support to tune scanning Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 11/13] vmstat: Add vmstat counters Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 12/13] trace/kmmscand: Add tracing of scanning and migration Raghavendra K T
2025-03-19 19:30 ` [RFC PATCH V1 13/13] prctl: Introduce new prctl to control scanning Raghavendra K T
2025-03-19 23:00 ` [RFC PATCH V1 00/13] mm: slowtier page promotion based on PTE A bit Davidlohr Bueso
2025-03-20  8:51   ` Raghavendra K T
2025-03-20 19:11     ` Raghavendra K T
2025-03-21 20:35       ` Davidlohr Bueso
2025-03-25  6:36         ` Raghavendra K T
2025-03-20 21:50     ` Davidlohr Bueso
2025-03-21  6:48       ` Raghavendra K T
2025-03-21 15:52 ` Jonathan Cameron
     [not found] ` <20250321105309.3521-1-hdanton@sina.com>
2025-03-23 18:14   ` [RFC PATCH V1 09/13] mm: Add heuristic to calculate target node Raghavendra K T
     [not found]   ` <20250324110543.3599-1-hdanton@sina.com>
2025-03-24 14:54     ` Raghavendra K T [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17b5d869-d1f7-4427-a293-aef42a37d639@amd.com \
    --to=raghavendra.kt@amd.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox