[PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
@ 2026-03-23  9:48 Donet Tom
  2026-04-02  0:22 ` Andrew Morton
  2026-04-02  3:27 ` Huang, Ying
  0 siblings, 2 replies; 10+ messages in thread
From: Donet Tom @ 2026-03-23  9:48 UTC (permalink / raw)
  To: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra
  Cc: Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman, Donet Tom

In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
disabled and the pages are on the lower tier, the pages may still be
promoted.

This happens because task_numa_work() updates the last_cpupid field to
record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
enabled and the folio is on the lower tier. If
NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
can retains a valid last CPU id.

In should_numa_migrate_memory(), the decision checks whether
NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
tier, and last_cpupid is invalid. However, the last_cpupid can be
valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
evaluates to false and migration is allowed.

This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
disabled and the folio is on the lower tier.

Behavior before this change:
============================
  - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
    nodes within the same memory tier, and promotion from lower
    tier to higher tier may also happen.

  - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
    lower tier to higher tier nodes is allowed.

Behavior after this change:
===========================
  - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
    between nodes within the same memory tier.

  - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
    tier to higher tier nodes will be allowed.

  - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
    enabled, both migration (same tier) and promotion (cross tier) are
    allowed.

Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
v1 -> v2
========
1. Dropped changes in task_numa_fault() since the original changes
   already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.

v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/
---
 kernel/sched/fair.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bf948db905ed..4b43809a3fb1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
 	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
 	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
 
+	/*
+	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
+	 * and the pages are on the lower tier.
+	 */
 	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
-	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
+	    !node_is_toptier(src_nid))
 		return false;
 
 	/*
-- 
2.47.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-03-23  9:48 [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled Donet Tom
@ 2026-04-02  0:22 ` Andrew Morton
  2026-04-02  3:31   ` Huang, Ying
  2026-04-02  3:27 ` Huang, Ying
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2026-04-02  0:22 UTC (permalink / raw)
  To: Donet Tom
  Cc: David Hildenbrand, Ingo Molnar, Peter Zijlstra, Ritesh Harjani,
	linux-mm, linux-kernel, Baolin Wang, Ying Huang, Juri Lelli,
	Mel Gorman, Rik van Riel, ying.huang, ying.huang

On Mon, 23 Mar 2026 04:48:49 -0500 Donet Tom <donettom@linux.ibm.com> wrote:

> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
> disabled and the pages are on the lower tier, the pages may still be
> promoted.
> 
> This happens because task_numa_work() updates the last_cpupid field to
> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
> enabled and the folio is on the lower tier. If
> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
> can retains a valid last CPU id.
> 
> In should_numa_migrate_memory(), the decision checks whether
> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
> tier, and last_cpupid is invalid. However, the last_cpupid can be
> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
> evaluates to false and migration is allowed.
> 
> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
> disabled and the folio is on the lower tier.
> 
> Behavior before this change:
> ============================
>   - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>     nodes within the same memory tier, and promotion from lower
>     tier to higher tier may also happen.
> 
>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>     lower tier to higher tier nodes is allowed.
> 
> Behavior after this change:
> ===========================
>   - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>     between nodes within the same memory tier.
> 
>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>     tier to higher tier nodes will be allowed.
> 
>   - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>     enabled, both migration (same tier) and promotion (cross tier) are
>     allowed.

There was no feedback on this, nor on your v1.

> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")

Ying Huang seems to have moved around a bit - let me add a couple more
email addresses.  Apologies if we have multiple Ying Huangs!

Rik, Mel?  It's a bugfix.

Thanks.



From: Donet Tom <donettom@linux.ibm.com>
Subject: memory tiering: do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
Date: Mon, 23 Mar 2026 04:48:49 -0500

In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
disabled and the pages are on the lower tier, the pages may still be
promoted.

This happens because task_numa_work() updates the last_cpupid field to
record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
enabled and the folio is on the lower tier.  If
NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field can
retains a valid last CPU id.

In should_numa_migrate_memory(), the decision checks whether
NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower tier,
and last_cpupid is invalid.  However, the last_cpupid can be valid when
NUMA_BALANCING_MEMORY_TIERING is disabled, the condition evaluates to
false and migration is allowed.

This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
disabled and the folio is on the lower tier.

Behavior before this change:
============================
  - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
    nodes within the same memory tier, and promotion from lower
    tier to higher tier may also happen.

  - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
    lower tier to higher tier nodes is allowed.

Behavior after this change:
===========================
  - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
    between nodes within the same memory tier.

  - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
    tier to higher tier nodes will be allowed.

  - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
    enabled, both migration (same tier) and promotion (cross tier) are
    allowed.

Link: https://lkml.kernel.org/r/20260323094849.3903-1-donettom@linux.ibm.com
Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Ben Segall <bsegall@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: "Huang, Ying" <huang.ying.caritas@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/sched/fair.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/kernel/sched/fair.c~memory-tiering-do-not-allow-promotion-if-numa_balancing_memory_tiering-is-disabled
+++ a/kernel/sched/fair.c
@@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct t
 	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
 	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
 
+	/*
+	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
+	 * and the pages are on the lower tier.
+	 */
 	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
-	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
+	    !node_is_toptier(src_nid))
 		return false;
 
 	/*
_



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-03-23  9:48 [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled Donet Tom
  2026-04-02  0:22 ` Andrew Morton
@ 2026-04-02  3:27 ` Huang, Ying
  2026-04-02  4:59   ` Donet Tom
  1 sibling, 1 reply; 10+ messages in thread
From: Huang, Ying @ 2026-04-02  3:27 UTC (permalink / raw)
  To: Donet Tom
  Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman

Donet Tom <donettom@linux.ibm.com> writes:

> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
> disabled and the pages are on the lower tier, the pages may still be
> promoted.
>
> This happens because task_numa_work() updates the last_cpupid field to
> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
> enabled and the folio is on the lower tier. If
> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
> can retains a valid last CPU id.
>
> In should_numa_migrate_memory(), the decision checks whether
> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
> tier, and last_cpupid is invalid. However, the last_cpupid can be
> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
> evaluates to false and migration is allowed.
>
> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
> disabled and the folio is on the lower tier.
>
> Behavior before this change:
> ============================
>   - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>     nodes within the same memory tier, and promotion from lower
>     tier to higher tier may also happen.
>
>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>     lower tier to higher tier nodes is allowed.
>
> Behavior after this change:
> ===========================
>   - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>     between nodes within the same memory tier.
>
>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>     tier to higher tier nodes will be allowed.
>
>   - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>     enabled, both migration (same tier) and promotion (cross tier) are
>     allowed.
>
> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
> ---
> v1 -> v2
> ========
> 1. Dropped changes in task_numa_fault() since the original changes
>    already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.
>
> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/
> ---
>  kernel/sched/fair.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index bf948db905ed..4b43809a3fb1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
>  	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
>  	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
>  
> +	/*
> +	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
> +	 * and the pages are on the lower tier.
> +	 */
>  	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
> -	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
> +	    !node_is_toptier(src_nid))
>  		return false;
>  
>  	/*

No.  Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still
allow migrate pages from lower tier to higher tier via
NUMA_BALANCING_NORMAL.  If we have precious DDR, why waste it?  This
follows the semantics of NUMA_BALANCING_NORMAL before introducing
NUMA_BALANCING_MEMORY_TIERING.

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-04-02  0:22 ` Andrew Morton
@ 2026-04-02  3:31   ` Huang, Ying
  0 siblings, 0 replies; 10+ messages in thread
From: Huang, Ying @ 2026-04-02  3:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Donet Tom, David Hildenbrand, Ingo Molnar, Peter Zijlstra,
	Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman, Rik van Riel, ying.huang

Hi, Andrew,

Andrew Morton <akpm@linux-foundation.org> writes:

> On Mon, 23 Mar 2026 04:48:49 -0500 Donet Tom <donettom@linux.ibm.com> wrote:
>
>> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
>> disabled and the pages are on the lower tier, the pages may still be
>> promoted.
>> 
>> This happens because task_numa_work() updates the last_cpupid field to
>> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
>> enabled and the folio is on the lower tier. If
>> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
>> can retains a valid last CPU id.
>> 
>> In should_numa_migrate_memory(), the decision checks whether
>> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
>> tier, and last_cpupid is invalid. However, the last_cpupid can be
>> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
>> evaluates to false and migration is allowed.
>> 
>> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
>> disabled and the folio is on the lower tier.
>> 
>> Behavior before this change:
>> ============================
>>   - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>>     nodes within the same memory tier, and promotion from lower
>>     tier to higher tier may also happen.
>> 
>>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>>     lower tier to higher tier nodes is allowed.
>> 
>> Behavior after this change:
>> ===========================
>>   - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>>     between nodes within the same memory tier.
>> 
>>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>>     tier to higher tier nodes will be allowed.
>> 
>>   - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>>     enabled, both migration (same tier) and promotion (cross tier) are
>>     allowed.
>
> There was no feedback on this, nor on your v1.
>
>> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
>
> Ying Huang seems to have moved around a bit - let me add a couple more
> email addresses.  Apologies if we have multiple Ying Huangs!

Thanks!  I don't find other Ying Huang in mm community yet.

Now I use the following email address:

"Huang, Ying" <ying.huang@linux.alibaba.com>
Ying Huang <huang.ying.caritas@gmail.com>

and stop using the following email address:

ying.huang@intel.com

> Rik, Mel?  It's a bugfix.
>
> Thanks.
>
>
>
> From: Donet Tom <donettom@linux.ibm.com>
> Subject: memory tiering: do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
> Date: Mon, 23 Mar 2026 04:48:49 -0500
>
> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
> disabled and the pages are on the lower tier, the pages may still be
> promoted.
>
> This happens because task_numa_work() updates the last_cpupid field to
> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
> enabled and the folio is on the lower tier.  If
> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field can
> retains a valid last CPU id.
>
> In should_numa_migrate_memory(), the decision checks whether
> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower tier,
> and last_cpupid is invalid.  However, the last_cpupid can be valid when
> NUMA_BALANCING_MEMORY_TIERING is disabled, the condition evaluates to
> false and migration is allowed.
>
> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
> disabled and the folio is on the lower tier.
>
> Behavior before this change:
> ============================
>   - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>     nodes within the same memory tier, and promotion from lower
>     tier to higher tier may also happen.
>
>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>     lower tier to higher tier nodes is allowed.
>
> Behavior after this change:
> ===========================
>   - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>     between nodes within the same memory tier.
>
>   - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>     tier to higher tier nodes will be allowed.
>
>   - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>     enabled, both migration (same tier) and promotion (cross tier) are
>     allowed.
>
> Link: https://lkml.kernel.org/r/20260323094849.3903-1-donettom@linux.ibm.com
> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Ben Segall <bsegall@google.com>
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: "Huang, Ying" <huang.ying.caritas@gmail.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Valentin Schneider <vschneid@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>  kernel/sched/fair.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> --- a/kernel/sched/fair.c~memory-tiering-do-not-allow-promotion-if-numa_balancing_memory_tiering-is-disabled
> +++ a/kernel/sched/fair.c
> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct t
>  	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
>  	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
>  
> +	/*
> +	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
> +	 * and the pages are on the lower tier.
> +	 */
>  	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
> -	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
> +	    !node_is_toptier(src_nid))
>  		return false;
>  
>  	/*
> _

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-04-02  3:27 ` Huang, Ying
@ 2026-04-02  4:59   ` Donet Tom
  2026-04-02  6:24     ` Huang, Ying
  0 siblings, 1 reply; 10+ messages in thread
From: Donet Tom @ 2026-04-02  4:59 UTC (permalink / raw)
  To: Huang, Ying
  Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman

Hi

On 4/2/26 8:57 AM, Huang, Ying wrote:
> Donet Tom <donettom@linux.ibm.com> writes:
>
>> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
>> disabled and the pages are on the lower tier, the pages may still be
>> promoted.
>>
>> This happens because task_numa_work() updates the last_cpupid field to
>> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
>> enabled and the folio is on the lower tier. If
>> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
>> can retains a valid last CPU id.
>>
>> In should_numa_migrate_memory(), the decision checks whether
>> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
>> tier, and last_cpupid is invalid. However, the last_cpupid can be
>> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
>> evaluates to false and migration is allowed.
>>
>> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
>> disabled and the folio is on the lower tier.
>>
>> Behavior before this change:
>> ============================
>>    - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>>      nodes within the same memory tier, and promotion from lower
>>      tier to higher tier may also happen.
>>
>>    - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>>      lower tier to higher tier nodes is allowed.
>>
>> Behavior after this change:
>> ===========================
>>    - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>>      between nodes within the same memory tier.
>>
>>    - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>>      tier to higher tier nodes will be allowed.
>>
>>    - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>>      enabled, both migration (same tier) and promotion (cross tier) are
>>      allowed.
>>
>> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>> ---
>> v1 -> v2
>> ========
>> 1. Dropped changes in task_numa_fault() since the original changes
>>     already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.
>>
>> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/
>> ---
>>   kernel/sched/fair.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index bf948db905ed..4b43809a3fb1 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
>>   	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
>>   	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
>>   
>> +	/*
>> +	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
>> +	 * and the pages are on the lower tier.
>> +	 */
>>   	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
>> -	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
>> +	    !node_is_toptier(src_nid))
>>   		return false;
>>   
>>   	/*
> No.  Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still
> allow migrate pages from lower tier to higher tier via
> NUMA_BALANCING_NORMAL.  If we have precious DDR, why waste it?  This
> follows the semantics of NUMA_BALANCING_NORMAL before introducing
> NUMA_BALANCING_MEMORY_TIERING.

Thank you for the review comments.

One thing I am trying to understand is that page promotion
appears to happen regardless of whether
NUMA_BALANCING_MEMORY_TIERING is enabled or disabled. In that
case, what is the specific role of
NUMA_BALANCING_MEMORY_TIERING? Do we get better performance
when it is enabled?

My initial understanding was that disabling
NUMA_BALANCING_MEMORY_TIERING could be used to turn off
promotion. However, it seems that currently we cannot control
promotion independently. If NUMA_BALANCING_NORMAL is disabled,
neither migration nor promotion happens, and if it is enabled,
both migration and promotion can occur.

I was under the impression that:
- NUMA_BALANCING_NORMAL would handle migration within the same tier,
- NUMA_BALANCING_MEMORY_TIERING would handle promotion across tiers,
- and enabling both would allow both migration and promotion.

This would provide more fine-grained control. Is my
understanding correct, or am I missing something here?


>
> ---
> Best Regards,
> Huang, Ying


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-04-02  4:59   ` Donet Tom
@ 2026-04-02  6:24     ` Huang, Ying
  2026-04-08 13:20       ` Donet Tom
  0 siblings, 1 reply; 10+ messages in thread
From: Huang, Ying @ 2026-04-02  6:24 UTC (permalink / raw)
  To: Donet Tom
  Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman

Donet Tom <donettom@linux.ibm.com> writes:

> Hi

Hi, Donet,

> On 4/2/26 8:57 AM, Huang, Ying wrote:
>> Donet Tom <donettom@linux.ibm.com> writes:
>>
>>> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
>>> disabled and the pages are on the lower tier, the pages may still be
>>> promoted.
>>>
>>> This happens because task_numa_work() updates the last_cpupid field to
>>> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
>>> enabled and the folio is on the lower tier. If
>>> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
>>> can retains a valid last CPU id.
>>>
>>> In should_numa_migrate_memory(), the decision checks whether
>>> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
>>> tier, and last_cpupid is invalid. However, the last_cpupid can be
>>> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
>>> evaluates to false and migration is allowed.
>>>
>>> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
>>> disabled and the folio is on the lower tier.
>>>
>>> Behavior before this change:
>>> ============================
>>>    - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>>>      nodes within the same memory tier, and promotion from lower
>>>      tier to higher tier may also happen.
>>>
>>>    - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>>>      lower tier to higher tier nodes is allowed.
>>>
>>> Behavior after this change:
>>> ===========================
>>>    - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>>>      between nodes within the same memory tier.
>>>
>>>    - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>>>      tier to higher tier nodes will be allowed.
>>>
>>>    - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>>>      enabled, both migration (same tier) and promotion (cross tier) are
>>>      allowed.
>>>
>>> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
>>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>>> ---
>>> v1 -> v2
>>> ========
>>> 1. Dropped changes in task_numa_fault() since the original changes
>>>     already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.
>>>
>>> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/
>>> ---
>>>   kernel/sched/fair.c | 6 +++++-
>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index bf948db905ed..4b43809a3fb1 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
>>>   	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
>>>   	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
>>>   +	/*
>>> +	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
>>> +	 * and the pages are on the lower tier.
>>> +	 */
>>>   	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
>>> -	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
>>> +	    !node_is_toptier(src_nid))
>>>   		return false;
>>>     	/*
>> No.  Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still
>> allow migrate pages from lower tier to higher tier via
>> NUMA_BALANCING_NORMAL.  If we have precious DDR, why waste it?  This
>> follows the semantics of NUMA_BALANCING_NORMAL before introducing
>> NUMA_BALANCING_MEMORY_TIERING.
>
> Thank you for the review comments.
>
> One thing I am trying to understand is that page promotion
> appears to happen regardless of whether
> NUMA_BALANCING_MEMORY_TIERING is enabled or disabled. In that
> case, what is the specific role of
> NUMA_BALANCING_MEMORY_TIERING? Do we get better performance
> when it is enabled?

You can search NUMA_BALANCING_MEMORY_TIERING to find out what it does.
We can get better performance as the original commit message says.

When NUMA_BALANCING_MEMORY_TIERING is introduced, we didn't change the
original behavior of NUMA_BALANCING_MEMORY_NORMAL because we had no good
reason to do that.  In fact, you change its behavior, so you should
provide some supporting data or bug report to justify the change.

> My initial understanding was that disabling
> NUMA_BALANCING_MEMORY_TIERING could be used to turn off
> promotion. However, it seems that currently we cannot control
> promotion independently. If NUMA_BALANCING_NORMAL is disabled,
> neither migration nor promotion happens, and if it is enabled,
> both migration and promotion can occur.
>
> I was under the impression that:
> - NUMA_BALANCING_NORMAL would handle migration within the same tier,
> - NUMA_BALANCING_MEMORY_TIERING would handle promotion across tiers,
> - and enabling both would allow both migration and promotion.
>
> This would provide more fine-grained control. Is my
> understanding correct, or am I missing something here?

You can change this, if you have some supporting data or bug report.

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-04-02  6:24     ` Huang, Ying
@ 2026-04-08 13:20       ` Donet Tom
  2026-04-09  1:28         ` Huang, Ying
  0 siblings, 1 reply; 10+ messages in thread
From: Donet Tom @ 2026-04-08 13:20 UTC (permalink / raw)
  To: Huang, Ying, David Hildenbrand
  Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman


On 4/2/26 11:54 AM, Huang, Ying wrote:
> Donet Tom <donettom@linux.ibm.com> writes:
>
>> Hi
> Hi, Donet,
>
>> On 4/2/26 8:57 AM, Huang, Ying wrote:
>>> Donet Tom <donettom@linux.ibm.com> writes:
>>>
>>>> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
>>>> disabled and the pages are on the lower tier, the pages may still be
>>>> promoted.
>>>>
>>>> This happens because task_numa_work() updates the last_cpupid field to
>>>> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
>>>> enabled and the folio is on the lower tier. If
>>>> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
>>>> can retains a valid last CPU id.
>>>>
>>>> In should_numa_migrate_memory(), the decision checks whether
>>>> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
>>>> tier, and last_cpupid is invalid. However, the last_cpupid can be
>>>> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
>>>> evaluates to false and migration is allowed.
>>>>
>>>> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
>>>> disabled and the folio is on the lower tier.
>>>>
>>>> Behavior before this change:
>>>> ============================
>>>>     - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>>>>       nodes within the same memory tier, and promotion from lower
>>>>       tier to higher tier may also happen.
>>>>
>>>>     - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>>>>       lower tier to higher tier nodes is allowed.
>>>>
>>>> Behavior after this change:
>>>> ===========================
>>>>     - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>>>>       between nodes within the same memory tier.
>>>>
>>>>     - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>>>>       tier to higher tier nodes will be allowed.
>>>>
>>>>     - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>>>>       enabled, both migration (same tier) and promotion (cross tier) are
>>>>       allowed.
>>>>
>>>> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
>>>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>>>> ---
>>>> v1 -> v2
>>>> ========
>>>> 1. Dropped changes in task_numa_fault() since the original changes
>>>>      already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.
>>>>
>>>> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/
>>>> ---
>>>>    kernel/sched/fair.c | 6 +++++-
>>>>    1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>> index bf948db905ed..4b43809a3fb1 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
>>>>    	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
>>>>    	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
>>>>    +	/*
>>>> +	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
>>>> +	 * and the pages are on the lower tier.
>>>> +	 */
>>>>    	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
>>>> -	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
>>>> +	    !node_is_toptier(src_nid))
>>>>    		return false;
>>>>      	/*
>>> No.  Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still
>>> allow migrate pages from lower tier to higher tier via
>>> NUMA_BALANCING_NORMAL.  If we have precious DDR, why waste it?  This
>>> follows the semantics of NUMA_BALANCING_NORMAL before introducing
>>> NUMA_BALANCING_MEMORY_TIERING.
>> Thank you for the review comments.
>>
>> One thing I am trying to understand is that page promotion
>> appears to happen regardless of whether
>> NUMA_BALANCING_MEMORY_TIERING is enabled or disabled. In that
>> case, what is the specific role of
>> NUMA_BALANCING_MEMORY_TIERING? Do we get better performance
>> when it is enabled?
> You can search NUMA_BALANCING_MEMORY_TIERING to find out what it does.
> We can get better performance as the original commit message says.
>
> When NUMA_BALANCING_MEMORY_TIERING is introduced, we didn't change the
> original behavior of NUMA_BALANCING_MEMORY_NORMAL because we had no good
> reason to do that.  In fact, you change its behavior, so you should
> provide some supporting data or bug report to justify the change.
>
>> My initial understanding was that disabling
>> NUMA_BALANCING_MEMORY_TIERING could be used to turn off
>> promotion. However, it seems that currently we cannot control
>> promotion independently. If NUMA_BALANCING_NORMAL is disabled,
>> neither migration nor promotion happens, and if it is enabled,
>> both migration and promotion can occur.
>>
>> I was under the impression that:
>> - NUMA_BALANCING_NORMAL would handle migration within the same tier,
>> - NUMA_BALANCING_MEMORY_TIERING would handle promotion across tiers,
>> - and enabling both would allow both migration and promotion.
>>
>> This would provide more fine-grained control. Is my
>> understanding correct, or am I missing something here?
> You can change this, if you have some supporting data or bug report.


Thanks for the clarification. I was running some experiments where I 
only required migration, not promotion. However, I observed that 
promotion was still occurring even when NUMA_BALANCING_MEMORY_TIERING 
was disabled, which led me to believe it might be a bug, so I reported it.

As I understand it, enabling both NUMA_BALANCING_MEMORY_TIERING and 
NUMA_BALANCING_NORMAL results in both promotion and migration. Given 
this, do you see any concerns with modifying the behavior of 
NUMA_BALANCING_NORMAL?

With this patch, we would have better control over enabling and 
disabling promotion independently. I would appreciate your thoughts on this.


-Donet

>
> ---
> Best Regards,
> Huang, Ying


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-04-08 13:20       ` Donet Tom
@ 2026-04-09  1:28         ` Huang, Ying
  2026-04-09  3:42           ` Ritesh Harjani
  0 siblings, 1 reply; 10+ messages in thread
From: Huang, Ying @ 2026-04-09  1:28 UTC (permalink / raw)
  To: Donet Tom
  Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	Ritesh Harjani, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman

Donet Tom <donettom@linux.ibm.com> writes:

> On 4/2/26 11:54 AM, Huang, Ying wrote:
>> Donet Tom <donettom@linux.ibm.com> writes:
>>
>>> Hi
>> Hi, Donet,
>>
>>> On 4/2/26 8:57 AM, Huang, Ying wrote:
>>>> Donet Tom <donettom@linux.ibm.com> writes:
>>>>
>>>>> In the current implementation, if NUMA_BALANCING_MEMORY_TIERING is
>>>>> disabled and the pages are on the lower tier, the pages may still be
>>>>> promoted.
>>>>>
>>>>> This happens because task_numa_work() updates the last_cpupid field to
>>>>> record the last access time only when NUMA_BALANCING_MEMORY_TIERING is
>>>>> enabled and the folio is on the lower tier. If
>>>>> NUMA_BALANCING_MEMORY_TIERING is disabled, the last_cpupid field
>>>>> can retains a valid last CPU id.
>>>>>
>>>>> In should_numa_migrate_memory(), the decision checks whether
>>>>> NUMA_BALANCING_MEMORY_TIERING is disabled, the folio is on the lower
>>>>> tier, and last_cpupid is invalid. However, the last_cpupid can be
>>>>> valid when NUMA_BALANCING_MEMORY_TIERING is disabled, the condition
>>>>> evaluates to false and migration is allowed.
>>>>>
>>>>> This patch prevents promotion when NUMA_BALANCING_MEMORY_TIERING is
>>>>> disabled and the folio is on the lower tier.
>>>>>
>>>>> Behavior before this change:
>>>>> ============================
>>>>>     - If NUMA_BALANCING_NORMAL is enabled, migration occurs between
>>>>>       nodes within the same memory tier, and promotion from lower
>>>>>       tier to higher tier may also happen.
>>>>>
>>>>>     - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from
>>>>>       lower tier to higher tier nodes is allowed.
>>>>>
>>>>> Behavior after this change:
>>>>> ===========================
>>>>>     - If NUMA_BALANCING_NORMAL is enabled, migration will occur only
>>>>>       between nodes within the same memory tier.
>>>>>
>>>>>     - If NUMA_BALANCING_MEMORY_TIERING is enabled, promotion from lower
>>>>>       tier to higher tier nodes will be allowed.
>>>>>
>>>>>     - If both NUMA_BALANCING_MEMORY_TIERING and NUMA_BALANCING_NORMAL are
>>>>>       enabled, both migration (same tier) and promotion (cross tier) are
>>>>>       allowed.
>>>>>
>>>>> Fixes: 33024536bafd ("memory tiering: hot page selection with hint page fault latency")
>>>>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>>>>> ---
>>>>> v1 -> v2
>>>>> ========
>>>>> 1. Dropped changes in task_numa_fault() since the original changes
>>>>>      already handle runtime disabling of NUMA_BALANCING_MEMORY_TIERING.
>>>>>
>>>>> v1 -> https://lore.kernel.org/all/20260320092251.1290207-1-donettom@linux.ibm.com/
>>>>> ---
>>>>>    kernel/sched/fair.c | 6 +++++-
>>>>>    1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>> index bf948db905ed..4b43809a3fb1 100644
>>>>> --- a/kernel/sched/fair.c
>>>>> +++ b/kernel/sched/fair.c
>>>>> @@ -2024,8 +2024,12 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
>>>>>    	this_cpupid = cpu_pid_to_cpupid(dst_cpu, current->pid);
>>>>>    	last_cpupid = folio_xchg_last_cpupid(folio, this_cpupid);
>>>>>    +	/*
>>>>> +	 * Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
>>>>> +	 * and the pages are on the lower tier.
>>>>> +	 */
>>>>>    	if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
>>>>> -	    !node_is_toptier(src_nid) && !cpupid_valid(last_cpupid))
>>>>> +	    !node_is_toptier(src_nid))
>>>>>    		return false;
>>>>>      	/*
>>>> No.  Even if NUMA_BALANCING_MEMORY_TIERING is disabled, we should still
>>>> allow migrate pages from lower tier to higher tier via
>>>> NUMA_BALANCING_NORMAL.  If we have precious DDR, why waste it?  This
>>>> follows the semantics of NUMA_BALANCING_NORMAL before introducing
>>>> NUMA_BALANCING_MEMORY_TIERING.
>>> Thank you for the review comments.
>>>
>>> One thing I am trying to understand is that page promotion
>>> appears to happen regardless of whether
>>> NUMA_BALANCING_MEMORY_TIERING is enabled or disabled. In that
>>> case, what is the specific role of
>>> NUMA_BALANCING_MEMORY_TIERING? Do we get better performance
>>> when it is enabled?
>> You can search NUMA_BALANCING_MEMORY_TIERING to find out what it does.
>> We can get better performance as the original commit message says.
>>
>> When NUMA_BALANCING_MEMORY_TIERING is introduced, we didn't change the
>> original behavior of NUMA_BALANCING_MEMORY_NORMAL because we had no good
>> reason to do that.  In fact, you change its behavior, so you should
>> provide some supporting data or bug report to justify the change.
>>
>>> My initial understanding was that disabling
>>> NUMA_BALANCING_MEMORY_TIERING could be used to turn off
>>> promotion. However, it seems that currently we cannot control
>>> promotion independently. If NUMA_BALANCING_NORMAL is disabled,
>>> neither migration nor promotion happens, and if it is enabled,
>>> both migration and promotion can occur.
>>>
>>> I was under the impression that:
>>> - NUMA_BALANCING_NORMAL would handle migration within the same tier,
>>> - NUMA_BALANCING_MEMORY_TIERING would handle promotion across tiers,
>>> - and enabling both would allow both migration and promotion.
>>>
>>> This would provide more fine-grained control. Is my
>>> understanding correct, or am I missing something here?
>> You can change this, if you have some supporting data or bug report.
>
>
> Thanks for the clarification. I was running some experiments where I
> only required migration, not promotion. However, I observed that
> promotion was still occurring even when NUMA_BALANCING_MEMORY_TIERING
> was disabled, which led me to believe it might be a bug, so I reported
> it.
>
> As I understand it, enabling both NUMA_BALANCING_MEMORY_TIERING and
> NUMA_BALANCING_NORMAL results in both promotion and migration. Given
> this, do you see any concerns with modifying the behavior of
> NUMA_BALANCING_NORMAL?
>
> With this patch, we would have better control over enabling and
> disabling promotion independently. I would appreciate your thoughts on
> this.

IIUC, we change the existing user visible behavior only with strong
enough practical reason.  If so, making something conceptually better
isn't enough for that.

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-04-09  1:28         ` Huang, Ying
@ 2026-04-09  3:42           ` Ritesh Harjani
  2026-04-09  6:39             ` Huang, Ying
  0 siblings, 1 reply; 10+ messages in thread
From: Ritesh Harjani @ 2026-04-09  3:42 UTC (permalink / raw)
  To: Huang, Ying, Donet Tom
  Cc: David Hildenbrand, Andrew Morton, Ingo Molnar, Peter Zijlstra,
	linux-mm, linux-kernel, Baolin Wang, Ying Huang, Juri Lelli,
	Mel Gorman

"Huang, Ying" <ying.huang@linux.alibaba.com> writes:

>>>>> Donet Tom <donettom@linux.ibm.com> writes:
>>
>>
>> Thanks for the clarification. I was running some experiments where I
>> only required migration, not promotion. However, I observed that
>> promotion was still occurring even when NUMA_BALANCING_MEMORY_TIERING
>> was disabled, which led me to believe it might be a bug, so I reported
>> it.
>>
>> As I understand it, enabling both NUMA_BALANCING_MEMORY_TIERING and
>> NUMA_BALANCING_NORMAL results in both promotion and migration. Given
>> this, do you see any concerns with modifying the behavior of
>> NUMA_BALANCING_NORMAL?
>>
>> With this patch, we would have better control over enabling and
>> disabling promotion independently. I would appreciate your thoughts on
>> this.
>
> IIUC, we change the existing user visible behavior only with strong
> enough practical reason.

So what I understood from this discussion so far is, we don't have any
mechanism to do auto-numa base page migration between DRAM -to- DRAM w/o
triggering promotions too from a lower tiers to higher tiers.

... This to me sounds more like a broken interface.

> If so, making something conceptually better isn't enough for that.
>

I think Donet's approach was more towards fixing the problem, then
making it conceptually better. So, as of now most of us may not see this
as a problem, since not many systems have different memory tiers
attached. But with more widespread CXL adoption and more memory tiers in
the system, we might require more finer control over auto-numa based
page migration.

But hey, I just wanted to voice out my opinion here. If we think
changing user visible behavior is going to break existing applications
and we don't want that - then in that case the reasoning sounds ok to
me.

> ---
> Best Regards,
> Huang, Ying

Thanks for your feedback!

-ritesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled
  2026-04-09  3:42           ` Ritesh Harjani
@ 2026-04-09  6:39             ` Huang, Ying
  0 siblings, 0 replies; 10+ messages in thread
From: Huang, Ying @ 2026-04-09  6:39 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Donet Tom, David Hildenbrand, Andrew Morton, Ingo Molnar,
	Peter Zijlstra, linux-mm, linux-kernel, Baolin Wang, Ying Huang,
	Juri Lelli, Mel Gorman

Ritesh Harjani (IBM) <ritesh.list@gmail.com> writes:

> "Huang, Ying" <ying.huang@linux.alibaba.com> writes:
>
>>>>>> Donet Tom <donettom@linux.ibm.com> writes:
>>>
>>>
>>> Thanks for the clarification. I was running some experiments where I
>>> only required migration, not promotion. However, I observed that
>>> promotion was still occurring even when NUMA_BALANCING_MEMORY_TIERING
>>> was disabled, which led me to believe it might be a bug, so I reported
>>> it.
>>>
>>> As I understand it, enabling both NUMA_BALANCING_MEMORY_TIERING and
>>> NUMA_BALANCING_NORMAL results in both promotion and migration. Given
>>> this, do you see any concerns with modifying the behavior of
>>> NUMA_BALANCING_NORMAL?
>>>
>>> With this patch, we would have better control over enabling and
>>> disabling promotion independently. I would appreciate your thoughts on
>>> this.
>>
>> IIUC, we change the existing user visible behavior only with strong
>> enough practical reason.
>
> So what I understood from this discussion so far is, we don't have any
> mechanism to do auto-numa base page migration between DRAM -to- DRAM w/o
> triggering promotions too from a lower tiers to higher tiers.
>
> ... This to me sounds more like a broken interface.
>
>> If so, making something conceptually better isn't enough for that.
>>
>
> I think Donet's approach was more towards fixing the problem, then
> making it conceptually better.

To fix a theoretical problem instead of a practical problem?

> So, as of now most of us may not see this
> as a problem, since not many systems have different memory tiers
> attached. But with more widespread CXL adoption and more memory tiers in
> the system, we might require more finer control over auto-numa based
> page migration.

By design, normal NUMA balancing (not memory tiering) should migrate
pages between tiers too.  Because it migrates pages to the node near a
CPU regardless of the memory tiers to optimize NUMA locality.

> But hey, I just wanted to voice out my opinion here. If we think
> changing user visible behavior is going to break existing applications
> and we don't want that - then in that case the reasoning sounds ok to
> me.

---
Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-04-09  6:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-23  9:48 [PATCH v2] memory tiering: Do not allow promotion if NUMA_BALANCING_MEMORY_TIERING is disabled Donet Tom
2026-04-02  0:22 ` Andrew Morton
2026-04-02  3:31   ` Huang, Ying
2026-04-02  3:27 ` Huang, Ying
2026-04-02  4:59   ` Donet Tom
2026-04-02  6:24     ` Huang, Ying
2026-04-08 13:20       ` Donet Tom
2026-04-09  1:28         ` Huang, Ying
2026-04-09  3:42           ` Ritesh Harjani
2026-04-09  6:39             ` Huang, Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox