From: Bharata B Rao <bharata@amd.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
AneeshKumar.KizhakeVeetil@arm.com, Hasan.Maruf@amd.com,
Jonathan.Cameron@huawei.com, Michael.Day@amd.com,
akpm@linux-foundation.org, dave.hansen@intel.com,
david@redhat.com, feng.tang@intel.com, gourry@gourry.net,
hannes@cmpxchg.org, honggyu.kim@sk.com, hughd@google.com,
jhubbard@nvidia.com, k.shutemov@gmail.com, kbusch@meta.com,
kmanaouil.dev@gmail.com, leesuyeon0506@gmail.com,
leillc@google.com, liam.howlett@oracle.com,
mgorman@techsingularity.net, mingo@redhat.com,
nadav.amit@gmail.com, nphamcs@gmail.com, peterz@infradead.org,
raghavendra.kt@amd.com, riel@surriel.com, rientjes@google.com,
rppt@kernel.org, shivankg@amd.com, shy828301@gmail.com,
sj@kernel.org, vbabka@suse.cz, weixugc@google.com,
willy@infradead.org, ying.huang@linux.alibaba.com,
ziy@nvidia.com, yuanchu@google.com
Subject: Re: [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and promotion daemon
Date: Mon, 17 Mar 2025 09:19:19 +0530 [thread overview]
Message-ID: <4a37ec49-008e-47e3-8aa4-462bc6be6d49@amd.com> (raw)
In-Reply-To: <20250313203607.zod6lssjef37ynbf@offworld>
On 14-Mar-25 2:06 AM, Davidlohr Bueso wrote:
> On Thu, 06 Mar 2025, Bharata B Rao wrote:
>
>> +/*
>> + * Go thro' page hotness information and migrate pages if required.
>> + *
>> + * Promoted pages are not longer tracked in the hot list.
>> + * Cold pages are pruned from the list as well.
>> + *
>> + * TODO: Batching could be done
>> + */
>> +static void kpromoted_migrate(pg_data_t *pgdat)
>> +{
>> + int nid = pgdat->node_id;
>> + struct page_hotness_info *phi;
>> + struct hlist_node *tmp;
>> + int nr_bkts = HASH_SIZE(page_hotness_hash);
>> + int bkt;
>> +
>> + for (bkt = 0; bkt < nr_bkts; bkt++) {
>> + mutex_lock(&page_hotness_lock[bkt]);
>> + hlist_for_each_entry_safe(phi, tmp, &page_hotness_hash[bkt],
>> hnode) {
>> + if (phi->hot_node != nid)
>> + continue;
>> +
>> + if (page_should_be_promoted(phi)) {
>> + count_vm_event(KPROMOTED_MIG_CANDIDATE);
>> + if (!kpromote_page(phi)) {
>> + count_vm_event(KPROMOTED_MIG_PROMOTED);
>> + hlist_del_init(&phi->hnode);
>> + kfree(phi);
>> + }
>> + } else {
>> + /*
>> + * Not a suitable page or cold page, stop tracking it.
>> + * TODO: Identify cold pages and drive demotion?
>> + */
>
> I don't think kpromoted should drive demotion at all. No one is
> complaining about migrate
> in lieu of discard, and there is also proactive reclaim which users can
> trigger. All the
> in-kernel problems are wrt promotion. The simpler any of these kthreads
> are the better.
I was testing on default kernel with NUMA balancing mode 2.
The multi-threaded application allocates memory on DRAM and the
allocation spills over to CXL node. The threads keep accessing allocated
memory pages in random order.
pgpromote_success 6
pgpromote_candidate 745387
pgdemote_kswapd 51085
pgdemote_direct 10481
pgdemote_khugepaged 0
numa_pte_updates 27249625
numa_huge_pte_updates 0
numa_hint_faults 9660745
numa_hint_faults_local 0
numa_pages_migrated 6
numa_node_full 745438
pgmigrate_success 2225458
pgmigrate_fail 1187349
I hardly see any promotion happening.
In order to check the number of times the toptier node was found to be
full when attempting to promote, I added numa_node_full counter like below:
diff --git a/mm/migrate.c b/mm/migrate.c
index fb19a18892c8..4d049d896589 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2673,6 +2673,7 @@ int migrate_misplaced_folio_prepare(struct folio
*folio,
if (!migrate_balanced_pgdat(pgdat, nr_pages)) {
int z;
+ count_vm_event(NUMA_NODE_FULL);
if (!(sysctl_numa_balancing_mode &
NUMA_BALANCING_MEMORY_TIERING))
return -EAGAIN;
for (z = pgdat->nr_zones - 1; z >= 0; z--) {
As seen above, numa_node_full 745438. This matches pgpromote_candidate
numbers.
I do see counters reporting kswapd-driven and direct demotion as well
but does this mean that demotion isn't happening fast enough to cope up
with promotion requirement in this high toptier memory pressure situation?
Regards,
Bharata.
next prev parent reply other threads:[~2025-03-17 3:49 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-06 5:45 [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot pages Bharata B Rao
2025-03-06 5:45 ` [RFC PATCH 1/4] mm: migrate: Allow misplaced migration without VMA too Bharata B Rao
2025-03-06 12:13 ` David Hildenbrand
2025-03-07 3:00 ` Bharata B Rao
2025-03-06 17:24 ` Gregory Price
2025-03-06 17:45 ` Matthew Wilcox
2025-03-06 18:19 ` Gregory Price
2025-03-06 18:42 ` Matthew Wilcox
2025-03-06 20:03 ` Gregory Price
2025-03-24 2:55 ` Balbir Singh
2025-03-24 14:51 ` Bharata B Rao
2025-03-06 5:45 ` [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and promotion daemon Bharata B Rao
2025-03-06 17:22 ` Mike Day
2025-03-07 3:27 ` Bharata B Rao
2025-03-13 16:44 ` Davidlohr Bueso
2025-03-17 3:39 ` Bharata B Rao
2025-03-17 15:05 ` Gregory Price
2025-03-17 16:22 ` Bharata B Rao
2025-03-17 18:24 ` Gregory Price
2025-03-13 20:36 ` Davidlohr Bueso
2025-03-17 3:49 ` Bharata B Rao [this message]
2025-03-14 15:28 ` Jonathan Cameron
2025-03-18 4:09 ` Bharata B Rao
2025-03-18 14:17 ` Jonathan Cameron
2025-03-24 3:35 ` Balbir Singh
2025-03-28 4:55 ` Bharata B Rao
2025-03-24 13:43 ` Gregory Price
2025-03-24 14:34 ` Bharata B Rao
2025-03-06 5:45 ` [RFC PATCH 3/4] x86: ibs: In-kernel IBS driver for memory access profiling Bharata B Rao
2025-03-14 15:38 ` Jonathan Cameron
2025-03-06 5:45 ` [RFC PATCH 4/4] x86: ibs: Enable IBS profiling for memory accesses Bharata B Rao
2025-03-16 22:00 ` [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot pages SeongJae Park
2025-03-18 6:33 ` Raghavendra K T
2025-03-18 10:45 ` Bharata B Rao
2025-03-18 5:28 ` Balbir Singh
2025-03-20 9:07 ` Bharata B Rao
2025-03-21 6:19 ` Balbir Singh
2025-03-25 8:18 ` Bharata B Rao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4a37ec49-008e-47e3-8aa4-462bc6be6d49@amd.com \
--to=bharata@amd.com \
--cc=AneeshKumar.KizhakeVeetil@arm.com \
--cc=Hasan.Maruf@amd.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=Michael.Day@amd.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=david@redhat.com \
--cc=feng.tang@intel.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=honggyu.kim@sk.com \
--cc=hughd@google.com \
--cc=jhubbard@nvidia.com \
--cc=k.shutemov@gmail.com \
--cc=kbusch@meta.com \
--cc=kmanaouil.dev@gmail.com \
--cc=leesuyeon0506@gmail.com \
--cc=leillc@google.com \
--cc=liam.howlett@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@redhat.com \
--cc=nadav.amit@gmail.com \
--cc=nphamcs@gmail.com \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=riel@surriel.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=shivankg@amd.com \
--cc=shy828301@gmail.com \
--cc=sj@kernel.org \
--cc=vbabka@suse.cz \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=ying.huang@linux.alibaba.com \
--cc=yuanchu@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox