linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Bharata B Rao <bharata@amd.com>
To: <bharata@amd.com>
Cc: <AneeshKumar.KizhakeVeetil@arm.com>, <Hasan.Maruf@amd.com>,
	<Jonathan.Cameron@huawei.com>, <Michael.Day@amd.com>,
	<akpm@linux-foundation.org>, <dave.hansen@intel.com>,
	<dave@stgolabs.net>, <david@redhat.com>, <feng.tang@intel.com>,
	<gourry@gourry.net>, <hannes@cmpxchg.org>, <honggyu.kim@sk.com>,
	<hughd@google.com>, <hyeonggon.yoo@sk.com>, <jhubbard@nvidia.com>,
	<k.shutemov@gmail.com>, <kbusch@meta.com>,
	<kmanaouil.dev@gmail.com>, <leesuyeon0506@gmail.com>,
	<leillc@google.com>, <liam.howlett@oracle.com>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<mgorman@techsingularity.net>, <mingo@redhat.com>,
	<nadav.amit@gmail.com>, <nphamcs@gmail.com>,
	<peterz@infradead.org>, <raghavendra.kt@amd.com>,
	<riel@surriel.com>, <rientjes@google.com>, <rppt@kernel.org>,
	<shivankg@amd.com>, <shy828301@gmail.com>, <sj@kernel.org>,
	<vbabka@suse.cz>, <weixugc@google.com>, <willy@infradead.org>,
	<ying.huang@linux.alibaba.com>, <yuanchu@google.com>,
	<ziy@nvidia.com>
Subject: Re: [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot pages
Date: Tue, 25 Mar 2025 13:48:32 +0530	[thread overview]
Message-ID: <20250325081832.209140-1-bharata@amd.com> (raw)
In-Reply-To: <20250306054532.221138-1-bharata@amd.com>

> Hi,
> 
> This is an attempt towards having a single subsystem that accumulates
> hot page information from lower memory tiers and does hot page
> promotion.
> 
> At the heart of this subsystem is a kernel daemon named kpromoted that
> does the following:
> 
> 1. Exposes an API that other subsystems which detect/generate memory
>    access information can use to inform the daemon about memory
>    accesses from lower memory tiers.
> 2. Maintains the list of hot pages and attempts to promote them to
>    toptiers.
> 
> Currently I have added AMD IBS driver as one source that provides
> page access information as an example. This driver feeds info to
> krpromoted in this RFC patchset.

FWIW, here are some numbers from krpomoted driven hotpage promotion with
IBS as the hotness source:

Test 1
======
Memory allocated on DRAM and CXL nodes explicitly and no demotion activity
is seen.

Benchmark details
-----------------
* Memory is allocated initially on DRAM and CXL nodes separately.
* Two threads: One accessing DRAM-allocated and other CXL-allocated memory.
* Divides memory area into regions and accesses pages within the region randomly
  and repetitively. In the test config shown below, the allocated memory is
  divided into regions of 1GB size and each such region is repetitively (512
  times) accessed with 21474836480 random accesses in each repetition).
* Benchmark score is time taken for accesses to complete, lower is better
* Data accesses from CXL node are expected to trigger promotion
* Test system has 2 DRAM nodes (128G each) and a CXL node (128G)

kernel.numa_balancing		2 for base, 0 for kpromoted
demotion			true
Threads run on			Node 1
Memory allocated on		Node 1(DRAM) and Node 2(CXL)
Initial allocation ratio	75% on DRAM
Allocated memory size		160G (mmap, MAP_POPULATE)
Initial memory on DRAM node	120G
Initial memory on CXL node	40G
Hot region size			1G
Acccess pattern			random
Access granularity		4K
Load/store ratio		50% loads + 50% stores
Number of accesses		21474836480
Nr access repetitions		512

Benchmark completion time
-------------------------
Base, NUMAB=2		261s
kpromoted-ibs, NUMAB=0	281s

Stats comparision
-----------------
				Base,NUMAB=2	kpromoted-IBS,NUMAB=0
pgdemote_kswapd			0		0
pgdemote_direct			0		0
numa_pte_updates		10485760	0
numa_hint_faults		4427809		0
numa_pages_migrated		388229		374765
kpromoted_recorded_accesses			1651130	/* nr accesses reported to kpromoted */
kpromoted_recorded_hwhints			1651130	/* nr accesses coming from IBS */
kpromoted_record_toptier			1269697	/* nr accesses from toptier/DRAM */
kpromoted_record_added				378090	/* nr accesses considered for promotion */
kpromoted_mig_promoted				374765	/* nr pages promoted */
hwhint_nr_events				1674227	/* nr events reported by IBS */
hwhint_dram_accesses				1269626	/* nr DRAM accesses reported by IBS */
hwhint_cxl_accesses				381435	/* nr Extmem (CXL) accesses reported by IBS */
hwhint_useful_samples				1651110	/* nr actionable samples as per IBS driver */


Test 2
======
Memory is allocated with DRAM and CXL nodes in the affinity mask with
MPOL_BIND + MPOL_F_NUMA_BALANCING.

Benchmark details
-----------------
* Initially, memory allocated spreads over from DRAM to CXL, involves demotion
* Single thread accesses the memory
* Divides memory area into regions and accesses pages within the region randomly
  and repetitively. In the test config shown below, the allocated memory is
  divided into regions of 1GB size and each such region is repetitively (512
  times) accessed with 21474836480 random accesses in each repetition).
* Benchmark score is time taken for accesses to complete, lower is better
* Data accesses from CXL node are expected to trigger promotion
* Test system has 2 DRAM nodes (128G each) and a CXL node (128G)

kernel.numa_balancing		2 for base, 0 for kpromoted
demotion			true
Threads run on			Node 1
Memory allocated on		Node 1(DRAM) and Node 2(CXL)
Allocated memory size		192G (mmap, MAP_POPULATE)
Hot region size			1G
Acccess pattern			random
Access granularity		4K
Load/store ratio		50% loads + 50% stores
Number of accesses		21474836480
Nr access repetitions		512

Benchmark completion time
-------------------------
Base, NUMAB=2		628s
kpromoted-ibs, NUMAB=0	626s

Stats comparision
-----------------
				Base,NUMAB=2	kpromoted-IBS,NUMAB=0
pgdemote_kswapd			73187		2196028
pgdemote_direct			0		0
numa_pte_updates		27511631	0
numa_hint_faults		10010852	0
numa_pages_migrated		14		611177	/* such low number of promotions is unexecpted in Base, Need to recheck */
kpromoted_recorded_accesses			1883570
kpromoted_recorded_hwhints			1883570
kpromoted_record_toptier			1262088
kpromoted_record_added				616273
kpromoted_mig_promoted				611077
hwhint_nr_events				1904619
hwhint_dram_accesses				1261758
hwhint_cxl_accesses				621428
hwhint_useful_samples				1883543


      parent reply	other threads:[~2025-03-25  8:19 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-06  5:45 Bharata B Rao
2025-03-06  5:45 ` [RFC PATCH 1/4] mm: migrate: Allow misplaced migration without VMA too Bharata B Rao
2025-03-06 12:13   ` David Hildenbrand
2025-03-07  3:00     ` Bharata B Rao
2025-03-06 17:24   ` Gregory Price
2025-03-06 17:45     ` Matthew Wilcox
2025-03-06 18:19       ` Gregory Price
2025-03-06 18:42         ` Matthew Wilcox
2025-03-06 20:03           ` Gregory Price
2025-03-24  2:55   ` Balbir Singh
2025-03-24 14:51     ` Bharata B Rao
2025-03-06  5:45 ` [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and promotion daemon Bharata B Rao
2025-03-06 17:22   ` Mike Day
2025-03-07  3:27     ` Bharata B Rao
2025-03-13 16:44   ` Davidlohr Bueso
2025-03-17  3:39     ` Bharata B Rao
2025-03-17 15:05       ` Gregory Price
2025-03-17 16:22         ` Bharata B Rao
2025-03-17 18:24           ` Gregory Price
2025-03-13 20:36   ` Davidlohr Bueso
2025-03-17  3:49     ` Bharata B Rao
2025-03-14 15:28   ` Jonathan Cameron
2025-03-18  4:09     ` Bharata B Rao
2025-03-18 14:17       ` Jonathan Cameron
2025-03-24  3:35   ` Balbir Singh
2025-03-28  4:55     ` Bharata B Rao
2025-03-24 13:43   ` Gregory Price
2025-03-24 14:34     ` Bharata B Rao
2025-03-06  5:45 ` [RFC PATCH 3/4] x86: ibs: In-kernel IBS driver for memory access profiling Bharata B Rao
2025-03-14 15:38   ` Jonathan Cameron
2025-03-06  5:45 ` [RFC PATCH 4/4] x86: ibs: Enable IBS profiling for memory accesses Bharata B Rao
2025-03-16 22:00 ` [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot pages SeongJae Park
2025-03-18  6:33   ` Raghavendra K T
2025-03-18 10:45   ` Bharata B Rao
2025-03-18  5:28 ` Balbir Singh
2025-03-20  9:07   ` Bharata B Rao
2025-03-21  6:19     ` Balbir Singh
2025-03-25  8:18 ` Bharata B Rao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250325081832.209140-1-bharata@amd.com \
    --to=bharata@amd.com \
    --cc=AneeshKumar.KizhakeVeetil@arm.com \
    --cc=Hasan.Maruf@amd.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=Michael.Day@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=feng.tang@intel.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=honggyu.kim@sk.com \
    --cc=hughd@google.com \
    --cc=hyeonggon.yoo@sk.com \
    --cc=jhubbard@nvidia.com \
    --cc=k.shutemov@gmail.com \
    --cc=kbusch@meta.com \
    --cc=kmanaouil.dev@gmail.com \
    --cc=leesuyeon0506@gmail.com \
    --cc=leillc@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=nphamcs@gmail.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=shivankg@amd.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox