linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Bharata B Rao <bharata@amd.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<AneeshKumar.KizhakeVeetil@arm.com>, <Hasan.Maruf@amd.com>,
	<Michael.Day@amd.com>, <akpm@linux-foundation.org>,
	<dave.hansen@intel.com>, <david@redhat.com>,
	<feng.tang@intel.com>, <gourry@gourry.net>, <hannes@cmpxchg.org>,
	<honggyu.kim@sk.com>, <hughd@google.com>, <jhubbard@nvidia.com>,
	<k.shutemov@gmail.com>, <kbusch@meta.com>,
	<kmanaouil.dev@gmail.com>, <leesuyeon0506@gmail.com>,
	<leillc@google.com>, <liam.howlett@oracle.com>,
	<mgorman@techsingularity.net>, <mingo@redhat.com>,
	<nadav.amit@gmail.com>, <nphamcs@gmail.com>,
	<peterz@infradead.org>, <raghavendra.kt@amd.com>,
	<riel@surriel.com>, <rientjes@google.com>, <rppt@kernel.org>,
	<shivankg@amd.com>, <shy828301@gmail.com>, <sj@kernel.org>,
	<vbabka@suse.cz>, <weixugc@google.com>, <willy@infradead.org>,
	<ying.huang@linux.alibaba.com>, <ziy@nvidia.com>,
	<dave@stgolabs.net>, <yuanchu@google.com>, <hyeonggon.yoo@sk.com>
Subject: Re: [RFC PATCH 3/4] x86: ibs: In-kernel IBS driver for memory access profiling
Date: Fri, 14 Mar 2025 15:38:41 +0000	[thread overview]
Message-ID: <20250314153841.00006978@huawei.com> (raw)
In-Reply-To: <20250306054532.221138-4-bharata@amd.com>

On Thu, 6 Mar 2025 11:15:31 +0530
Bharata B Rao <bharata@amd.com> wrote:

> Use IBS (Instruction Based Sampling) feature present
> in AMD processors for memory access tracking. The access
> information obtained from IBS via NMI is fed to kpromoted
> daemon for futher action.
> 
> In addition to many other information related to the memory
> access, IBS provides physical (and virtual) address of the access
> and indicates if the access came from slower tier. Only memory
> accesses originating from slower tiers are further acted upon
> by this driver.
> 
> The samples are initially accumulated in percpu buffers which
> are flushed to kpromoted using irq_work.
> 
> About IBS
> ---------
> IBS can be programmed to provide data about instruction
> execution periodically. This is done by programming a desired
> sample count (number of ops) in a control register. When the
> programmed number of ops are dispatched, a micro-op gets tagged,
> various information about the tagged micro-op's execution is
> populated in IBS execution MSRs and an interrupt is raised.
> While IBS provides a lot of data for each sample, for the
> purpose of  memory access profiling, we are interested in
> linear and physical address of the memory access that reached
> DRAM. Recent AMD processors provide further filtering where
> it is possible to limit the sampling to those ops that had
> an L3 miss which greately reduces the non-useful samples.
> 
> While IBS provides capability to sample instruction fetch
> and execution, only IBS execution sampling is used here
> to collect data about memory accesses that occur during
> the instruction execution.
> 
> More information about IBS is available in Sec 13.3 of
> AMD64 Architecture Programmer's Manual, Volume 2:System
> Programming which is present at:
> https://bugzilla.kernel.org/attachment.cgi?id=288923
> 
> Information about MSRs used for programming IBS can be
> found in Sec 2.1.14.4 of PPR Vol 1 for AMD Family 19h
> Model 11h B1 which is currently present at:
> https://www.amd.com/system/files/TechDocs/55901_0.25.zip
> 
> Signed-off-by: Bharata B Rao <bharata@amd.com>
> ---

Trivial comments inline. I'd love to find a clean way to steal stuff
perf is using though.

>  arch/x86/events/amd/ibs.c        |  11 ++
>  arch/x86/include/asm/ibs.h       |   7 +
>  arch/x86/include/asm/msr-index.h |  16 ++
>  arch/x86/mm/Makefile             |   3 +-
>  arch/x86/mm/ibs.c                | 312 +++++++++++++++++++++++++++++++
>  include/linux/vm_event_item.h    |  17 ++
>  mm/vmstat.c                      |  17 ++
>  7 files changed, 382 insertions(+), 1 deletion(-)
>  create mode 100644 arch/x86/include/asm/ibs.h
>  create mode 100644 arch/x86/mm/ibs.c
> 
> diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
> index e7a8b8758e08..35497e8c0846 100644
> --- a/arch/x86/events/amd/ibs.c
> +++ b/arch/x86/events/amd/ibs.c
> @@ -13,8 +13,10 @@
>  #include <linux/ptrace.h>
>  #include <linux/syscore_ops.h>
>  #include <linux/sched/clock.h>
> +#include <linux/kpromoted.h>
>  
>  #include <asm/apic.h>
> +#include <asm/ibs.h>
>  
>  #include "../perf_event.h"
>  
> @@ -1539,6 +1541,15 @@ static __init int amd_ibs_init(void)
>  {
>  	u32 caps;
>  
> +	/*
> +	 * TODO: Find a clean way to disable perf IBS so that IBS
> +	 * can be used for memory access profiling.

Yeah.  That bit us in a number of similar cases.  Does anyone
have a good solution for this?  For my hammer (CXL HMU) the
perf case is probably the niche one so I'm less worried, but for
SPE, IBS, PEBS etc we need to figure out how to elegantly back off
on promotion if a user wants to use tracing.

> +	 */
> +	if (arch_hw_access_profiling) {
> +		pr_info("IBS isn't available for perf use\n");
> +		return 0;
> +	}
> +
>  	caps = __get_ibs_caps();
>  	if (!caps)
>  		return -ENODEV;	/* ibs not supported by the cpu */


> +
> +static void clear_APIC_ibs(void)
> +{
> +	int offset;
> +
> +	offset = get_ibs_lvt_offset();

Trivial but I'd flip condition and deal with the error
out of line.  Ah I see this is cut and paste from existing
code I'll stop pointing this stuff out!

	if (offset < 0)
		return;

	setup_APIC_eivt();

> +	if (offset >= 0)
> +		setup_APIC_eilvt(offset, 0, APIC_EILVT_MSG_FIX, 1);
> +}

> +
> +static int __init ibs_access_profiling_init(void)
> +{
> +	if (!boot_cpu_has(X86_FEATURE_IBS)) {
> +		pr_info("IBS capability is unavailable for access profiling\n");

Probably worth saying that is because the chip doesn't have it!
This reads to similar to the perf case above where we just pinched it
for other usecases.

> +		return 0;
> +	}
> +
> +	ibs_s = alloc_percpu_gfp(struct ibs_sample_pcpu, __GFP_ZERO);
> +	if (!ibs_s)
> +		return 0;
> +
> +	INIT_WORK(&ibs_work, ibs_work_handler);
> +	init_irq_work(&ibs_irq_work, ibs_irq_handler);
> +
> +	/* Uses IBS Op sampling */
> +	ibs_config = IBS_OP_CNT_CTL | IBS_OP_ENABLE;
> +	ibs_caps = cpuid_eax(IBS_CPUID_FEATURES);
> +	if (ibs_caps & IBS_CAPS_ZEN4)
> +		ibs_config |= IBS_OP_L3MISSONLY;
> +
> +	register_nmi_handler(NMI_LOCAL, ibs_overflow_handler, 0, "ibs");
> +
> +	cpuhp_setup_state(CPUHP_AP_PERF_X86_AMD_IBS_STARTING,
> +			  "x86/amd/ibs_access_profile:starting",
> +			  x86_amd_ibs_access_profile_startup,
> +			  x86_amd_ibs_access_profile_teardown);
> +
> +	pr_info("IBS setup for memory access profiling\n");
> +	return 0;
> +}
> +
> +arch_initcall(ibs_access_profiling_init);



  reply	other threads:[~2025-03-14 15:38 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-06  5:45 [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot pages Bharata B Rao
2025-03-06  5:45 ` [RFC PATCH 1/4] mm: migrate: Allow misplaced migration without VMA too Bharata B Rao
2025-03-06 12:13   ` David Hildenbrand
2025-03-07  3:00     ` Bharata B Rao
2025-03-06 17:24   ` Gregory Price
2025-03-06 17:45     ` Matthew Wilcox
2025-03-06 18:19       ` Gregory Price
2025-03-06 18:42         ` Matthew Wilcox
2025-03-06 20:03           ` Gregory Price
2025-03-24  2:55   ` Balbir Singh
2025-03-24 14:51     ` Bharata B Rao
2025-03-06  5:45 ` [RFC PATCH 2/4] mm: kpromoted: Hot page info collection and promotion daemon Bharata B Rao
2025-03-06 17:22   ` Mike Day
2025-03-07  3:27     ` Bharata B Rao
2025-03-13 16:44   ` Davidlohr Bueso
2025-03-17  3:39     ` Bharata B Rao
2025-03-17 15:05       ` Gregory Price
2025-03-17 16:22         ` Bharata B Rao
2025-03-17 18:24           ` Gregory Price
2025-03-13 20:36   ` Davidlohr Bueso
2025-03-17  3:49     ` Bharata B Rao
2025-03-14 15:28   ` Jonathan Cameron
2025-03-18  4:09     ` Bharata B Rao
2025-03-18 14:17       ` Jonathan Cameron
2025-03-24  3:35   ` Balbir Singh
2025-03-28  4:55     ` Bharata B Rao
2025-03-24 13:43   ` Gregory Price
2025-03-24 14:34     ` Bharata B Rao
2025-03-06  5:45 ` [RFC PATCH 3/4] x86: ibs: In-kernel IBS driver for memory access profiling Bharata B Rao
2025-03-14 15:38   ` Jonathan Cameron [this message]
2025-03-06  5:45 ` [RFC PATCH 4/4] x86: ibs: Enable IBS profiling for memory accesses Bharata B Rao
2025-03-16 22:00 ` [RFC PATCH 0/4] Kernel daemon for detecting and promoting hot pages SeongJae Park
2025-03-18  6:33   ` Raghavendra K T
2025-03-18 10:45   ` Bharata B Rao
2025-03-18  5:28 ` Balbir Singh
2025-03-20  9:07   ` Bharata B Rao
2025-03-21  6:19     ` Balbir Singh
2025-03-25  8:18 ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250314153841.00006978@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=AneeshKumar.KizhakeVeetil@arm.com \
    --cc=Hasan.Maruf@amd.com \
    --cc=Michael.Day@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@amd.com \
    --cc=dave.hansen@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=feng.tang@intel.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=honggyu.kim@sk.com \
    --cc=hughd@google.com \
    --cc=hyeonggon.yoo@sk.com \
    --cc=jhubbard@nvidia.com \
    --cc=k.shutemov@gmail.com \
    --cc=kbusch@meta.com \
    --cc=kmanaouil.dev@gmail.com \
    --cc=leesuyeon0506@gmail.com \
    --cc=leillc@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=nphamcs@gmail.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=shivankg@amd.com \
    --cc=shy828301@gmail.com \
    --cc=sj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox