linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Bharata B Rao <bharata@amd.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	mgorman@suse.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org,
	akpm@linux-foundation.org, luto@kernel.org, tglx@linutronix.de,
	yue.li@memverge.com, Ravikumar.Bangoria@amd.com,
	ying.huang@intel.com
Subject: Re: [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing
Date: Thu, 9 Feb 2023 11:27:40 +0530	[thread overview]
Message-ID: <a0127c3c-22c9-ac46-1e9f-200978fec392@amd.com> (raw)
In-Reply-To: <Y+Pj+9bbBbHpf6xM@hirez.programming.kicks-ass.net>

On 2/8/2023 11:33 PM, Peter Zijlstra wrote:
> On Wed, Feb 08, 2023 at 01:05:28PM +0530, Bharata B Rao wrote:
> 
>> - Perf uses IBS and we are using the same IBS for access profiling here.
>>   There needs to be a proper way to make the use mutually exclusive.
> 
> No, IFF this lives it needs to use in-kernel perf.

In fact I started out with in-kernel perf by using the
perf_event_create_kernel_counter() API. However there are issues
with using in-kernel perf:

- We want to reprogram the counter potentially during every
context switch. The IBS hardware sample counter needs to be
reprogrammed based on the incoming thread's view of sample period.
Additionally sampling needs to be disabled for kernel threads.
So I wanted to use perf_event_enable/disable() and
perf_event_period(). However they take mutexes and hence it is
not possible to use them from the sched switch atomic context.

- In-kernel perf gives a per-cpu counter, but we want it to count
based on the task that is currently running. I,e., the period should
be modified on per-task basis. I don't see how an in-kernel perf
event counter can be associated with per-task like this.
	
Hence I didn't see an easy option other than making the use
of IBS in perf and NUMA balancing mutually exclusive.

> 
>> - Is tying this up with NUMA balancing a reasonable approach or
>>   should we look at a completely new approach?
> 
> Is it giving sufficient win to be worth it, afaict it doesn't come even
> close to justifying it.
> 
>> - Hardware provided access information could be very useful for driving
>>   hot page promotion in tiered memory systems. Need to check if this
>>   requires different tuning/heuristics apart from what NUMA balancing
>>   already does.
> 
> I think Huang Ying looked at that from the Intel POV and I think the
> conclusion was that it doesn't really work out. What you need is
> frequency information, but the PMU doesn't really give you that. You
> need to process a *ton* of PMU data in-kernel.

What I am doing here is to feed the access data into NUMA balancing which
already has the logic to aggregate that at task and numa group level and
decide if that access is actionable in terms of migrating the page. In this
context, I am not sure about the frequency information that you and Dave
are mentioning. AFAIU, existing NUMA balancing takes care of taking
action, IBS becomes an alternative source of access information to NUMA
hint faults.

Thanks for your inputs.

Regards,
Bharata.


  parent reply	other threads:[~2023-02-09  5:57 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-08  7:35 Bharata B Rao
2023-02-08  7:35 ` [RFC PATCH 1/5] x86/ibs: In-kernel IBS driver for page access profiling Bharata B Rao
2023-02-08  7:35 ` [RFC PATCH 2/5] x86/ibs: Drive NUMA balancing via IBS access data Bharata B Rao
2023-02-08  7:35 ` [RFC PATCH 3/5] x86/ibs: Enable per-process IBS from sched switch path Bharata B Rao
2023-02-08  7:35 ` [RFC PATCH 4/5] x86/ibs: Adjust access faults sampling period Bharata B Rao
2023-02-08  7:35 ` [RFC PATCH 5/5] x86/ibs: Delay the collection of HW-provided access info Bharata B Rao
2023-02-08 18:03 ` [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing Peter Zijlstra
2023-02-08 18:12   ` Dave Hansen
2023-02-09  6:04     ` Bharata B Rao
2023-02-09 14:28       ` Dave Hansen
2023-02-10  4:28         ` Bharata B Rao
2023-02-10  4:40           ` Dave Hansen
2023-02-10 15:10             ` Bharata B Rao
2023-02-09  5:57   ` Bharata B Rao [this message]
2023-02-13  2:56     ` Huang, Ying
2023-02-13  3:23       ` Bharata B Rao
2023-02-13  3:34         ` Huang, Ying
2023-02-13  3:26 ` Huang, Ying
2023-02-13  5:52   ` Bharata B Rao
2023-02-13  6:30     ` Huang, Ying
2023-02-14  4:55       ` Bharata B Rao
2023-02-15  6:07         ` Huang, Ying
2023-02-24  3:28           ` Bharata B Rao
2023-02-16  8:41         ` Bharata B Rao
2023-02-17  6:03           ` Huang, Ying
2023-02-24  3:36             ` Bharata B Rao
2023-02-27  7:54               ` Huang, Ying
2023-03-01 11:21                 ` Bharata B Rao
2023-03-02  8:10                   ` Huang, Ying
2023-03-03  5:25                     ` Bharata B Rao
2023-03-03  5:53                       ` Huang, Ying
2023-03-06 15:30                         ` Bharata B Rao
2023-03-07  2:33                           ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0127c3c-22c9-ac46-1e9f-200978fec392@amd.com \
    --to=bharata@amd.com \
    --cc=Ravikumar.Bangoria@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=ying.huang@intel.com \
    --cc=yue.li@memverge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox