From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 30DADCDB47E
	for <linux-mm@archiver.kernel.org>; Thu, 12 Oct 2023 13:14:15 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id B84628D0126; Thu, 12 Oct 2023 09:14:14 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id B349E8D0002; Thu, 12 Oct 2023 09:14:14 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id A22C78D0126; Thu, 12 Oct 2023 09:14:14 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 906E28D0002
	for <linux-mm@kvack.org>; Thu, 12 Oct 2023 09:14:14 -0400 (EDT)
Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id 6224514046A
	for <linux-mm@kvack.org>; Thu, 12 Oct 2023 13:14:14 +0000 (UTC)
X-FDA: 81336852828.22.0B47AAA
Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151])
	by imf18.hostedemail.com (Postfix) with ESMTP id 2B36D1C001B
	for <linux-mm@kvack.org>; Thu, 12 Oct 2023 13:14:10 +0000 (UTC)
Authentication-Results: imf18.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=SMDbevKF;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ying.huang@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697116452; a=rsa-sha256;
	cv=none;
	b=y5l7LDn0w1kkTbukEO3IrSdrvQ0SYmhSlCFuMa5yykgVR2lRtLo6uyPAVguTPmOu2BTUVx
	deOj4kDdMICMNbLk44HSqJwxAd5hpWFGhEv1tGM7dPs3poVbvL7Xg/tRzsX7rTFF2wkWoa
	073VCFBdQeZ/t+moAsirnYj621RSLoc=
ARC-Authentication-Results: i=1;
	imf18.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=SMDbevKF;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ying.huang@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1697116452;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=A/fqw6km8EW8Cqt3RwYdJ3PW5zAF9RhzVk9ZgAGHpaY=;
	b=dUiQvniKvAIvlpsRNdfwc30V5Awncvqvrukrh5dBqsLFt3mnOJKGzbLWZhrYjwdKN8Lv31
	eJbRuUt9+MC5OecmuOIS07P9TeTIGioK0zq9laDVxmpv9QpBGSe15Ng/5cqFOwhee+C1Ph
	RaQLiuadcn7PRuXE1+1O5tlPpdrRPCU=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1697116451; x=1728652451;
  h=from:to:cc:subject:references:date:in-reply-to:
   message-id:mime-version;
  bh=eiWMQcDXi/+eJkMwzq0GtXmInqEDzIHYJMGyPtFb05g=;
  b=SMDbevKFJWmrNFGqTszfT3eu1Mqagy48rT1rSrCcG1MK8R6/iYLxpRll
   JAp3tKrqwdXISPoNLdOIJgX3rxsdoAQo8tLaKmXmev1ZneMWvWrlZnISZ
   WKDIdam5jt5k2YQLeL+sXa3fD8gkuUmdvfZvmG+TAyojty+VsLwM5OWuk
   UOF1wsPBR4SNn7ksV5zoJ7vY2O9fli521ZzGvZEmN2YSHfipZ66zKbI/+
   qRiNmYzaSV6pZ/yUByQSIUb1aADEGWhiRCyIHvJTAwiIIcaS2usNaU9pu
   S4B1q3n3SjnPb1MYR02mkS1UbTB+tO/0ZXxlD4+IByfTz5J9gv8foyrOa
   g==;
X-IronPort-AV: E=McAfee;i="6600,9927,10861"; a="365187174"
X-IronPort-AV: E=Sophos;i="6.03,219,1694761200"; 
   d="scan'208";a="365187174"
Received: from orsmga004.jf.intel.com ([10.7.209.38])
  by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Oct 2023 06:14:08 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10861"; a="878098259"
X-IronPort-AV: E=Sophos;i="6.03,219,1694761200"; 
   d="scan'208";a="878098259"
Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55])
  by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Oct 2023 06:14:05 -0700
From: "Huang, Ying" <ying.huang@intel.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: <linux-mm@kvack.org>,  <linux-kernel@vger.kernel.org>,  Arjan Van De Ven
 <arjan@linux.intel.com>,  Sudeep Holla <sudeep.holla@arm.com>,  Andrew
 Morton <akpm@linux-foundation.org>,  Vlastimil Babka <vbabka@suse.cz>,
  "David Hildenbrand" <david@redhat.com>,  Johannes Weiner
 <jweiner@redhat.com>,  "Dave Hansen" <dave.hansen@linux.intel.com>,
  Michal Hocko <mhocko@suse.com>,  "Pavel Tatashin"
 <pasha.tatashin@soleen.com>,  Matthew Wilcox <willy@infradead.org>,
  Christoph Lameter <cl@linux.com>
Subject: Re: [PATCH 02/10] cacheinfo: calculate per-CPU data cache size
References: <20230920061856.257597-1-ying.huang@intel.com>
	<20230920061856.257597-3-ying.huang@intel.com>
	<20231011122027.pw3uw32sdxxqjsrq@techsingularity.net>
	<87h6mwf3gf.fsf@yhuang6-desk2.ccr.corp.intel.com>
	<20231012125253.fpeehd6362c5v2sj@techsingularity.net>
Date: Thu, 12 Oct 2023 21:12:00 +0800
In-Reply-To: <20231012125253.fpeehd6362c5v2sj@techsingularity.net> (Mel
	Gorman's message of "Thu, 12 Oct 2023 13:52:53 +0100")
Message-ID: <87v8bcdly7.fsf@yhuang6-desk2.ccr.corp.intel.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=ascii
X-Rspam-User: 
X-Rspamd-Server: rspam06
X-Rspamd-Queue-Id: 2B36D1C001B
X-Stat-Signature: gb83gjxrzicydbtgst9akjqp9fds6s4n
X-HE-Tag: 1697116450-389213
X-HE-Meta: U2FsdGVkX19iKc6bfAcfUJOYEtKjU8WaodafNAYtpAi90CroxRviOh+Me3ZptszVlianpYdJQoaL47d/RKlmXcce7Yhvv3UHXxjMPBo/tHPpk2W725/++Be8xLIbRHcDmwlqJ4w9gMBou/uNwy+JXgeQtlpurDq4z+T22rTotRrPlOq5qRNVXwPpOYp2Z09WQzpz4UfeAd73lIEBVVlwI/lmTAlZyn60jpl4yqWsorXkxa+QWONw4/WrQrHheqC3nD6cJ6hGZzyST1Gs7mHfgr0o8h0QATZpv3nbrZOco0zEJbkxv65CSzNhPPGi4/KdiFN2LV6R3VtCluXlHOIZ7acOVHpij+w4wzlQ+ObHhe+MT26WNxi66uM3UtYwQD1/C0ngpUMX64Y97ixT7+X0qg//7rVrI/TYmhkUMqS9zoDRMSUA7nOIyNWKSeilf/CDN/Dz70G/X0gtieBJ9MXoQKKnnHtG4TFsyqIUCuhyxGbolp6Hb3OGC5/510WIEl3HlStPbDorE85IFdVv+amQUkdMWq8lPEsgMx1mhjZ8gGhKEvj3/ayK5HockIgWLshomiFE2gC2xSqukwfQQTx/UfPxghIWupunmXuOTrQnfhgUDR8Dl0w/rSFq1Mr6p0qNtmoz52Rj5fKktagNGg6WiqSy6CTOsu28eTrDkJCUZao7Q+fcH7bXwn6Mx5ewjWI4HZl3lFdN9jvESh3WZlG90iWzTWFQ0UDsoTUaZN4eWKnOXIoWzMHNjA6b4zD7m9t9ossJD08VfnQavwFtQH8JuPfBG+7FQV1bOaqGuyOnCkIysT/GOTQE7PsxQWQ5/9nn4GZ+NQN8+5F7BxTEefp5EO7f5hacq2ZKjUvsAz0b1Tkdisnv3ENC1pkUwdO7bIV4hmydvtnTLeWLnYQoZBkivkOy3VXdUfzqR+ZMuLZpouJAAb9XGU0hHhkfvy2VXxBxEYIJXQi5e12SmY54w7P
 ugSgiiHs
 Zl+/+Oa+XtASQZAb8Q2Apf5HSufF+Rtcq/fnzalAU5XfA6CB0n3EfMXgZbF5/LLdFilrutlEJeTgDWj02meg4obFpUiaVkbFcoeRq45ApiM2CvmIhFaYjHgOlgDQSq3tQkOsScYOwRH6/vh0+QWGV+EXQygRi5VjLhEIbLxKH3egdp6kuhBB6TTeMPDVOhnYiJf8paRJRtTVS9v+f2Gh2r7f2ozGJRWgg94kamcczB18ET820RoPQjElxFPMfqTe1EGhCz63tqrfd0ry7kWMSXi8nkMlfEGtdedNNTT6L2fFhvUyrNeu26w6C6hrP/WQKdaoOHNqBc3QM/QEd6AWVmMgYa0Zaw8sfLEXDsBtroPZQtHzTUGx2Slt2zg987mE/AlMCSEhsGASCEv1j2sSCZD8OdPlXF0vcNJItoGLdRo+fhnKaWjbOcS+8kxklyhXUC+82oWHF0zfKS9YR/zx0i1OBxVrtLoUJV6VAFONcJ3N0qq1EjBXFlpR1hppDcqRMXvMhusGgw502qUBhCrHevM82vfh+7DdhKl4PJg4cIp7RWcqLcdB0Lw+yfzDwJ2d7DonuCHLH9o7dpv3IAQqOnCRGAiQdqMmRqfZ6AnR3/g4EJ+9ybRKVQKoQF7sIxNiZnvfTgG4YsU7+yscy4m9C/JBe2XSkmnPpACUB
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

Mel Gorman <mgorman@techsingularity.net> writes:

> On Thu, Oct 12, 2023 at 08:08:32PM +0800, Huang, Ying wrote:
>> Mel Gorman <mgorman@techsingularity.net> writes:
>> 
>> > On Wed, Sep 20, 2023 at 02:18:48PM +0800, Huang Ying wrote:
>> >> Per-CPU data cache size is useful information.  For example, it can be
>> >> used to determine per-CPU cache size.  So, in this patch, the data
>> >> cache size for each CPU is calculated via data_cache_size /
>> >> shared_cpu_weight.
>> >> 
>> >> A brute-force algorithm to iterate all online CPUs is used to avoid
>> >> to allocate an extra cpumask, especially in offline callback.
>> >> 
>> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> >
>> > It's not necessarily relevant to the patch, but at least the scheduler
>> > also stores some per-cpu topology information such as sd_llc_size -- the
>> > number of CPUs sharing the same last-level-cache as this CPU. It may be
>> > worth unifying this at some point if it's common that per-cpu
>> > information is too fine and per-zone or per-node information is too
>> > coarse. This would be particularly true when considering locking
>> > granularity,
>> >
>> >> Cc: Sudeep Holla <sudeep.holla@arm.com>
>> >> Cc: Andrew Morton <akpm@linux-foundation.org>
>> >> Cc: Mel Gorman <mgorman@techsingularity.net>
>> >> Cc: Vlastimil Babka <vbabka@suse.cz>
>> >> Cc: David Hildenbrand <david@redhat.com>
>> >> Cc: Johannes Weiner <jweiner@redhat.com>
>> >> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> >> Cc: Michal Hocko <mhocko@suse.com>
>> >> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
>> >> Cc: Matthew Wilcox <willy@infradead.org>
>> >> Cc: Christoph Lameter <cl@linux.com>
>> >> ---
>> >>  drivers/base/cacheinfo.c  | 42 ++++++++++++++++++++++++++++++++++++++-
>> >>  include/linux/cacheinfo.h |  1 +
>> >>  2 files changed, 42 insertions(+), 1 deletion(-)
>> >> 
>> >> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>> >> index cbae8be1fe52..3e8951a3fbab 100644
>> >> --- a/drivers/base/cacheinfo.c
>> >> +++ b/drivers/base/cacheinfo.c
>> >> @@ -898,6 +898,41 @@ static int cache_add_dev(unsigned int cpu)
>> >>  	return rc;
>> >>  }
>> >>  
>> >> +static void update_data_cache_size_cpu(unsigned int cpu)
>> >> +{
>> >> +	struct cpu_cacheinfo *ci;
>> >> +	struct cacheinfo *leaf;
>> >> +	unsigned int i, nr_shared;
>> >> +	unsigned int size_data = 0;
>> >> +
>> >> +	if (!per_cpu_cacheinfo(cpu))
>> >> +		return;
>> >> +
>> >> +	ci = ci_cacheinfo(cpu);
>> >> +	for (i = 0; i < cache_leaves(cpu); i++) {
>> >> +		leaf = per_cpu_cacheinfo_idx(cpu, i);
>> >> +		if (leaf->type != CACHE_TYPE_DATA &&
>> >> +		    leaf->type != CACHE_TYPE_UNIFIED)
>> >> +			continue;
>> >> +		nr_shared = cpumask_weight(&leaf->shared_cpu_map);
>> >> +		if (!nr_shared)
>> >> +			continue;
>> >> +		size_data += leaf->size / nr_shared;
>> >> +	}
>> >> +	ci->size_data = size_data;
>> >> +}
>> >
>> > This needs comments.
>> >
>> > It would be nice to add a comment on top describing the limitation of
>> > CACHE_TYPE_UNIFIED here in the context of
>> > update_data_cache_size_cpu().
>> 
>> Sure.  Will do that.
>> 
>
> Thanks.
>
>> > The L2 cache could be unified but much smaller than a L3 or other
>> > last-level-cache. It's not clear from the code what level of cache is being
>> > used due to a lack of familiarity of the cpu_cacheinfo code but size_data
>> > is not the size of a cache, it appears to be the share of a cache a CPU
>> > would have under ideal circumstances.
>> 
>> Yes.  And it isn't for one specific level of cache.  It's sum of per-CPU
>> shares of all levels of cache.  But the calculation is inaccurate.  More
>> details are in the below reply.
>> 
>> > However, as it appears to also be
>> > iterating hierarchy then this may not be accurate. Caches may or may not
>> > allow data to be duplicated between levels so the value may be inaccurate.
>> 
>> Thank you very much for pointing this out!  The cache can be inclusive
>> or not.  So, we cannot calculate the per-CPU slice of all-level caches
>> via adding them together blindly.  I will change this in a follow-on
>> patch.
>> 
>
> Please do, I would strongly suggest basing this on LLC only because it's
> the only value you can be sure of. This change is the only change that may
> warrant a respin of the series as the history will be somewhat confusing
> otherwise.

I am still checking whether it's possible to get cache inclusive
information via cpuid.

If there's no reliable way to do that.  We can use the max value of
per-CPU share of each level of cache.  For inclusive cache, that will be
the value of LLC.  For non-inclusive cache, the value will be more
accurate.  For example, on Intel Sapphire Rapids, the L2 cache is 2 MB
per core, while LLC is 1.875 MB per core according to [1].

[1] https://www.intel.com/content/www/us/en/developer/articles/technical/fourth-generation-xeon-scalable-family-overview.html

I will respin the series.

Thanks a lot for review!

--
Best Regards,
Huang, Ying