From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE256C00144 for ; Wed, 27 Jul 2022 01:16:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 757D26B0071; Tue, 26 Jul 2022 21:16:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DFAF6B0072; Tue, 26 Jul 2022 21:16:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5593A8E0001; Tue, 26 Jul 2022 21:16:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 40D236B0071 for ; Tue, 26 Jul 2022 21:16:25 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id F38F3A0A72 for ; Wed, 27 Jul 2022 01:16:24 +0000 (UTC) X-FDA: 79731114330.09.EE3A743 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf12.hostedemail.com (Postfix) with ESMTP id AB735400C1 for ; Wed, 27 Jul 2022 01:16:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1658884583; x=1690420583; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=iPdXmQ179/RmymAUmqn4R6JR+2WpZ7duivhKTTYWM9A=; b=BOLFCPkyEP0mrkJYC1Jx88RZWHmRFJGU7oGh4eHfXuWIpDCZElj42ibO sBe33Mc8fQeZ/DmkmX7Nt6wpKGrV3M3pN3mltoRX8h1aF73Lj1/NMGf5L a8SVG86/SqF448W2GDqsTA6cKZ4NBFbM1xoQ7HU7FARkgAyNf8xeg+E7j zsxku6GLlcRjcqgxZ0Ndu/xmng7HT1eGpMM6islEtshsuBTpR1S9XPwPe 0O9ajCmmCv3kcag22NO7G6OcKd0u9JP0eCoANo2Q/nss1SgFnEjTTYx7R pj05Ce1M2r1fKTiCgRSpznZoPdtaahVS2ZjvjwLEe8mJtVQqOQ9fvUVS2 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10420"; a="271154630" X-IronPort-AV: E=Sophos;i="5.93,194,1654585200"; d="scan'208";a="271154630" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2022 18:16:21 -0700 X-IronPort-AV: E=Sophos;i="5.93,194,1654585200"; d="scan'208";a="575754964" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.239.13.94]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jul 2022 18:16:17 -0700 From: "Huang, Ying" To: Aneesh Kumar K V , Wei Xu , Johannes Weiner Cc: linux-mm@kvack.org, akpm@linux-foundation.org, Yang Shi , Davidlohr Bueso , Tim C Chen , Michal Hocko , Linux Kernel Mailing List , Hesham Almatary , Dave Hansen , Jonathan Cameron , Alistair Popple , Dan Williams , jvgediya.oss@gmail.com, Jagdish Gediya Subject: Re: [PATCH v10 1/8] mm/demotion: Add support for explicit memory tiers References: <20220720025920.1373558-1-aneesh.kumar@linux.ibm.com> <20220720025920.1373558-2-aneesh.kumar@linux.ibm.com> <87k080wmvb.fsf@yhuang6-desk2.ccr.corp.intel.com> <9e9ba2e4-3a87-3a79-e336-8849dad4856a@linux.ibm.com> Date: Wed, 27 Jul 2022 09:16:08 +0800 In-Reply-To: <9e9ba2e4-3a87-3a79-e336-8849dad4856a@linux.ibm.com> (Aneesh Kumar K. V.'s message of "Tue, 26 Jul 2022 17:29:56 +0530") Message-ID: <87lesfuzhj.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658884584; a=rsa-sha256; cv=none; b=oWYmRNrg0vYU65UvO5te6LZbol6ftbE2yepkxIyyCrmnUuS0R8ZOdFbiZSKme3EXHK+gzP ry3OqBVU228Cg/Glnhbt6seVlrMCkbu0kI89fS5vTIe4sx4SlD/DGq4v1FLOrFesRdhNwQ oBZLDGeoAa9SZPKzOUua0yVwShf/x5s= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=BOLFCPky; spf=pass (imf12.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658884584; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=r8v1Z4aqNyggNNTh2xAC1xyaC7joXYAND8kMvdXuiHQ=; b=Bgbx28TZUWYa3UG9ulvbpxWriaJ0Mq9mAw4MAglQvoem9qPLJrMpOpLV6JJdRw9DYR6lQr AL5g6kUuGOdAIjPMeLx+gU4JJPYh8EU6vugc8i68tfJr/e/DaGKVNB1MyjBHCPuCu6YCz5 iFJBpHK51ffzFnUkmp2O4YyIoxKtS4k= X-Rspamd-Queue-Id: AB735400C1 X-Rspam-User: X-Stat-Signature: gtzfr6q55jadgd7mgntrp5rffpc7hg9a Authentication-Results: imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=BOLFCPky; spf=pass (imf12.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam08 X-HE-Tag: 1658884583-48311 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Aneesh Kumar K V writes: >>> diff --git a/include/linux/node.h b/include/linux/node.h >>> index 40d641a8bfb0..a2a16d4104fd 100644 >>> --- a/include/linux/node.h >>> +++ b/include/linux/node.h >>> @@ -92,6 +92,12 @@ struct node { >>> struct list_head cache_attrs; >>> struct device *cache_dev; >>> #endif >>> + /* >>> + * For memory devices, perf_level describes >>> + * the device performance and how it should be used >>> + * while building a memory hierarchy. >>> + */ >>> + int perf_level; >> >> Think again, I found that "perf_level" may be not the best abstraction >> of the performance of memory devices. In concept, it's an abstraction of the memory >> bandwidth. But it will not reflect the memory latency. >> >> Instead, the previous proposed "abstract_distance" is an abstraction of >> the memory latency. Per my understanding, the memory latency has more >> direct influence on system performance. And because the latency of the >> memory device will increase if the memory accessing throughput nears its >> max bandwidth, so the memory bandwidth can be reflected in the "abstract >> distance" too. That is, the "abstract distance" is an abstraction of >> the memory latency under the expected memory accessing throughput. The >> "offset" to the default "abstract distance" reflects the different >> expected memory accessing throughput. >> >> So, I think we need some kind of abstraction of the memory latency >> instead of memory bandwidth, e.g., "abstract distance". >> > > I am reworking other parts of the patch set based on your feedback. Thanks! > This part I guess we need to reach some consensus. Yes. Let's do that. > IMHO perf_level (performance level) can indicate a combination of both latency > and bandwidth. "abstract distance" is based on latency, and bandwidth is reflected via "latency under the expected memory accessing throughput". How does perf_level indicate the combination? Per my understanding, it's bandwidth based. > It is an abstract concept that indicates the performance of the > device. As we learn more about which device attribute makes more impact in > defining hierarchy, performance level will give more weightage to that specific > attribute. It could be write latency or bandwidth. For me, distance has a direct > linkage to latency because that is how we define numa distance now. Adding > abstract to the name is not making it more abstract than perf_level. > > I am open to suggestions from others. Wei Xu has also suggested perf_level name. > I can rename this to abstract_distance if that indicates the goal better. I'm open to naming. But I think that it's good to define it at some degree instead of completely opaque stuff. If it's latency based, then low value corresponds to high performance. If it's bandwidth based, then low value corresponds to low performance. Hi, Wei and Johannes, What do you think about this? Best Regards, Huang, Ying