From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96B06C4167B for ; Thu, 2 Nov 2023 07:48:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2429E8D0081; Thu, 2 Nov 2023 03:48:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F1F78D0026; Thu, 2 Nov 2023 03:48:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 06E8B8D0081; Thu, 2 Nov 2023 03:48:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E8A998D0026 for ; Thu, 2 Nov 2023 03:48:21 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id BD0B7140951 for ; Thu, 2 Nov 2023 07:48:21 +0000 (UTC) X-FDA: 81412236402.25.509BB69 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by imf17.hostedemail.com (Postfix) with ESMTP id 205D840010 for ; Thu, 2 Nov 2023 07:48:18 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IMpJwePy; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf17.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698911300; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=13HzKIKsh4VQtZiSyTxXLr6KXrnYnBgJA12Nor8L1Rs=; b=F+r5iwFk7m0Z0atHA/VoTXEvEyorjIGt6p2WObWdEneL8OUn1Hpk9TBIQAj0q2KNg0bfbX ZlhoWMhoqY7gVBjjysQa+JSct6u61tkSiP1GBqpYmyWwDAxhWi0UQu/XO0Et+LjtxIfiEM Cm8BZvXHFUrd6s5UMaHgiggqgbg1xfE= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IMpJwePy; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf17.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698911300; a=rsa-sha256; cv=none; b=m8W2WSwBPooCWgjubdhILaph3/ywTIY3hBgiGspmGuKCQLU0kz/sWgJPy9zhPaUy7Xh+WM /o5Y/1cBiAUarRX7fzK/VsJPsMzcxhQnEt/rdoIrcHjdYQtnunLGAokOpa9Tv6p/MbQU4N o5Md6zKh4m6pY4vrjwpCaE9RXlDBFBQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698911299; x=1730447299; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=5YvNxTfvSnC1y/nzvgzf1E81qwSgP1aoPEYjM5UVLpo=; b=IMpJwePyCECTZFLL1l1am4H0/2qX4NTcGGuGRMeNB9tphv8Keh5vKvLd sLXyt65gwIa3PLs2hIsjEJAzNGaWpPAX6B55r6mp5wEts7rXMESHm+e1b KcDxkAbD/BAFtLSioUu3XRaWl4CTKoqVrTJhGK5HeqTRso4JGoh3w49qz jxiiPIqfvgudzaWIENBXO31G/qvQJPvCzKParJxt93x7D6dFoRSLqW8mr PAtMfK6HQiZsIdxS8zRw4/mHKIHBBqjRmdP4/InF1KEMA4Hp4K5NdH79x N7OhdiXARc6zV9v2YECDEh3B3ZA7OrX4O+y0ugBqIxssKD+eDL9ABrF/L Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10881"; a="419768056" X-IronPort-AV: E=Sophos;i="6.03,270,1694761200"; d="scan'208";a="419768056" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2023 00:48:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10881"; a="754699383" X-IronPort-AV: E=Sophos;i="6.03,270,1694761200"; d="scan'208";a="754699383" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2023 00:48:15 -0700 From: "Huang, Ying" To: "Yasunori Gotou (Fujitsu)" Cc: Andrew Morton , Greg Kroah-Hartman , "rafael@kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "Zhijian Li (Fujitsu)" Subject: Re: [PATCH RFC 3/4] mm/vmstat: rename pgdemote_* to pgdemote_dst_* and add pgdemote_src_* In-Reply-To: (Yasunori Gotou's message of "Thu, 2 Nov 2023 07:38:19 +0000") References: <20231102025648.1285477-1-lizhijian@fujitsu.com> <20231102025648.1285477-4-lizhijian@fujitsu.com> <87r0l81zfd.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Thu, 02 Nov 2023 15:46:13 +0800 Message-ID: <871qd81ttm.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 205D840010 X-Stat-Signature: wa6m4r7arfa5o3fm6sgncd1p1kjr4kps X-Rspam-User: X-HE-Tag: 1698911298-158749 X-HE-Meta: U2FsdGVkX1+iFmLpxTZO9OtRKba2eMTdF/cxVuFgcfCPE3NmP5AJQgYf7cxIPK0JN6yPNbDk7k97smO7cvMtGEaKK5vf4nzzntD+qc+shYkR2AXg+8nqAj2zx7pNTA6veZQRMglT6Y4R6B/t6jZCbIjS0Lc4BOBQcjWfjhHi5ir32cTHZ5s+FSBaBIxRHh2OXu7LpuGpoUzfZ3aOUklsYIas32Oth5+MTPitBCkkLj7HeqG6BcDlIowbYaSopciSMEoPUhmkM4s+tMKsA+q3gVXHGWusHgfBjs/Cl8daG3nx6xTwlzq6J00GxRrWL9FwhpwjrSuhDkicxPl8DLj4qdWN7MS/iWcgttjW3njJgijvJEPyhXgF4J61wqVGv71Vy16Q5KP9yIX4i09oJSoLNZlm07twgzaeRI/x1sWX3DgYJSHnmsU63bp6z4ls1GR27E6H/CUK53zZCho7LVl1slTz6vw05ago80xBo8viphKF8M30epknpM10cx5N+8DPyIcpNWngjVUWRl3FaAXkOuKbb1ZoacUxJzFH6YDHmRBWaLMyeIv9xiPl19j614vfaEwT21kx+0nBB122Yk2TdzxzWZsuMsJ+DoWQsfTbuejIKiPK6woX4k8oEAo/cnqB1nx4o9Q/nvAWHTVkHWqX5r4QKzc8liELtMIOS9DEwiap41Ea2heZhkFgOo2/6Pmgbt679qYpTp4iZGNVe1bgp9hlGaOXjRiQATl1fYEW89yJ6+2C+bAvQvKB3YlMKUFylmlAB5dX1SxtGFH6afPcvep4b7xEBpNWHBwBau7TJr1G/y8yXU/WD2GdtUSYFZ16x47n/nkDHDnRgM/SJ8NQr8RkgzzMV7QktnVPusYDW/x81oc9G3aZLH9Kg6yP4AGQTVI7wiUqP+DMGxTGVC08GS5m7HmZIyj+Gf5vpR8FV5uxypefWBw3jqwHtROwPV5w9ipvHjPfUErfw+JGG/U kU7j7OeC HNlA/W9FEVL88LlamoEvX+9I4LA0Fpd2KFpN4lCUPiWVPR0wF1fow98f0xcc78KY4eD7hNu9/nZ3OAG5Vr43qjv/mwY0Peotbwr2lhbYzoniioXgyh3GVb7fX6fYiunfxNtfd7kNyXH2kiyZOqWXcugsYiIWwYJAyXj0nnhYPc0nL2n+029BoyoEnM48mjGabyMaKNRSteD3pbkRdnkLNSC9WCWH8mbv+OoGWN4Db1r7AMmtfxtjbKMInn1EaxNzSDkv2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: "Yasunori Gotou (Fujitsu)" writes: > Hello, > >> On 02/11/2023 13:45, Huang, Ying wrote: >> > Li Zhijian writes: >> > >> >> pgdemote_src_*: pages demoted from this node. >> >> pgdemote_dst_*: pages demoted to this node. >> >> >> >> So that we are able to know their demotion per-node stats by checking this. >> >> >> >> In the environment, node0 and node1 are DRAM, node3 is PMEM. >> >> >> >> Global stats: >> >> $ grep -E 'demote' /proc/vmstat >> >> pgdemote_src_kswapd 130155 >> >> pgdemote_src_direct 113497 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 130155 >> >> pgdemote_dst_direct 113497 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> Per-node stats: >> >> $ grep demote /sys/devices/system/node/node0/vmstat >> >> pgdemote_src_kswapd 68454 >> >> pgdemote_src_direct 83431 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 0 >> >> pgdemote_dst_direct 0 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> $ grep demote /sys/devices/system/node/node1/vmstat >> >> pgdemote_src_kswapd 185834 >> >> pgdemote_src_direct 30066 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 0 >> >> pgdemote_dst_direct 0 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> $ grep demote /sys/devices/system/node/node3/vmstat >> >> pgdemote_src_kswapd 0 >> >> pgdemote_src_direct 0 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 254288 >> >> pgdemote_dst_direct 113497 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> From above stats, we know node3 is the demotion destination which one >> >> the node0 and node1 will demote to. >> > >> > Why do we need these information? Do you have some use case? >> >> I recall our customers have mentioned that they want to know how much the >> memory is demoted >> to the CXL memory device in a specific period. > > I'll mention about it more. > > I had a conversation with one of our customers. He expressed a desire for more detailed > profile information to analyze the behavior of demotion (and promotion) when > his workloads are executed. > If the results are not satisfactory for his workloads, he wants to tune his servers for his workloads > with these profiles. > Additionally, depending on the results, he may want to change his server configuration. > For example, he may want to buy more expensive DDR memories rather than cheaper CXL memory. > > In my impression, our customers seems to think that CXL memory is NOT as reliable as DDR memory yet. > Therefore, they want to prepare for the new world that CXL will bring, and want to have a method > for the preparation by profiling information as much as possible. > > it this enough for your question? I want some more detailed information about how these stats are used? Why isn't per-node pgdemote_xxx counter enough? -- Best Regards, Huang, Ying > Thanks, > >> >> >> >>> mod_node_page_state(NODE_DATA(target_nid), >> >>> - PGDEMOTE_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >>> + PGDEMOTE_DST_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> But if the *target_nid* is only indicate the preferred node, this accounting >> maybe not accurate. >> >> >> Thanks >> Zhijian >> >> > >> > -- >> > Best Regards, >> > Huang, Ying >> > >> >> Signed-off-by: Li Zhijian >> >> --- >> >> RFC: their names are open to discussion, maybe pgdemote_from/to_* >> >> Another defect of this patch is that, SUM(pgdemote_src_*) is always same >> >> as SUM(pgdemote_dst_*) in the global stats, shall we hide one of them. >> >> --- >> >> include/linux/mmzone.h | 9 ++++++--- >> >> mm/vmscan.c | 13 ++++++++++--- >> >> mm/vmstat.c | 9 ++++++--- >> >> 3 files changed, 22 insertions(+), 9 deletions(-) >> >> >> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> >> index ad0309eea850..a6140d894bec 100644 >> >> --- a/include/linux/mmzone.h >> >> +++ b/include/linux/mmzone.h >> >> @@ -207,9 +207,12 @@ enum node_stat_item { >> >> PGPROMOTE_SUCCESS, /* promote successfully */ >> >> PGPROMOTE_CANDIDATE, /* candidate pages to promote */ >> >> /* PGDEMOTE_*: pages demoted */ >> >> - PGDEMOTE_KSWAPD, >> >> - PGDEMOTE_DIRECT, >> >> - PGDEMOTE_KHUGEPAGED, >> >> + PGDEMOTE_SRC_KSWAPD, >> >> + PGDEMOTE_SRC_DIRECT, >> >> + PGDEMOTE_SRC_KHUGEPAGED, >> >> + PGDEMOTE_DST_KSWAPD, >> >> + PGDEMOTE_DST_DIRECT, >> >> + PGDEMOTE_DST_KHUGEPAGED, >> >> #endif >> >> NR_VM_NODE_STAT_ITEMS >> >> }; >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> >> index 2f1fb4ec3235..55d2287d7150 100644 >> >> --- a/mm/vmscan.c >> >> +++ b/mm/vmscan.c >> >> @@ -1111,13 +1111,18 @@ void drop_slab(void) >> >> static int reclaimer_offset(void) >> >> { >> >> BUILD_BUG_ON(PGSTEAL_DIRECT - PGSTEAL_KSWAPD != >> >> - PGDEMOTE_DIRECT - PGDEMOTE_KSWAPD); >> >> + PGDEMOTE_SRC_DIRECT - >> PGDEMOTE_SRC_KSWAPD); >> >> BUILD_BUG_ON(PGSTEAL_DIRECT - PGSTEAL_KSWAPD != >> >> PGSCAN_DIRECT - PGSCAN_KSWAPD); >> >> BUILD_BUG_ON(PGSTEAL_KHUGEPAGED - PGSTEAL_KSWAPD != >> >> - PGDEMOTE_KHUGEPAGED - >> PGDEMOTE_KSWAPD); >> >> + PGDEMOTE_SRC_KHUGEPAGED - >> PGDEMOTE_SRC_KSWAPD); >> >> BUILD_BUG_ON(PGSTEAL_KHUGEPAGED - PGSTEAL_KSWAPD != >> >> PGSCAN_KHUGEPAGED - PGSCAN_KSWAPD); >> >> + BUILD_BUG_ON(PGDEMOTE_SRC_DIRECT - >> PGDEMOTE_SRC_KSWAPD != >> >> + PGDEMOTE_DST_DIRECT - >> PGDEMOTE_DST_KSWAPD); >> >> + BUILD_BUG_ON(PGDEMOTE_SRC_KHUGEPAGED - >> PGDEMOTE_SRC_KSWAPD != >> >> + PGDEMOTE_DST_KHUGEPAGED - >> PGDEMOTE_DST_KSWAPD); >> >> + >> >> >> >> if (current_is_kswapd()) >> >> return 0; >> >> @@ -1678,8 +1683,10 @@ static unsigned int demote_folio_list(struct >> list_head *demote_folios, >> >> (unsigned long)&mtc, MIGRATE_ASYNC, >> MR_DEMOTION, >> >> &nr_succeeded); >> >> >> >> + mod_node_page_state(pgdat, >> >> + PGDEMOTE_SRC_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> mod_node_page_state(NODE_DATA(target_nid), >> >> - PGDEMOTE_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> + PGDEMOTE_DST_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> >> >> return nr_succeeded; >> >> } >> >> diff --git a/mm/vmstat.c b/mm/vmstat.c >> >> index f141c48c39e4..63f106a5e008 100644 >> >> --- a/mm/vmstat.c >> >> +++ b/mm/vmstat.c >> >> @@ -1244,9 +1244,12 @@ const char * const vmstat_text[] = { >> >> #ifdef CONFIG_NUMA_BALANCING >> >> "pgpromote_success", >> >> "pgpromote_candidate", >> >> - "pgdemote_kswapd", >> >> - "pgdemote_direct", >> >> - "pgdemote_khugepaged", >> >> + "pgdemote_src_kswapd", >> >> + "pgdemote_src_direct", >> >> + "pgdemote_src_khugepaged", >> >> + "pgdemote_dst_kswapd", >> >> + "pgdemote_dst_direct", >> >> + "pgdemote_dst_khugepaged", >> >> #endif >> >> >> >> /* enum writeback_stat_item counters */