From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93A59C54ED1 for ; Tue, 27 May 2025 18:16:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 287E26B00A2; Tue, 27 May 2025 14:16:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 210976B00A5; Tue, 27 May 2025 14:16:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 08ADE6B00A7; Tue, 27 May 2025 14:16:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DCD656B00A2 for ; Tue, 27 May 2025 14:16:02 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 93499EA8C5 for ; Tue, 27 May 2025 18:16:02 +0000 (UTC) X-FDA: 83489491764.14.898ABB0 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) by imf28.hostedemail.com (Postfix) with ESMTP id 95EBBC0005 for ; Tue, 27 May 2025 18:16:00 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=lvLaWjqW; spf=pass (imf28.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748369761; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hgp7UFRFy924BDD+FB4aDdOCHtczOgvXd6rr5SomvVo=; b=DYx8led8nY62acXs+FFT98/KRO6O6Qz5rLnJS/CJjktKlVTlKPi3numov1U08o8UwdQ6BC Kbiwwj0yK0t9CzxXfDY1ilxB4miLQaRBvthNuJalwGq8EyLE0i/nR/0HHNaH5vjc+3vUBR 9iN/3JJehDng74Vrn3CpoVQSPbh3EGM= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=lvLaWjqW; spf=pass (imf28.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748369761; a=rsa-sha256; cv=none; b=0e7l2+bi0ywnbhynSugr6HIBvjDxc8c3EjFIqIdjZD/T6NLrm775ZWewN8TjcqZ8CnnwnA UyiRYqDz85zg4ZeBtwgh23+c3MeBQxMruIy45aK9jTkABEdhaYIeS4S89Y3x4ll0yqLFz2 wW9CvAVHRNSSvXVS7gSrBj2o73xmTrw= Date: Tue, 27 May 2025 11:15:33 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1748369758; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Hgp7UFRFy924BDD+FB4aDdOCHtczOgvXd6rr5SomvVo=; b=lvLaWjqWmcbwbDI9SnvAAklzWuKIlhtjpVMcOF4pHh+20rEa/pTZr5v1ISbaCr9S3OVMuv zqd7Trm+lnrdoBM7KmRAz5w1A34aqoZaaa775IoTECuB8TXDQKeBftRMtQXqd8HShqQyix RLRlzjHCwlC0vOZe6SgudD9NXXKtgbc= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: "Chen, Yu C" Cc: Michal =?utf-8?Q?Koutn=C3=BD?= , peterz@infradead.org, akpm@linux-foundation.org, mingo@redhat.com, tj@kernel.org, hannes@cmpxchg.org, corbet@lwn.net, mgorman@suse.de, mhocko@kernel.org, muchun.song@linux.dev, roman.gushchin@linux.dev, tim.c.chen@intel.com, aubrey.li@intel.com, libo.chen@oracle.com, kprateek.nayak@amd.com, vineethr@linux.ibm.com, venkat88@linux.ibm.com, ayushjai@amd.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, yu.chen.surf@foxmail.com Subject: Re: [PATCH v5 2/2] sched/numa: add statistics of numa balance task Message-ID: References: <7ef90a88602ed536be46eba7152ed0d33bad5790.1748002400.git.yu.c.chen@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Queue-Id: 95EBBC0005 X-Rspamd-Server: rspam09 X-Stat-Signature: ag3m6k9emezdq8kucc5me8r6bprgtqxh X-HE-Tag: 1748369760-957449 X-HE-Meta: U2FsdGVkX18D874AfK0f/jTbC02LAQe7cYD0yNX1ThzaST6NXgjxitg3PmtQcnPmiVn4AkkBGS9wjP35yTFpJ0XmcD48gwUiVkPv0F1krAhc0Ndd1CZugSuHiQMKQZbRRP7qVqRCg2eGTgjLJ7qiiof7vwkbeC2PLpniZ/vBU/uBvDel7wipDnCjqy9lBo9RgpymkYFfqpuXnSsHStwHM4VjoLA2QFOAB5kZPx5O3mC+gCK+xhkb4QHblvV7XqkyeuNDRbyFb+uA/MNjRdlwS1MZs5zAjusUPzkvgutJu1J7ZZoa5ZPB/HA9gczGtgU3W4og/FHHp4VwRnLQZLAJE6OMkiALCygPmf+FUTYNMDCu23eNDLGLVubNPN5DM/RhDXMjS1M/qAGyNsS31O/vndAJuEifAkJAFwqD5LxxX39AJeOLhVDAIopbxY+tVfKDMFYmCFeD7dO7CWyuu8uQ/8nEZMxqcphrC/L6eMPsZ4/q0eUI8HuRg0RkbV4NJOmHiiX6U2wkrOWjeC6Lzxoqtgw4Eixedz35HF5VZimPn1J43ctIIzvPl5GRkRRYiBH6QLmbBH3xDs2VGpvVkuUtTbOsWiC+GEAkscb3UcLWP1tHIvzN6rXUpZ0lSDTrxE/eR0im42Ccncib5v26Ps7nUN+KA9p14gCt18sAEGyYafUfKzTqBm6BcIgDgR81eoIDvtKvGtT2yfJJfiAyZIAR37yvCV6jll7EJqwFUgJsKOJG6LV8M9HQJNZ9XvHCM05Fzw8RI18dUxyMo97GsNzg0J5ofDCccuLxjvBN1NcQ9Sb+v5XOSeLWO7BB8MtLuld3ZxGO3ZUVX2yYqcu5Jyx1XJXpQS7xwTZp3Y2M3+fUQmkpXELaEZiB0Hk9oxJWqCVdvSI7whM8mLINOv5XXJTa0hVTX6GsJw6kHpIRvkZnxHxfXu0uNQ4SrO3DPP/SAL15jxcjPLxh0eXO4qYWk9o 3i7fOSy6 p2Kk5frDKW4c/1LvOA9IMZEweN1u/C0n5rM90dEX+dKZrr3/AGV/Jck1rxHz5SP8MWbcTj/5cKiGaJoM2Ox5VsFSnFSSG2kn0pqO2l1XYySLADK4fwdjWbivUr/YQ7KSrLGSAiGX3kC7967WumJEjb0kgblneaCf2wK4+UxGT5PQRUrF2ua86EsOeQZaXgxdTtLordZpfA5db3/GNB+2dBtzqjL6qKsisJmEou+jSSlQixGRLmZwsE5ez0A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 27, 2025 at 05:20:54PM +0800, Chen, Yu C wrote: > On 5/26/2025 9:35 PM, Michal Koutný wrote: > > On Fri, May 23, 2025 at 04:42:50PM -0700, Shakeel Butt wrote: > > > Hmm these are scheduler events, how are these relevant to memory cgroup > > > or vmstat? Any reason to not expose these in cpu.stat? > > > > Good point. If I take it further -- this functionality needs neither > > memory controller (CONFIG_MEMCG) nor CPU controller > > (CONFIG_CGROUP_SCHED), so it might be technically calculated and exposed > > in _any_ cgroup (which would be same technical solution how cpu time is > > counted in cpu.stat regardless of CPU controller, cpu_stat_show()). > > > > Yes, we can add it to cpu.stat. However, this might make it more difficult > for users to locate related events. Some statistics about NUMA page > migrations/faults are recorded in memory.stat, while others about NUMA task > migrations (triggered by NUMA faults periodicly) are stored in cpu.stat. > > Do you recommend extending the struct cgroup_base_stat to include counters > for task_migrate/task_swap? Additionally, should we enhance > cgroup_base_stat_cputime_show() to parse task_migrate/task_swap in a manner > similar to cputime? > > Alternatively, as Shakeel previously mentioned, could we reuse > "count_memcg_event_mm()" and related infrastructure while exposing these > statistics/events in cpu.stat? I assume Shakeel was referring to the > following > approach: > > 1. Skip task migration/swap in memory.stat: > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index cdaab8a957f3..b8eea3eca46f 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1529,6 +1529,11 @@ static void memcg_stat_format(struct mem_cgroup > *memcg, struct seq_buf *s) > if (memcg_vm_event_stat[i] == PGPGIN || > memcg_vm_event_stat[i] == PGPGOUT) > continue; > +#endif > +#ifdef CONFIG_NUMA_BALANCING > + if (memcg_vm_event_stat[i] == NUMA_TASK_MIGRATE || > + memcg_vm_event_stat[i] == NUMA_TASK_SWAP) > + continue; > #endif > > 2.Skip task migration/swap in /proc/vmstat > diff --git a/mm/vmstat.c b/mm/vmstat.c > index ed08bb384ae4..ea8a8ae1cdac 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -1912,6 +1912,10 @@ static void *vmstat_next(struct seq_file *m, void > *arg, loff_t *pos) > (*pos)++; > if (*pos >= NR_VMSTAT_ITEMS) > return NULL; > +#ifdef CONFIG_NUMA_BALANCING > + if (*pos == NUMA_TASK_MIGRATE || *pos == NUMA_TASK_SWAP) > + return NULL; > +#endif > > 3. Display task migration/swap events in cpu.stat: > seq_buf_printf(&s, "%s %lu\n", > + vm_event_name(memcg_vm_event_stat[NUMA_TASK_MIGRATE]), > + memcg_events(memcg, > memcg_vm_event_stat[NUMA_TASK_MIGRATE])); > You would need to use memcg_events() and you will need to flush the memcg rstat trees as well > > It looks like more code is needed. Michal, Shakeel, could you please advise > which strategy is preferred, or should we keep the current version? I am now more inclined to keep these new stats in memory.stat as the current version is doing because: 1. Relevant stats are exposed through the same interface and we already have numa balancing stats in memory.stat. 2. There is no single good home for these new stats and exposing them in cpu.stat would require more code and even if we reuse memcg infra, we would still need to flush the memcg stats, so why not just expose in the memory.stat. 3. Though a bit far fetched, I think we may add more stats which sit at the boundary of sched and mm in future. Numa balancing is one concrete example of such stats. I am envisioning for reliable memory reclaim or overcommit, there might be some useful events as well. Anyways it is still unbaked atm. Michal, let me know your thought on this.