From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6B65C4345F for ; Fri, 3 May 2024 09:16:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6BBD16B0092; Fri, 3 May 2024 05:16:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 66C116B0093; Fri, 3 May 2024 05:16:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55A886B0095; Fri, 3 May 2024 05:16:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3827E6B0092 for ; Fri, 3 May 2024 05:16:08 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id EB2C5C0D39 for ; Fri, 3 May 2024 09:16:07 +0000 (UTC) X-FDA: 82076527974.05.60174B4 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 349F720023 for ; Fri, 3 May 2024 09:16:05 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714727765; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xSypJXgqakWaJuLZ8/nBAPBsVo6pvJ17rgN77byl6Hw=; b=4MXD0FqlnFZk9xeUU6yQ0rCwbCo5ZRplSM0Or5jOJ9UWMFC9ZIDPFSA6GAKQfn6JBzV32c xao0ID68+yHHY1vwrI0KG05Pi9MGCd8f5OO/2XMtNEI+iINSB7E8Lrrrb/z6YFHBoDQkdH 2qMfP3TfjU9lvouxSKgsLI1lsm4two0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714727765; a=rsa-sha256; cv=none; b=7Ew9xyhNMYtyWyaWGitjZt1Xo1wW/LgPLL3PVTby/Y2UkT8wYNDQeu/Z+8a58CRTxK3Q7A eV00NhjFZk01oe0siFAsKDyoQUa38KfedpcWL6pU6eu9zOGCX1+aX9vIcVyKttwvN7DGUB CNX6dsybBh8S5oiyK5Lvf8CzNIRhdfA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A22CA2F4; Fri, 3 May 2024 02:16:29 -0700 (PDT) Received: from [10.57.67.51] (unknown [10.57.67.51]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 16A913F73F; Fri, 3 May 2024 02:16:02 -0700 (PDT) Message-ID: Date: Fri, 3 May 2024 10:16:01 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] mm/vmstat: sum up all possible CPUs instead of using vm_events_fold_cpu Content-Language: en-GB To: Barry Song <21cnbao@gmail.com>, akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, linux-kernel@vger.kernel.org, david@redhat.com, v-songbaohua@oppo.com, vbabka@suse.cz, willy@infradead.org References: <20240503020924.208431-1-21cnbao@gmail.com> From: Ryan Roberts In-Reply-To: <20240503020924.208431-1-21cnbao@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: 4negaqube817s5sihwmrimwc4mqysnd3 X-Rspamd-Queue-Id: 349F720023 X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1714727765-820579 X-HE-Meta: U2FsdGVkX1+n3b5TcLNLWKXXjqS0FFAUrfF2A7VdT+/08am/CXYbjRuzmvnrRl2hlCx+xq3aOHeABeVFdCT/fx/IfE2LM2tb1YjJhmNHfARtQV/Ml8JdGu6iTXogwcJIWfbXFpi09WhvZf3Z5NK+CStX+nzj0rthQbKKbqIBQQ00QD13xKbzJrnY/O1xrn5J7psB3MIhyNC8VoXt0GyWURf9numP3E8G/XYPX5t9NsH/HKnzLasxp3g2YbQKPNelSbPMiDrj0GqdaoK9R8x5eP/K30a0eTtenBI1YGklNMWTmSm9w4KtVxIJm1AqZl1q8o7U5S4eppYc3i72Ddr95ozFDfURqEJbMJAHesn6I/3ro0+BVmjdv5C6OZkXUNfLtYUgOmBxO+pJFu7rDza0UuyjwWaldta+vL17YVYKvBjK5vhvXrxaMQZGyBLPa6E3eWHxhpZlItzaEstpR5/81Cg6M6VjTZdR2NRoc/MWeQuEsxp0oo/ojiEIQCqwRXcJ3pF3eqEh9p+mcI/4oR7FkVzIFPEIlGNDlMPxP0Vof8od2H6vt+ZLmAmQMwL8WImIsoQgUEbvsGIj/jofn1uJrila6AHvQXocyx2KaoCw3WzUUbd9fxSUVkqMARYXWkZAXmEyEvp6rwHriWFbWQoSh7lrAgznKk5FLoIEenFgVmq6bqSezBgFNlvrtUvX6HgazX/7B9pwaCyQq+VhWXHXh/Km1b92CH+ax8amVgWjK2z/FCvQM4iuxcqKDwQIsFreyU0/fY4v0MJHVzN78D4ldvDhBz4cB6f1jgkdP/OLn/oq/3SEnKjE5YrMXW1ZGynqX05JI3xXAbAbJWuKugbmimuIOkCi0y2psOiQh9b+jgwySho9xjvQ8FjyZfm2lF/fQf+9ztofvivp//B5aADR1iWtzIfiXexJuuUSkkntdq3F/quxwPa0OJhph9/+kY177sw3XAtS94vDGT3ie+k rqHvRABs WFHecjPZ6glc3C4J8spWeyeVljCCJYGicrnOuThPaKvFKm29bRru3X+jgJHWXjTL4cTDWEI+tcnLmfq7TQJpfKTmQ8J+g2iJySI1CsYsoItKmsIQEam4OBjISSzoVzSvEHDVYdrtaYpg0Dyho5DmuvC5mDCB4HurDTzQvE/GLCXtttm9GffNsbajp46sRLVTTYf5aWgmNEJ/K+kvh9eNlDQUy6F2Cm3gIy3r7nszwy9dhpdvrShH9BkIUwN67rK5YbziM3OQGK9RoSUeSZCFSxqYC1mQjVIGugPM3jgTMWWIoghOC+fHhecAOEvtXTLrHyriskHoXKcp9bTioe8yog3H13o0iFYFpIaeId6/WjClXd3Aa3g7niQi9VHw/DYRWUHP8V3psND4cog0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 03/05/2024 03:09, Barry Song wrote: > From: Barry Song > > When unplugging a CPU, the current code merges its vm_events > with an online CPU. Because, during summation, it only considers > online CPUs, which is a crude workaround. By transitioning to > summing up all possible CPUs, we can eliminate the need for > vm_events_fold_cpu. > > Suggested-by: Ryan Roberts > Signed-off-by: Barry Song > --- > originally suggested by Ryan while he reviewed mTHP counters > patchset[1]; I am also applying this suggestion to vm_events > > -v2: > also drop cpus_read_lock() as we don't care about cpu hotplug any more; > -v1: > https://lore.kernel.org/linux-mm/20240412123039.442743-1-21cnbao@gmail.com/ > > [1] https://lore.kernel.org/linux-mm/ca73cbf1-8304-4790-a721-3c3a42f9d293@arm.com/ > > include/linux/vmstat.h | 5 ----- > mm/page_alloc.c | 8 -------- > mm/vmstat.c | 21 +-------------------- > 3 files changed, 1 insertion(+), 33 deletions(-) > > diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h > index 735eae6e272c..f7eaeb8bfa47 100644 > --- a/include/linux/vmstat.h > +++ b/include/linux/vmstat.h > @@ -83,8 +83,6 @@ static inline void count_vm_events(enum vm_event_item item, long delta) > > extern void all_vm_events(unsigned long *); > > -extern void vm_events_fold_cpu(int cpu); > - > #else > > /* Disable counters */ > @@ -103,9 +101,6 @@ static inline void __count_vm_events(enum vm_event_item item, long delta) > static inline void all_vm_events(unsigned long *ret) > { > } > -static inline void vm_events_fold_cpu(int cpu) > -{ > -} > > #endif /* CONFIG_VM_EVENT_COUNTERS */ > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index cd584aace6bf..8b56d785d587 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5826,14 +5826,6 @@ static int page_alloc_cpu_dead(unsigned int cpu) > mlock_drain_remote(cpu); > drain_pages(cpu); > > - /* > - * Spill the event counters of the dead processor > - * into the current processors event counters. > - * This artificially elevates the count of the current > - * processor. > - */ > - vm_events_fold_cpu(cpu); > - > /* > * Zero the differential counters of the dead processor > * so that the vm statistics are consistent. > diff --git a/mm/vmstat.c b/mm/vmstat.c > index db79935e4a54..aaa32418652e 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -114,7 +114,7 @@ static void sum_vm_events(unsigned long *ret) > > memset(ret, 0, NR_VM_EVENT_ITEMS * sizeof(unsigned long)); > > - for_each_online_cpu(cpu) { > + for_each_possible_cpu(cpu) { One thought comes to mind (due to my lack of understanding exactly what "possible" means): Linux is compiled with a max number of cpus - NR_CPUS - 512 for arm64's defconfig. Does all possible cpus include all 512? On an 8 CPU system that would be increasing the number of loops by 64 times. Or perhaps possible just means CPUs that have ever been online? Either way, I guess it's not considered a performance bottleneck because, from memory, the scheduler and many other places are iterating over all possible cpus. > struct vm_event_state *this = &per_cpu(vm_event_states, cpu); > > for (i = 0; i < NR_VM_EVENT_ITEMS; i++) > @@ -129,29 +129,10 @@ static void sum_vm_events(unsigned long *ret) > */ > void all_vm_events(unsigned long *ret) > { > - cpus_read_lock(); > sum_vm_events(ret); > - cpus_read_unlock(); > } > EXPORT_SYMBOL_GPL(all_vm_events); > > -/* > - * Fold the foreign cpu events into our own. > - * > - * This is adding to the events on one processor > - * but keeps the global counts constant. > - */ > -void vm_events_fold_cpu(int cpu) > -{ > - struct vm_event_state *fold_state = &per_cpu(vm_event_states, cpu); > - int i; > - > - for (i = 0; i < NR_VM_EVENT_ITEMS; i++) { > - count_vm_events(i, fold_state->event[i]); > - fold_state->event[i] = 0; > - } > -} > - > #endif /* CONFIG_VM_EVENT_COUNTERS */ > > /*