From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F324ACF6D3C for ; Wed, 2 Oct 2024 16:04:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55B6A6B037F; Wed, 2 Oct 2024 12:04:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E4226B0381; Wed, 2 Oct 2024 12:04:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35E5A6B0380; Wed, 2 Oct 2024 12:04:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F25F84401B5 for ; Wed, 2 Oct 2024 12:04:14 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 95FE3AB8B4 for ; Wed, 2 Oct 2024 16:04:14 +0000 (UTC) X-FDA: 82629134028.28.6EBAEDD Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf24.hostedemail.com (Postfix) with ESMTP id D2EFB18000A for ; Wed, 2 Oct 2024 16:04:11 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=CF4TT7KN; spf=pass (imf24.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727884923; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/8aj6qwMzlmy8Py9EsNFiyyDWDFhx0aFCGlFiLc/P14=; b=aXdapwtWV4PLf0Kwapc/9RSE0qwvQysRsLUTWLy/r5J7gI14niKJFNFJt5EdHqhvzRL81K 2VxxOj6Hz1ekmnUdz5Tq2ZwG6JY1OVBr+Zt1dSLsff1ShynHM7s8XIqk9YIZQE/chae9lw B30t7tFSgpgfBbHXAh6/Dmo65PJT3UE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727884923; a=rsa-sha256; cv=none; b=em1t03Ep3Lfc2vKm70qwWB2ygT1P+YBjl5Q/N17FJlzJdQjLq1klQlGyQC3TmmD4zY/+5V aL+F7YgtK9VEq4ahd1v/OXnhNtJO4u6G6Ch3ffk7JTYmeRkmpPTNQ0xytgeyvJ8UiKiueQ rfMZDiWkuno6o7qTPEmVuMs9/K6DXjY= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=CF4TT7KN; spf=pass (imf24.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1727885051; bh=6wCd0ATWAk93Kx57rpnE5VXXFxLzQSM30u/zKePBA54=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=CF4TT7KN2aoCAboqXU5D8MWNSeePl8xG/MHqpYyQ212rBgmpTpFjfD5r4BRQNq0RS gVATUx01l7XqL2gCevSVNpFjfRsPN6ENFq/dD400shDwgrP+/K8/XEyzvBxW3UILWf CZ8mKzFpD7+Xs6byUdcmhHlgq5NBUbfBwx+9EfeIo814UG+HSpGGj6Uf1ddzHVrOBz 5G6h/54iLgpS9VKr1aS01C/fvTrrc/qEYyoVCcbFLxY9J69Vh3E66T/1/Rr7VYrL4K MaSrzEKbQk2Hgp2MsyNYe5jhHSvGi0co3XllN0VStegfheeIgvDeHFsfMiRf22Jdw0 xf0Gl4X0FOevA== Received: from [IPV6:2606:6d00:100:4000:cacb:9855:de1f:ded2] (unknown [IPv6:2606:6d00:100:4000:cacb:9855:de1f:ded2]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4XJfkf5Sz2zrZ0; Wed, 2 Oct 2024 12:04:10 -0400 (EDT) Message-ID: Date: Wed, 2 Oct 2024 12:02:09 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 0/4] sched+mm: Track lazy active mm existence with hazard pointers To: Jens Axboe , Matthew Wilcox Cc: paulmck@kernel.org, Linus Torvalds , Andrew Morton , Peter Zijlstra , linux-kernel@vger.kernel.org, Nicholas Piggin , Michael Ellerman , Greg Kroah-Hartman , Sebastian Andrzej Siewior , Will Deacon , Boqun Feng , Alan Stern , John Stultz , Neeraj Upadhyay , Frederic Weisbecker , Joel Fernandes , Josh Triplett , Uladzislau Rezki , Steven Rostedt , Lai Jiangshan , Zqiang , Ingo Molnar , Waiman Long , Mark Rutland , Thomas Gleixner , Vlastimil Babka , maged.michael@gmail.com, Mateusz Guzik , Jonas Oberhauser , rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev References: <20241002010205.1341915-1-mathieu.desnoyers@efficios.com> <8a627fc7-cc62-40e6-ad28-c730d4a8f7d6@efficios.com> <579bdbbf-82a7-4330-9a5e-495d89befbac@kernel.dk> From: Mathieu Desnoyers Content-Language: en-US In-Reply-To: <579bdbbf-82a7-4330-9a5e-495d89befbac@kernel.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: 99tqaf5tfhcpqj1y7dy5r3oaqkro996p X-Rspamd-Queue-Id: D2EFB18000A X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1727885051-402582 X-HE-Meta: U2FsdGVkX19co/BtnBngPAmgz2XTeZIXTEVrRBaTKJ7Uf25TvvaZ1svQyzyz1ya0spDa+wVK3UvZEB/grhxKunttAWvTYmMyikVEGUF43ltuZEn4347uAb2a2BnjFennfVeKPEmexX3AA1EihSqvZomINXOar0QJ6IPvRpVkhkxDuKnsxPTmsOpRlg6B+ty4vMjJoGX4LSAgnmiUcW8BTBzc1ZF9evX34E59NH6/oT0XKN/El+EpZjaOJZbbe6GwL5vbSki8Q+6jCyMBmU/JSTJYiRFImQPc58QPN2GyaeaB6lT5gBa+uA3un3EXr4SoVlVHTALLW+uAD1FYqIX177d7cb1uUwVe3rgGWlf4+EIvS18oisu2ge4nEdStGRENBslCWU94v61y8gtO7jC6zfVQn3cofStQ8HYK+SYQ5WSD0Nw+nx4CcI3dBofQlrHQ7puELTxrDwo96FnxexSiYNzsEfvrqSa/hHNZ8kmpCJtzOevL0P+KZ8qAE0mJBG49A2fVWQ+OH/TGD8QYekrjyoXAqXiMZ9YW+lvne0sRKAm1r9A6pW40jrTrCXha1GVlRkwGq7FHMHp+2/SyiPpg+5/GmR/q02EVyUPVG5mfPxT/smczaCVAn7Tmip4LAnOswCYWW/Ojnp+y2jlL+YfF/yWEhonziLssBcMUsLMZawnKqLNs6VZl+sGuZ+nA7z6sq/Z9paqcIRYROkyedlfdp6+YCxqOnejIY+qjDP9Pe0JNNy4dP+4YYsGJNztsulDuXfEHEuqNl8t/IJMgt+25wewQEoAgolz5YT9+pJU1r0nP/ClWTGQhM1fxSd+M1s0OSP3FEef/XQaVPshIDkW+IMK1UqwmNKCp64iKzrC51GDjTZMgN8L8EuzhLe0dVoUoeJNnQTdFFnB2aSFo2vHCevKdwiz5K3KnIFlz818NukZU1nSYXJT0ExOmM+JToXM/LfGP3UeD3EXsph3V5Np mBWygfB8 HSTkQE5momt9O/Vrzz9USKkG8DJ1dBQOXsF0dxUDNBXhvQP0eMX+xymlzAX+0HG9IFWAnL/vPShawx77I2O+0E7joEFqITTTWCI1TEu7geoUm+kU43vdj8grS0AnIUJv5Hy+CoDRshp3pCvQ014X263oFRwLestFOx3Dt21kpq3TDvoHAOlY6Mnb3/uCvI0wQJouKcj/S8VVmIlmGUEmLD/kKlNm4qiVo9zxOrQa4KownjqQlvbcAJL50FjyuVezQuHX4294XvC76+c6bYyv+t35hZOmLkAZ41SUeXL6lzRDGHQsmt0vHPbJycgDyygHkjllbhdHBE3wetEqF57PQ66q65V8I+Lx482Swk2xNUvJEk3i0mCUNplcFjKDLVOlGchi9cw9XuVScPW3sLtse2Ow5f5lKjFte811+OCh61RXOkCKcj+FJjpEyD0TGhOHRnJH+AFl9P7Xj0fc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-10-02 17:58, Jens Axboe wrote: > On 10/2/24 9:53 AM, Mathieu Desnoyers wrote: >> On 2024-10-02 17:36, Mathieu Desnoyers wrote: >>> On 2024-10-02 17:33, Matthew Wilcox wrote: >>>> On Wed, Oct 02, 2024 at 11:26:27AM -0400, Mathieu Desnoyers wrote: >>>>> On 2024-10-02 16:09, Paul E. McKenney wrote: >>>>>> On Tue, Oct 01, 2024 at 09:02:01PM -0400, Mathieu Desnoyers wrote: >>>>>>> Hazard pointers appear to be a good fit for replacing refcount based lazy >>>>>>> active mm tracking. >>>>>>> >>>>>>> Highlight: >>>>>>> >>>>>>> will-it-scale context_switch1_threads >>>>>>> >>>>>>> nr threads (-t) speedup >>>>>>> 24 +3% >>>>>>> 48 +12% >>>>>>> 96 +21% >>>>>>> 192 +28% >>>>>> >>>>>> Impressive!!! >>>>>> >>>>>> I have to ask... Any data for smaller numbers of CPUs? >>>>> >>>>> Sure, but they are far less exciting ;-) >>>> >>>> How many CPUs in the system under test? >>> >>> 2 sockets, 96-core per socket: >>> >>> CPU(s): 384 >>> On-line CPU(s) list: 0-383 >>> Vendor ID: AuthenticAMD >>> Model name: AMD EPYC 9654 96-Core Processor >>> CPU family: 25 >>> Model: 17 >>> Thread(s) per core: 2 >>> Core(s) per socket: 96 >>> Socket(s): 2 >>> Stepping: 1 >>> Frequency boost: enabled >>> CPU(s) scaling MHz: 68% >>> CPU max MHz: 3709.0000 >>> CPU min MHz: 400.0000 >>> BogoMIPS: 4800.00 >>> >>> Note that Jens Axboe got even more impressive speedups testing this >>> on his 512-hw-thread EPYC [1] (390% speedup for 192 threads). I've >>> noticed I had schedstats and sched debug enabled in my config, so I'll have to re-run my tests. >> >> A quick re-run of the 128-thread case with schedstats and sched debug >> disabled still show around 26% speedup, similar to my prior numbers. >> >> I'm not sure why Jens has much better speedups on a similar system. >> >> I'm attaching my config in case someone spots anything obvious. Note >> that my BIOS is configured to show 24 NUMA nodes to the kernel (one >> NUMA node per core complex). > > Here's my .config - note it's from the stock kernel run, which is why it > still has: > > CONFIG_MMU_LAZY_TLB_REFCOUNT=y > > set. Have the same numa configuration as you, just end up with 32 nodes > on this box. Just to make sure: did you use other command line options when starting the test program (other than -t N ?). Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com