From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23F64ECAAD5 for ; Mon, 5 Sep 2022 11:54:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 930A1801D9; Mon, 5 Sep 2022 07:54:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DFBE8D0050; Mon, 5 Sep 2022 07:54:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A777801D9; Mon, 5 Sep 2022 07:54:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6D5058D0050 for ; Mon, 5 Sep 2022 07:54:44 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3A7DD1A0D1A for ; Mon, 5 Sep 2022 11:54:44 +0000 (UTC) X-FDA: 79877874888.13.7FF32CB Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf12.hostedemail.com (Postfix) with ESMTP id 403984007D for ; Mon, 5 Sep 2022 11:54:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662378883; x=1693914883; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=IJhTAgu2cxNzpV0h3DX4P61WCYhZ95uQuCPblFmdGpk=; b=KZ9CYWSToWPUd1N3UMczlw/XzL/YIFuVqXX5x5hfJpZPYiYIGNaNSjVo xz4BetEUBAfjwpeCDHhEyq4iHTs/LrnMuuQK58StIE8HIRcTDw7TcYoyD c31+4QL+BO9AMXMakf5W9rMn/ga/EGX5SywfXPS6oWCmPq5jDJgUynaVP vSrWMnlCi2dAzhPZCQ1BXHgRIENXeFfJfFbUsPz04lj7wVW1QSv0L4R9E EgIGF/G3NFSMk76+0Yw/NhiipPjmpAZUPDAPMu4NWrU+wzKxece0M40J9 L2XxesGlwAqPV5RrmN0HVkgG0CvgYXq22cnlk1VCgSrIXVUMP/Fr4ZIbL Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10460"; a="293957358" X-IronPort-AV: E=Sophos;i="5.93,291,1654585200"; d="scan'208";a="293957358" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Sep 2022 04:54:41 -0700 X-IronPort-AV: E=Sophos;i="5.93,291,1654585200"; d="scan'208";a="643784115" Received: from jiebinsu-mobl.ccr.corp.intel.com (HELO [10.238.0.228]) ([10.238.0.228]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Sep 2022 04:54:37 -0700 Message-ID: Date: Mon, 5 Sep 2022 19:54:35 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.1 Subject: Re: [PATCH] ipc/msg.c: mitigate the lock contention with percpu counter Content-Language: en-US To: Andrew Morton Cc: vasily.averin@linux.dev, shakeelb@google.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, ebiederm@xmission.com, legion@kernel.org, manfred@colorfullife.com, alexander.mikhalitsyn@virtuozzo.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tim.c.chen@intel.com, feng.tang@intel.com, ying.huang@intel.com, tianyou.li@intel.com, wangyang.guo@intel.com References: <20220902152243.479592-1-jiebin.sun@intel.com> <20220902090659.28829853543cac3f3f725df5@linux-foundation.org> From: "Sun, Jiebin" In-Reply-To: <20220902090659.28829853543cac3f3f725df5@linux-foundation.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662378884; a=rsa-sha256; cv=none; b=vRjVQZmLxkHefxo4cuBsRCza9fXZvCLi/eLQQWHh4oO2KDz2H7PUAa9/Eg7h3hddIy/0qq QC75CCNVO3EkwGi7jkEFf+GVVP5OMFGEZE7UXKekpFRiw3EGXDebPboPLmvGCU5VBXMgZn 5KdN4KvNjVRFUnw/YtJQNPlilzkxVGo= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=KZ9CYWST; spf=pass (imf12.hostedemail.com: domain of jiebin.sun@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=jiebin.sun@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662378884; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1JTEp5gibgB3kqyzC3os6qc2F9CG5Uc7Q5k9g866MXY=; b=RJNAFOY7TzG0krpEfrH3t2vpgrXiqr03CpdM7aEcU5qobG2sU2mFMRG6svgkDJ/+mWUO5t 2VyutHelwCjPYZJfB5M0aBtefxsUP+RV2X1HAYtaJGItq4bf4cqpu/PTjWvbsb9H30MQnT v5TFhcypDqfgbxYkXuCypVlIhygheDc= X-Rspam-User: Authentication-Results: imf12.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=KZ9CYWST; spf=pass (imf12.hostedemail.com: domain of jiebin.sun@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=jiebin.sun@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam09 X-Stat-Signature: hoejnitd6ip5kb6bi7r13iehbbqkkgnr X-Rspamd-Queue-Id: 403984007D X-HE-Tag: 1662378882-961545 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/3/2022 12:06 AM, Andrew Morton wrote: > On Fri, 2 Sep 2022 23:22:43 +0800 Jiebin Sun wrote: > >> The msg_bytes and msg_hdrs atomic counters are frequently >> updated when IPC msg queue is in heavy use, causing heavy >> cache bounce and overhead. Change them to percpu_counters >> greatly improve the performance. Since there is one unique >> ipc namespace, additional memory cost is minimal. Reading >> of the count done in msgctl call, which is infrequent. So >> the need to sum up the counts in each CPU is infrequent. >> >> Apply the patch and test the pts/stress-ng-1.4.0 >> -- system v message passing (160 threads). >> >> Score gain: 3.38x > So this test became 3x faster? Yes. It is from the phoronix test suite stress-ng-1.4.0 -- system v message passing with dual sockets ICX servers. In this benchmark, there are 160 pairs of threads, which do msgsnd and msgrcv. The patch benefit more as the threads of workload increase. > >> CPU: ICX 8380 x 2 sockets >> Core number: 40 x 2 physical cores >> Benchmark: pts/stress-ng-1.4.0 >> -- system v message passing (160 threads) >> >> ... >> >> @@ -138,6 +139,14 @@ percpu_counter_add(struct percpu_counter *fbc, s64 amount) >> preempt_enable(); >> } >> >> +static inline void >> +percpu_counter_add_local(struct percpu_counter *fbc, s64 amount) >> +{ >> + preempt_disable(); >> + fbc->count += amount; >> + preempt_enable(); >> +} > What's this and why is it added? > > It would be best to propose this as a separate preparatory patch. > Fully changelogged and perhaps even with a code comment explaining why > and when it should be used. > > Thanks. As it will always do sum in msgctl_info, there is no need to use percpu_counter_add_batch. It will do global updating when the counter reach the batch size. So we add percpu_counter_add_local for smp and non_smp, which will only do local adding to the percpu counter. I have separate the original patch into two patches. Thanks.