From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F71FECAAD3 for ; Wed, 7 Sep 2022 09:39:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E47568D0005; Wed, 7 Sep 2022 05:39:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF7088D0003; Wed, 7 Sep 2022 05:39:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBF238D0005; Wed, 7 Sep 2022 05:39:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B99C68D0003 for ; Wed, 7 Sep 2022 05:39:56 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9356A4101D for ; Wed, 7 Sep 2022 09:39:56 +0000 (UTC) X-FDA: 79884792792.19.63115C9 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf20.hostedemail.com (Postfix) with ESMTP id C16EC1C0097 for ; Wed, 7 Sep 2022 09:39:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662543595; x=1694079595; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=L9mAPti4dY/5Rd9k0BChHr2no1/RvmenIKFeJ7EdNQo=; b=Q3F8KCwne8tHL5PTSfbQY3MtSgGqBXKYA01qjywlpg+8NRwbVYmKeT8q pvufZUDWnDZ2jlIg9UmpekEZUAT2sZ6tDHMiRURm2+/dStAt6IAuY59lz +2ja5DvxDtDQq4eLpp1MJ4hyi7aHLrBTdoHptUxQDnqsTJy4O9kPng9er 0QmaZV/8R0KhdsB/5BHvZITjZoTJn0FaFmY0uIE/AjwHGYgezHqoudE4U 6aN8ZbF55MiqCFd+OgVEGCw/8a4RNJxYaNAtGn4fHfl2RaLc1jbPB0CuU qcI27csmcXRRyoXoCYHbfguI3U8L44Kqvl+zhFTWO6wVi7orC/PIU5NQI g==; X-IronPort-AV: E=McAfee;i="6500,9779,10462"; a="323012354" X-IronPort-AV: E=Sophos;i="5.93,296,1654585200"; d="scan'208";a="323012354" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Sep 2022 02:39:54 -0700 X-IronPort-AV: E=Sophos;i="5.93,296,1654585200"; d="scan'208";a="591621518" Received: from jiebinsu-mobl.ccr.corp.intel.com (HELO [10.238.0.228]) ([10.238.0.228]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Sep 2022 02:39:49 -0700 Message-ID: Date: Wed, 7 Sep 2022 17:39:47 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.1 Subject: Re: [PATCH] ipc/msg.c: mitigate the lock contention with percpu counter Content-Language: en-US To: Tim Chen , Shakeel Butt Cc: Andrew Morton , vasily.averin@linux.dev, Dennis Zhou , Tejun Heo , Christoph Lameter , "Eric W. Biederman" , Alexey Gladkov , Manfred Spraul , alexander.mikhalitsyn@virtuozzo.com, Linux MM , LKML , "Chen, Tim C" , Feng Tang , Huang Ying , tianyou.li@intel.com, wangyang.guo@intel.com, jiebin.sun@intel.com References: <20220902152243.479592-1-jiebin.sun@intel.com> <048517e7f95aa8460cd47a169f3dfbd8e9b70d5c.camel@linux.intel.com> From: "Sun, Jiebin" In-Reply-To: <048517e7f95aa8460cd47a169f3dfbd8e9b70d5c.camel@linux.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662543596; a=rsa-sha256; cv=none; b=nsorS3zRkExdBdpFo5hV24fv3P6tzajm7li+c9MNHkCmaUoE+YY3HhJ6Pv0WIQDx6KjR9w gkB8Tukbd60UATDdztE6gB3oyBp0o9OyZEAa7ro43iZSDRMc4ThTww6DiorqbcCFtGLrnX z7QoPBuXpKYvaTMd8boJhChN2d/IhZk= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=Q3F8KCwn; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=softfail (imf20.hostedemail.com: 192.55.52.88 is neither permitted nor denied by domain of jiebin.sun@intel.com) smtp.mailfrom=jiebin.sun@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662543596; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nIvwLOlsSQ8WlESqo09zA/b/eOT/6iw39ON5AKzLbhc=; b=JOvz5IN+NlJgibdiQclCTA4UdanMuT+BWqACsjjcBUl+DrXrngBrHo17Gu+Wd5kM+e448y MEyqTvpm1Bj/3Xn6dXHWPjHk/ZHkWcLaiqTiGPktRKj18Tv2OCD4IVTMPol42GRJ6mB7nk KfPoaWEAT3nYOnLJnL/hIYXzZ3vU+dc= X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C16EC1C0097 Authentication-Results: imf20.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=Q3F8KCwn; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=softfail (imf20.hostedemail.com: 192.55.52.88 is neither permitted nor denied by domain of jiebin.sun@intel.com) smtp.mailfrom=jiebin.sun@intel.com X-Stat-Signature: iqum5q414arec1jrqku1ga8yt5qsknkf X-Rspam-User: X-HE-Tag: 1662543595-347312 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/7/2022 2:44 AM, Tim Chen wrote: > On Fri, 2022-09-02 at 09:27 -0700, Shakeel Butt wrote: >> On Fri, Sep 2, 2022 at 12:04 AM Jiebin Sun wrote: >>> The msg_bytes and msg_hdrs atomic counters are frequently >>> updated when IPC msg queue is in heavy use, causing heavy >>> cache bounce and overhead. Change them to percpu_counters >>> greatly improve the performance. Since there is one unique >>> ipc namespace, additional memory cost is minimal. Reading >>> of the count done in msgctl call, which is infrequent. So >>> the need to sum up the counts in each CPU is infrequent. >>> >>> Apply the patch and test the pts/stress-ng-1.4.0 >>> -- system v message passing (160 threads). >>> >>> Score gain: 3.38x >>> >>> CPU: ICX 8380 x 2 sockets >>> Core number: 40 x 2 physical cores >>> Benchmark: pts/stress-ng-1.4.0 >>> -- system v message passing (160 threads) >>> >>> Signed-off-by: Jiebin Sun >> [...] >>> +void percpu_counter_add_local(struct percpu_counter *fbc, s64 amount) >>> +{ >>> + this_cpu_add(*fbc->counters, amount); >>> +} >>> +EXPORT_SYMBOL(percpu_counter_add_local); >> Why not percpu_counter_add()? This may drift the fbc->count more than >> batch*nr_cpus. I am assuming that is not the issue for you as you >> always do an expensive sum in the slow path. As Andrew asked, this >> should be a separate patch. > In the IPC case, the read is always done with the accurate read using > percpu_counter_sum() gathering all the counts and > never with percpu_counter_read() that only read global count. > So Jiebin was not worry about accuracy. > > However, the counter is s64 and the local per cpu counter is S32. > So the counter size has shrunk if we only keep the count in local per > cpu counter, which can overflow a lot sooner and is not okay. > > Jiebin, can you try to use percpu_counter_add_batch, but using a large > batch size. That should achieve what you want without needing > to create a percpu_counter_add_local() function, and also the overflow > problem. > > Tim > I have sent out the patch v4 which use percpu_counter_add_batch. If we use a tuned large batch size (1024), the performance gain is 3.17x (patch v4) vs 3.38x (patch v3) previously in stress-ng -- message. It still has significant performance improvement and also good balance between performance gain and overflow issue. Jiebin >