From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D241DC6FA82 for ; Tue, 13 Sep 2022 11:06:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 035516B0072; Tue, 13 Sep 2022 07:06:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F002D6B0073; Tue, 13 Sep 2022 07:06:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9FD68D0001; Tue, 13 Sep 2022 07:06:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C6ED36B0072 for ; Tue, 13 Sep 2022 07:06:44 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9AD3A1A11AE for ; Tue, 13 Sep 2022 11:06:44 +0000 (UTC) X-FDA: 79906784328.30.159A6EE Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf09.hostedemail.com (Postfix) with ESMTP id 186041400B5 for ; Tue, 13 Sep 2022 11:06:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663067203; x=1694603203; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=S+rR3hEdjXVGudyxEjdeHvn4EsXZj3Ige9BWWN070iA=; b=ljoVZ2z5XkNdmEj9NkaPHMcxFNZVVi3pnDEe9lkw+xr7L/JUKWUWqYU8 TgmJscqz7aK/BMZDPJr8FZFP2MzqML/qOWLwNg9Z0eO3GLwUJlaAxWmy0 GGpGluoCRokgiHT1Ko/vxe9MtgeAUKNgdogM+urudNcOjgS6/Vwv2uJqE vPe1wK8hovK3jPzXYt2dStpPUn1rwcb2kzxcHl/QcO2fPKNAK0O+Ly0bP 2QZN2/6lof53D20L+Gi6uT+X7i10N/9Nb9vNCUlh7OVIH6m10LshvSkbX /RUP04cMdEqGB/rFbPoriyYUEn/96TfQdfv2EZdAFDtReFYrgti95a6Bf A==; X-IronPort-AV: E=McAfee;i="6500,9779,10468"; a="278501198" X-IronPort-AV: E=Sophos;i="5.93,312,1654585200"; d="scan'208";a="278501198" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2022 04:06:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,312,1654585200"; d="scan'208";a="678522118" Received: from linux-pnp-server-13.sh.intel.com ([10.239.176.176]) by fmsmga008.fm.intel.com with ESMTP; 13 Sep 2022 04:06:37 -0700 From: Jiebin Sun To: akpm@linux-foundation.org, vasily.averin@linux.dev, shakeelb@google.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, ebiederm@xmission.com, legion@kernel.org, manfred@colorfullife.com, alexander.mikhalitsyn@virtuozzo.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tim.c.chen@intel.com, feng.tang@intel.com, ying.huang@intel.com, tianyou.li@intel.com, wangyang.guo@intel.com, jiebin.sun@intel.com Subject: [PATCH v6 0/2] ipc/msg: mitigate the lock contention in ipc/msg Date: Wed, 14 Sep 2022 03:25:36 +0800 Message-Id: <20220913192538.3023708-1-jiebin.sun@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220902152243.479592-1-jiebin.sun@intel.com> References: <20220902152243.479592-1-jiebin.sun@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663067203; a=rsa-sha256; cv=none; b=3wik6+zHEsBBNmd5pmP//ziNNx32SVOE3ZWAU6fY8B+F6mSByY9guGfhBkWF5qcx4zETse Nj8VDoYlCb7aMhaO93rMOOy7vIUZOeD/dBIQD7flkDpFs9UUUxDpmW6xBnIFQy/x7wkGuz y5fmp7B0n2mHdKiHpmab6Cxesdj92nI= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=ljoVZ2z5; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf09.hostedemail.com: domain of jiebin.sun@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=jiebin.sun@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663067203; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=S+rR3hEdjXVGudyxEjdeHvn4EsXZj3Ige9BWWN070iA=; b=Qj6+W0BxALR9AfRCsIVKC3YU9gZ2nPh6AsBovEb6UDHkkgVtFY6VwzCEgaCM3qLx8tHwKv F5mqlIADyQqYvtAQZWz58+0nLzUtNaJlaEWSzPcFsRtN7wBJojZB72DQTRwTY9LOK96Y52 Nr5Ytn7DF14VWYTApLTxJZqGCL18R3k= X-Rspamd-Queue-Id: 186041400B5 Authentication-Results: imf09.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=ljoVZ2z5; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf09.hostedemail.com: domain of jiebin.sun@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=jiebin.sun@intel.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 9mjzghxapmjsd557editz6dtfza4tech X-HE-Tag: 1663067202-230486 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, Here are two patches to mitigate the lock contention in ipc/msg. The 1st patch is to add the new interface percpu_counter_add_local and percpu_counter_sub_local. The batch size in percpu_counter_add_batch should be very large in heavy writing and rare reading case. Add the "_local" version, and mostly it will do local adding, reduce the global updating and mitigate lock contention in writing. The 2nd patch is to use percpu_counter instead of atomic update in ipc/msg. The msg_bytes and msg_hdrs atomic counters are frequently updated when IPC msg queue is in heavy use, causing heavy cache bounce and overhead. Change them to percpu_counter greatly improve the performance. Since there is one percpu struct per namespace, additional memory cost is minimal. Reading of the count done in msgctl call, which is infrequent. So the need to sum up the counts in each CPU is infrequent. Changes in v6: 1. Revise the code comment of percpu_counter_add_local in patch 1/2. 2. Get percpu_counter_sub_local from percpu_counter_add_local rather than that from percpu_counter_add_batch for SMP and percpu_counter_sub for non-SMP to reduce code modification. Changes in v5: 1. Use INT_MAX as the large batch size in percpu_counter_local_add and percpu_counter_sub_local. 2. Use the latest kernel 6.0-rc4 as the baseline for performance test. 3. Move the percpu_counter_local_add and percpu_counter_sub_local from percpu_counter.c to percpu_counter.h. Changes in v3: 1. Add comment and change log for the new function percpu_counter_add_local. Who should use it and who shouldn't. Changes in v2: 1. Separate the original patch into two patches. 2. Add error handling for percpu_counter_init. The performance gain increases as the threads of workload become larger. Performance gain: 3.99x CPU: ICX 8380 x 2 sockets Core number: 40 x 2 physical cores Benchmark: pts/stress-ng-1.4.0 -- system v message passing (160 threads) Regards Jiebin