From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4076EC3DA4A for ; Mon, 5 Aug 2024 04:36:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 853C36B007B; Mon, 5 Aug 2024 00:36:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 803426B0082; Mon, 5 Aug 2024 00:36:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CB036B0085; Mon, 5 Aug 2024 00:36:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4F14D6B007B for ; Mon, 5 Aug 2024 00:36:05 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 819BB41B44 for ; Mon, 5 Aug 2024 04:36:04 +0000 (UTC) X-FDA: 82416929448.12.9501792 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by imf06.hostedemail.com (Postfix) with ESMTP id 55EAE180011 for ; Mon, 5 Aug 2024 04:36:01 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RcTCbgv2; spf=pass (imf06.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.9 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722832501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BqpThDb5W6SlYR3V+aEcRzVHwq0HvB9RzGs1mw5kCQE=; b=mda5JBbYOGwPEX7j7r8FH1Ou6sVZk4b0yjttyRiXytf4TJdtXqGVolZJR0y8LtuO3R0Lyi xPxX5w36Q6YWpnIHSVbbjI+wPmK0n8wyPb+5n4yCrea6vIUdo6ka5Ps6z8JCRCYEsWhigb pvT2FIoep/mgcWbULMq2t+OCPKBoIKk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722832501; a=rsa-sha256; cv=none; b=RpQRzR9sUhsRINkmCy1LIXE62G9+JYK33/cOyKGdzEdpMIfLLmWCoeBFW/lE9a4WtxbnNc f2ZQAlFRapMyw5yUdBkA57+4TSzYZWmVSdVPZM0mWgZMj9SROowOUDknTbR9ZCf1yiBahs /WuUN00WWk+43NUX9uY5iOBiEx1B69c= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RcTCbgv2; spf=pass (imf06.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.9 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722832562; x=1754368562; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=IuvyNOHU3LwrHrpCEK0Y0cdM2PuDPsqf3XZsH9/BJrg=; b=RcTCbgv2NKTwmXEUmNGhyaJ6qs9Z7xFDTw70DmTFDhPJJzZ0MKXZ1Ci2 Dg0zINVhpB7D2w1ZrxzYa1o919Of4FC5I/CSed47Ik5xWtmGStdbZECMR T2GIlEkeFAUW5koX4umz3ZR93ZooY4o5scPOWBOUIlQEBkv5qILFlq+Ij oYLiWhCfjQ1Z6D+xnywBpT0WNh+drFZMiMQVOVFsCZ0FmcA0W+808Elmo dK2n5wzsqCkVKhSfq8OXK7WpX9uFLbqT0t6YFeBLzF1HFN4bb4HVEqd7e brp1O8f/iuyvK5G5skGY3dK+69OD4fVmPx3PA5MOjKrnAojUL4lXd8Zn7 A==; X-CSE-ConnectionGUID: A5H2dOcRQMmV6iSHABDi6g== X-CSE-MsgGUID: 9dapy0ejQjOfP59joOrxHg== X-IronPort-AV: E=McAfee;i="6700,10204,11154"; a="43298160" X-IronPort-AV: E=Sophos;i="6.09,263,1716274800"; d="scan'208";a="43298160" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Aug 2024 21:36:00 -0700 X-CSE-ConnectionGUID: CF7V98ycQiuGYQmlSo6qsw== X-CSE-MsgGUID: WbZo3E8bQaOPk6YF+rkg3w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,263,1716274800"; d="scan'208";a="55978254" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Aug 2024 21:35:57 -0700 From: "Huang, Ying" To: Yafang Shao Cc: akpm@linux-foundation.org, mgorman@techsingularity.net, linux-mm@kvack.org, Matthew Wilcox , David Rientjes Subject: Re: [PATCH v3 3/3] mm/page_alloc: Introduce a new sysctl knob vm.pcp_batch_scale_max In-Reply-To: (Yafang Shao's message of "Mon, 5 Aug 2024 11:17:26 +0800") References: <20240804080107.21094-1-laoar.shao@gmail.com> <20240804080107.21094-4-laoar.shao@gmail.com> <87r0b3g35e.fsf@yhuang6-desk2.ccr.corp.intel.com> <87mslrfz9i.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Mon, 05 Aug 2024 12:32:24 +0800 Message-ID: <87cymnfv3b.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 5ficse15wu4na5yu38euzgqzqyjqcpr7 X-Rspamd-Queue-Id: 55EAE180011 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1722832561-519848 X-HE-Meta: U2FsdGVkX1/MTJOM+P+0ZfWe1ifWJxRNAR4Gj6JVKWwR35LaQOhsVCrLPNYVf8mI4l+vgdyMzEkXVM9RD+uO5/gxNnkBDAtdkQ5k8/1HdUntEJj/Gg3JHUgg8ziu0CTkkLtPVs5t3FtEd8jWSQZQi8vToZRY9XsfplR5phF42B9LxhQ35LSPX1IJia/j3q0IzwOPViqCCPUpNToP2MU4EVaAxUfHSWvVsjnMvVfyIKvl4h4McLQA/o/uFhE1WSFG0mADbyNBMe6lJESXTOeO5mbf8sL3Wg6tep9Q/OFrCS0UqYoqE3b/tyclbSMZRmUYMgXHYEOVj/9tsuPG7X0nNgpRI7+6LpcLxly5CTzjRIdiM/r4HVMglpIrl5mA7eQCIKAUkdydikzfhMUSoJF43KclX1a9wdjuSK28puQGi2kX/41+fy7yyys0jQv87L3zE9+OPzYICncrPa94xk0u6bDKzISyVUGdGVrz/QSyDD8zXVSAmZEBU0D/e1T4Q4jXk0nbfNMWQOMki9Saci9WERGiocVlgLdGH0Q7dRtVgqWJc8P2uEcJtIg0IumpIiXhQpPxKUNq8La9R5h6o+SBdciPkgrMumyXOwjpv2AwxuJYqhaPyuPbUQVQpX/Gn+ijrAU17uqEtENs1YjdlQ8KfQIlJgoIGGewCEkrbfH2tqQB8/rtrKvJ2iItZJyMPlz3zl+kSInCgTX4+KmhhzIx+3675KDwOTwikNrPu17XDELTNGl+9zvdjlIIGtzPcnrP66cLKHVQ+oWapLl8Ghjkkk1TTZvpp5I5JRDCxh5I11ia18JP0eSbihrQksIRIHq8GJglpfw4JnxjTG0UUWlrSReoLPOJcNypICGGMZLtqFiBuOuVnLNl+3Dl+2wHfuQbpYSVED/UOWwKyTR2ii8yOmbRJbcH3ntb3AfcBvbOA3kcBLgBsS1g6skd3nkenvoXvw8MsEag1Q4KwwYvxXK UFMZouhh kp+3OFWYmd9bOpW9jVrV2bHZ9oourlAukUWf3COcdnNKZz7K1xxRvCghUPPV4aBSisgzZfMlpE3HlgxqHJqOrGAiojKyY2NSW2huinZfJ8z4ZKHtW27uCFG5wcwjnzJ6YbXquR2c/Y/FovHDiSWMJOlTKbhJnYneHBhY1z+pCWy9Z4pmxOOt/1Vs6kLjNhzFsOlPBTIPtMog+0N9DgccxR1q42NgqtNpOgw1O1kc+9lOXMq/L+rDXb1ped7b3bxU2WqQne4I2+xTztppUpzqr48siZE0MAtt4S0SVboA1hBMKTFULPa8lsRsFww== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yafang Shao writes: > On Mon, Aug 5, 2024 at 11:05=E2=80=AFAM Huang, Ying wrote: >> >> Yafang Shao writes: >> >> > On Mon, Aug 5, 2024 at 9:41=E2=80=AFAM Huang, Ying wrote: >> >> >> >> Yafang Shao writes: >> >> >> >> [snip] >> >> >> >> > >> >> > Why introduce a systl knob? >> >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D >> >> > >> >> > From the above data, it's clear that different CPU types have varyi= ng >> >> > allocation latencies concerning zone->lock contention. Typically, p= eople >> >> > don't release individual kernel packages for each type of x86_64 CP= U. >> >> > >> >> > Furthermore, for latency-insensitive applications, we can keep the = default >> >> > setting for better throughput. >> >> >> >> Do you have any data to prove that the default setting is better for >> >> throughput? If so, that will be a strong support for your patch. >> > >> > No, I don't. The primary reason we can't change the default value from >> > 5 to 0 across our fleet of servers is that you initially set it to 5. >> > The sysadmins believe you had a strong reason for setting it to 5 by >> > default; otherwise, it would be considered careless for the upstream >> > kernel. I also believe you must have had a solid justification for >> > setting the default value to 5; otherwise, why would you have >> > submitted your patches? >> >> In commit 52166607ecc9 ("mm: restrict the pcp batch scale factor to >> avoid too long latency"), I tried my best to run test on the machines >> available with a micro-benchmark (will-it-scale/page_fault1) which >> exercises kernel page allocator heavily. From the data in commit, >> larger CONFIG_PCP_BATCH_SCALE_MAX helps throughput a little, but not >> much. The 99% alloc/free latency can be kept within about 100us with >> CONFIG_PCP_BATCH_SCALE_MAX =3D=3D 5. So, we chose 5 as default value. >> >> But, we can always improve the default value with more data, on more >> types of machines and with more types of benchmarks, etc. >> >> Your data suggest smaller default value because you have data to show >> that larger default value has the latency spike issue (as large as tens >> ms) for some practical workloads. Which weren't tested previously. In >> contrast, we don't have strong data to show the throughput advantages of >> larger CONFIG_PCP_BATCH_SCALE_MAX value. >> >> So, I suggest to use a smaller default value for >> CONFIG_PCP_BATCH_SCALE_MAX. But, we may need more test to check the >> data for 1, 2, 3, and 4, in addtion to 0 and 5 to determine the best >> choice. > > Which smaller default value would be better? This depends on further test results. > How can we ensure that other workloads, which we haven't tested, will > work well with this new default value? We cannot. We can only depends on the data available. If there are new data available in the future, we can make the change accordingly. > If you have a better default value in mind, would you consider sending > a patch for it? I would be happy to test it with my test case. If you can test the value 1, 2, 3, and 4 with your workload, that will be very helpful! Both allocation latency and total free time (if possible) are valuable. -- Best Regards, Huang, Ying