From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 249D2C3DA4A for ; Mon, 5 Aug 2024 04:49:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6FA666B007B; Mon, 5 Aug 2024 00:49:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AA996B0082; Mon, 5 Aug 2024 00:49:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 572586B0085; Mon, 5 Aug 2024 00:49:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 39CB56B007B for ; Mon, 5 Aug 2024 00:49:30 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CE12F1C57BA for ; Mon, 5 Aug 2024 04:49:29 +0000 (UTC) X-FDA: 82416963258.21.4E17657 Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) by imf14.hostedemail.com (Postfix) with ESMTP id 179D2100007 for ; Mon, 5 Aug 2024 04:49:27 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LWx5HJ11; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.52 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722833318; a=rsa-sha256; cv=none; b=J+L8ZrKaxmVWb+Ms3liCXK/zf/hQEu2x+xtz6Ths0UVl00tPPNdRdckNg3KsKxwds+r1fZ wL5OLnM0cdLLOD1GFzPwsiapG/z48Z28c+tL6c62G2aZp8G/G5EOHQOEGlYB2cG6+DFpHw PGdjUmSYrikZsYNsAaef35msCIHoMe8= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LWx5HJ11; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.52 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722833318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=82hbiE2djQ+k9PHKqVSRd/QdgvgIRJZhan32sPLQQB4=; b=Hi3K5PaT40BJm4vJx+c/pL06nrEGQQLkG0VJ+UsmBRZWq0fgwVhwe6wHE/ONAplZYOsCkj qCnlffMZtloX3Ks+91Tm38aY5kb5Y54ES5+Qhgu5w6bVgT4dVu1MjK/brpia6XWkAmx18X QMzuezO6Duo+8+a7AYFHKQBSrM5Plk4= Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-6b7a8cada97so54544856d6.3 for ; Sun, 04 Aug 2024 21:49:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722833367; x=1723438167; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=82hbiE2djQ+k9PHKqVSRd/QdgvgIRJZhan32sPLQQB4=; b=LWx5HJ11bDk2c9xdgxjDF4YE9JDOhi580qgZCAutx0p5oNJXEEV4GiuvY1wIVlKduu RXDtzebU+hITqOIOomvODCBJJpE+5NGRrLksGDEOZFNbkTPfoE/+HqhSGQRTe55nFEx3 xGTlrNS3oBI+7fm8RPSL3ZdG+BDxdvX5zlHkFV7ndFmT/2clH/RFAULPhVDIYO0rH4+2 eGsFB93UvK6bLItqcmPPQbjd28NZAPwOy/xt+9Bfv9OxDNIAxUKW6PJSmkt8euinWNap wHDghhcKSuRuHVrcY76crqW7Mrg/LDGmfXtlcjnjRAaHOyMXqempOSRVTnRKSBGV72aS SSgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722833367; x=1723438167; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=82hbiE2djQ+k9PHKqVSRd/QdgvgIRJZhan32sPLQQB4=; b=CjZFC/O3lJ6tzL7WOh1xbsHOKUjJsV5gT8ROn2jblzBWJzhgY9YmfvftdsI/31UVQp 3Zn7g9jFzXajErQy0IG4Dd4cRbaUwFVvq+6PzLAKgBITAM8myj7QtwO6KoUmxoBzprgj x8rUvL1/umTFUV7io7+5W2bq/8qVAncBfU0dNnSDYM0HMVXoambKuyPvrKKIIZujLSkR TwTC0TN1DyfGqTr1oVk+zNGgHtPHX9BC0GgTXFzgRShFnbQk32H8l20PaLEzMGAEfm7y fTjdUwMuDxvwwY9af8XHoN3m8j1PFvF3l672986I7ZKu7SBRcxtgsjyQuxkPdf6fRrGn WRrw== X-Forwarded-Encrypted: i=1; AJvYcCUyq9LFPC0gZImTn+BSIFcDlEguzO3fChGWPr9XVxOzu8ihmpMJ44BAe7g8oKuN1pHtGQ2ZOfm/Ws4HkEcfq9lgxWU= X-Gm-Message-State: AOJu0Yyzzx/AALuMIW+6XBOHGMW6Xk2ha0G1+rLSdJZGa9z7pEt0IrRW AuLdMmd4iJ5kALckA9tRLXhGbc1/vaQqT1VEkG77OLBSLlSTp8CA9RO9UIBtIEvdqW1ViTGtOw8 H2YfEVXYS+CcyZcovl7C33AkqsuM= X-Google-Smtp-Source: AGHT+IFmq1ve98/ed3RFVd9xmGo2PnCEvq9BFAZqHGKU1XWo2xAr7k5d0aH5E4w0Rs6b+T+DvK93PcTntmfyE2juI/s= X-Received: by 2002:a05:6214:3283:b0:6b5:e0a2:988f with SMTP id 6a1803df08f44-6bb9842fae9mr115751666d6.47.1722833367024; Sun, 04 Aug 2024 21:49:27 -0700 (PDT) MIME-Version: 1.0 References: <20240804080107.21094-1-laoar.shao@gmail.com> <20240804080107.21094-4-laoar.shao@gmail.com> <87r0b3g35e.fsf@yhuang6-desk2.ccr.corp.intel.com> <87mslrfz9i.fsf@yhuang6-desk2.ccr.corp.intel.com> <87cymnfv3b.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87cymnfv3b.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Yafang Shao Date: Mon, 5 Aug 2024 12:48:50 +0800 Message-ID: Subject: Re: [PATCH v3 3/3] mm/page_alloc: Introduce a new sysctl knob vm.pcp_batch_scale_max To: "Huang, Ying" Cc: akpm@linux-foundation.org, mgorman@techsingularity.net, linux-mm@kvack.org, Matthew Wilcox , David Rientjes Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 179D2100007 X-Stat-Signature: bjycueiqzfk6bubg1zgpwznxh54tejz4 X-Rspam-User: X-HE-Tag: 1722833367-546306 X-HE-Meta: U2FsdGVkX19g4FsBoDqx7BwhWHtgem1dcG7VPKHjP/fdHsrdkyJzxPz9ZzwNNHe1TMRlxY4xNGGIy0JMUTFywoZe3fjCj73XkZImZ9DJ4L0QVV5HPgDcbruHh0M90CnUcDfmugVydDyNJSRlaNBPXTxWpEYZ9VnboSC3dlTX+deEhmQDntkXlf+LOnPrfwHL2sq/XwoZ3ZYnOoZNyq8ArRALr2QDw6L7mPyg3zqzsiaPMaf5ESqVrKX7DFWBz7fA9qDfrQxNzlQPogeSCYMwP1A4+/9MFf+0u+3s2a7zRJhFQES4qOtbmymYBfIcmU6Shbv+1mwDmfSll05y6OOOK1JTsdT8MJjDhdzjfQXwwLYcOvIdE6QfpV535ISBBvrzWEKLq/HKcWE4Fv6FMqhkwxg2QersbrjTPUj4cOdzhHjM6Sch02Yvf4uC8js16qJ5SXmZuClXYzOr01XXNAX7nqvhe16nG0S5Bv7E1Z82mHrjoGu5sGkcQNhhvkGtTyHemp6tGNNWu7spN6fRcL7a0xxmbFEdyznay76xcVN9IBJrKV4pLREM/iGE0UpZP7F/Up1/at/miCtIeedA9otultkWVJ9minb0y9dQSi061U2wptx6U2XCtrjHZbog7qEjD+Ka5wk3h1l9K19ipZMrxwHHdsVO9d5KFWyoggvUG1LE1K/OircuOOj7bAyFtoK1gjl4BeHohCohwjPtnTXPUO+OT9PywXZ18t8My+OALPv0GzCRM1TwMtd0soC6LbK+Z2aMUm9fz+GjJGbDX4mjvVvPx+jSMkWssvGD4MGv+xf7iKeEyMMsEENgiI2CV/D9bkOdHv1oAZSOdwzc1htKIfmxxcA5sn0xaQIULoBa36AgO6BJv7Zfk5kkFXXvw+DomeCiNc6abvlAuX5KUmMFRbJgXYCH94e1aAdOL+0Iodg4bXJhPpAiXmttO8SBmR/gelgPohh8XUGXxtEkoer P/tyHBGd l4hRJwim5VmJQ4tpXTeVLofoCkralroxdFy9gTdTy9oTxY8XVD38D7TD+RmpJt6H3rzmkPwR4GKbA5Cse4DXgYDLY0/5zYLDo07b9zK7yuwOCHDiub9KwG3syYg3+rSp9Hmbjm8MTSONYhUcUumPyUJwWUboCxNTOsIaEAxdte1cWCBdI/8tJcGmWYLisWrj+Bo3TEJ0oCMSXKcdiSD9s63ug6Ef85IoG3+psXtzQ9wwzpF/xz+Q9NaCA6I6VXBpzdBCvkieAGPbzIU1q3EfizkG+CQsKx/AW+0kLaltIcf1XmxhHD9v08eWrOlERPFqohmPO1Q6TiINZ50pZm1dru47oRwV32jWIphgRt97VTwUBGVQRUwQK2dHPQfTSYLucrIoJktAEu4BfqIcGPtQc/R7J5pGN81xfpaWn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 5, 2024 at 12:36=E2=80=AFPM Huang, Ying = wrote: > > Yafang Shao writes: > > > On Mon, Aug 5, 2024 at 11:05=E2=80=AFAM Huang, Ying wrote: > >> > >> Yafang Shao writes: > >> > >> > On Mon, Aug 5, 2024 at 9:41=E2=80=AFAM Huang, Ying wrote: > >> >> > >> >> Yafang Shao writes: > >> >> > >> >> [snip] > >> >> > >> >> > > >> >> > Why introduce a systl knob? > >> >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D > >> >> > > >> >> > From the above data, it's clear that different CPU types have var= ying > >> >> > allocation latencies concerning zone->lock contention. Typically,= people > >> >> > don't release individual kernel packages for each type of x86_64 = CPU. > >> >> > > >> >> > Furthermore, for latency-insensitive applications, we can keep th= e default > >> >> > setting for better throughput. > >> >> > >> >> Do you have any data to prove that the default setting is better fo= r > >> >> throughput? If so, that will be a strong support for your patch. > >> > > >> > No, I don't. The primary reason we can't change the default value fr= om > >> > 5 to 0 across our fleet of servers is that you initially set it to 5= . > >> > The sysadmins believe you had a strong reason for setting it to 5 by > >> > default; otherwise, it would be considered careless for the upstream > >> > kernel. I also believe you must have had a solid justification for > >> > setting the default value to 5; otherwise, why would you have > >> > submitted your patches? > >> > >> In commit 52166607ecc9 ("mm: restrict the pcp batch scale factor to > >> avoid too long latency"), I tried my best to run test on the machines > >> available with a micro-benchmark (will-it-scale/page_fault1) which > >> exercises kernel page allocator heavily. From the data in commit, > >> larger CONFIG_PCP_BATCH_SCALE_MAX helps throughput a little, but not > >> much. The 99% alloc/free latency can be kept within about 100us with > >> CONFIG_PCP_BATCH_SCALE_MAX =3D=3D 5. So, we chose 5 as default value. > >> > >> But, we can always improve the default value with more data, on more > >> types of machines and with more types of benchmarks, etc. > >> > >> Your data suggest smaller default value because you have data to show > >> that larger default value has the latency spike issue (as large as ten= s > >> ms) for some practical workloads. Which weren't tested previously. I= n > >> contrast, we don't have strong data to show the throughput advantages = of > >> larger CONFIG_PCP_BATCH_SCALE_MAX value. > >> > >> So, I suggest to use a smaller default value for > >> CONFIG_PCP_BATCH_SCALE_MAX. But, we may need more test to check the > >> data for 1, 2, 3, and 4, in addtion to 0 and 5 to determine the best > >> choice. > > > > Which smaller default value would be better? > > This depends on further test results. I believe you agree with me that you can't test all workloads. > > > How can we ensure that other workloads, which we haven't tested, will > > work well with this new default value? > > We cannot. We can only depends on the data available. If there are > new data available in the future, we can make the change accordingly. So, your solution is to change the hardcoded value for untested workloads and then release the kernel package again? > > > If you have a better default value in mind, would you consider sending > > a patch for it? I would be happy to test it with my test case. > > If you can test the value 1, 2, 3, and 4 with your workload, that will > be very helpful! Both allocation latency and total free time (if > possible) are valuable. You know I can't verify it with all workloads, right? You have so much data to verify, which indicates uncertainty about any default value. Why not make it tunable and let the user choose the value they prefer? --=20 Regards Yafang