From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8B66C3DA7F for ; Mon, 5 Aug 2024 05:37:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0959B6B007B; Mon, 5 Aug 2024 01:37:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0460A6B0082; Mon, 5 Aug 2024 01:37:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4F816B0085; Mon, 5 Aug 2024 01:37:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C00A16B007B for ; Mon, 5 Aug 2024 01:37:11 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C712DA0105 for ; Mon, 5 Aug 2024 05:37:10 +0000 (UTC) X-FDA: 82417083420.25.4B43CB4 Received: from mail-ua1-f46.google.com (mail-ua1-f46.google.com [209.85.222.46]) by imf03.hostedemail.com (Postfix) with ESMTP id E949B20017 for ; Mon, 5 Aug 2024 05:37:08 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VMDcJPGU; spf=pass (imf03.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.222.46 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722836221; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=p5MSCinJWc6Hf0NoIOSfTt+lvpVR4ytnrf3IB5nzD7Y=; b=ASRNj5zbH3HWa9Sl2i14XOSZUoTGE6tj/8omwq+ssITKn0c7+SE3S8H++s59IJMa6wFB2U p9M6fcdnvJcJUi0r4eXkPJp9j88JydQmkf1B5rpADUcvqW5HrfGUe9Tpuc8to263iSsZde viesktcjFdMV/lbL2Kq0xUHc5UUPuX4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VMDcJPGU; spf=pass (imf03.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.222.46 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722836221; a=rsa-sha256; cv=none; b=NfAUuIAjDKmDJBmbvf/VtwHylCJMbiGzqrrbRTYmKCsnJSy9T+jKF2FmlIGwO+HIH+0oyy bFRsRyFIsTk2yW9N6DrnW23Qw28SEQqYkHYN5CzyasW1i5VA4jWxcFLsBBsm7H73CqsQlE foqgMpVdKAWLkPN7BD7CIEbbQIkmA+E= Received: by mail-ua1-f46.google.com with SMTP id a1e0cc1a2514c-83446a5601bso3176909241.0 for ; Sun, 04 Aug 2024 22:37:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722836228; x=1723441028; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=p5MSCinJWc6Hf0NoIOSfTt+lvpVR4ytnrf3IB5nzD7Y=; b=VMDcJPGUuzVnWTXaGcUAARY0fTh/3qhOyKPu4DNNIUB3QY38q+yTapwitoiCGMM6v2 Ep4WVame0udA2jIVixLQPuM8dDG+bmzCdG2avdx51EeNDUhmJlBM8uWAp7VCuFMZDPZK s5lPXibgI0jUoHBsS4l3U1m9driIBgJhKTc1uu8Bk197hQeu0kW0htNs8N5skhWIIIfr gorlGT/b3ckRJ/7YdYYkv1BW+x0sPm9wWleh/lVuBnxFI81VpWc0hwH8+bg3Y6WG6sbr LumkxYRmtxH0EWke7vQU7ucdoY+r/7l6FTPexTI/hTSdDUVPf6QxfSBRm/VsdeTO/Kur psEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722836228; x=1723441028; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p5MSCinJWc6Hf0NoIOSfTt+lvpVR4ytnrf3IB5nzD7Y=; b=tts87rm/nXYKV2EcpyaKE2aXuwWtGlTE1uKFPWdYbXAX11jQt6RTg3cQAXWwg1qKJ5 M3dsKJFUh1L1T0lcmaTKzAuJsRUeMr/Jl/mtPzc8xTVF+8Qb0n8RSlCvnrW1udbytmZt GaNnR6BirNcsPm5BrqX529htv1BEplEA7/qraQCvRzephPy1p4BC1mTGT/JYH196XdLd Ai8Qvlj2j5WK3611a5G7zwdd/koEeobCl34StwyMrNOJJPVkhfw5tZzYjl4WaEQK+Vjl gxuhWz65vVJW2E6IssaxIsvq31JSL/886Ms9GYpkMfEamJLl4tA+WfVknz4bTCvlX41k FpTw== X-Forwarded-Encrypted: i=1; AJvYcCX4uy8A5S+T+8wIWd/Q7yY04bBv8+R8cmrfI5+GRbhJkbs8vbQy5tqjdqhQo10r2Frq3I46xU4LGOrz2Y3Rt/h7XEQ= X-Gm-Message-State: AOJu0YxTOYx26GadCifPUP/QNR+Gd6m3vFPnHFe3N0R/67t/4sudwYK3 uuA5ASJ4K8b2/lRggD4ruqeHQEjpcVhPkcDT+5fJ1NYIhdNZvIv4DlP14P/CNMGRmztY7TOGBRB DvWnuty+TjNgEYNjJqLThLUSq90o= X-Google-Smtp-Source: AGHT+IGZqxOZSou4hSbQiOhKXs+JnYjb9DUSUy+XqRC0VanU7kA9U14uxMfXmgVKhgFuVHWBxOusSULFoZqYl2HMVv4= X-Received: by 2002:a05:6102:38d1:b0:48f:380a:ca90 with SMTP id ada2fe7eead31-4945be0ab9cmr13704720137.18.1722836227862; Sun, 04 Aug 2024 22:37:07 -0700 (PDT) MIME-Version: 1.0 References: <20240804080107.21094-1-laoar.shao@gmail.com> <20240804080107.21094-4-laoar.shao@gmail.com> <87r0b3g35e.fsf@yhuang6-desk2.ccr.corp.intel.com> <87mslrfz9i.fsf@yhuang6-desk2.ccr.corp.intel.com> <87cymnfv3b.fsf@yhuang6-desk2.ccr.corp.intel.com> <878qxbfts5.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <878qxbfts5.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Yafang Shao Date: Mon, 5 Aug 2024 13:36:31 +0800 Message-ID: Subject: Re: [PATCH v3 3/3] mm/page_alloc: Introduce a new sysctl knob vm.pcp_batch_scale_max To: "Huang, Ying" Cc: akpm@linux-foundation.org, mgorman@techsingularity.net, linux-mm@kvack.org, Matthew Wilcox , David Rientjes Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 1pzqzikyx6c114odh9gggjfex4cdmxuf X-Rspamd-Queue-Id: E949B20017 X-Rspamd-Server: rspam11 X-HE-Tag: 1722836228-104532 X-HE-Meta: U2FsdGVkX19E3PhTKwDrhmiPwZceTdJI2Bfw7vbmvKJ5N/B5tsOcb1HAUa+ICoAvnKY7j8c62/Zjh1iZDj2eSNz0sO6ZppHatyjczeOVA33NcTHGC+SFFMUmG9ldfrlou7ePYGuouIZd2EG7oFjVZGhUntPGkCUuoRHkXH8olzH6HeuCtYhjiZJjaq00Sz5QMcN4u6ooTKIL0++BusmE4GycDTgA3dKjWgMRihwWrTiPeKvjOhAR95LXoHXf9niwTjN0NWNQSSY4ZbF2Np7lqgTWrgdSGBhP3rwGW69N/ZOWm4a242rx8FcXRbSVKfc0sT8aRzMnqKdmQpshRpy2G5hm2xdGmoV1zMnj6r92LafUGNOsK9DhmmPzzC8HRv+Z24o+2fMZJAtY9teR/GC4QKZVsCJ/6NsPccxqTXAG4ZdJDFXj5JTsncdzBiJtKasDbf4IynRg8ImQgE2sRURUIqZJv5TVqVwwQogvOFJJqP6CEASrxjqAm2TKS6WccgIT3f6ZFww3e4bqWEMk387acOJqc/1Sb97nHLVg3HUbtizlk+CXLggbdL901IddHW5yt/cvX4YOmnLzewTLtF24R1k1/HtYE7yfokcR8UnXnjmw9PqHmi8LI6dIiOZcEiSfDuz5UMg0Ww+GSpd05CSP3rVAw9JPUGZH4TForX+k+NGeXGaQ7US+40Sl5q9rh+29IUbLPOHWhf61SF6zZZ7evPXwxBe5zMWz4P7i4CL/mp6T9I23KWpo4qjoNNHkK/W2Wvc5VObIs1LWL7NgVtDgF5RUeId+gdQhbJxA6R6tQDc1nMigqTFH30qlhrA1spUYcjT4IpeoF8567BioaLuwRZPb56q1jKA+w0EhtaP4dSco3w5+tpjVrJf4IXfvEjooM3j5EYVOcBjdU8qWqEYqR3aVmKH79IzKrVRlcOHG7PYj8E4EIg3cbi7Ld2C8xAcDyUgbs3+dJojfoMJggmn toItOChP qqmybSpBbKyBFL2/41CG9U6dcXo7HDr9leGnm3yzS7MwQ7h2eUSeYU9XaItyLQS07jbWtqjc82PEgvi90KxO1gG3omS+bCHWrtAD079g6ShudbTpYdB1Iiv3dGQr9pcdtmUaGjcADjSguWn+RhjHot8d6eLMu/GcFxJ2Grh0PDexE9BmUMS4R0K29OWGRiOb8z6+dTuiFqFkhpPclaxXNgE6rpSZlxVMyP55/wdmDOJGGMm7w0yDzmHX6MO1KViRZYrjNOT89FdDJcU1ZjVxWlU2lwUQaf4+5Ttl5ksytiebypL7OKbXvW/rfviH4Faeg33ZBS4iAWOqyX4qAictsSvvyNaT8yqCPosQzYaLWCieci4GXT6QP3Af+AxWrf0Fjn8oTS2hGipJjR6/bSG/O1z4q1XrmBZ8gB4sP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 5, 2024 at 1:04=E2=80=AFPM Huang, Ying w= rote: > > Yafang Shao writes: > > > On Mon, Aug 5, 2024 at 12:36=E2=80=AFPM Huang, Ying wrote: > >> > >> Yafang Shao writes: > >> > >> > On Mon, Aug 5, 2024 at 11:05=E2=80=AFAM Huang, Ying wrote: > >> >> > >> >> Yafang Shao writes: > >> >> > >> >> > On Mon, Aug 5, 2024 at 9:41=E2=80=AFAM Huang, Ying wrote: > >> >> >> > >> >> >> Yafang Shao writes: > >> >> >> > >> >> >> [snip] > >> >> >> > >> >> >> > > >> >> >> > Why introduce a systl knob? > >> >> >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > >> >> >> > > >> >> >> > From the above data, it's clear that different CPU types have = varying > >> >> >> > allocation latencies concerning zone->lock contention. Typical= ly, people > >> >> >> > don't release individual kernel packages for each type of x86_= 64 CPU. > >> >> >> > > >> >> >> > Furthermore, for latency-insensitive applications, we can keep= the default > >> >> >> > setting for better throughput. > >> >> >> > >> >> >> Do you have any data to prove that the default setting is better= for > >> >> >> throughput? If so, that will be a strong support for your patch= . > >> >> > > >> >> > No, I don't. The primary reason we can't change the default value= from > >> >> > 5 to 0 across our fleet of servers is that you initially set it t= o 5. > >> >> > The sysadmins believe you had a strong reason for setting it to 5= by > >> >> > default; otherwise, it would be considered careless for the upstr= eam > >> >> > kernel. I also believe you must have had a solid justification fo= r > >> >> > setting the default value to 5; otherwise, why would you have > >> >> > submitted your patches? > >> >> > >> >> In commit 52166607ecc9 ("mm: restrict the pcp batch scale factor to > >> >> avoid too long latency"), I tried my best to run test on the machin= es > >> >> available with a micro-benchmark (will-it-scale/page_fault1) which > >> >> exercises kernel page allocator heavily. From the data in commit, > >> >> larger CONFIG_PCP_BATCH_SCALE_MAX helps throughput a little, but no= t > >> >> much. The 99% alloc/free latency can be kept within about 100us wi= th > >> >> CONFIG_PCP_BATCH_SCALE_MAX =3D=3D 5. So, we chose 5 as default val= ue. > >> >> > >> >> But, we can always improve the default value with more data, on mor= e > >> >> types of machines and with more types of benchmarks, etc. > >> >> > >> >> Your data suggest smaller default value because you have data to sh= ow > >> >> that larger default value has the latency spike issue (as large as = tens > >> >> ms) for some practical workloads. Which weren't tested previously.= In > >> >> contrast, we don't have strong data to show the throughput advantag= es of > >> >> larger CONFIG_PCP_BATCH_SCALE_MAX value. > >> >> > >> >> So, I suggest to use a smaller default value for > >> >> CONFIG_PCP_BATCH_SCALE_MAX. But, we may need more test to check th= e > >> >> data for 1, 2, 3, and 4, in addtion to 0 and 5 to determine the bes= t > >> >> choice. > >> > > >> > Which smaller default value would be better? > >> > >> This depends on further test results. > > > > I believe you agree with me that you can't test all workloads. > > > >> > >> > How can we ensure that other workloads, which we haven't tested, wil= l > >> > work well with this new default value? > >> > >> We cannot. We can only depends on the data available. If there are > >> new data available in the future, we can make the change accordingly. > > > > So, your solution is to change the hardcoded value for untested > > workloads and then release the kernel package again? > > > >> > >> > If you have a better default value in mind, would you consider sendi= ng > >> > a patch for it? I would be happy to test it with my test case. > >> > >> If you can test the value 1, 2, 3, and 4 with your workload, that will > >> be very helpful! Both allocation latency and total free time (if > >> possible) are valuable. > > > > You know I can't verify it with all workloads, right? > > You have so much data to verify, which indicates uncertainty about any > > default value. Why not make it tunable and let the user choose the > > value they prefer? > > We only make decision based on data available. In theory, we cannot > test all workloads, because there will be new workloads in the future. > If we have data to show that smaller value will cause performance > regressions for some reasonable workloads, we can make it user tunable. The issue arises when a new workload is discovered; you have to release a new kernel package for it. If that's your expectation, why not make it tunable from the start? Had you made it tunable in your original commit, we wouldn't be having this non-intuitive discussion repeatedly. Which came first, the chicken or the egg? --=20 Regards Yafang