From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDB56C3DA59 for ; Fri, 19 Jul 2024 07:52:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70DDF6B0082; Fri, 19 Jul 2024 03:52:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6BF076B0083; Fri, 19 Jul 2024 03:52:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5AB9B6B0088; Fri, 19 Jul 2024 03:52:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 387B96B0082 for ; Fri, 19 Jul 2024 03:52:48 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8DD8914049A for ; Fri, 19 Jul 2024 07:52:45 +0000 (UTC) X-FDA: 82355735490.09.DDF2E62 Received: from mail-vk1-f169.google.com (mail-vk1-f169.google.com [209.85.221.169]) by imf11.hostedemail.com (Postfix) with ESMTP id D598B40007 for ; Fri, 19 Jul 2024 07:52:43 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721375522; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9xl2BF9j5KmOhunUwdfCN8P7oaCHLOP0oS6Xt9fJcuk=; b=P/5pMomT9WDZxmsxyxiwjMV/e82PBkMKVJKvDHWBKHmyJCZT6xTDuw/61xL4+8Xrz2Ck++ 66N0+yZTVTAWyTiGbFMFy6gOXdtvjQq9kbh9bWdwZPu/dcb2PymUXTTmJiAU3Kruwtg5IP 0le1s1c69ZKaxof9ZHA/xc9r5u/HI2g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721375522; a=rsa-sha256; cv=none; b=WplpZDqUXAs+1jViYbmDUQey+SZsHExHDfMsQVmV81jllYPzaSe52QyQRxkI/HCg3Une/f 2pquPY5kDi7Mb8uc5gLIgc/HaILyBJZVhZT7xxA/V99kXY4j3GHHpxF4S3KsuAnXZgHdz8 almLaTBRppnJ+DonlCSZlQjWX18RgoA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) Received: by mail-vk1-f169.google.com with SMTP id 71dfb90a1353d-4f2e1093abeso593884e0c.1 for ; Fri, 19 Jul 2024 00:52:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721375563; x=1721980363; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9xl2BF9j5KmOhunUwdfCN8P7oaCHLOP0oS6Xt9fJcuk=; b=n2xHP7Q6Xin8GopEQQX1qDr6/M9rZiNMVIsGnWoH/6uH8FVWID2+u2CdaNarh5ZsXe gOZh9r0efnTwOAsG3RqRMWC0f5wJOOaUAGo6rMgeEUXCwAO2NRecPdT7xB7D+gxTya3l qMFUz9n9aw7vrE74nYZ+YYhnoqFZVN9kjRTjAM6OTuW/uw0vS/zgKUah7+DVMkbee/xO LS78I4a3nDmp1u6Hh6YJsheOU1h3gIQwEVFjNcqhBMdc3BqEopkXHKlDTte+AayS5cx3 8qa2+fXdKSKQtSuMA3MeZKOP5fTjaA2AaTIvdKQmuuJNBPPmV4qX9tYMUriQqSJtTBeI wOzQ== X-Forwarded-Encrypted: i=1; AJvYcCXMpxQf4pOO0H22r6soFpKME8+iqPAKjdkUwU5tor/leUoirsJ1nteSIaKqMMmljm4lWUiKPw3nBIsX0VKbWdJAbcQ= X-Gm-Message-State: AOJu0YxdV9I6DoctGU46OvnNKnyLRv2aaQU9r3v55SYNhCMfP46UTMSj K+dgosgKW6vwWP3bXPI8trTqo524Q1jHB5kq+LvKZN9qdU7qYyjCLrdqQyfBP53e3Bvrylja7T9 voSZVp+Eol1eFGJq99Ccu/vtM9f8= X-Google-Smtp-Source: AGHT+IH9ktg2Fg5gwNnNNy033Oyx54ryUfT96YkbV6E/YeZXj7RltjlfvHpW7ogclmxfCllCy0NVcKMMgOn3F8YrrGo= X-Received: by 2002:a05:6122:3283:b0:4f2:e2f7:ed6 with SMTP id 71dfb90a1353d-4f4df8c75a1mr9688844e0c.13.1721375562862; Fri, 19 Jul 2024 00:52:42 -0700 (PDT) MIME-Version: 1.0 References: <20240717071257.4141363-1-ryan.roberts@arm.com> <20240717071257.4141363-4-ryan.roberts@arm.com> In-Reply-To: From: Barry Song Date: Fri, 19 Jul 2024 19:52:32 +1200 Message-ID: Subject: Re: [RFC PATCH v1 3/4] mm: Override mTHP "enabled" defaults at kernel cmdline To: Ryan Roberts Cc: Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , David Hildenbrand , Lance Yang , Baolin Wang , Gavin Shan , Pankaj Raghav , Daniel Gomez , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D598B40007 X-Stat-Signature: ayzku97hp68wtumasbh63irgzygik8o3 X-HE-Tag: 1721375563-601607 X-HE-Meta: U2FsdGVkX19r6MJQ/CK6vJdAhp1MvQrj9Mi96OB6O+CNzJrojhjl0SogAu2999IIU+2TGb71ngiF3uOWGiIivC/bF0Dn+XCZkDeX7t7JDbOhsmn7UP7ejvUKdP/JJy/7Vz5YIGWABjQneeoBtO7Kf0/jSygZJ0FBjwmxj4ftBUTjzEvsi2qJTqXfvn/QRS6Kp6+59p0wbBR9p48BMkHsCoujSBbXnHk4azSg3NulHvQqtaO0zVpoH9lGCugTTAXsOt3Ej5ZcWyxy2c9vBFxeqCrdI+VC0IYPTZuDWmhiK0wQyhgboX6B5Ts2PAhVsI/oT7SPzZ6wPcbSchNnzoi/1d8zOk5AfPTSgQDLI/raw64pye9gH+uNv+7zrAlyIVOnUQMnLg2yzREJMe0Br+kXekVgkqVKwr53b2mowQBB2kZUvoHh8qVl4VjSCp8xc6Ns9HsQSy3NpTT2ZqRGTjK8KHQkoimtFII81rAhv7BWV8/2pVKiWki4pBICQ5XxywE8VFEBzTcaoRmNhT2H/i1TC6HOmmE+mcaSs9JQuJkyfyoOvcZ69zV8e31dL4WkhzmdiUGoXYs/2T3mQvDCcboWJPVhEPSF0rDmEyYT72Q/Xr/aRs65MQkqD4wuLlB2Vxt+mjtFDpSm+cxsMQY6LsinZfABJm2q/EukaFo+Qbowm5pHMegYbdvbCsyDG5Ywp2ywBaJ/bLmaQPnnVUqZuW/YJ+DE7M3oXfPRbKsEZdXQFzlR7F2wRr4274uYwiTSMKb9j0QY5r4b5aCk531dam8oLa2EOUggYZh2hRG1/HRrVkZUMPBZY0Rkuzg3Qo53RfDBIvTuEtstAQLTmaXfgi0N1Sg7MGV8IsP4dUqENaE0pEMdDge+sNPq11swXp4mmXCu3ZAOtzVYMN9cgOVadO0kACFEB80aTZMAOfH41j9Kl1lz1nu5/gDICFXrRf+mgFZbGgNMGAxYbgV9PNruGTR gcBrODL6 cDTG2/kEzXotg3co8BQwqtuaiP0wWNGrzezYZXERL84YlhMA/zxREdsZYojB6dGVluDwm8A+KpHIrvW/lGDvxGy05lQDo/4Ej4ZZ3pS9s21VfiyA0xY4oR2x6oNlvSJnatp9G/B07pXNdkxRD+7D9/98GC3grN8wfBwyze7b0rcRR0mrOleHe9oSnv4qsgjKYJQjTZSwkHiNC5yitnI+Xs4stmdlspvKBGj9fxuVl5dm08zbeCvm6jpP0GBlAfxhCVmDTlVbBINxkwnvQG8DQsFP9DFktky5S1XhIkOCev9yEhjSYrU2zIbfeeEyYx52hs1UAPAved1lcc2jt4loa7sAJn55xy0sPtAkhS0+4Dd/nRIOPfloggDV3FA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jul 19, 2024 at 7:48=E2=80=AFPM Ryan Roberts = wrote: > > On 19/07/2024 01:46, Barry Song wrote: > > On Wed, Jul 17, 2024 at 7:13=E2=80=AFPM Ryan Roberts wrote: > >> > >> Add thp_anon=3D cmdline parameter to allow specifying the default > >> enablement of each supported anon THP size. The parameter accepts the > >> following format and can be provided multiple times to configure each > >> size: > >> > >> thp_anon=3D[KMG]: > >> > >> See Documentation/admin-guide/mm/transhuge.rst for more details. > >> > >> Configuring the defaults at boot time is useful to allow early user > >> space to take advantage of mTHP before its been configured through > >> sysfs. > > > > This is exactly what I need and want to implement, as the current behav= ior > > is problematic. We need to boot up the system and reach the point where > > we can set up the sys interfaces to enable mTHP. Many processes miss th= e > > opportunity to use mTHP. > > > > On the other hand, userspace might have been tuned to detect that mTHP > > is enabled, such as a .so library. However, it turns out we have had > > inconsistent settings between the two stages - before and after setting > > mTHP enabled by sys interfaces. > > Good feedback - sounds like I should separate out this patch from the res= t of > the series to get it reviewed and merged faster? +1 > > > > >> > >> Signed-off-by: Ryan Roberts > >> --- > >> .../admin-guide/kernel-parameters.txt | 8 +++ > >> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > >> mm/huge_memory.c | 55 ++++++++++++++++++= - > >> 3 files changed, 82 insertions(+), 7 deletions(-) > >> > >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documen= tation/admin-guide/kernel-parameters.txt > >> index bc55fb55cd26..48443ad12e3f 100644 > >> --- a/Documentation/admin-guide/kernel-parameters.txt > >> +++ b/Documentation/admin-guide/kernel-parameters.txt > >> @@ -6592,6 +6592,14 @@ > >> : poll all this frequency > >> 0: no polling (default) > >> > >> + thp_anon=3D [KNL] > >> + Format: [KMG]:always|madvise|never|inher= it > >> + Can be used to control the default behavior of= the > >> + system with respect to anonymous transparent h= ugepages. > >> + Can be used multiple times for multiple anon T= HP sizes. > >> + See Documentation/admin-guide/mm/transhuge.rst= for more > >> + details. > >> + > >> threadirqs [KNL,EARLY] > >> Force threading of all interrupt handlers exce= pt those > >> marked explicitly IRQF_NO_THREAD. > >> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentatio= n/admin-guide/mm/transhuge.rst > >> index 1aaf8e3a0b5a..f53d43d986e2 100644 > >> --- a/Documentation/admin-guide/mm/transhuge.rst > >> +++ b/Documentation/admin-guide/mm/transhuge.rst > >> @@ -311,13 +311,27 @@ performance. > >> Note that any changes to the allowed set of sizes only applies to fut= ure > >> file-backed THP allocations. > >> > >> -Boot parameter > >> -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> +Boot parameters > >> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> > >> -You can change the sysfs boot time defaults of Transparent Hugepage > >> -Support by passing the parameter ``transparent_hugepage=3Dalways`` or > >> -``transparent_hugepage=3Dmadvise`` or ``transparent_hugepage=3Dnever`= ` > >> -to the kernel command line. > >> +You can change the sysfs boot time default for the top-level "enabled= " > >> +control by passing the parameter ``transparent_hugepage=3Dalways`` or > >> +``transparent_hugepage=3Dmadvise`` or ``transparent_hugepage=3Dnever`= ` to the > >> +kernel command line. > >> + > >> +Alternatively, each supported anonymous THP size can be controlled by > >> +passing ``thp_anon=3D[KMG]:``, where ```` is the T= HP size > >> +and ```` is one of ``always``, ``madvise``, ``never`` or > >> +``inherit``. > >> + > >> +For example, the following will set 64K THP to ``always``:: > >> + > >> + thp_anon=3D64K:always > >> + > >> +``thp_anon=3D`` may be specified multiple times to configure all THP = sizes as > >> +required. If ``thp_anon=3D`` is specified at least once, any anon THP= sizes > >> +not explicitly configured on the command line are implicitly set to > >> +``never``. > >> > >> Hugepages in tmpfs/shmem > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D > >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >> index 4249c0bc9388..794d2790d90d 100644 > >> --- a/mm/huge_memory.c > >> +++ b/mm/huge_memory.c > >> @@ -82,6 +82,7 @@ unsigned long huge_anon_orders_madvise __read_mostly= ; > >> unsigned long huge_anon_orders_inherit __read_mostly; > >> unsigned long huge_file_orders_always __read_mostly; > >> int huge_file_exec_order __read_mostly =3D -1; > >> +static bool anon_orders_configured; > >> > >> unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > >> unsigned long vm_flags, > >> @@ -763,7 +764,10 @@ static int __init hugepage_init_sysfs(struct kobj= ect **hugepage_kobj) > >> * disable all other sizes. powerpc's PMD_ORDER isn't a compil= e-time > >> * constant so we have to do this here. > >> */ > >> - huge_anon_orders_inherit =3D BIT(PMD_ORDER); > >> + if (!anon_orders_configured) { > >> + huge_anon_orders_inherit =3D BIT(PMD_ORDER); > >> + anon_orders_configured =3D true; > >> + } > >> > >> /* > >> * For pagecache, default to enabling all orders. powerpc's PM= D_ORDER > >> @@ -955,6 +959,55 @@ static int __init setup_transparent_hugepage(char= *str) > >> } > >> __setup("transparent_hugepage=3D", setup_transparent_hugepage); > >> > >> +static int __init setup_thp_anon(char *str) > >> +{ > >> + unsigned long size; > >> + char *state; > >> + int order; > >> + int ret =3D 0; > >> + > >> + if (!str) > >> + goto out; > >> + > >> + size =3D (unsigned long)memparse(str, &state); > >> + order =3D ilog2(size >> PAGE_SHIFT); > >> + if (*state !=3D ':' || !is_power_of_2(size) || size <=3D PAGE_= SIZE || > >> + !(BIT(order) & THP_ORDERS_ALL_ANON)) > >> + goto out; > >> + > >> + state++; > >> + > >> + if (!strcmp(state, "always")) { > >> + clear_bit(order, &huge_anon_orders_inherit); > >> + clear_bit(order, &huge_anon_orders_madvise); > >> + set_bit(order, &huge_anon_orders_always); > >> + ret =3D 1; > >> + } else if (!strcmp(state, "inherit")) { > >> + clear_bit(order, &huge_anon_orders_always); > >> + clear_bit(order, &huge_anon_orders_madvise); > >> + set_bit(order, &huge_anon_orders_inherit); > >> + ret =3D 1; > >> + } else if (!strcmp(state, "madvise")) { > >> + clear_bit(order, &huge_anon_orders_always); > >> + clear_bit(order, &huge_anon_orders_inherit); > >> + set_bit(order, &huge_anon_orders_madvise); > >> + ret =3D 1; > >> + } else if (!strcmp(state, "never")) { > >> + clear_bit(order, &huge_anon_orders_always); > >> + clear_bit(order, &huge_anon_orders_inherit); > >> + clear_bit(order, &huge_anon_orders_madvise); > >> + ret =3D 1; > >> + } > >> + > >> + if (ret) > >> + anon_orders_configured =3D true; > >> +out: > >> + if (!ret) > >> + pr_warn("thp_anon=3D%s: cannot parse, ignored\n", str)= ; > >> + return ret; > >> +} > >> +__setup("thp_anon=3D", setup_thp_anon); > >> + > >> pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) > >> { > >> if (likely(vma->vm_flags & VM_WRITE)) > >> -- > >> 2.43.0 > >> > > > > Thanks > > Barry > >