From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39E61C3DA61 for ; Fri, 19 Jul 2024 00:46:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86BA46B0083; Thu, 18 Jul 2024 20:46:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 81D286B0089; Thu, 18 Jul 2024 20:46:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E5726B008C; Thu, 18 Jul 2024 20:46:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4FD606B0083 for ; Thu, 18 Jul 2024 20:46:48 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A91A7C0306 for ; Fri, 19 Jul 2024 00:46:47 +0000 (UTC) X-FDA: 82354662054.19.B6C7774 Received: from mail-ua1-f54.google.com (mail-ua1-f54.google.com [209.85.222.54]) by imf24.hostedemail.com (Postfix) with ESMTP id EC507180003 for ; Fri, 19 Jul 2024 00:46:45 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721349985; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GXQ+IO8iuumbpDwvRCDXC4/iQuI0xJQGtKu8Dv5HDVs=; b=F0Ks4hdmTG3VuAtmElVfP4+6gy+++jFsb4gLE1a2bNlXoQIRQ3OsDE5UOcNixgWNu9p2q2 BzmPtnvrrT6vrSvozvFagywcuZVJUF79lhzfnQ125l82UUVtUza2ZPdpuUdQoNzxWdpBB+ fE6HR9c/GKapQ6K55nTMn3ZB0zOIi0c= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721349985; a=rsa-sha256; cv=none; b=54wUy4FuKhkEXWG7mN/kLuh1RabZktVJH0sjU9BLD1jYGTCedscE+OqCrma7h2a3+FLEtp yYS6p5HSfLRUaVqpDENHY4L18knBUGmQezh3lTFDubUxtaE6UgKYWIZkpZg+fj0H1ypBPh QKA+2dZXEuafLsKpO5EQSnSb9HMVKqw= Received: by mail-ua1-f54.google.com with SMTP id a1e0cc1a2514c-822eb80f6e4so369316241.2 for ; Thu, 18 Jul 2024 17:46:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721350005; x=1721954805; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GXQ+IO8iuumbpDwvRCDXC4/iQuI0xJQGtKu8Dv5HDVs=; b=ZtJB4os0vp6PhIbPP+lVyG6I+ej9vlU3FQ27Lr9Qg+XBSvIQ6hF2iVTAPhoYS5XckV FxR5bIJOMAmgGFPWLDstXMmia7X36pMhz6GptkmHnUyo1Miza/T1B/7v09ZI8ejuQAes i5yT/IMaD1xgMv7NoBZIYAAFc+gLCtMRjferURCFu3XyhlzrKO6z/TW9jBJFJsZsVGQC GOf0RdC4bSPu2FrkWbEwgMp/q0VcIshw7hJUKLbcQyhk7izBhG+PME3YaKCzL73xKQDl Z/c2JIafUXtNK0kq6gU7K+/8BclVIy0bGrMvQ4CiEXdQL4UKIb6oprKRh6zz8un7YkCS IS6g== X-Forwarded-Encrypted: i=1; AJvYcCUqbQ8ue+dj78C6+KVXB6Z2xCdU7+dRHAPUNPxYXJBgEEsrBPsFA4JTYULxda2Ykg59aQyuMJSewmo7UNTawv2Hv78= X-Gm-Message-State: AOJu0YzUE7BC33+lk+X2hSeZGjfJv6jxOR5DeZsfVCmautjB2OI5J0kR pGy7bZ77f9H5jFRKa3zwLnGbETidTEtGlt2ZtJIjjDu4kBY1F87UG/nsB2e7hQtLZWVYS7l9DJD ZqL6MyKQPKup9b0hg+Ja7jBVrFMA= X-Google-Smtp-Source: AGHT+IFT7TFRyWG+4EjCRpKUw7n6DPN0+TItRo/kZEQUnluoetN3fJUk0pXIeT7pPSnPEG51sD9+mp57BP2qUJ66a0g= X-Received: by 2002:a67:e709:0:b0:48f:3e67:78d2 with SMTP id ada2fe7eead31-49159901cd2mr8312047137.26.1721350004799; Thu, 18 Jul 2024 17:46:44 -0700 (PDT) MIME-Version: 1.0 References: <20240717071257.4141363-1-ryan.roberts@arm.com> <20240717071257.4141363-4-ryan.roberts@arm.com> In-Reply-To: <20240717071257.4141363-4-ryan.roberts@arm.com> From: Barry Song Date: Fri, 19 Jul 2024 12:46:33 +1200 Message-ID: Subject: Re: [RFC PATCH v1 3/4] mm: Override mTHP "enabled" defaults at kernel cmdline To: Ryan Roberts Cc: Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , David Hildenbrand , Lance Yang , Baolin Wang , Gavin Shan , Pankaj Raghav , Daniel Gomez , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: EC507180003 X-Stat-Signature: 1kbtm93ixoqhh56ybsryme3pimp8xg8m X-HE-Tag: 1721350005-12653 X-HE-Meta: U2FsdGVkX1+jVZt79RQo81W/1DrIte9y//ewjSAbQj4/7hDrOQQ+3iT1NIl7bIITNI037xexV3/WqQ7SFJasuhfmG8BYweMrbna6L/4juDzBww/IVeShP0vdyguLo/fAi1W4EGVgOh6R1f6zDKD91+0n/HVv2d0STS+KS8/TYKHNtrSDkiJFPD/ygEz0JoTiE21nUGWM11vu4aDH3hoRV80HrKphUZEPjYKf9ekJ8AZfbgYrrWr4x8OyD+l9OWFaKFJAav3ir6rQY7vV53MDY7+JvFNcigxoaqUPqvptbx0hl54VCgYMKhz4RhxMu3NNLOyCSH+T+SkbcXgCexp9Tsm+cCyS3v7uMZwxnPCnU9Z6r/NP7RTfZC/M3GjW1B0ilKbMSHlqHh+o0NpMgu0YBVnCE3urfK/t0LIpvNQqUvb60rEOPEqD7X80nKJORVI1QPp6sidgyd3QJN6Rg0DoAp0aHaqGjT1XzK63769bCwbxCjnR7Ub9TocBiT9QZhA3M0Nl0BdoxLocvXlXf2RrnClXybsPiByX3Pm5EUbNg56GEihM97XQcnscN/1bwE0ZUpxXueuAHKzzG9lvGLDBBMAGgglKQZIOA95pTkaHfXWxPcwVQoRr+bad++6xkgO4GvllXyqROmIZhQv9dUcaEl2KQnxamypCd788bdUteD0dHalqXduFmKMiXxvfj8OJaQzHcY5qAmO7LaEDBIOnf17d2YugKyGD8QEu8ASGkYqwBi+fzSzKr+dUlUFYPDkOurzCsDlkMvE1ibxDKIxsm5Ig8VGrrcdcTytBhlsQlDtol+SYZvW6MPkBJKAoAHm1PpjcTu4vtliLVkYQrHjVLDKKg8WbHsIXLP8qZtLoTIM/GF+nzf1EN5gUPWo4dY653n7Eq1oRqVT2KuRY+tumGcMjaslMp9lcBO9Em87Fpqdl6/caHq7q0J5IEMObipWdPMrjK/oGoLpzCcCf9s5 ZCgAv6Yc 7CXE7bvtpdRQUmkB0JAJfbziCTvDFPixHNHEY0mvQsd+L5C0yhyQFmRaIKlqgMEUk8qZioVlljRmf4DsUwWbXuwoK7uO5djKtqGh1fM2ZLciL01OdeDWylY689cXMj/E/W3P8gkAsZ6laCMJuduDH6OzEUS2EWMMzVPTLMTIp1o0V5nC4aztNKEPuFanjGhyJepa/ZagPYyVk7rqxP2XKEBgpxCfLAxOZgnI33TpFMjA9smNhe5NLlXH7mweSuBfdyb4OjU4mW8tiEv/juOY7ftSCE7QkRwdM0qah3/XdppJmNZwkZyXORCWqaHbbjwhnDQmwZT1Q9WK7yLkYARPnq76pRHSDuOayNkFlg1y+ig55P4UeN1vFKsLpZw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 17, 2024 at 7:13=E2=80=AFPM Ryan Roberts = wrote: > > Add thp_anon=3D cmdline parameter to allow specifying the default > enablement of each supported anon THP size. The parameter accepts the > following format and can be provided multiple times to configure each > size: > > thp_anon=3D[KMG]: > > See Documentation/admin-guide/mm/transhuge.rst for more details. > > Configuring the defaults at boot time is useful to allow early user > space to take advantage of mTHP before its been configured through > sysfs. This is exactly what I need and want to implement, as the current behavior is problematic. We need to boot up the system and reach the point where we can set up the sys interfaces to enable mTHP. Many processes miss the opportunity to use mTHP. On the other hand, userspace might have been tuned to detect that mTHP is enabled, such as a .so library. However, it turns out we have had inconsistent settings between the two stages - before and after setting mTHP enabled by sys interfaces. > > Signed-off-by: Ryan Roberts > --- > .../admin-guide/kernel-parameters.txt | 8 +++ > Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > mm/huge_memory.c | 55 ++++++++++++++++++- > 3 files changed, 82 insertions(+), 7 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentat= ion/admin-guide/kernel-parameters.txt > index bc55fb55cd26..48443ad12e3f 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6592,6 +6592,14 @@ > : poll all this frequency > 0: no polling (default) > > + thp_anon=3D [KNL] > + Format: [KMG]:always|madvise|never|inherit > + Can be used to control the default behavior of th= e > + system with respect to anonymous transparent huge= pages. > + Can be used multiple times for multiple anon THP = sizes. > + See Documentation/admin-guide/mm/transhuge.rst fo= r more > + details. > + > threadirqs [KNL,EARLY] > Force threading of all interrupt handlers except = those > marked explicitly IRQF_NO_THREAD. > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/a= dmin-guide/mm/transhuge.rst > index 1aaf8e3a0b5a..f53d43d986e2 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -311,13 +311,27 @@ performance. > Note that any changes to the allowed set of sizes only applies to future > file-backed THP allocations. > > -Boot parameter > -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +Boot parameters > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > -You can change the sysfs boot time defaults of Transparent Hugepage > -Support by passing the parameter ``transparent_hugepage=3Dalways`` or > -``transparent_hugepage=3Dmadvise`` or ``transparent_hugepage=3Dnever`` > -to the kernel command line. > +You can change the sysfs boot time default for the top-level "enabled" > +control by passing the parameter ``transparent_hugepage=3Dalways`` or > +``transparent_hugepage=3Dmadvise`` or ``transparent_hugepage=3Dnever`` t= o the > +kernel command line. > + > +Alternatively, each supported anonymous THP size can be controlled by > +passing ``thp_anon=3D[KMG]:``, where ```` is the THP = size > +and ```` is one of ``always``, ``madvise``, ``never`` or > +``inherit``. > + > +For example, the following will set 64K THP to ``always``:: > + > + thp_anon=3D64K:always > + > +``thp_anon=3D`` may be specified multiple times to configure all THP siz= es as > +required. If ``thp_anon=3D`` is specified at least once, any anon THP si= zes > +not explicitly configured on the command line are implicitly set to > +``never``. > > Hugepages in tmpfs/shmem > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 4249c0bc9388..794d2790d90d 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -82,6 +82,7 @@ unsigned long huge_anon_orders_madvise __read_mostly; > unsigned long huge_anon_orders_inherit __read_mostly; > unsigned long huge_file_orders_always __read_mostly; > int huge_file_exec_order __read_mostly =3D -1; > +static bool anon_orders_configured; > > unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > unsigned long vm_flags, > @@ -763,7 +764,10 @@ static int __init hugepage_init_sysfs(struct kobject= **hugepage_kobj) > * disable all other sizes. powerpc's PMD_ORDER isn't a compile-t= ime > * constant so we have to do this here. > */ > - huge_anon_orders_inherit =3D BIT(PMD_ORDER); > + if (!anon_orders_configured) { > + huge_anon_orders_inherit =3D BIT(PMD_ORDER); > + anon_orders_configured =3D true; > + } > > /* > * For pagecache, default to enabling all orders. powerpc's PMD_O= RDER > @@ -955,6 +959,55 @@ static int __init setup_transparent_hugepage(char *s= tr) > } > __setup("transparent_hugepage=3D", setup_transparent_hugepage); > > +static int __init setup_thp_anon(char *str) > +{ > + unsigned long size; > + char *state; > + int order; > + int ret =3D 0; > + > + if (!str) > + goto out; > + > + size =3D (unsigned long)memparse(str, &state); > + order =3D ilog2(size >> PAGE_SHIFT); > + if (*state !=3D ':' || !is_power_of_2(size) || size <=3D PAGE_SIZ= E || > + !(BIT(order) & THP_ORDERS_ALL_ANON)) > + goto out; > + > + state++; > + > + if (!strcmp(state, "always")) { > + clear_bit(order, &huge_anon_orders_inherit); > + clear_bit(order, &huge_anon_orders_madvise); > + set_bit(order, &huge_anon_orders_always); > + ret =3D 1; > + } else if (!strcmp(state, "inherit")) { > + clear_bit(order, &huge_anon_orders_always); > + clear_bit(order, &huge_anon_orders_madvise); > + set_bit(order, &huge_anon_orders_inherit); > + ret =3D 1; > + } else if (!strcmp(state, "madvise")) { > + clear_bit(order, &huge_anon_orders_always); > + clear_bit(order, &huge_anon_orders_inherit); > + set_bit(order, &huge_anon_orders_madvise); > + ret =3D 1; > + } else if (!strcmp(state, "never")) { > + clear_bit(order, &huge_anon_orders_always); > + clear_bit(order, &huge_anon_orders_inherit); > + clear_bit(order, &huge_anon_orders_madvise); > + ret =3D 1; > + } > + > + if (ret) > + anon_orders_configured =3D true; > +out: > + if (!ret) > + pr_warn("thp_anon=3D%s: cannot parse, ignored\n", str); > + return ret; > +} > +__setup("thp_anon=3D", setup_thp_anon); > + > pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) > { > if (likely(vma->vm_flags & VM_WRITE)) > -- > 2.43.0 > Thanks Barry