From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71158C3DA5D for ; Fri, 19 Jul 2024 07:48:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E101F6B008C; Fri, 19 Jul 2024 03:48:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DBE016B0092; Fri, 19 Jul 2024 03:48:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CADCA6B0093; Fri, 19 Jul 2024 03:48:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AD6D86B008C for ; Fri, 19 Jul 2024 03:48:06 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 347FC1C0EB5 for ; Fri, 19 Jul 2024 07:48:06 +0000 (UTC) X-FDA: 82355723772.14.A2547CA Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id 7E8E1C001D for ; Fri, 19 Jul 2024 07:48:03 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721375243; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Yp/2f4VbngfjGf9CQ4vgbpCazSvlwodCa+suA1ngvk=; b=G+bvPB3F8f5Kb4/mgn7Uv5dGIQk7j/YaJppuhH5P0Ab7Zqt9+fyr32qSF2xZcjYnDFARpJ W4EwWsvj/glCz/r45mI9FOkbvSxQmFTDMywj2xcPiQ8gkellUf/Hcn1LNDaRPURe4lrU5N YiRnZLkdgxW9zHaLMfGvrGBnsQIJiu8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721375243; a=rsa-sha256; cv=none; b=QL4EP4uEUVRnYBdqJ/t9cN2dkc4dXj65G6iuOvXkEru1XuO5ZkiAvMPW1Fnk0GWF85LhMC K0PvHXmVGJrshJ7jXKc2/cZlgPWGHiPIQXVeQPF4y0FC82Xvnz5QLP0ibVuzTRm34chP1H FKjfxGj24oGdB3uUsuBU5edVYkdO+gY= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CF8E21042; Fri, 19 Jul 2024 00:48:27 -0700 (PDT) Received: from [10.57.76.151] (unknown [10.57.76.151]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 098B23F766; Fri, 19 Jul 2024 00:47:59 -0700 (PDT) Message-ID: Date: Fri, 19 Jul 2024 08:47:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 3/4] mm: Override mTHP "enabled" defaults at kernel cmdline To: Barry Song Cc: Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , David Hildenbrand , Lance Yang , Baolin Wang , Gavin Shan , Pankaj Raghav , Daniel Gomez , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240717071257.4141363-1-ryan.roberts@arm.com> <20240717071257.4141363-4-ryan.roberts@arm.com> Content-Language: en-GB From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 7E8E1C001D X-Stat-Signature: gr68dhpbsbhfioombgigopnc54otjg7g X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1721375283-680326 X-HE-Meta: U2FsdGVkX18QzZlLX3Bcgfd9+GwqhVmwJ9/rjQb0IWs6LIDOGug+5ZRCke7nqLpWlcRRiLRxdmNrUmkbfV7fPya9SQQyLUnIiDagQPkIL5ZTvD655VY1tt867W5XnrG82R1Pgje4t19sMt1RH5kbHtLRy5QrDsrZ9zVrYf3/E1pviO4TCQl00nuVMF9p6afls1/yXF/ofe8TVk6ik3yBa86/LjGn1FoY72xoYkaVX7tTiuxh6Uid2Fe8pYbDCxmgje6hKMtRMmZf7WKfjSEOqGsKHFo/L8w7ZZormG8D6GLc6eRtVVEtjIeKNJAFCMrcPxVS1fyQ7BbIh7Lcu/DQ0uUR/busngG52URvfCAmoLbpxVyagDVjlQ1E3P7CeVwnQe45a1vh53FGgEMkG6ba3XNdleOtmm6hOTuQMayfFeR4Mw1zyIu/VCqgcrtVwjjt+sWzxVyWqRlaiZ1wO5DZipss02zxdDufKVO2A14p9b+5RrjSlrj2m43E8Mds8ka0fXN8rLWowdIF2uUPn0flwvqKsHvQD03XQr33TpfXY5Zi5VXU5Ta9hNBAsGMFTZMt1HWZcIGqJxWs2+dVbLTBvHBugsP5M4pvNqQ2voiJaUOdF3kTI8EmQfFG1dUOJQeqkm5iYwZJPz1lw0A7GD+nbsILQWprFnfAXQxuOgLfG6bw1lMMYwDBfBHP4AG3ZjqqgQpiMLFABWz3aSnhgrRKBD4LyQL+rtW2sylcp2Q2Pdv1iWrNxC9T3zzoraGDsBeKb60zOVoHmMqT5n+185Zg+qSSB/ssDoB1+QBwxckr2YJBppU6i5P2S0MFPc6MVTDlK0QAOvVoVdO+l4tgWJKxSe6LEQj9l5+6vbyq2bVWYhpoDgKqNC0eA6WmZGZ9nhNyVgyoAn7VVCQqHw+yhn1ZS5bKPGAe6Jsso4NspfrHivE4IcN4i/9zqU7x0qRJza84gJ7DrGf+5yv5Dl2XcG3 gsj80HVG 4b3QDIWiUKLw19L8D0Qbq1eQXyF8mFlQi2SEzBw2w8kTCDlz8WSgceYlKAbaEcPMUzRjK763kNhE4bSJS3WM4FX6Qza0LngYMRyB27SbG+sPIlCLGknSXQh3On6jE1nPog8lspkHctOunBS7PszYkXtmkkCndTFck7rk/RcWbvdWQ+zLt1e+uHhgZ0z77nuAI3+lRbk5T8KFm2BhZGlWYY8MdJS3EKRPBNQoqtrDTO0Qgb3ngcb8/h9MdGK4q6Tw6IurP9tt6JWm0wsM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 19/07/2024 01:46, Barry Song wrote: > On Wed, Jul 17, 2024 at 7:13 PM Ryan Roberts wrote: >> >> Add thp_anon= cmdline parameter to allow specifying the default >> enablement of each supported anon THP size. The parameter accepts the >> following format and can be provided multiple times to configure each >> size: >> >> thp_anon=[KMG]: >> >> See Documentation/admin-guide/mm/transhuge.rst for more details. >> >> Configuring the defaults at boot time is useful to allow early user >> space to take advantage of mTHP before its been configured through >> sysfs. > > This is exactly what I need and want to implement, as the current behavior > is problematic. We need to boot up the system and reach the point where > we can set up the sys interfaces to enable mTHP. Many processes miss the > opportunity to use mTHP. > > On the other hand, userspace might have been tuned to detect that mTHP > is enabled, such as a .so library. However, it turns out we have had > inconsistent settings between the two stages - before and after setting > mTHP enabled by sys interfaces. Good feedback - sounds like I should separate out this patch from the rest of the series to get it reviewed and merged faster? > >> >> Signed-off-by: Ryan Roberts >> --- >> .../admin-guide/kernel-parameters.txt | 8 +++ >> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- >> mm/huge_memory.c | 55 ++++++++++++++++++- >> 3 files changed, 82 insertions(+), 7 deletions(-) >> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >> index bc55fb55cd26..48443ad12e3f 100644 >> --- a/Documentation/admin-guide/kernel-parameters.txt >> +++ b/Documentation/admin-guide/kernel-parameters.txt >> @@ -6592,6 +6592,14 @@ >> : poll all this frequency >> 0: no polling (default) >> >> + thp_anon= [KNL] >> + Format: [KMG]:always|madvise|never|inherit >> + Can be used to control the default behavior of the >> + system with respect to anonymous transparent hugepages. >> + Can be used multiple times for multiple anon THP sizes. >> + See Documentation/admin-guide/mm/transhuge.rst for more >> + details. >> + >> threadirqs [KNL,EARLY] >> Force threading of all interrupt handlers except those >> marked explicitly IRQF_NO_THREAD. >> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >> index 1aaf8e3a0b5a..f53d43d986e2 100644 >> --- a/Documentation/admin-guide/mm/transhuge.rst >> +++ b/Documentation/admin-guide/mm/transhuge.rst >> @@ -311,13 +311,27 @@ performance. >> Note that any changes to the allowed set of sizes only applies to future >> file-backed THP allocations. >> >> -Boot parameter >> -============== >> +Boot parameters >> +=============== >> >> -You can change the sysfs boot time defaults of Transparent Hugepage >> -Support by passing the parameter ``transparent_hugepage=always`` or >> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >> -to the kernel command line. >> +You can change the sysfs boot time default for the top-level "enabled" >> +control by passing the parameter ``transparent_hugepage=always`` or >> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >> +kernel command line. >> + >> +Alternatively, each supported anonymous THP size can be controlled by >> +passing ``thp_anon=[KMG]:``, where ```` is the THP size >> +and ```` is one of ``always``, ``madvise``, ``never`` or >> +``inherit``. >> + >> +For example, the following will set 64K THP to ``always``:: >> + >> + thp_anon=64K:always >> + >> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >> +not explicitly configured on the command line are implicitly set to >> +``never``. >> >> Hugepages in tmpfs/shmem >> ======================== >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 4249c0bc9388..794d2790d90d 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -82,6 +82,7 @@ unsigned long huge_anon_orders_madvise __read_mostly; >> unsigned long huge_anon_orders_inherit __read_mostly; >> unsigned long huge_file_orders_always __read_mostly; >> int huge_file_exec_order __read_mostly = -1; >> +static bool anon_orders_configured; >> >> unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, >> unsigned long vm_flags, >> @@ -763,7 +764,10 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) >> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time >> * constant so we have to do this here. >> */ >> - huge_anon_orders_inherit = BIT(PMD_ORDER); >> + if (!anon_orders_configured) { >> + huge_anon_orders_inherit = BIT(PMD_ORDER); >> + anon_orders_configured = true; >> + } >> >> /* >> * For pagecache, default to enabling all orders. powerpc's PMD_ORDER >> @@ -955,6 +959,55 @@ static int __init setup_transparent_hugepage(char *str) >> } >> __setup("transparent_hugepage=", setup_transparent_hugepage); >> >> +static int __init setup_thp_anon(char *str) >> +{ >> + unsigned long size; >> + char *state; >> + int order; >> + int ret = 0; >> + >> + if (!str) >> + goto out; >> + >> + size = (unsigned long)memparse(str, &state); >> + order = ilog2(size >> PAGE_SHIFT); >> + if (*state != ':' || !is_power_of_2(size) || size <= PAGE_SIZE || >> + !(BIT(order) & THP_ORDERS_ALL_ANON)) >> + goto out; >> + >> + state++; >> + >> + if (!strcmp(state, "always")) { >> + clear_bit(order, &huge_anon_orders_inherit); >> + clear_bit(order, &huge_anon_orders_madvise); >> + set_bit(order, &huge_anon_orders_always); >> + ret = 1; >> + } else if (!strcmp(state, "inherit")) { >> + clear_bit(order, &huge_anon_orders_always); >> + clear_bit(order, &huge_anon_orders_madvise); >> + set_bit(order, &huge_anon_orders_inherit); >> + ret = 1; >> + } else if (!strcmp(state, "madvise")) { >> + clear_bit(order, &huge_anon_orders_always); >> + clear_bit(order, &huge_anon_orders_inherit); >> + set_bit(order, &huge_anon_orders_madvise); >> + ret = 1; >> + } else if (!strcmp(state, "never")) { >> + clear_bit(order, &huge_anon_orders_always); >> + clear_bit(order, &huge_anon_orders_inherit); >> + clear_bit(order, &huge_anon_orders_madvise); >> + ret = 1; >> + } >> + >> + if (ret) >> + anon_orders_configured = true; >> +out: >> + if (!ret) >> + pr_warn("thp_anon=%s: cannot parse, ignored\n", str); >> + return ret; >> +} >> +__setup("thp_anon=", setup_thp_anon); >> + >> pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) >> { >> if (likely(vma->vm_flags & VM_WRITE)) >> -- >> 2.43.0 >> > > Thanks > Barry