From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13DBCC3ABC9 for ; Thu, 15 May 2025 13:35:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B1916B0083; Thu, 15 May 2025 09:35:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 05FAE6B0085; Thu, 15 May 2025 09:35:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E44BB6B0088; Thu, 15 May 2025 09:35:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C897B6B0083 for ; Thu, 15 May 2025 09:35:26 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A55D2B593F for ; Thu, 15 May 2025 13:35:27 +0000 (UTC) X-FDA: 83445239094.01.18B59BF Received: from mail-qk1-f181.google.com (mail-qk1-f181.google.com [209.85.222.181]) by imf13.hostedemail.com (Postfix) with ESMTP id E401720019 for ; Thu, 15 May 2025 13:35:25 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VcYh0175; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.181 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747316125; a=rsa-sha256; cv=none; b=ZP5AIr3F4rsQ3oDvl932DKFirxufR3Lza9KCxMyusZa+ljfx2qCTnW9wRcjP7wFEgrjrE+ iebgQD3WxaLztypqnd4is9cORHF7gmt2+pv3AjXjaQBtbELhw7mqTu5BlWbCgdqhyAaVfK 0W4YTY23QNDWu/n+z4YZBu1k514B3J8= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VcYh0175; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.181 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747316125; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=B/DF6x64D/fKACEnPhLYyL2miQ9EFIvp0WGBVg1OxD0=; b=YcRRPn2Jxu1FPnr3pFuXOAHOVJ8zBRsVhaRDXuLPw3KiY7TkCfWmgBM83ZCN7ThEsiVInB YN1bRqaGq0XeJ6CCz4AG/ONXpRRN8HYThjizmBGLS5tQfjpnWdul+PHQtEzrnHe9bZubSC ZuTGTGkHoF0UoJm/DlgsGkrlV8erhU4= Received: by mail-qk1-f181.google.com with SMTP id af79cd13be357-7c54f67db99so223312785a.1 for ; Thu, 15 May 2025 06:35:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747316125; x=1747920925; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=B/DF6x64D/fKACEnPhLYyL2miQ9EFIvp0WGBVg1OxD0=; b=VcYh0175rC4s4d8Y+DXavV4xOUkYRrMtOFRnN3qpqXaoX0rfC3S+cxEuHPZ7bEUE17 GcJZrT6Ck1OUXlmZwk957Q6GU+BD3k/iAB73u93quyKdWvjHh5gdjyyLUo8A4e1lz7oH 38/BZndV7C3iCL1I7R6RmhbBhSpBsbVtPL5MNrwtSouzeBZVJ15jZ0Za3XMZz76ptBA/ k9Wxp4YbIiIkJZDZ3CWqelU6Fflxt9BiswIREu5jd4G0RaW9xaM6O8HlEQKtccsaQqOH Tydnusx7AreqgxFlngMLQxNt0l+Uj5yDXhB59jIlocGQrJRrWKJGnDi4tTO8gIvUn9KZ 0JNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747316125; x=1747920925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=B/DF6x64D/fKACEnPhLYyL2miQ9EFIvp0WGBVg1OxD0=; b=k1tQWMvSbbCSNVp0UZ+DB6AYExiOgnRJR4lA/9zvScINDP1iIOIRBNOaNBsgF9KDg3 P82AzdMQlYC5hvrqcY8pTYSM/P0tUsO7lRT/j/SjOtuUcJM+6GrB9FA7j72oOPCgc9Kl iviuo2KlMO9Z5tkpL2nf88xvj3D1RLp+KUpbeqVKOt4/q4NKGY1Fo6Yq5+O1SgF51m2o L3JibjqqRMU+SWArGnyIgO2261MUGL2xYzIQV5/emfQNXZ0d1CeyhF4HVrZnPmygm+e2 Fi2Y+3XHU2EoaZXBG6kz21Mzl1hNoOgp7SwokOF/PN0DvdrE9KKS+lzWmlMDUCLknKH/ 1M7Q== X-Forwarded-Encrypted: i=1; AJvYcCXkEwU8z1PNGfO0rz+s350w+BvLdRFMXPHztr2ZK9uSfrxRMxnZfbJtNBOlh/I8YIz1BxPfuosgFA==@kvack.org X-Gm-Message-State: AOJu0YxbEKJN/aFcCPOIIvolYkxSsutnib8t2J1m7nDOjcSfnR1VtG0A ozfCL4EJHBYfvtMVA2bPzSEdRcPcCj1fyqfbLqMcerlCFsoB6V5EaSBusVpn X-Gm-Gg: ASbGnct0NGC5HbGnmaAknbemrT26YDio7FE+bd+zljA2wIN4WRYkgeVyP7LOuhPodI2 NwqyFZUqvkjrZ6T7fHfyrFbum/5fF1F7FQnXNnJPGjHpbmXlvCQCweJOnJ7TjddivLHuEGWrTpW 29tYT7rJWhmShxkLxqjoJG/m/hCDli8VPCR0VH8BWX9zs3D2eXHYLqx57bwtpNq7zhZ1/zoE30U n2aeXA+0ysjvmDLmtMWqdtQ752UHJcDcunCy7hs55u14uEpsmbEjikXEbweJ8G2Fv3TLacSf7vc yC2lL/UvjW7X1eIX+aCh69Zz36dviMB5KOWGq2UlyeJhHEz/uQ== X-Google-Smtp-Source: AGHT+IH1T7xQLSTGycqkKe1o9dKahkpCSZkhxgBizqiXTZZ6kY7wkbw61A/jppy451alD4weThfo9w== X-Received: by 2002:a05:620a:44c6:b0:7c5:d71c:6a47 with SMTP id af79cd13be357-7cd39de0c81mr510317185a.8.1747316124803; Thu, 15 May 2025 06:35:24 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:72::]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7cd2464aaf2sm304527985a.7.2025.05.15.06.35.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 May 2025 06:35:24 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH 2/6] prctl: introduce PR_THP_POLICY_DEFAULT_NOHUGE for the process Date: Thu, 15 May 2025 14:33:31 +0100 Message-ID: <20250515133519.2779639-3-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250515133519.2779639-1-usamaarif642@gmail.com> References: <20250515133519.2779639-1-usamaarif642@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E401720019 X-Rspam-User: X-Stat-Signature: y4k83ofk8duaaqpsoemmhs4tjgq3ube8 X-HE-Tag: 1747316125-615309 X-HE-Meta: U2FsdGVkX1/e+NiApfhZoePO3qMF6CH/jEUqA2xfzwOeIrFYeE95dFWtLh748AlP9325cYsXpZuCSG08AR2Tn2hLkVnJi9ed7q2JiGGOsfeurUZcAOX8jLaduQYWbfhbLMp8yyN7qp0+mUO2xEN9tEzWQ2MjXpqTTXAyYj9Kkzj0ikGwf4u8luiStw1Z9wrtA+VKPkHBn0VXC4xxvsJYnrvsoGU7d4HPTZN90/f2tMsZepXx3MwuU+7mZ+fLJGHEbSlDYdvT6oMfSZ2EiyKQuGF4s00BH4CVnp+qVCCSyYwbmNOjLhZ7d/CagnOjwY8d6+Kd0gKPWRAjL/1Jjeh7bcU4HVpXmTkcdzMpHLHa1tR+gbWmBINar3MwnPwin3D7yoCsyQT0Ej//ZuA9W2pMyT5mz9xWpX1uFQUI2hBHItL/sTR4VEviDM8uXXJPuiXaJtF/O+/z5VB+mSoo6++D5tqjM+43rzn00QxbYZ/LCHK0/nnKuWws51WsZc+l4HXmV8VQS3Mbu6P6elA7hF4nZtSsixu5GHOJFTfmRCRAV8BDbzfsrJV0OR00ngz79awb3uUUbL3oyPTdCgqeX3LDCkWfIwV1Uvz9APnvspovyOKhi7rKBw46T1vntcFx3+zD3uI3Ct53sh+hfWWGyMpyL6RxpNILFcYe0NHmApIIhaLOjLQ6QykBMO9OzOHZbd7NLGC/X56ScEJZT/Mek5HlMijjG2lnioSW98/EpK5dY9JN5m2mYUI3LgbreyBwMu180B7cGd1UzoyWzCJaMIEl2FuSu9zydaip9WYM7EM3NyV1gT+Dloe6ALgOL4g+obntPtD50eR1XYDq/zj9kbltaZlr4V/U411ddOyt2ctiKckJPG49xIJavyte3j4b+a3iWb21Vfs833lTprtMuOU1v72coA+/dRDcrJtoc3zIjeJbJJfmNE9Hrte87T/d442UpVyhuhw5VNDxx7PBZOh Liu9ksJW wIkbym7Qr1kfgSZtiz2+1qHdpLfieqtCBW+6LkKWZn84ULchMKDJB8thxLllO6+xNQMJKOqUKY6iF22bucd3i93tq/WE5Y4s/IPDPQ7qd0iozLzseSG494y1wg2JXmU7LOEd6yGPXlkl22OfdJLKs8I5iqgGZMG6GXMSILUkJ14NGpDxQWg0wI8KLVw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is set via the new PR_SET_THP_POLICY prctl. This will set the MMF2_THP_VMA_DEFAULT_NOHUGE process flag which changes the default of new VMAs to be VM_NOHUGEPAGE. The call also modifies all existing VMAs that are not VM_HUGEPAGE to be VM_NOHUGEPAGE. The policy is inherited during fork+exec. This allows systems where the global policy is set to "always" to effectively have THPs on madvise only for the process. In an environment where different types of workloads are stacked on the same machine,this will allow workloads that benefit from having hugepages on an madvise basis only to do so, without regressing those that benefit from having hugepages always. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 1 + include/linux/mm_types.h | 5 +++- include/uapi/linux/prctl.h | 1 + kernel/sys.c | 8 +++++++ mm/huge_memory.c | 24 +++++++++++++++++++ tools/include/uapi/linux/prctl.h | 1 + .../trace/beauty/include/uapi/linux/prctl.h | 1 + 7 files changed, 40 insertions(+), 1 deletion(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index e652ad9ddbbd..d46bba282701 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -262,6 +262,7 @@ static inline unsigned long thp_vma_suitable_orders(struct vm_area_struct *vma, void vma_set_thp_policy(struct vm_area_struct *vma); void process_vmas_thp_default_huge(struct mm_struct *mm); +void process_vmas_thp_default_nohuge(struct mm_struct *mm); unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long vm_flags, diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2fe93965e761..5e770411d8d1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1747,8 +1747,11 @@ enum { #define MMF2_THP_VMA_DEFAULT_HUGE 0 #define MMF2_THP_VMA_DEFAULT_HUGE_MASK (1 << MMF2_THP_VMA_DEFAULT_HUGE) +#define MMF2_THP_VMA_DEFAULT_NOHUGE 1 +#define MMF2_THP_VMA_DEFAULT_NOHUGE_MASK (1 << MMF2_THP_VMA_DEFAULT_NOHUGE) -#define MMF2_INIT_MASK (MMF2_THP_VMA_DEFAULT_HUGE_MASK) +#define MMF2_INIT_MASK (MMF2_THP_VMA_DEFAULT_HUGE_MASK |\ + MMF2_THP_VMA_DEFAULT_NOHUGE_MASK) static inline unsigned long mmf_init_flags(unsigned long flags) { diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 325c72f40a93..d25458f4db9e 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -367,5 +367,6 @@ struct prctl_mm_map { #define PR_SET_THP_POLICY 78 #define PR_GET_THP_POLICY 79 #define PR_THP_POLICY_DEFAULT_HUGE 0 +#define PR_THP_POLICY_DEFAULT_NOHUGE 1 #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 1115f258f253..d91203e6dd0d 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2663,6 +2663,8 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, return -EINVAL; if (!!test_bit(MMF2_THP_VMA_DEFAULT_HUGE, &me->mm->flags2)) error = PR_THP_POLICY_DEFAULT_HUGE; + else if (!!test_bit(MMF2_THP_VMA_DEFAULT_NOHUGE, &me->mm->flags2)) + error = PR_THP_POLICY_DEFAULT_NOHUGE; break; case PR_SET_THP_POLICY: if (arg3 || arg4 || arg5) @@ -2672,8 +2674,14 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, switch (arg2) { case PR_THP_POLICY_DEFAULT_HUGE: set_bit(MMF2_THP_VMA_DEFAULT_HUGE, &me->mm->flags2); + clear_bit(MMF2_THP_VMA_DEFAULT_NOHUGE, &me->mm->flags2); process_vmas_thp_default_huge(me->mm); break; + case PR_THP_POLICY_DEFAULT_NOHUGE: + clear_bit(MMF2_THP_VMA_DEFAULT_HUGE, &me->mm->flags2); + set_bit(MMF2_THP_VMA_DEFAULT_NOHUGE, &me->mm->flags2); + process_vmas_thp_default_nohuge(me->mm); + break; default: return -EINVAL; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 64f66d5295e8..9d70a365ced3 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -104,6 +104,8 @@ void vma_set_thp_policy(struct vm_area_struct *vma) if (test_bit(MMF2_THP_VMA_DEFAULT_HUGE, &mm->flags2)) vm_flags_set(vma, VM_HUGEPAGE); + else if (test_bit(MMF2_THP_VMA_DEFAULT_NOHUGE, &mm->flags2)) + vm_flags_set(vma, VM_NOHUGEPAGE); } static void vmas_thp_default_huge(struct mm_struct *mm) @@ -129,6 +131,28 @@ void process_vmas_thp_default_huge(struct mm_struct *mm) vmas_thp_default_huge(mm); } +static void vmas_thp_default_nohuge(struct mm_struct *mm) +{ + struct vm_area_struct *vma; + unsigned long vm_flags; + + VMA_ITERATOR(vmi, mm, 0); + for_each_vma(vmi, vma) { + vm_flags = vma->vm_flags; + if (vm_flags & VM_HUGEPAGE) + continue; + vm_flags_set(vma, VM_NOHUGEPAGE); + } +} + +void process_vmas_thp_default_nohuge(struct mm_struct *mm) +{ + if (test_bit(MMF2_THP_VMA_DEFAULT_NOHUGE, &mm->flags2)) + return; + + set_bit(MMF2_THP_VMA_DEFAULT_NOHUGE, &mm->flags2); + vmas_thp_default_nohuge(mm); +} unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long vm_flags, diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h index f5945ebfe3f2..e03d0ed890c5 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -331,5 +331,6 @@ struct prctl_mm_map { #define PR_SET_THP_POLICY 78 #define PR_GET_THP_POLICY 79 #define PR_THP_POLICY_DEFAULT_HUGE 0 +#define PR_THP_POLICY_DEFAULT_NOHUGE 1 #endif /* _LINUX_PRCTL_H */ diff --git a/tools/perf/trace/beauty/include/uapi/linux/prctl.h b/tools/perf/trace/beauty/include/uapi/linux/prctl.h index 325c72f40a93..d25458f4db9e 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/prctl.h +++ b/tools/perf/trace/beauty/include/uapi/linux/prctl.h @@ -367,5 +367,6 @@ struct prctl_mm_map { #define PR_SET_THP_POLICY 78 #define PR_GET_THP_POLICY 79 #define PR_THP_POLICY_DEFAULT_HUGE 0 +#define PR_THP_POLICY_DEFAULT_NOHUGE 1 #endif /* _LINUX_PRCTL_H */ -- 2.47.1