From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2029C3ABC0 for ; Wed, 7 May 2025 14:11:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C97476B0083; Wed, 7 May 2025 10:11:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C44396B0085; Wed, 7 May 2025 10:11:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ABF306B0088; Wed, 7 May 2025 10:11:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8C0E66B0083 for ; Wed, 7 May 2025 10:11:41 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 95354803C4 for ; Wed, 7 May 2025 14:11:41 +0000 (UTC) X-FDA: 83416300002.03.1D85278 Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by imf12.hostedemail.com (Postfix) with ESMTP id BC2D54000E for ; Wed, 7 May 2025 14:11:39 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="V/UDu/IJ"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746627099; a=rsa-sha256; cv=none; b=uJp+n+sc1vip1ZrYoNRtTUsQQvg0pPE5Fh+lSz+gXczKwrnfC3vXngXIUrsHRvGKfW8uJ7 gEEQ69BYMmesNFceGiAWd8SttXB8fd/rUvSKSUlwuHn1OK0FqdtZu8xYDJjTa/hOPh51i8 4cOCGLwtBFClqjXDXsEW+Ug1M3bBrpM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="V/UDu/IJ"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746627099; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vm+MHUBv09afGPstqZ6g33kZoaoECt43wozuRnFS1ks=; b=wrCPVi7p0CFTFVdNWlPX8iUxk4a/NR++U8e64lbCMd9RT4zGF7dgLx8Hhi+R0g8UWXXI2m senAS0iy07t14nB4MUpLI6e/gCEXbKBSyH7Naf1bIvOMsGdfYA0emppMyCMc2KmfCybKrR YsvMbIJWhkau/aMpRm4XTgJHyqPbXaQ= Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6f548a4ea4dso2886396d6.1 for ; Wed, 07 May 2025 07:11:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746627099; x=1747231899; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vm+MHUBv09afGPstqZ6g33kZoaoECt43wozuRnFS1ks=; b=V/UDu/IJLT00zMbaOKpgYcR4f7ztZ30t/8PdQ90fJ87BkelUZoJBulPzhF/hh4udxT P4NgVYylTLB7wg9k864P1FzTyWIFfKnF9X/ul4yzcqDchR5RfvWtnWGTw6eoyFEjkbhC v8hZArMgx6eUii50m0xJa/1yHTpMcrvG5Pyx+x9+rsaPGcPVdFZbOAUk0kijoh8ghDGl etzOeVeTXXiE1oCApzq/cdzP9PwNEQeXkd+Fy9khULW7NcuP9FDeGVxxgcl5GvMIDnjC MSzsaKmwqLxb5xr5GzrV1PWAipgWY3lceY2NoJ35tFaAQSjAURApr2pNcmUpJEdzhuiB KsxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746627099; x=1747231899; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vm+MHUBv09afGPstqZ6g33kZoaoECt43wozuRnFS1ks=; b=FsBozy2HpqqOArhdbon3l3Q2uBhv8BXMnQ+OYKsoL1OorKszwWKoDt4Uv6QNvRtQOg sJyffeOLt6ZH5q+3OJCzxwfP8X4PVmuYAvgLSrCvbdkhGwwf0u5mQlnMiED7qUhR2m/W KIXuZcIXvLQgUOnFZs4GPd7NkPHNbiDvFHclO5/F5X5GKmYgPgJ3542bOTgiKg0LnD1N TbPdCybiGm5OkCFuzyyH/tzlYFhRN4r11gqWWLyYVWqg2f8RpiH2LfjOzfmn2pIO5Bjg Ptn2q9nLAwnyT52Pe+nlzlEhJ7fdPIORazud1o3aX0PvX4/xSYAzrU/vaE18sm85YUb6 1eFw== X-Forwarded-Encrypted: i=1; AJvYcCXjFmEj34/S5K2jc9gIn126sbRwArXByZCI/BhzjPIROjCrQokxbLKcIGqvRRn008J837SV1KfYlw==@kvack.org X-Gm-Message-State: AOJu0Yz16vBP96lJfsDg3h8DR9/EJcDc8KMNEzcYwcM4FL0BKp/v8oGl nuvz3yoL09wf3KL1FLKsO97B0HtHpC0T8Coga2rCT6H9GIjiF0pY X-Gm-Gg: ASbGncvtlM/kzX8DwLJeyypS4wTBNrxVlIabVLb76H58xKx1sg3yYstE393sEfa/eYp ZNJJ8jmPB/vbdKx5+gDxpH0sg3JYegP926xf2d8805+vH3aP7IqUF0KRJ5LorZITWuBRmJEvWe7 AS3HaZJipldTMJpEnnODmnXXWhmJJefLPhixpG4/lp6TsdCQLvl3gueCeNDQQ8TTf3jlcLfz9CK UVEAgXsR5TDUvZwqDDSu0qh+l7Jd8A/kNrJwGpXBWhMDVa4sfSBhDyB/FSdHvyUS+68qb5j83vU NRDddYUvZ5kJtFdbtcKcptUDb286sUe7/si5IQ== X-Google-Smtp-Source: AGHT+IGJu/VKlLDFa89vNWFcY6rHfDrXU1rIHuIP4kHVkuFiNxpZM5MMjkM5oXVKvn4vhVXHF/ryzA== X-Received: by 2002:a05:6214:c6b:b0:6e8:f4d3:e8a5 with SMTP id 6a1803df08f44-6f5429f60c4mr53500276d6.15.1746627098583; Wed, 07 May 2025 07:11:38 -0700 (PDT) Received: from localhost ([2a03:2880:20ff:7::]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6f542623f1esm14529706d6.8.2025.05.07.07.11.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 May 2025 07:11:38 -0700 (PDT) From: Usama Arif To: Andrew Morton , david@redhat.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH 1/1] prctl: allow overriding system THP policy to always per process Date: Wed, 7 May 2025 15:00:34 +0100 Message-ID: <20250507141132.2773275-2-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250507141132.2773275-1-usamaarif642@gmail.com> References: <20250507141132.2773275-1-usamaarif642@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: BC2D54000E X-Rspamd-Server: rspam04 X-Stat-Signature: mopzusfni1p1puat86rzfopqnbauu6w4 X-HE-Tag: 1746627099-799063 X-HE-Meta: U2FsdGVkX1/M8B9PTUQ3qKWJFnuORaiDYvsPNiFKTCOTMcSkwejw4oTqkaDGI3aY4SM/SXuHt4M6aidhM9LHPBcG83D6WTDnO5GisJwJkvrs8t6MmT7WbMtW8DE9be+KC2BoZXP0KI9KE4LErtPlhK/4TGPh5lxuf3DJO/MEQTKnXqbRoTxoLu4ed6Ybd1Qokft1IWYGRLkRePGRZDw2lCjvqurVjOqqVQMcZ7ya3eNksCPv36GCIbYiGSLDScEcAOLJ7eA+mYyvro/OXkP81wyLjFuKe06/l3Rk5Q0LJ/H0ItYllDJNFQp09veW25nvgJeyX+4DTyW6U0CUCfzxNpD3XLgbltWcAtdO531g/bxkzRxQoVtCNZm1RoI0dnrlXg6SvGnY7t4pj/QBuSA/b0+UsFBnHy3XoaDrZ10ZRRdQhxwPpuwS+C8GYLoi4j5swD6kVIQxftt5yNF1pWjVoxHFedN0ad1AbbXuDttR48yuIFmHqknYHAP75dEeUXpC/dq2Ex63EkHs+vfsCXEmS8t5cvkgAW2l9XZ19wCrtpYvrhTjA2QWzSOkaQ3hqd21ELe6/Op+qZFCXSjHBrpYCizHFIncJhwt72+lPSgwAoio3D26B6iJlsuZEVpFDiokknOa73d1orPbi6A55ypWGeR/suPYXP+y4eX5kxM56Wwh9dlzrNH8sAqHSKEz3WFEXQ7x1qzGw1N6gZcdSC4DdkgVbeQnIjhKjSHYgcIYxDw1MFmsVawbK9ISg0dwCenI7fzlrS9ajXROz8TPycBVfF0/IW3/7C6TkohCwwt5AcaR1mltW8RNTKtJOhU7aO29bJ3c/ZJhyDTzUWrukt7ymCSyqTXvPzpodvdOfSQlOS8kYWl/P3HrmBfxxnQKse92FuWEPT/dRwSV6mV6dKWaUOYEeoCLZyWczSoqjJEaFyQ2eQ1JFqZPusr9ahuBQ/h/GNUlt9FfM4nSQU/uZc2 FXO1fjrL pCQKTitzVTlZ+kCLIR+ebELag/YHcSPaXKzVydNett9unSTtrgkzYT8uJeXJmdAAIEei8Q9JE5B7yRJudefR3L38c1SFJylg9gK4VtgyjAZNFMttPEe67J7p62QlVhJET6mIj/kPHZS5/k3nyRv4dwefD8oLavUM+1QnvyJkAbepfGqg+wPA2MjTzvhuR07y8rHiXzBxXbTQQlxxKj1Yw7jO3VADhNxmUx7bWVyJu8gQPUUSqzJjerPVQGpmbE5/9A0r/wfuF/8YsiZpIJr4lAp8ZBeRVFWpruOX0qY04FhX2N1RmZt4m9ebE09CASCm9ULo+mJDylT2Qcb86L8AB2GrZyH+t2GIvApmRqSEmATyPi/NYvy2Df4NqKiuBLWW3fznt3pQnGWid3VXNtSD29IQqmM4yv4nqJ3xhVT95BZux28VeMEqevD90B4k74wep0iqYPwwkSpyluOA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Allowing override of global THP policy per process allows workloads that have shown to benefit from hugepages to do so, without regressing workloads that wouldn't benefit. This will allow such types of workloads to be run/stacked on the same machine. It also helps in rolling out hugepages in hyperscaler configurations for workloads that benefit from them, where a single THP policy is likely to be used across the entire fleet, and prctl will help override it. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 3 ++- include/linux/mm_types.h | 7 ++----- include/uapi/linux/prctl.h | 3 +++ kernel/sys.c | 16 ++++++++++++++++ tools/include/uapi/linux/prctl.h | 3 +++ .../perf/trace/beauty/include/uapi/linux/prctl.h | 3 +++ 6 files changed, 29 insertions(+), 6 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 2f190c90192d..0587dc4b8e2d 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -293,7 +293,8 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, if (vm_flags & VM_HUGEPAGE) mask |= READ_ONCE(huge_anon_orders_madvise); if (hugepage_global_always() || - ((vm_flags & VM_HUGEPAGE) && hugepage_global_enabled())) + ((vm_flags & VM_HUGEPAGE) && hugepage_global_enabled()) || + test_bit(MMF_THP_ALWAYS, &vma->vm_mm->flags)) mask |= READ_ONCE(huge_anon_orders_inherit); orders &= mask; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index e76bade9ebb1..9bcd72b2b191 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1704,11 +1704,8 @@ enum { #define MMF_VM_MERGEABLE 16 /* KSM may merge identical pages */ #define MMF_VM_HUGEPAGE 17 /* set when mm is available for khugepaged */ -/* - * This one-shot flag is dropped due to necessity of changing exe once again - * on NFS restore - */ -//#define MMF_EXE_FILE_CHANGED 18 /* see prctl_set_mm_exe_file() */ +/* override inherited page sizes to always for the entire process */ + #define MMF_THP_ALWAYS 18 #define MMF_HAS_UPROBES 19 /* has uprobes */ #define MMF_RECALC_UPROBES 20 /* MMF_HAS_UPROBES can be wrong */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 15c18ef4eb11..22c526681562 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -364,4 +364,7 @@ struct prctl_mm_map { # define PR_TIMER_CREATE_RESTORE_IDS_ON 1 # define PR_TIMER_CREATE_RESTORE_IDS_GET 2 +#define PR_SET_THP_ALWAYS 78 +#define PR_GET_THP_ALWAYS 79 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index c434968e9f5d..ee56b059ff1f 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2658,6 +2658,22 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, clear_bit(MMF_DISABLE_THP, &me->mm->flags); mmap_write_unlock(me->mm); break; + case PR_GET_THP_ALWAYS: + if (arg2 || arg3 || arg4 || arg5) + return -EINVAL; + error = !!test_bit(MMF_THP_ALWAYS, &me->mm->flags); + break; + case PR_SET_THP_ALWAYS: + if (arg3 || arg4 || arg5) + return -EINVAL; + if (mmap_write_lock_killable(me->mm)) + return -EINTR; + if (arg2) + set_bit(MMF_THP_ALWAYS, &me->mm->flags); + else + clear_bit(MMF_THP_ALWAYS, &me->mm->flags); + mmap_write_unlock(me->mm); + break; case PR_MPX_ENABLE_MANAGEMENT: case PR_MPX_DISABLE_MANAGEMENT: /* No longer implemented: */ diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h index 35791791a879..f5f6cff42b3f 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -328,4 +328,7 @@ struct prctl_mm_map { # define PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC 0x10 /* Clear the aspect on exec */ # define PR_PPC_DEXCR_CTRL_MASK 0x1f +#define PR_GET_THP_ALWAYS 78 +#define PR_SET_THP_ALWAYS 79 + #endif /* _LINUX_PRCTL_H */ diff --git a/tools/perf/trace/beauty/include/uapi/linux/prctl.h b/tools/perf/trace/beauty/include/uapi/linux/prctl.h index 15c18ef4eb11..680996d56faf 100644 --- a/tools/perf/trace/beauty/include/uapi/linux/prctl.h +++ b/tools/perf/trace/beauty/include/uapi/linux/prctl.h @@ -364,4 +364,7 @@ struct prctl_mm_map { # define PR_TIMER_CREATE_RESTORE_IDS_ON 1 # define PR_TIMER_CREATE_RESTORE_IDS_GET 2 +#define PR_GET_THP_ALWAYS 78 +#define PR_SET_THP_ALWAYS 79 + #endif /* _LINUX_PRCTL_H */ -- 2.47.1