From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27E98C3ABBF for ; Wed, 7 May 2025 15:12:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E37A06B0098; Wed, 7 May 2025 11:12:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DBE796B009A; Wed, 7 May 2025 11:12:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5F886B009B; Wed, 7 May 2025 11:12:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A6A5E6B0098 for ; Wed, 7 May 2025 11:12:09 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 10A8D5A115 for ; Wed, 7 May 2025 15:12:10 +0000 (UTC) X-FDA: 83416452420.14.8A4442B Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by imf09.hostedemail.com (Postfix) with ESMTP id EEF1E140014 for ; Wed, 7 May 2025 15:12:07 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Pulynv20; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746630728; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=phVaMalRtUk1F/w1Ce4FEueMLSRuW8rkNqtSAq9Dvac=; b=4XsnYQf4ynbKb640IW/ur5Mi1KrWbzWY0po51w2B2hs8T4mI7wh9+hKgsJzdhzqKMUA8aT lH+nfU3PTPDSZb70r0DbSKqxrPz/xjRW0gcH533KGqaumIGHJ58qnwfYHmyRXVmUacg0Qi DnCTs+eBZbLGEczhFd1RcZq93aXmXus= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Pulynv20; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746630728; a=rsa-sha256; cv=none; b=DBxcEGtZHUASvEmZ7OljIg2cPL187/3kJ0Vil38c34utIx7wr0gxejHIkN00fwBQjtg7Wm hIoXIDZu2yDoLj+Cr/Whc2LXG28B8inTT3+ToU9EBus7e2wg4+OXtCGagEd1+PxB8n+S3J 5f/uthkyhBuHpr5BqGpf4R5IMc25hIs= Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-5fbee322ddaso1832952a12.0 for ; Wed, 07 May 2025 08:12:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746630726; x=1747235526; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=phVaMalRtUk1F/w1Ce4FEueMLSRuW8rkNqtSAq9Dvac=; b=Pulynv20aqniRA1mMMjFs31+GQxkLBBWH+vbZlEb4r6SkqsDb/Ep3u4hhMQQOsKUCT N1p0QJJqyaXyXPiS7J9DXFflgpym626ZiI6T5dykGi+EQ5oc9o4S2C67i/JQcPoQneK1 gsJcUoJNukhqCBM8KgiUHdimJp4AwrfzRUTp04nIOGR9wAuscZrwLR3RFyq5EBy/d3ff OYzJs2MPx/hPASOYSatrRh0drbcrhlT9BDhEyCuGjE4eAEGT/jrUVXFNmvyNTbn5JvrJ CjtjnsrR3950b+8E0cmK9zNXJAQ2QfLx0+caNBNDhuZGo+oTCHXl37Spto/UfP1QRNGm XWcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746630726; x=1747235526; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=phVaMalRtUk1F/w1Ce4FEueMLSRuW8rkNqtSAq9Dvac=; b=fvAi0a5DsWPoenDhHPZLP9cq0SoRGqW7YLN/x6F0vIB7OfyRoL/s1XI2gSZQfEfIN/ Pual6ffe5oiQxy8qw+6Im8S03U5fosVwDWckYb6r8NscpEIFB0EyK4NLlbZp87AUKgPT ehrYteird7fkaG3hTR7H0ihJIBle2D085aJEqNNq+Amsdy0hHRlpiS5KXZciQU6a14cA fygD6NlHi9iLqBTQeNaEx6ptW0wEmDrt/V39p3+UUiX7pwvlvaSr2iRa5S69noslFkH+ +NVNLmbSO9OnJl0XLFwNqnnCfPIJKFGiqXTQ8aaDdv7MDnL51+nSiLsUKq5i0a5QTeaI 5Paw== X-Forwarded-Encrypted: i=1; AJvYcCXgkWpWUa/8H7fLdYBYdLTmjpC83wClMJ/VKBaWc7VmXG4vYR3bq9dQdwd9X05yTaqnYwcyTNYauw==@kvack.org X-Gm-Message-State: AOJu0Yw8+ofn2u5eT0FYAIUcDpDZts5/Y0gHVUfL4KKj9xQxGPCfi7KJ v2JvRYl/EiBy4xRCTo0iW7Kcv+fDNSTRSMEpoS0rYtDIFY9CYy9D X-Gm-Gg: ASbGnctj91+MTGOpIxCjHOK08mOGsnNiI1Uz7TUDlGOeaiwQaZAMsA9xysaYgFBL/Jj KfmfFNU1XbPo/0daoYsIETwT6nJbPmMPIPkDZ1H6yrrCzx1zFu97UQUu1OQywY/JskmBq4Ysf61 C96WaIICqSYLzuWtKVNbCgk4iwo5QPR4kdrnUKEmLoZz1UO+1JNl+b9hd96OxByh8LrJ9UlQ0rM fo8wCaNk5UsYkqh8wPZIWRaG9hlJYGD7P+a5i/9GLIXcvs3lQ086wbAZudwYt1S8Jmw9Gnmm76u HSVOdh+6RaIBEkb+R/xDilQAD5CFgX3sUGW8VIyoL7EQOpPugddckzfi05VUvYcAnoUOd4opnaO 4AcXfStySSRm/X/DiyL0= X-Google-Smtp-Source: AGHT+IHcUBa+Rs4pkTmWUIiYUBQNNXDI9hmy1bvM1LVySE4mxULGdi4P6Qz2kTYMolzufjjvwBx7ww== X-Received: by 2002:a05:6402:524b:b0:5fb:1f36:f101 with SMTP id 4fb4d7f45d1cf-5fbe9f485c8mr2850023a12.23.1746630726000; Wed, 07 May 2025 08:12:06 -0700 (PDT) Received: from ?IPV6:2a03:83e0:1126:4:14f7:eab6:23d5:4cab? ([2620:10d:c092:500::7:6396]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5fbfdfff331sm417360a12.56.2025.05.07.08.12.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 07 May 2025 08:12:05 -0700 (PDT) Message-ID: <96eccc48-b632-40b7-9797-1b0780ea59cd@gmail.com> Date: Wed, 7 May 2025 16:12:04 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 0/1] prctl: allow overriding system THP policy to always To: Zi Yan Cc: Andrew Morton , david@redhat.com, linux-mm@kvack.org, hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, linux-kernel@vger.kernel.org, kernel-team@meta.com, Yafang Shao References: <20250507141132.2773275-1-usamaarif642@gmail.com> <293530AA-1AB7-4FA0-AF40-3A8464DC0198@nvidia.com> Content-Language: en-US From: Usama Arif In-Reply-To: <293530AA-1AB7-4FA0-AF40-3A8464DC0198@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: EEF1E140014 X-Stat-Signature: hr4fht89mzcixb3ekhqnqd5nyuq45far X-HE-Tag: 1746630727-31457 X-HE-Meta: U2FsdGVkX19m6fa2rE0ySKMptu2mcA3bHDBNSkjYtItZk44jngpkRKjlYfBldCMlw3YrFbuZMGCEsSnBHdyvD7ggDVozObKXMwHcARgp7MhhTEu0fjHU49Ob25ze4L5lQssTjVnZWLR46YxYJpTtimflcfNQD+aYSZn2JE5n5O1Akmlms/CI8I7FZ2x1v19uL923JFEhXS6P6SmiBCtwKcryxQ9lpvIJkqkxMBrX7eQ9fwG4peFXsvB6TiD9dq83n/fC3oMVtcsmh3NOcZxZgbQR+Xdi5FNQsLF9o3IkKqGCoWMK4itTl9nTo4sSl/m0KaCMNX4JQ8ZVI5AyGGQcThsfbu4aORDsKT8P8eVDoKqaWeNepijEzcjyzWhyJOuR7p86q0EynvLjOXzSPO0Wjjz3tIo2gIYAff0TwlkXERrHuf4aYc6haDsHJvA01BPa2LiAGGQW83kit7mxKTF2MxuhGfcMH1R8dN73kN7KJRT5xYZ2R8wLEbITq5rNY4Hxl1he+C0AcLm5u3FYB5xce9r2lMLgGRV26gQW/x8VqexvkDpH1hx/dO5lgxdxJ9JoN/FxmozC88P9AQz4tZ2eOJzBEKRYjhxvYoNbZ8H2K6+9345fMZKU+pndDsrUm75lqymY1o23T4w+NPqKnC5nq/wk+eKN3pQJBTlo43skH+zrW1oBcoeLlwZQfGdvLlf+7+jBzpZksIHjC+hKHTaj0V8HxTUcxv7zaGe0LapkvjwFloEiZwZ3QkVC9y1PVu0zlcA0hiyPrnCZS+zqckyuKiYP69O89wcUksJtSUgr69NpvUCurrvk6DwL4YN4ebH/rfL0dBucvcfBzmkcprASOd6rZRAtywwQVf2HYQ60wEHW8UaXgwg9ZhBCjjS27bb+onxVfgFdi5Jj8W1tYJKRPRsZUv0JhqxgRjHeOEvYLQsSBFKc39owj3zmYRvklPsvIABjjQzzp+PsGKE3GVh nOAvNRzu M44Po/GvKHcOuU/CojZzmgrJD1irikaIWYL4b3sAksM5PqPy5mbsrkRZDwBNy8m20vvyBjWnPcasuLYzjZNchBPyHeR57KRx4RQ/qcqUb2GP9ZEL6cg0L7XtEPx6/QVvi3jlvOtaNd6wHI+RQ1RYuSrj4eIu1RdJdKO8P14H4f1iesnmOiY/BJ9ALfC7Dv3DLiCZI5ieY1D078dCgnay5vt4CcaTBdIUQRFYbVte7Elx19JHq2Fql9Z/eOE9U2Oew/bS+ovCGMcDNMZ9wl1iqH/oxoZXJxcUa9d39t/ICf2U0YHeX1lg/o0NDeayHumxGTxhGAGrmpTwHFxWiLZadQXbdz+SjWw41umim5ANTG0dAcBTWtH5d0Mdm1Ah2dNZgE69gXMvV17tvXqofskQyDUSqCHTjaMCeoQSV7OJw+WmgSggf3pwYk6bYFjXrK4oC42CFeJkztMJMk7VnR/3pRJN/0i03rxJ27+qwn6TKZMLPmfxfHLQ/ykLWRCtZiBxaINvjlAJ3lruG7RpWjAeju6I4gpq3zbK3OuTJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 07/05/2025 15:57, Zi Yan wrote: > +Yafang, who is also looking at changing THP config at cgroup/container level. > > On 7 May 2025, at 10:00, Usama Arif wrote: > >> Allowing override of global THP policy per process allows workloads >> that have shown to benefit from hugepages to do so, without regressing >> workloads that wouldn't benefit. This will allow such types of >> workloads to be run/stacked on the same machine. >> >> It also helps in rolling out hugepages in hyperscaler configurations >> for workloads that benefit from them, where a single THP policy is >> likely to be used across the entire fleet, and prctl will help override it. >> >> An advantage of doing it via prctl vs creating a cgroup specific >> option (like /sys/fs/cgroup/test/memory.transparent_hugepage.enabled) is >> that this will work even when there are no cgroups present, and my >> understanding is there is a strong preference of cgroups controls being >> hierarchical which usually means them having a numerical value. > > Hi Usama, > > Do you mind giving an example on how to change THP policy for a set of > processes running in a container (under a cgroup)? Hi Zi, In our case, we create the processes in the cgroup via systemd. The way we will enable THP=always for processes in a cgroup is in the same way we enable KSM for the cgroup. The change in systemd would be very similar to the line in [1], where we would set prctl PR_SET_THP_ALWAYS in exec-invoke. This is at the start of the process, but you would already know at the start of the process whether you want THP=always for it or not. [1] https://github.com/systemd/systemd/blob/2e72d3efafa88c1cb4d9b28dd4ade7c6ab7be29a/src/core/exec-invoke.c#L5045 Thanks, Usama > > Yafang mentioned that the prctl approach would require restarting all running > services[1] and other inflexiblities, so he proposed to use BPF to change THP > policy[2]. I wonder if Yafang's issues also apply to your case and if you > have a solution to them. > > Thanks. > > [1] https://lore.kernel.org/linux-mm/CALOAHbCXMi2GaZdHJaNLXxGsJf-hkDTrztsQiceaBcJ8d8p3cA@mail.gmail.com/ > [2] https://lore.kernel.org/linux-mm/20250429024139.34365-1-laoar.shao@gmail.com/ >> >> >> The output and code of test program is below: >> >> [root@vm4 vmuser]# echo madvise > /sys/kernel/mm/transparent_hugepage/enabled >> [root@vm4 vmuser]# echo inherit > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled >> [root@vm4 vmuser]# ./a.out >> Default THP setting: >> THP is not set to 'always'. >> PR_SET_THP_ALWAYS = 1 >> THP is set to 'always'. >> PR_SET_THP_ALWAYS = 0 >> THP is not set to 'always'. >> >> >> #include >> #include >> #include >> #include >> #include >> #include >> >> #define PR_SET_THP_ALWAYS 78 >> #define SIZE 12 * (2 * 1024 * 1024) // 24 MB >> >> void check_smaps(void) { >> FILE *file = fopen("/proc/self/smaps", "r"); >> if (!file) { >> perror("fopen"); >> return; >> } >> >> char line[256]; >> int is_hugepage = 0; >> while (fgets(line, sizeof(line), file)) { >> // if (strstr(line, "AnonHugePages:")) >> // printf("%s\n", line); >> if (strstr(line, "AnonHugePages:") && strstr(line, "24576 kB")) >> { >> // printf("%s\n", line); >> is_hugepage = 1; >> break; >> } >> } >> fclose(file); >> if (is_hugepage) { >> printf("THP is set to 'always'.\n"); >> } else { >> printf("THP is not set to 'always'.\n"); >> } >> } >> >> void test_mmap_thp(void) { >> char *buffer = (char *)mmap(NULL, SIZE, PROT_READ | PROT_WRITE, >> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); >> if (buffer == MAP_FAILED) { >> perror("mmap"); >> return; >> } >> // Touch the memory to ensure it's allocated >> memset(buffer, 0, SIZE); >> check_smaps(); >> munmap(buffer, SIZE); >> } >> >> int main() { >> printf("Default THP setting: \n"); >> test_mmap_thp(); >> printf("PR_SET_THP_ALWAYS = 1 \n"); >> prctl(PR_SET_THP_ALWAYS, 1, NULL, NULL, NULL); >> test_mmap_thp(); >> printf("PR_SET_THP_ALWAYS = 0 \n"); >> prctl(PR_SET_THP_ALWAYS, 0, NULL, NULL, NULL); >> test_mmap_thp(); >> >> return 0; >> } >> >> >> Usama Arif (1): >> prctl: allow overriding system THP policy to always per process >> >> include/linux/huge_mm.h | 3 ++- >> include/linux/mm_types.h | 7 ++----- >> include/uapi/linux/prctl.h | 3 +++ >> kernel/sys.c | 16 ++++++++++++++++ >> tools/include/uapi/linux/prctl.h | 3 +++ >> .../perf/trace/beauty/include/uapi/linux/prctl.h | 3 +++ >> 6 files changed, 29 insertions(+), 6 deletions(-) >> >> -- >> 2.47.1 > > > -- > Best Regards, > Yan, Zi