From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD5CDC369DC for ; Tue, 29 Apr 2025 15:26:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B59AC6B0005; Tue, 29 Apr 2025 11:26:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AE0DF6B0007; Tue, 29 Apr 2025 11:26:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 981B16B0008; Tue, 29 Apr 2025 11:26:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 764146B0005 for ; Tue, 29 Apr 2025 11:26:46 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 6FA01B88EC for ; Tue, 29 Apr 2025 15:26:47 +0000 (UTC) X-FDA: 83387458854.18.1AA8B72 Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) by imf14.hostedemail.com (Postfix) with ESMTP id 5AACB100012 for ; Tue, 29 Apr 2025 15:26:45 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YQKMeH4f; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.208.51 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745940405; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5o3YZLthCdx1ZsnJiax3XnbHWY0e82uvxz2zjUXCWIc=; b=f8vBxNRgzkFQkUFsuIol8WOj6RqiY6oIV46N6yWE04Kkpx92mLkV5rq+JxpwDX8VJOLPzO Tfv7UmjTgcVKvCDaBZeh7HdAt9ZfhMxiBAeYl27NyNYFPMkoIAyq6ZbL7al4AJaX+VgOwv xtNFRV9R8KePPvq2XYE66bw5wAbHbVQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745940405; a=rsa-sha256; cv=none; b=AONB0unauaMAoWVQtntU9Zxi6yOWonUl6uPi8Rah9Z9MP7dAYGwSxod5wG48IaarNvxl9X HbXEAlaPAayaXIG8qW9V6dGcN2oFvlTy+ZfGcjz7hE7CdpnDyq0QcPxD4cjzipZb3INQdV 9SKwLGqFCoG4i5zivMj+94ovRcDOJes= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YQKMeH4f; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.208.51 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com Received: by mail-ed1-f51.google.com with SMTP id 4fb4d7f45d1cf-5ed43460d6bso9239377a12.0 for ; Tue, 29 Apr 2025 08:26:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745940404; x=1746545204; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5o3YZLthCdx1ZsnJiax3XnbHWY0e82uvxz2zjUXCWIc=; b=YQKMeH4fFbuT3H0rbfpijnj71VU5Ke+OZkJM5kkeY1rv65elgVyHinImM6gCTswS2M QBMyxfI2KT0TuqSeSW5jYOxteZB1DvKczxIz3KMZS99FlPrUBJ/JYYSAc6dngbqK8SuL SUalicqkIthaZOcoFlXTMeMu/T0rP5gh+Vlzhnsl6sOpmOupX9XwxELnkWNdCHkR5jpW VrlxI28WMVl2l+Ybwc7Nq/oH+Qyg6LT44tlrt/0FzUrVBA3UZRA6PRhf0K1kCQmXDSIP 2MZ0n4TJExpC+DVjrg0dlupo8ivQgIVAHbhtpOVwRjsV9N9os0MZ1WIpgGJQhQCL1nkE KUxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745940404; x=1746545204; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5o3YZLthCdx1ZsnJiax3XnbHWY0e82uvxz2zjUXCWIc=; b=K/ILtMHRUbQ1gtQlTLmHVtUFGlasp5B522enbx190es0/0T7q87pURTQD+1bUsKunD JSC7LBrXAk6Pflum8bKffQdVzMP9f8l2uKbf+pjas65O3FV5/bSbYTYmTZftK1Xvkbh8 TXpTJQCp44NvXt/35PVgLIAAx5jpC37MlLtPbig/hp40NwSKHdR4D+hseUigMLVDfx/v lF6vqRMLgTLOmJHDFSJ5+61zA7sGN1JntDp104O+a/RBpUezPDnRxslJWiQ5ZiKARK94 CfHO4eUFh0lVnbjuUeOFHT9lAh/eZO1wjNmWGwrpJUkFcuM6Wb6XuUnpU2eF9/scgAhf RX6Q== X-Forwarded-Encrypted: i=1; AJvYcCVkMgf7SJ6Veng54TZIAbiFdNRtB65DdUYYe3wEDugq/uN8htBwXUVa5R9DQtAc+9HvzDR8xKSnXA==@kvack.org X-Gm-Message-State: AOJu0YyOsBevTrUv/WcVZ4Ds2nzFdniUJ1Nx1Z2WIV1uXYeI6GdILcdX QyfXzpeJE9Opdlo+g6VSWyvRsIoAtw4seb531hVQCn2ju2kmXFXLmd6Lj1y6NpLGExiWXu78kce +4f99ThGQB4h2FfbZRoN6ycp70DdSHQ== X-Gm-Gg: ASbGncuui7JzQC6NjYXNpP1o+yIpdNJ+0TT9GEjt5LCElgVR6AL6U68+7b4iRE9XnoJ xsIH+BfJ+Ugyl4E0RS/zL9aFEfw2KhMdtRibHG5eBdq7T5vawBdLljR8WYFLqBS6B3RDm886qBT 2fbdts7I1AzkbstBT5CRt3sXiYjknOi1TF7fUZoA== X-Google-Smtp-Source: AGHT+IEKrqfR6pygOVH5iWkcP8cayTZnb9oghXvklRUVGTHRWYz7KdjAwuAIc7K60b5hEFyldG4JSiVjGPVnGDcgh8s= X-Received: by 2002:a05:6000:430c:b0:39c:1f0e:95af with SMTP id ffacd0b85a97d-3a0890a5104mr3581910f8f.3.1745939990044; Tue, 29 Apr 2025 08:19:50 -0700 (PDT) MIME-Version: 1.0 References: <20250429024139.34365-1-laoar.shao@gmail.com> <20250429024139.34365-4-laoar.shao@gmail.com> In-Reply-To: <20250429024139.34365-4-laoar.shao@gmail.com> From: Alexei Starovoitov Date: Tue, 29 Apr 2025 08:19:38 -0700 X-Gm-Features: ATxdqUEY4yBbJFrcBm1747j0TrsvmzhZ0S6ckhj3XQSyWYZumSIa2qBv3hFSGkM Message-ID: Subject: Re: [RFC PATCH 3/4] mm: add BPF hook for THP adjustment To: Yafang Shao Cc: Andrew Morton , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , bpf , linux-mm Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5AACB100012 X-Rspam-User: X-Stat-Signature: q4wwga1sgsi4j7bqnfjydj6nfc6pmhr3 X-HE-Tag: 1745940405-80493 X-HE-Meta: U2FsdGVkX19R3d7tAFTmOUMaSdxb3yihP9aNHUbCelaNlE3O5z9yEXnQWYQJfSSSaltMdgSEWriFdUcGClFoC++xI07hI1TiOP9Z74CgFSXAiAlAW51T9LszDm0ISMdHmslqvhwwYunmAZV211G56gpx9qSxmspmDz3TzdoV7ycYihnrWubqgKg47l/yRRyZPy9WjVW6kgrCo1NbBmCNHmvL3dfesMeI/tOtKWBXbVaQTmZBtnvmJ0Gecsx36Do3e/4pWKepAAWAqe5RxmSIzHbN1m+QGd/IZQN+SFcMq2+C09uY7kohotuqaCFNWQGmRjzOIIiASfVpSUG6Fc88G3TNMtM8qtb0zOgoPAeukbESem3r7VlJcs744dPjtD4HwOZrrROK+3G2qb3cc9SXzAKhcg5f7bki0ySeLSukXJQgoeSm/nHLb0S6iHAiiJt385MZXBU7R5ys4fiiKNW/2tFzhICIPwpi44NaeErN0fwvJG1XOmio1kahNGQ7cKeSkqNuF7dd157nWVkIt0crOWlhJCAul1QIC6oxw9QSGZSJG0FmdFu5+Qtc+OtxJuwy8V6qtedZ5d8UeZDrrIm8N5WY174YElvFklI1w7ihsQ15l4HBA0nXFcWMYeIprOz2vm1eLOu4ZtF5fXaX7veChcyxA7k04oWg6dS6+ysZWlu1Nc7+pdgp4TgwcKDYX/tWQ4746p7GveIO0jlAbHANPqeTl0Do+K1cfDAcKsw8doSnkOTU3ScCdnlp6LFCLv3LLyZtmf6WcSHEERiRpGryuwyFIU52Fb4zXT7UwH2FQUNyqEZMepYpPvzrHbni4p+uDplv2zQIkgPZnzBr0f7xsJGGNPbWXknQ3uRKYhWddaFqOSEJXyKeHT9d5+ZzTimkaihsSBAWTyURgSunLft+3K0uEMLTfj2AHffZQXgwO3duhqi0orthHtfvVdUwFWC49guUoHshCjHpw1ybB4z MNEJZfFo QCG9ifxaI3kBG7CWeL57QxjxxelvSt3rPBWihCj8DsRc0jtpNEVozdPmA5vOMaS4sYqmi2CA2OVdXVTl+d7vDLIAmPO8QpcAyJRfx4uTCGpxE2z+NOtKM54iSc4OY+Hyhzqk+0QSN0t+Wi7+L4C1lQSelEBCZ4yllzsKWqeSNWfHeuqs9WYBGOuen3YYN51rtmlPF28R+JxsmnZHrWVqCXDZXzhEqaosL2Y12M9G3qxM5R7JiG7C/55T0lKY0L5CHnyzGPHT84fd9n1nfaRlTQqXawSFacqTOAzWYam6DmiT8bvSyH6tqeqkzd/Xw+s2RepumH3oNVTzVDtJV1joYlQVE6Rf3hsBH7pZL549ihCCnjd6w7V3JZXQb66Ei/U+M0eroe/o4p0OgMR4mdin0cTxItTMGlrS4JmVcTUtbnTVvU1dp5Vih917MfMSOqsC8D3My X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Apr 28, 2025 at 7:42=E2=80=AFPM Yafang Shao = wrote: > > We will use the @vma parameter in BPF programs to determine whether THP c= an > be used. The typical workflow is as follows: > > 1. Retrieve the mm_struct from the given @vma. > 2. Obtain the task_struct associated with the mm_struct > It depends on CONFIG_MEMCG. > 3. Adjust THP behavior dynamically based on task attributes > E.g., based on the task=E2=80=99s cgroup > > Signed-off-by: Yafang Shao > --- > mm/Makefile | 3 +++ > mm/bpf.c | 36 ++++++++++++++++++++++++++++++++++++ > mm/bpf.h | 21 +++++++++++++++++++++ > mm/internal.h | 3 +++ > 4 files changed, 63 insertions(+) > create mode 100644 mm/bpf.c > create mode 100644 mm/bpf.h > > diff --git a/mm/Makefile b/mm/Makefile > index e7f6bbf8ae5f..97055da04746 100644 > --- a/mm/Makefile > +++ b/mm/Makefile > @@ -99,6 +99,9 @@ obj-$(CONFIG_MIGRATION) +=3D migrate.o > obj-$(CONFIG_NUMA) +=3D memory-tiers.o > obj-$(CONFIG_DEVICE_MIGRATION) +=3D migrate_device.o > obj-$(CONFIG_TRANSPARENT_HUGEPAGE) +=3D huge_memory.o khugepaged.o > +ifdef CONFIG_BPF_SYSCALL > +obj-$(CONFIG_TRANSPARENT_HUGEPAGE) +=3D bpf.o > +endif > obj-$(CONFIG_PAGE_COUNTER) +=3D page_counter.o > obj-$(CONFIG_MEMCG_V1) +=3D memcontrol-v1.o > obj-$(CONFIG_MEMCG) +=3D memcontrol.o vmpressure.o > diff --git a/mm/bpf.c b/mm/bpf.c > new file mode 100644 > index 000000000000..72eebcdbad56 > --- /dev/null > +++ b/mm/bpf.c > @@ -0,0 +1,36 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Author: Yafang Shao > + */ > + > +#include > +#include > + > +__bpf_hook_start(); > + > +/* Checks if this @vma can use THP. */ > +__weak noinline int > +mm_bpf_thp_vma_allowable(struct vm_area_struct *vma) > +{ > + /* At present, fmod_ret exclusively uses 0 to signify that the re= turn > + * value remains unchanged. > + */ > + return 0; > +} > + > +__bpf_hook_end(); > + > +BTF_SET8_START(mm_bpf_fmod_ret_ids) > +BTF_ID_FLAGS(func, mm_bpf_thp_vma_allowable) > +BTF_SET8_END(mm_bpf_fmod_ret_ids) > + > +static const struct btf_kfunc_id_set mm_bpf_fmodret_set =3D { > + .owner =3D THIS_MODULE, > + .set =3D &mm_bpf_fmod_ret_ids, > +}; > + > +static int __init bpf_mm_kfunc_init(void) > +{ > + return register_btf_fmodret_id_set(&mm_bpf_fmodret_set); > +} > +late_initcall(bpf_mm_kfunc_init); > diff --git a/mm/bpf.h b/mm/bpf.h > new file mode 100644 > index 000000000000..e03a38084b08 > --- /dev/null > +++ b/mm/bpf.h > @@ -0,0 +1,21 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +#ifndef __MM_BPF_H > +#define __MM_BPF_H > + > +#define MM_BPF_ALLOWABLE (1) > +#define MM_BPF_NOT_ALLOWABLE (-1) > + > +#define MM_BPF_ALLOWABLE_HOOK(func, args...) { \ > + int ret =3D func(args); \ > + \ > + if (ret =3D=3D MM_BPF_ALLOWABLE) \ > + return 1; \ > + if (ret =3D=3D MM_BPF_NOT_ALLOWABLE) \ > + return 0; \ > +} > + > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > +int mm_bpf_thp_vma_allowable(struct vm_area_struct *vma); > +#endif > + > +#endif > diff --git a/mm/internal.h b/mm/internal.h > index aa698a11dd68..c8bf405fa581 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -21,6 +21,7 @@ > > /* Internal core VMA manipulation functions. */ > #include "vma.h" > +#include "bpf.h" > > struct folio_batch; > > @@ -1632,6 +1633,7 @@ static inline bool reclaim_pt_is_enabled(unsigned l= ong start, unsigned long end, > */ > static inline bool hugepage_global_enabled(struct vm_area_struct *vma) > { > + MM_BPF_ALLOWABLE_HOOK(mm_bpf_thp_vma_allowable, vma); > return transparent_hugepage_flags & > ((1< (1< @@ -1639,6 +1641,7 @@ static inline bool hugepage_global_enabled(struct v= m_area_struct *vma) > > static inline bool hugepage_global_always(struct vm_area_struct *vma) > { > + MM_BPF_ALLOWABLE_HOOK(mm_bpf_thp_vma_allowable, vma); Please define a clean struct_ops based interface and demonstrate the generality of the api with both bpf prog and a kernel module. Do not use fmod_ret since it's global while struct_ops can be made scoped for use case. Ex: per cgroup.