From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Yafang Shao <laoar.shao@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Zi Yan <ziy@nvidia.com>,
Liam Howlett <Liam.Howlett@oracle.com>,
npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
Johannes Weiner <hannes@cmpxchg.org>,
usamaarif642@gmail.com, gutierrez.asier@huawei-partners.com,
Matthew Wilcox <willy@infradead.org>,
Amery Hung <ameryhung@gmail.com>,
David Rientjes <rientjes@google.com>,
Jonathan Corbet <corbet@lwn.net>, Barry Song <21cnbao@gmail.com>,
Shakeel Butt <shakeel.butt@linux.dev>, Tejun Heo <tj@kernel.org>,
lance.yang@linux.dev, Randy Dunlap <rdunlap@infradead.org>,
Chris Mason <clm@meta.com>, bpf <bpf@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH v12 mm-new 06/10] mm: bpf-thp: add support for global mode
Date: Thu, 27 Nov 2025 12:48:33 +0100 [thread overview]
Message-ID: <9f73a5bd-32a0-4d5f-8a3f-7bff8232e408@kernel.org> (raw)
In-Reply-To: <CAADnVQK9kp_5zh0gYvXdJ=3MSuXTbmZT+cah5uhZiGk5qYfckw@mail.gmail.com>
>> To move forward, I'm happy to set the global mode aside for now and
>> potentially drop it in the next version. I'd really like to hear your
>> perspective on the per-process mode. Does this implementation meet
>> your needs?
I haven't had the capacity to follow the evolution of this patch set
unfortunately, just to comment on some points from my perspective.
First, I agree that the global mode is not what we want, not even as a
fallback.
>
> Attaching st_ops to task_struct or to mm_struct is a can of worms.
> With cgroup-bpf we went through painful bugs with lifetime
> of cgroup vs bpf, dying cgroups, wq deadlock, etc. All these
> problems are behind us. With st_ops in mm_struct it will be more
> painful. I'd rather not go that route.
That's valuable information, thanks. I would have hoped that per-MM
policies would be easier.
Are there some pointers to explore regarding the "can of worms" you
mention when it comes to per-MM policies?
>
> And revist cgroup instead, since you were way too quick
> to accept the pushback because all you wanted is global mode.
>
> The main reason for pushback was:
> "
> Cgroup was designed for resource management not for grouping processes and
> tune those processes
> "
>
> which was true when cgroup-v2 was designed, but that ship sailed
> years ago when we introduced cgroup-bpf.
Also valuable information.
Personally I don't have a preference regarding per-mm or per-cgroup.
Whatever we can get working reliably. Sounds like cgroup-bpf has sorted
out most of the mess.
memcg/cgroup maintainers might disagree, but it's probably worth having
that discussion once again.
> None of the progs are doing resource management and lots of infrastructure,
> container management, and open source projects use cgroup-bpf
> as a grouping of processes. bpf progs attached to cgroup/hook tuple
> only care about processes within that cgroup. No resource management.
> See __cgroup_bpf_check_dev_permission or __cgroup_bpf_run_filter_sysctl
> and others.
> The path is current->cgroup->bpf_progs and progs do exactly
> what cgroup wasn't designed to do. They tune a set of processes.
>
> You should do the same.
>
> Also I really don't see a compelling use case for bpf in THP.
There is a lot more potential there to write fine-tuned policies that
thack VMA information into account.
The tests likely reflect what Yafang seems to focus on: IIUC primarily
enabling+disabling traditional THPs (e.g., 2M) on a per-process basis.
Some of what Yafang might want to achieve could maybe at this point be
maybe achieved through the prctl(PR_SET_THP_DISABLE) support, including
extensions we recently added [1].
Systemd support still seems to be in the works [2] for some of that.
[1] https://lwn.net/Articles/1032014/
[2] https://github.com/systemd/systemd/pull/39085
--
Cheers
David
next prev parent reply other threads:[~2025-11-27 11:48 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-26 10:01 [PATCH v12 mm-new 00/10] mm, bpf: BPF-MM, BPF-THP Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 01/10] mm: thp: remove vm_flags parameter from khugepaged_enter_vma() Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 02/10] mm: thp: remove vm_flags parameter from thp_vma_allowable_order() Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 03/10] mm: thp: add support for BPF based THP order selection Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 04/10] mm: thp: decouple THP allocation between swap and page fault paths Yafang Shao
2025-10-27 4:07 ` Barry Song
2025-10-26 10:01 ` [PATCH v12 mm-new 05/10] mm: thp: enable THP allocation exclusively through khugepaged Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 06/10] mm: bpf-thp: add support for global mode Yafang Shao
2025-10-29 1:32 ` Alexei Starovoitov
2025-10-29 2:13 ` Yafang Shao
2025-10-30 0:57 ` Alexei Starovoitov
2025-10-30 2:40 ` Yafang Shao
2025-11-27 11:48 ` David Hildenbrand (Red Hat) [this message]
2025-11-28 2:53 ` Yafang Shao
2025-11-28 7:57 ` Lorenzo Stoakes
2025-11-28 8:18 ` Yafang Shao
2025-11-28 8:31 ` Lorenzo Stoakes
2025-11-28 11:56 ` Yafang Shao
2025-11-28 12:18 ` Lorenzo Stoakes
2025-11-28 12:51 ` Yafang Shao
2025-11-28 8:39 ` David Hildenbrand (Red Hat)
2025-11-28 8:55 ` Lorenzo Stoakes
2025-11-30 13:06 ` Yafang Shao
2025-11-26 15:13 ` Rik van Riel
2025-11-27 2:35 ` Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 07/10] Documentation: add BPF THP Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 08/10] selftests/bpf: add a simple BPF based THP policy Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 09/10] selftests/bpf: add test case to update " Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 10/10] selftests/bpf: add test case for BPF-THP inheritance across fork Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9f73a5bd-32a0-4d5f-8a3f-7bff8232e408@kernel.org \
--to=david@kernel.org \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=dev.jain@arm.com \
--cc=eddyz87@gmail.com \
--cc=gutierrez.asier@huawei-partners.com \
--cc=hannes@cmpxchg.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=lance.yang@linux.dev \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=martin.lau@linux.dev \
--cc=npache@redhat.com \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=ryan.roberts@arm.com \
--cc=sdf@fomichev.me \
--cc=shakeel.butt@linux.dev \
--cc=song@kernel.org \
--cc=tj@kernel.org \
--cc=usamaarif642@gmail.com \
--cc=willy@infradead.org \
--cc=yonghong.song@linux.dev \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox