From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
David Hildenbrand <david@redhat.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Zi Yan <ziy@nvidia.com>,
Liam Howlett <Liam.Howlett@oracle.com>,
npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
Johannes Weiner <hannes@cmpxchg.org>,
usamaarif642@gmail.com, gutierrez.asier@huawei-partners.com,
Matthew Wilcox <willy@infradead.org>,
Amery Hung <ameryhung@gmail.com>,
David Rientjes <rientjes@google.com>,
Jonathan Corbet <corbet@lwn.net>, Barry Song <21cnbao@gmail.com>,
Shakeel Butt <shakeel.butt@linux.dev>, Tejun Heo <tj@kernel.org>,
lance.yang@linux.dev, Randy Dunlap <rdunlap@infradead.org>,
Chris Mason <clm@meta.com>, bpf <bpf@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH v12 mm-new 06/10] mm: bpf-thp: add support for global mode
Date: Wed, 29 Oct 2025 17:57:09 -0700 [thread overview]
Message-ID: <CAADnVQK9kp_5zh0gYvXdJ=3MSuXTbmZT+cah5uhZiGk5qYfckw@mail.gmail.com> (raw)
In-Reply-To: <CALOAHbD+9gxukoZ3OQvH2fNH2Ff+an+Dx-fzx_+mhb=8fZZ+sw@mail.gmail.com>
On Tue, Oct 28, 2025 at 7:14 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Wed, Oct 29, 2025 at 9:33 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sun, Oct 26, 2025 at 3:03 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > > The per-process BPF-THP mode is unsuitable for managing shared resources
> > > such as shmem THP and file-backed THP. This aligns with known cgroup
> > > limitations for similar scenarios [0].
> > >
> > > Introduce a global BPF-THP mode to address this gap. When registered:
> > > - All existing per-process instances are disabled
> > > - New per-process registrations are blocked
> > > - Existing per-process instances remain registered (no forced unregistration)
> > >
> > > The global mode takes precedence over per-process instances. Updates are
> > > type-isolated: global instances can only be updated by new global
> > > instances, and per-process instances by new per-process instances.
> >
> > ...
> >
> > > spin_lock(&thp_ops_lock);
> > > - /* Each process is exclusively managed by a single BPF-THP. */
> > > - if (rcu_access_pointer(mm->bpf_mm.bpf_thp)) {
> > > + /* Each process is exclusively managed by a single BPF-THP.
> > > + * Global mode disables per-process instances.
> > > + */
> > > + if (rcu_access_pointer(mm->bpf_mm.bpf_thp) || rcu_access_pointer(bpf_thp_global)) {
> > > err = -EBUSY;
> > > goto out;
> > > }
> >
> > You didn't address the issue and instead doubled down
> > on this broken global approach.
> >
> > This bait-and-switch patchset is frankly disingenuous.
> > 'lets code up some per-mm hack, since people will hate it anyway,
> > and I'm not going to use it either, and add this global mode
> > as a fake "fallback"...'
> >
> > The way the previous thread evolved and this followup hack
> > I don't see a genuine desire to find a solution.
> > Just relentless push for global mode.
> >
> > Nacked-by: Alexei Starovoitov <ast@kernel.org>
> >
> > Please carry it in all future patches.
>
> To move forward, I'm happy to set the global mode aside for now and
> potentially drop it in the next version. I'd really like to hear your
> perspective on the per-process mode. Does this implementation meet
> your needs?
Attaching st_ops to task_struct or to mm_struct is a can of worms.
With cgroup-bpf we went through painful bugs with lifetime
of cgroup vs bpf, dying cgroups, wq deadlock, etc. All these
problems are behind us. With st_ops in mm_struct it will be more
painful. I'd rather not go that route.
And revist cgroup instead, since you were way too quick
to accept the pushback because all you wanted is global mode.
The main reason for pushback was:
"
Cgroup was designed for resource management not for grouping processes and
tune those processes
"
which was true when cgroup-v2 was designed, but that ship sailed
years ago when we introduced cgroup-bpf.
None of the progs are doing resource management and lots of infrastructure,
container management, and open source projects use cgroup-bpf
as a grouping of processes. bpf progs attached to cgroup/hook tuple
only care about processes within that cgroup. No resource management.
See __cgroup_bpf_check_dev_permission or __cgroup_bpf_run_filter_sysctl
and others.
The path is current->cgroup->bpf_progs and progs do exactly
what cgroup wasn't designed to do. They tune a set of processes.
You should do the same.
Also I really don't see a compelling use case for bpf in THP.
Your selftest is beyond primitive:
+int pmd_order;
+
+SEC("struct_ops/thp_get_order")
+int BPF_PROG(thp_not_eligible, struct vm_area_struct *vma, enum tva_type type,
+ unsigned long orders)
+{
+ /* THPeligible in /proc/pid/smaps is 0 */
+ if (type == TVA_SMAPS)
+ return 0;
+ return pmd_order;
+}
hard code this thing. Don't bother with bpf.
next prev parent reply other threads:[~2025-10-30 0:57 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-26 10:01 [PATCH v12 mm-new 00/10] mm, bpf: BPF-MM, BPF-THP Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 01/10] mm: thp: remove vm_flags parameter from khugepaged_enter_vma() Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 02/10] mm: thp: remove vm_flags parameter from thp_vma_allowable_order() Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 03/10] mm: thp: add support for BPF based THP order selection Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 04/10] mm: thp: decouple THP allocation between swap and page fault paths Yafang Shao
2025-10-27 4:07 ` Barry Song
2025-10-26 10:01 ` [PATCH v12 mm-new 05/10] mm: thp: enable THP allocation exclusively through khugepaged Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 06/10] mm: bpf-thp: add support for global mode Yafang Shao
2025-10-29 1:32 ` Alexei Starovoitov
2025-10-29 2:13 ` Yafang Shao
2025-10-30 0:57 ` Alexei Starovoitov [this message]
2025-10-30 2:40 ` Yafang Shao
2025-11-27 11:48 ` David Hildenbrand (Red Hat)
2025-11-28 2:53 ` Yafang Shao
2025-11-28 7:57 ` Lorenzo Stoakes
2025-11-28 8:18 ` Yafang Shao
2025-11-28 8:31 ` Lorenzo Stoakes
2025-11-28 11:56 ` Yafang Shao
2025-11-28 12:18 ` Lorenzo Stoakes
2025-11-28 12:51 ` Yafang Shao
2025-11-28 8:39 ` David Hildenbrand (Red Hat)
2025-11-28 8:55 ` Lorenzo Stoakes
2025-11-30 13:06 ` Yafang Shao
2025-11-26 15:13 ` Rik van Riel
2025-11-27 2:35 ` Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 07/10] Documentation: add BPF THP Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 08/10] selftests/bpf: add a simple BPF based THP policy Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 09/10] selftests/bpf: add test case to update " Yafang Shao
2025-10-26 10:01 ` [PATCH v12 mm-new 10/10] selftests/bpf: add test case for BPF-THP inheritance across fork Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAADnVQK9kp_5zh0gYvXdJ=3MSuXTbmZT+cah5uhZiGk5qYfckw@mail.gmail.com' \
--to=alexei.starovoitov@gmail.com \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clm@meta.com \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=eddyz87@gmail.com \
--cc=gutierrez.asier@huawei-partners.com \
--cc=hannes@cmpxchg.org \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=lance.yang@linux.dev \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=martin.lau@linux.dev \
--cc=npache@redhat.com \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=ryan.roberts@arm.com \
--cc=sdf@fomichev.me \
--cc=shakeel.butt@linux.dev \
--cc=song@kernel.org \
--cc=tj@kernel.org \
--cc=usamaarif642@gmail.com \
--cc=willy@infradead.org \
--cc=yonghong.song@linux.dev \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox