From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
To: Breno Leitao <leitao@debian.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
Barry Song <baohua@kernel.org>,
Lance Yang <lance.yang@linux.dev>,
Vlastimil Babka <vbabka@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
usamaarif642@gmail.com, kas@kernel.org, kernel-team@meta.com
Subject: Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
Date: Thu, 5 Mar 2026 12:30:40 +0000 [thread overview]
Message-ID: <6eaeeace-7b0e-4465-8bb0-88cce2b13bb9@lucifer.local> (raw)
In-Reply-To: <aalXA_nP2k2OrHhc@gmail.com>
On Thu, Mar 05, 2026 at 03:48:07AM -0800, Breno Leitao wrote:
> On Wed, Mar 04, 2026 at 04:40:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> > On Wed, Mar 04, 2026 at 02:22:33AM -0800, Breno Leitao wrote:
> > > Writing "never" (or any other value) multiple times to
> > > /sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
> > > start_stop_khugepaged() each time, even when nothing actually changed.
> > > This causes set_recommended_min_free_kbytes() to run unconditionally,
> > > which is unnecessary and floods the printk buffer with "raising
> > > min_free_kbytes" messages. Example:
> > >
> > > # for i in $(seq 100); do
> > > # echo never > /sys/kernel/mm/transparent_hugepage/enabled
> > > # done
> > >
> > > # dmesg | grep "min_free_kbytes is not updated" | wc -l
> > > 100
> > >
> > > Use test_and_set_bit()/test_and_clear_bit() instead of the plain
> > > variants to detect whether any bit actually flipped, and skip the
> > > start_stop_khugepaged() call entirely when the configuration is
> > > unchanged.
> > >
> > > With this patch, redoing the same operation becomes a no-op.
> > >
> > > Signed-off-by: Breno Leitao <leitao@debian.org>
> >
> > General concept is sensible, but let's improve this code please.
>
> Ack! Thanks for the suggestions.
No problem, thanks for the patch! :)
>
> > > spin_unlock(&huge_anon_orders_lock);
> > > } else
> > > ret = -EINVAL;
> > >
> > > - if (ret > 0) {
> > > + if (ret > 0 && changed) {
> > > int err;
> > >
> > > err = start_stop_khugepaged();
> >
> > There's a caveat here as mentioned in reply to Kiryl - I'm concerned users
> > might rely on the set recommended min kbytes even when things don't change.
> >
> > Not sure how likely that is, but it's a user-visible change in how this behaves.
>
> Is there any motivation that users are retouching
> /sys/kernel/mm/transparent_hugepage just to trigger
> set_recommended_min_free_kbytes() ? That seems weird, but, I will keep it in
> the change.
I mean I can't really think of any, but I don't want to risk breaking (weird)
userspace.
>
> > From cb2c4c8bf183ef0d10068cfd12c12d19cb17a241 Mon Sep 17 00:00:00 2001
> > From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
> > Date: Wed, 4 Mar 2026 16:37:20 +0000
> > Subject: [PATCH] idea
> >
> > Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
>
> Thanks for the idea. Let me hack on top of it, and propose a v2.
Thanks!
>
> > ---
> > mm/huge_memory.c | 74 ++++++++++++++++++++++++++++++------------------
> > 1 file changed, 46 insertions(+), 28 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 0df1f4a17430..97dabbeb9112 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -515,46 +515,64 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
> > return sysfs_emit(buf, "%s\n", output);
> > }
> >
> > +enum huge_mode {
> > + HUGE_ALWAYS,
> > + HUGE_INHERIT,
> > + HUGE_MADVISE,
> > + HUGE_NUM_MODES,
> > + HUGE_NEVER,
> > +};
> > +
> > +static bool change_anon_orders(int order, enum huge_mode mode)
> > +{
> > + static unsigned long *orders[] = {
> > + &huge_anon_orders_always,
> > + &huge_anon_orders_inherit,
> > + &huge_anon_orders_madvise,
> > + };
> > + bool changed = false;
> > + int i;
> > +
> > + spin_lock(&huge_anon_orders_lock);
> > + for (i = 0; i < HUGE_NUM_MODES; i++) {
>
> > + if (i == mode)
> > + changed |= !test_and_set_bit(order, orders[mode]);
> > + else
> > + changed |= test_and_clear_bit(order, orders[mode]);
>
> I suppose we want s/mode/i in the test_and_{clear,set}_bit() here:
>
> if (i == mode)
> // set for mode
> changed |= !test_and_set_bit(order, orders[i]);
> else
> // clear for !mode
> changed |= test_and_clear_bit(order, orders[i]);
>
> For two reasons:
> * you want to unset "i" when i != mode.
> * you would have an OOB when accessing orders[HUGE_NEVER == 4]
>
>
> > static ssize_t anon_enabled_store(struct kobject *kobj,
> > struct kobj_attribute *attr,
> > const char *buf, size_t count)
> > {
> > int order = to_thpsize(kobj)->order;
> > ssize_t ret = count;
> > + bool changed;
> > +
> > + if (sysfs_streq(buf, "always"))
> > + changed = change_anon_orders(order, HUGE_ALWAYS);
> > + else if (sysfs_streq(buf, "inherit"))
> > + changed = change_anon_orders(order, HUGE_INHERIT);
> > + else if (sysfs_streq(buf, "madvise"))
> > + changed = change_anon_orders(order, HUGE_MADVISE);
> > + else if (sysfs_streq(buf, "never"))
> > + changed = change_anon_orders(order, HUGE_NEVER);
> > + else
> > + return -EINVAL;
>
> I think we can simplify anon_enabled_store() even more, by leveraging sysfs_match_string().
> Something like:
>
> static const char *const anon_mode_strings[] = {
> [HUGE_ALWAYS] = "always",
> [HUGE_INHERIT] = "inherit",
> [HUGE_MADVISE] = "madvise",
> [HUGE_NEVER] = "never",
> NULL,
> };
>
> and then
>
> static ssize_t anon_enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> int order = to_thpsize(kobj)->order;
> int mode;
>
> mode = sysfs_match_string(enabled_mode_strings, buf);
> if (mode < 0)
> return mode;
Nice!
>
> if (change_anon_orders(order, mode)) {
> int err = start_stop_khugepaged();
>
> if (err)
> return err;
> } else {
> /* Users expect this even if unchanged. TODO: Put in header... */
> //set_recommended_min_free_kbytes();
> }
> return count;
> }
>
>
>
> Anyway, I like this approach, thanks!. Let me hack a v2 based on it.
Great, thanks!
Note that my code seemed to introduce a splat, so it's buggy, make sure to check
it carefully (that 'untested' proviso was apposite, it turns out! :)
>
> --breno
Cheers, Lorenzo
next prev parent reply other threads:[~2026-03-05 12:30 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-05 11:48 ` Breno Leitao
2026-03-05 12:30 ` Lorenzo Stoakes (Oracle) [this message]
2026-03-05 12:44 ` David Hildenbrand (Arm)
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 11:53 ` Breno Leitao
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:41 ` David Hildenbrand (Arm)
2026-03-05 12:45 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:46 ` Lorenzo Stoakes (Oracle)
2026-03-05 13:14 ` David Hildenbrand (Arm)
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6eaeeace-7b0e-4465-8bb0-88cce2b13bb9@lucifer.local \
--to=ljs@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=kas@kernel.org \
--cc=kernel-team@meta.com \
--cc=lance.yang@linux.dev \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox