* [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
@ 2026-03-04 10:22 Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Breno Leitao @ 2026-03-04 10:22 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
start_stop_khugepaged() called independent of any change.
start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
exact same message, even when nothing changes.
For instance, if you have a custom vm.min_free_kbytes, just touching
/sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
Example:
# sysctl -w vm.min_free_kbytes=112382
# for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done
and you have 100 WARN messages like the following, which is pretty dull:
khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred
A similar message shows up when setting thp to "always":
# for i in $(seq 100); do
# echo 1024 > /proc/sys/vm/min_free_kbytes
# echo always > /sys/kernel/mm/transparent_hugepage/enabled
# done
And then, we have 100K messages like:
khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations
This is more common when you have a configuration management system that
writes the THP configuration without an extra read, assuming that
nothing will happen if there is no change in the configuration, but it
prints these annoying messages.
For instance, at Meta's fleet, ~10K servers were producing 3.5M of
these messages per day.
Fix this by doing two things:
* Make the sysfs _store helpers a no-op if there is no state change
* Ratelimit these messages to avoid SPAM on config change
---
Breno Leitao (2):
mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
mm/huge_memory.c | 48 ++++++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 20 deletions(-)
---
base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
change-id: 20260303-thp_logs-059d6b80f6d6
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
@ 2026-03-04 10:22 ` Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Breno Leitao @ 2026-03-04 10:22 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Writing "never" (or any other value) multiple times to
/sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
start_stop_khugepaged() each time, even when nothing actually changed.
This causes set_recommended_min_free_kbytes() to run unconditionally,
which is unnecessary and floods the printk buffer with "raising
min_free_kbytes" messages. Example:
# for i in $(seq 100); do
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# done
# dmesg | grep "min_free_kbytes is not updated" | wc -l
100
Use test_and_set_bit()/test_and_clear_bit() instead of the plain
variants to detect whether any bit actually flipped, and skip the
start_stop_khugepaged() call entirely when the configuration is
unchanged.
With this patch, redoing the same operation becomes a no-op.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/huge_memory.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8e2746ea74adf..9abfb115e9329 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -520,36 +520,37 @@ static ssize_t anon_enabled_store(struct kobject *kobj,
const char *buf, size_t count)
{
int order = to_thpsize(kobj)->order;
+ bool changed = false;
ssize_t ret = count;
if (sysfs_streq(buf, "always")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_always);
+ changed = test_and_clear_bit(order, &huge_anon_orders_inherit);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
+ changed |= !test_and_set_bit(order, &huge_anon_orders_always);
spin_unlock(&huge_anon_orders_lock);
} else if (sysfs_streq(buf, "inherit")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_inherit);
+ changed = test_and_clear_bit(order, &huge_anon_orders_always);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
+ changed |= !test_and_set_bit(order, &huge_anon_orders_inherit);
spin_unlock(&huge_anon_orders_lock);
} else if (sysfs_streq(buf, "madvise")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- set_bit(order, &huge_anon_orders_madvise);
+ changed = test_and_clear_bit(order, &huge_anon_orders_always);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
+ changed |= !test_and_set_bit(order, &huge_anon_orders_madvise);
spin_unlock(&huge_anon_orders_lock);
} else if (sysfs_streq(buf, "never")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
+ changed = test_and_clear_bit(order, &huge_anon_orders_always);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
spin_unlock(&huge_anon_orders_lock);
} else
ret = -EINVAL;
- if (ret > 0) {
+ if (ret > 0 && changed) {
int err;
err = start_stop_khugepaged();
--
2.47.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
@ 2026-03-04 10:22 ` Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
3 siblings, 1 reply; 9+ messages in thread
From: Breno Leitao @ 2026-03-04 10:22 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Avoid calling start_stop_khugepaged() at the top-level enabled_store()
for /sys/kernel/mm/transparent_hugepage/enabled. Use test_and_set_bit()
and test_and_clear_bit() to detect whether the configuration
actually changed before calling start_stop_khugepaged().
This avoid calling start_stop_khugepaged() unnecessary.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/huge_memory.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9abfb115e9329..b6ed44b6e8c02 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -320,21 +320,28 @@ static ssize_t enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
+ bool changed = false;
ssize_t ret = count;
if (sysfs_streq(buf, "always")) {
- clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
- set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
+ changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+ &transparent_hugepage_flags);
+ changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_FLAG,
+ &transparent_hugepage_flags);
} else if (sysfs_streq(buf, "madvise")) {
- clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
+ changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
+ &transparent_hugepage_flags);
+ changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+ &transparent_hugepage_flags);
} else if (sysfs_streq(buf, "never")) {
- clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
+ changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
+ &transparent_hugepage_flags);
+ changed |= test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+ &transparent_hugepage_flags);
} else
ret = -EINVAL;
- if (ret > 0) {
+ if (ret > 0 && changed) {
int err = start_stop_khugepaged();
if (err)
ret = err;
--
2.47.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
@ 2026-03-04 11:18 ` Kiryl Shutsemau
2026-03-04 11:53 ` Breno Leitao
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
3 siblings, 2 replies; 9+ messages in thread
From: Kiryl Shutsemau @ 2026-03-04 11:18 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kernel-team
On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> Breno Leitao (2):
> mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
I think the same can be achieved cleaner from within start_stop_khugepaged().
Completely untested patch is below.
One noticeable difference that with the patch we don't kick
khugepaged_wait if khugepaged_thread is there. But I don't think it
should make a difference.
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1dd3cfca610d..80f818d3a094 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2683,18 +2683,18 @@ static void set_recommended_min_free_kbytes(void)
int start_stop_khugepaged(void)
{
- int err = 0;
+ guard(mutex)(&khugepaged_mutex);
+
+ /* Check if anything has to be done */
+ if (hugepage_pmd_enabled() == !!khugepaged_thread)
+ return 0;
- mutex_lock(&khugepaged_mutex);
if (hugepage_pmd_enabled()) {
- if (!khugepaged_thread)
- khugepaged_thread = kthread_run(khugepaged, NULL,
- "khugepaged");
+ khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged");
if (IS_ERR(khugepaged_thread)) {
pr_err("khugepaged: kthread_run(khugepaged) failed\n");
- err = PTR_ERR(khugepaged_thread);
khugepaged_thread = NULL;
- goto fail;
+ return PTR_ERR(khugepaged_thread);
}
if (!list_empty(&khugepaged_scan.mm_head))
@@ -2703,10 +2703,9 @@ int start_stop_khugepaged(void)
kthread_stop(khugepaged_thread);
khugepaged_thread = NULL;
}
+
set_recommended_min_free_kbytes();
-fail:
- mutex_unlock(&khugepaged_mutex);
- return err;
+ return 0;
}
void khugepaged_min_free_kbytes_update(void)
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
@ 2026-03-04 11:53 ` Breno Leitao
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
1 sibling, 0 replies; 9+ messages in thread
From: Breno Leitao @ 2026-03-04 11:53 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kernel-team
Hello Kiryl,
On Wed, Mar 04, 2026 at 11:18:37AM +0000, Kiryl Shutsemau wrote:
> On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> > Breno Leitao (2):
> > mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> > mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
>
> I think the same can be achieved cleaner from within start_stop_khugepaged().
Thanks for the review.
I considered this approach as well. From my limited experience, the preference
is to return early in the _store() callbacks when the operation is a no-op,
rather than pushing that logic deeper into nested code.
Handling it at the _store() level results in more changes overall, but it is
arguably more straightforward to reason about (!?).
That said, I am no MM subsystem expert, and I am happy to go either way.
--breno
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
` (2 preceding siblings ...)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
@ 2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
3 siblings, 0 replies; 9+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:17 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
> start_stop_khugepaged() called independent of any change.
> start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
> exact same message, even when nothing changes.
>
> For instance, if you have a custom vm.min_free_kbytes, just touching
> /sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
> Example:
>
> # sysctl -w vm.min_free_kbytes=112382
> # for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done
>
> and you have 100 WARN messages like the following, which is pretty dull:
>
> khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred
>
> A similar message shows up when setting thp to "always":
>
> # for i in $(seq 100); do
> # echo 1024 > /proc/sys/vm/min_free_kbytes
> # echo always > /sys/kernel/mm/transparent_hugepage/enabled
> # done
>
> And then, we have 100K messages like:
You mean 100? :)
It's 100 locally here.
>
> khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations
>
> This is more common when you have a configuration management system that
> writes the THP configuration without an extra read, assuming that
> nothing will happen if there is no change in the configuration, but it
> prints these annoying messages.
Right.
>
> For instance, at Meta's fleet, ~10K servers were producing 3.5M of
> these messages per day.
Yeah... that's not so fun :)
>
> Fix this by doing two things:
> * Make the sysfs _store helpers a no-op if there is no state change
> * Ratelimit these messages to avoid SPAM on config change
Thanks for taking a look at this!
>
> ---
> Breno Leitao (2):
> mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
>
> mm/huge_memory.c | 48 ++++++++++++++++++++++++++++--------------------
> 1 file changed, 28 insertions(+), 20 deletions(-)
> ---
> base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
> change-id: 20260303-thp_logs-059d6b80f6d6
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 11:53 ` Breno Leitao
@ 2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
1 sibling, 0 replies; 9+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:24 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: Breno Leitao, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, Lance Yang, Vlastimil Babka,
Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
Johannes Weiner, linux-mm, linux-kernel, usamaarif642,
kernel-team
On Wed, Mar 04, 2026 at 11:18:37AM +0000, Kiryl Shutsemau wrote:
> On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> > Breno Leitao (2):
> > mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> > mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
>
> I think the same can be achieved cleaner from within start_stop_khugepaged().
I'm kind of with you in doing it here but...
>
> Completely untested patch is below.
>
> One noticeable difference that with the patch we don't kick
> khugepaged_wait if khugepaged_thread is there. But I don't think it
> should make a difference.
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1dd3cfca610d..80f818d3a094 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2683,18 +2683,18 @@ static void set_recommended_min_free_kbytes(void)
>
> int start_stop_khugepaged(void)
> {
> - int err = 0;
> + guard(mutex)(&khugepaged_mutex);
This is nice :) we need more guard usage in mm.
> +
> + /* Check if anything has to be done */
> + if (hugepage_pmd_enabled() == !!khugepaged_thread)
This is horrible, better to break out the latter expression as a small static
function like:
static bool is_khugepaged_thread_running(void)
{
if (khugepaged_thread)
return true;
return false;
}
> + return 0;
>
> - mutex_lock(&khugepaged_mutex);
> if (hugepage_pmd_enabled()) {
Can we race on this function call, meaning check above might vary from one here?
Better to have
bool enabled = hugepage_pmd_enabled();
etc. etc.
> - if (!khugepaged_thread)
> - khugepaged_thread = kthread_run(khugepaged, NULL,
> - "khugepaged");
> + khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged");
> if (IS_ERR(khugepaged_thread)) {
> pr_err("khugepaged: kthread_run(khugepaged) failed\n");
> - err = PTR_ERR(khugepaged_thread);
> khugepaged_thread = NULL;
> - goto fail;
> + return PTR_ERR(khugepaged_thread);
> }
>
> if (!list_empty(&khugepaged_scan.mm_head))
> @@ -2703,10 +2703,9 @@ int start_stop_khugepaged(void)
> kthread_stop(khugepaged_thread);
> khugepaged_thread = NULL;
> }
> +
> set_recommended_min_free_kbytes();
We are setting recommended min free kbytes each time somebody touches these
flags, maybe somebody somewhere relies on that and this change would break them?
So maybe we just want to do this each time anyway, but repress/rate limit
output?
> -fail:
> - mutex_unlock(&khugepaged_mutex);
> - return err;
> + return 0;
> }
>
> void khugepaged_min_free_kbytes_update(void)
> --
> Kiryl Shutsemau / Kirill A. Shutemov
Thanks, Lorenzo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
@ 2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 9+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:40 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Wed, Mar 04, 2026 at 02:22:33AM -0800, Breno Leitao wrote:
> Writing "never" (or any other value) multiple times to
> /sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
> start_stop_khugepaged() each time, even when nothing actually changed.
> This causes set_recommended_min_free_kbytes() to run unconditionally,
> which is unnecessary and floods the printk buffer with "raising
> min_free_kbytes" messages. Example:
>
> # for i in $(seq 100); do
> # echo never > /sys/kernel/mm/transparent_hugepage/enabled
> # done
>
> # dmesg | grep "min_free_kbytes is not updated" | wc -l
> 100
>
> Use test_and_set_bit()/test_and_clear_bit() instead of the plain
> variants to detect whether any bit actually flipped, and skip the
> start_stop_khugepaged() call entirely when the configuration is
> unchanged.
>
> With this patch, redoing the same operation becomes a no-op.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
General concept is sensible, but let's improve this code please.
> ---
> mm/huge_memory.c | 27 ++++++++++++++-------------
> 1 file changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8e2746ea74adf..9abfb115e9329 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -520,36 +520,37 @@ static ssize_t anon_enabled_store(struct kobject *kobj,
> const char *buf, size_t count)
> {
> int order = to_thpsize(kobj)->order;
> + bool changed = false;
> ssize_t ret = count;
>
> if (sysfs_streq(buf, "always")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_always);
> + changed = test_and_clear_bit(order, &huge_anon_orders_inherit);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
> + changed |= !test_and_set_bit(order, &huge_anon_orders_always);
> spin_unlock(&huge_anon_orders_lock);
> } else if (sysfs_streq(buf, "inherit")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_inherit);
> + changed = test_and_clear_bit(order, &huge_anon_orders_always);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
> + changed |= !test_and_set_bit(order, &huge_anon_orders_inherit);
> spin_unlock(&huge_anon_orders_lock);
> } else if (sysfs_streq(buf, "madvise")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - set_bit(order, &huge_anon_orders_madvise);
> + changed = test_and_clear_bit(order, &huge_anon_orders_always);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
> + changed |= !test_and_set_bit(order, &huge_anon_orders_madvise);
> spin_unlock(&huge_anon_orders_lock);
> } else if (sysfs_streq(buf, "never")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> + changed = test_and_clear_bit(order, &huge_anon_orders_always);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
This is badly implemented already (sigh) so a little tricky as to how to
abstract.
Yes the existing logic duplicated, doesn't mean we have to keep doing so :)
To put money where my mouth is attached a (totally untested, in line with
Kiryl's :P) patch to give a sense of how one might achieve this.
As to this vs. Kiryl's... I mean it might be nice to fix this crap up here to be
honest.
Maybe David can have deciding vote ;)
But see below for a caveat...
> spin_unlock(&huge_anon_orders_lock);
> } else
> ret = -EINVAL;
>
> - if (ret > 0) {
> + if (ret > 0 && changed) {
> int err;
>
> err = start_stop_khugepaged();
There's a caveat here as mentioned in reply to Kiryl - I'm concerned users
might rely on the set recommended min kbytes even when things don't change.
Not sure how likely that is, but it's a user-visible change in how this behaves.
Cheers, Lorenzo
>
> --
> 2.47.3
>
----8<----
From cb2c4c8bf183ef0d10068cfd12c12d19cb17a241 Mon Sep 17 00:00:00 2001
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Date: Wed, 4 Mar 2026 16:37:20 +0000
Subject: [PATCH] idea
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
mm/huge_memory.c | 74 ++++++++++++++++++++++++++++++------------------
1 file changed, 46 insertions(+), 28 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0df1f4a17430..97dabbeb9112 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -515,46 +515,64 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
return sysfs_emit(buf, "%s\n", output);
}
+enum huge_mode {
+ HUGE_ALWAYS,
+ HUGE_INHERIT,
+ HUGE_MADVISE,
+ HUGE_NUM_MODES,
+ HUGE_NEVER,
+};
+
+static bool change_anon_orders(int order, enum huge_mode mode)
+{
+ static unsigned long *orders[] = {
+ &huge_anon_orders_always,
+ &huge_anon_orders_inherit,
+ &huge_anon_orders_madvise,
+ };
+ bool changed = false;
+ int i;
+
+ spin_lock(&huge_anon_orders_lock);
+ for (i = 0; i < HUGE_NUM_MODES; i++) {
+ if (i == mode)
+ changed |= !test_and_set_bit(order, orders[mode]);
+ else
+ changed |= test_and_clear_bit(order, orders[mode]);
+ }
+ spin_unlock(&huge_anon_orders_lock);
+
+ return changed;
+}
+
static ssize_t anon_enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
int order = to_thpsize(kobj)->order;
ssize_t ret = count;
+ bool changed;
+
+ if (sysfs_streq(buf, "always"))
+ changed = change_anon_orders(order, HUGE_ALWAYS);
+ else if (sysfs_streq(buf, "inherit"))
+ changed = change_anon_orders(order, HUGE_INHERIT);
+ else if (sysfs_streq(buf, "madvise"))
+ changed = change_anon_orders(order, HUGE_MADVISE);
+ else if (sysfs_streq(buf, "never"))
+ changed = change_anon_orders(order, HUGE_NEVER);
+ else
+ return -EINVAL;
- if (sysfs_streq(buf, "always")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_always);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "inherit")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_inherit);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "madvise")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- set_bit(order, &huge_anon_orders_madvise);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "never")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- spin_unlock(&huge_anon_orders_lock);
- } else
- ret = -EINVAL;
-
- if (ret > 0) {
+ if (changed) {
int err;
err = start_stop_khugepaged();
if (err)
ret = err;
+ } else {
+ /* Users expect this even if unchanged. TODO: Put in header... */
+ //set_recommended_min_free_kbytes();
}
return ret;
}
--
2.53.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
@ 2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 9+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:40 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Wed, Mar 04, 2026 at 02:22:34AM -0800, Breno Leitao wrote:
> Avoid calling start_stop_khugepaged() at the top-level enabled_store()
> for /sys/kernel/mm/transparent_hugepage/enabled. Use test_and_set_bit()
> and test_and_clear_bit() to detect whether the configuration
> actually changed before calling start_stop_khugepaged().
>
> This avoid calling start_stop_khugepaged() unnecessary.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
Similar comments to 1/2 ofc.
> ---
> mm/huge_memory.c | 21 ++++++++++++++-------
> 1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9abfb115e9329..b6ed44b6e8c02 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -320,21 +320,28 @@ static ssize_t enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> + bool changed = false;
> ssize_t ret = count;
>
> if (sysfs_streq(buf, "always")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> - set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> + changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
> + &transparent_hugepage_flags);
> + changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_FLAG,
> + &transparent_hugepage_flags);
> } else if (sysfs_streq(buf, "madvise")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> - set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> + changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
> + &transparent_hugepage_flags);
> + changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
> + &transparent_hugepage_flags);
> } else if (sysfs_streq(buf, "never")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> - clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> + changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
> + &transparent_hugepage_flags);
> + changed |= test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
> + &transparent_hugepage_flags);
> } else
> ret = -EINVAL;
>
> - if (ret > 0) {
> + if (ret > 0 && changed) {
> int err = start_stop_khugepaged();
> if (err)
> ret = err;
>
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-04 16:40 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 11:53 ` Breno Leitao
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox