* [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
@ 2026-03-04 10:22 Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Breno Leitao @ 2026-03-04 10:22 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
start_stop_khugepaged() called independent of any change.
start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
exact same message, even when nothing changes.
For instance, if you have a custom vm.min_free_kbytes, just touching
/sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
Example:
# sysctl -w vm.min_free_kbytes=112382
# for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done
and you have 100 WARN messages like the following, which is pretty dull:
khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred
A similar message shows up when setting thp to "always":
# for i in $(seq 100); do
# echo 1024 > /proc/sys/vm/min_free_kbytes
# echo always > /sys/kernel/mm/transparent_hugepage/enabled
# done
And then, we have 100K messages like:
khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations
This is more common when you have a configuration management system that
writes the THP configuration without an extra read, assuming that
nothing will happen if there is no change in the configuration, but it
prints these annoying messages.
For instance, at Meta's fleet, ~10K servers were producing 3.5M of
these messages per day.
Fix this by doing two things:
* Make the sysfs _store helpers a no-op if there is no state change
* Ratelimit these messages to avoid SPAM on config change
---
Breno Leitao (2):
mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
mm/huge_memory.c | 48 ++++++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 20 deletions(-)
---
base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
change-id: 20260303-thp_logs-059d6b80f6d6
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
@ 2026-03-04 10:22 ` Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
` (2 subsequent siblings)
3 siblings, 1 reply; 16+ messages in thread
From: Breno Leitao @ 2026-03-04 10:22 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Writing "never" (or any other value) multiple times to
/sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
start_stop_khugepaged() each time, even when nothing actually changed.
This causes set_recommended_min_free_kbytes() to run unconditionally,
which is unnecessary and floods the printk buffer with "raising
min_free_kbytes" messages. Example:
# for i in $(seq 100); do
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# done
# dmesg | grep "min_free_kbytes is not updated" | wc -l
100
Use test_and_set_bit()/test_and_clear_bit() instead of the plain
variants to detect whether any bit actually flipped, and skip the
start_stop_khugepaged() call entirely when the configuration is
unchanged.
With this patch, redoing the same operation becomes a no-op.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/huge_memory.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8e2746ea74adf..9abfb115e9329 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -520,36 +520,37 @@ static ssize_t anon_enabled_store(struct kobject *kobj,
const char *buf, size_t count)
{
int order = to_thpsize(kobj)->order;
+ bool changed = false;
ssize_t ret = count;
if (sysfs_streq(buf, "always")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_always);
+ changed = test_and_clear_bit(order, &huge_anon_orders_inherit);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
+ changed |= !test_and_set_bit(order, &huge_anon_orders_always);
spin_unlock(&huge_anon_orders_lock);
} else if (sysfs_streq(buf, "inherit")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_inherit);
+ changed = test_and_clear_bit(order, &huge_anon_orders_always);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
+ changed |= !test_and_set_bit(order, &huge_anon_orders_inherit);
spin_unlock(&huge_anon_orders_lock);
} else if (sysfs_streq(buf, "madvise")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- set_bit(order, &huge_anon_orders_madvise);
+ changed = test_and_clear_bit(order, &huge_anon_orders_always);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
+ changed |= !test_and_set_bit(order, &huge_anon_orders_madvise);
spin_unlock(&huge_anon_orders_lock);
} else if (sysfs_streq(buf, "never")) {
spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
+ changed = test_and_clear_bit(order, &huge_anon_orders_always);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
+ changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
spin_unlock(&huge_anon_orders_lock);
} else
ret = -EINVAL;
- if (ret > 0) {
+ if (ret > 0 && changed) {
int err;
err = start_stop_khugepaged();
--
2.47.3
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
@ 2026-03-04 10:22 ` Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
3 siblings, 1 reply; 16+ messages in thread
From: Breno Leitao @ 2026-03-04 10:22 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Avoid calling start_stop_khugepaged() at the top-level enabled_store()
for /sys/kernel/mm/transparent_hugepage/enabled. Use test_and_set_bit()
and test_and_clear_bit() to detect whether the configuration
actually changed before calling start_stop_khugepaged().
This avoid calling start_stop_khugepaged() unnecessary.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/huge_memory.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9abfb115e9329..b6ed44b6e8c02 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -320,21 +320,28 @@ static ssize_t enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
+ bool changed = false;
ssize_t ret = count;
if (sysfs_streq(buf, "always")) {
- clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
- set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
+ changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+ &transparent_hugepage_flags);
+ changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_FLAG,
+ &transparent_hugepage_flags);
} else if (sysfs_streq(buf, "madvise")) {
- clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
+ changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
+ &transparent_hugepage_flags);
+ changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+ &transparent_hugepage_flags);
} else if (sysfs_streq(buf, "never")) {
- clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
+ changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
+ &transparent_hugepage_flags);
+ changed |= test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+ &transparent_hugepage_flags);
} else
ret = -EINVAL;
- if (ret > 0) {
+ if (ret > 0 && changed) {
int err = start_stop_khugepaged();
if (err)
ret = err;
--
2.47.3
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
@ 2026-03-04 11:18 ` Kiryl Shutsemau
2026-03-04 11:53 ` Breno Leitao
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
3 siblings, 2 replies; 16+ messages in thread
From: Kiryl Shutsemau @ 2026-03-04 11:18 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kernel-team
On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> Breno Leitao (2):
> mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
I think the same can be achieved cleaner from within start_stop_khugepaged().
Completely untested patch is below.
One noticeable difference that with the patch we don't kick
khugepaged_wait if khugepaged_thread is there. But I don't think it
should make a difference.
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1dd3cfca610d..80f818d3a094 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2683,18 +2683,18 @@ static void set_recommended_min_free_kbytes(void)
int start_stop_khugepaged(void)
{
- int err = 0;
+ guard(mutex)(&khugepaged_mutex);
+
+ /* Check if anything has to be done */
+ if (hugepage_pmd_enabled() == !!khugepaged_thread)
+ return 0;
- mutex_lock(&khugepaged_mutex);
if (hugepage_pmd_enabled()) {
- if (!khugepaged_thread)
- khugepaged_thread = kthread_run(khugepaged, NULL,
- "khugepaged");
+ khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged");
if (IS_ERR(khugepaged_thread)) {
pr_err("khugepaged: kthread_run(khugepaged) failed\n");
- err = PTR_ERR(khugepaged_thread);
khugepaged_thread = NULL;
- goto fail;
+ return PTR_ERR(khugepaged_thread);
}
if (!list_empty(&khugepaged_scan.mm_head))
@@ -2703,10 +2703,9 @@ int start_stop_khugepaged(void)
kthread_stop(khugepaged_thread);
khugepaged_thread = NULL;
}
+
set_recommended_min_free_kbytes();
-fail:
- mutex_unlock(&khugepaged_mutex);
- return err;
+ return 0;
}
void khugepaged_min_free_kbytes_update(void)
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
@ 2026-03-04 11:53 ` Breno Leitao
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
1 sibling, 0 replies; 16+ messages in thread
From: Breno Leitao @ 2026-03-04 11:53 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kernel-team
Hello Kiryl,
On Wed, Mar 04, 2026 at 11:18:37AM +0000, Kiryl Shutsemau wrote:
> On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> > Breno Leitao (2):
> > mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> > mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
>
> I think the same can be achieved cleaner from within start_stop_khugepaged().
Thanks for the review.
I considered this approach as well. From my limited experience, the preference
is to return early in the _store() callbacks when the operation is a no-op,
rather than pushing that logic deeper into nested code.
Handling it at the _store() level results in more changes overall, but it is
arguably more straightforward to reason about (!?).
That said, I am no MM subsystem expert, and I am happy to go either way.
--breno
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
` (2 preceding siblings ...)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
@ 2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
3 siblings, 0 replies; 16+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:17 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
> start_stop_khugepaged() called independent of any change.
> start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
> exact same message, even when nothing changes.
>
> For instance, if you have a custom vm.min_free_kbytes, just touching
> /sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
> Example:
>
> # sysctl -w vm.min_free_kbytes=112382
> # for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done
>
> and you have 100 WARN messages like the following, which is pretty dull:
>
> khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred
>
> A similar message shows up when setting thp to "always":
>
> # for i in $(seq 100); do
> # echo 1024 > /proc/sys/vm/min_free_kbytes
> # echo always > /sys/kernel/mm/transparent_hugepage/enabled
> # done
>
> And then, we have 100K messages like:
You mean 100? :)
It's 100 locally here.
>
> khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations
>
> This is more common when you have a configuration management system that
> writes the THP configuration without an extra read, assuming that
> nothing will happen if there is no change in the configuration, but it
> prints these annoying messages.
Right.
>
> For instance, at Meta's fleet, ~10K servers were producing 3.5M of
> these messages per day.
Yeah... that's not so fun :)
>
> Fix this by doing two things:
> * Make the sysfs _store helpers a no-op if there is no state change
> * Ratelimit these messages to avoid SPAM on config change
Thanks for taking a look at this!
>
> ---
> Breno Leitao (2):
> mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
>
> mm/huge_memory.c | 48 ++++++++++++++++++++++++++++--------------------
> 1 file changed, 28 insertions(+), 20 deletions(-)
> ---
> base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
> change-id: 20260303-thp_logs-059d6b80f6d6
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 11:53 ` Breno Leitao
@ 2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:41 ` David Hildenbrand (Arm)
1 sibling, 1 reply; 16+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:24 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: Breno Leitao, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, Lance Yang, Vlastimil Babka,
Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
Johannes Weiner, linux-mm, linux-kernel, usamaarif642,
kernel-team
On Wed, Mar 04, 2026 at 11:18:37AM +0000, Kiryl Shutsemau wrote:
> On Wed, Mar 04, 2026 at 02:22:32AM -0800, Breno Leitao wrote:
> > Breno Leitao (2):
> > mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
> > mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
>
> I think the same can be achieved cleaner from within start_stop_khugepaged().
I'm kind of with you in doing it here but...
>
> Completely untested patch is below.
>
> One noticeable difference that with the patch we don't kick
> khugepaged_wait if khugepaged_thread is there. But I don't think it
> should make a difference.
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1dd3cfca610d..80f818d3a094 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2683,18 +2683,18 @@ static void set_recommended_min_free_kbytes(void)
>
> int start_stop_khugepaged(void)
> {
> - int err = 0;
> + guard(mutex)(&khugepaged_mutex);
This is nice :) we need more guard usage in mm.
> +
> + /* Check if anything has to be done */
> + if (hugepage_pmd_enabled() == !!khugepaged_thread)
This is horrible, better to break out the latter expression as a small static
function like:
static bool is_khugepaged_thread_running(void)
{
if (khugepaged_thread)
return true;
return false;
}
> + return 0;
>
> - mutex_lock(&khugepaged_mutex);
> if (hugepage_pmd_enabled()) {
Can we race on this function call, meaning check above might vary from one here?
Better to have
bool enabled = hugepage_pmd_enabled();
etc. etc.
> - if (!khugepaged_thread)
> - khugepaged_thread = kthread_run(khugepaged, NULL,
> - "khugepaged");
> + khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged");
> if (IS_ERR(khugepaged_thread)) {
> pr_err("khugepaged: kthread_run(khugepaged) failed\n");
> - err = PTR_ERR(khugepaged_thread);
> khugepaged_thread = NULL;
> - goto fail;
> + return PTR_ERR(khugepaged_thread);
> }
>
> if (!list_empty(&khugepaged_scan.mm_head))
> @@ -2703,10 +2703,9 @@ int start_stop_khugepaged(void)
> kthread_stop(khugepaged_thread);
> khugepaged_thread = NULL;
> }
> +
> set_recommended_min_free_kbytes();
We are setting recommended min free kbytes each time somebody touches these
flags, maybe somebody somewhere relies on that and this change would break them?
So maybe we just want to do this each time anyway, but repress/rate limit
output?
> -fail:
> - mutex_unlock(&khugepaged_mutex);
> - return err;
> + return 0;
> }
>
> void khugepaged_min_free_kbytes_update(void)
> --
> Kiryl Shutsemau / Kirill A. Shutemov
Thanks, Lorenzo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
@ 2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-05 11:48 ` Breno Leitao
2026-03-05 12:44 ` David Hildenbrand (Arm)
0 siblings, 2 replies; 16+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:40 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Wed, Mar 04, 2026 at 02:22:33AM -0800, Breno Leitao wrote:
> Writing "never" (or any other value) multiple times to
> /sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
> start_stop_khugepaged() each time, even when nothing actually changed.
> This causes set_recommended_min_free_kbytes() to run unconditionally,
> which is unnecessary and floods the printk buffer with "raising
> min_free_kbytes" messages. Example:
>
> # for i in $(seq 100); do
> # echo never > /sys/kernel/mm/transparent_hugepage/enabled
> # done
>
> # dmesg | grep "min_free_kbytes is not updated" | wc -l
> 100
>
> Use test_and_set_bit()/test_and_clear_bit() instead of the plain
> variants to detect whether any bit actually flipped, and skip the
> start_stop_khugepaged() call entirely when the configuration is
> unchanged.
>
> With this patch, redoing the same operation becomes a no-op.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
General concept is sensible, but let's improve this code please.
> ---
> mm/huge_memory.c | 27 ++++++++++++++-------------
> 1 file changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8e2746ea74adf..9abfb115e9329 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -520,36 +520,37 @@ static ssize_t anon_enabled_store(struct kobject *kobj,
> const char *buf, size_t count)
> {
> int order = to_thpsize(kobj)->order;
> + bool changed = false;
> ssize_t ret = count;
>
> if (sysfs_streq(buf, "always")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_always);
> + changed = test_and_clear_bit(order, &huge_anon_orders_inherit);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
> + changed |= !test_and_set_bit(order, &huge_anon_orders_always);
> spin_unlock(&huge_anon_orders_lock);
> } else if (sysfs_streq(buf, "inherit")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_inherit);
> + changed = test_and_clear_bit(order, &huge_anon_orders_always);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
> + changed |= !test_and_set_bit(order, &huge_anon_orders_inherit);
> spin_unlock(&huge_anon_orders_lock);
> } else if (sysfs_streq(buf, "madvise")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - set_bit(order, &huge_anon_orders_madvise);
> + changed = test_and_clear_bit(order, &huge_anon_orders_always);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
> + changed |= !test_and_set_bit(order, &huge_anon_orders_madvise);
> spin_unlock(&huge_anon_orders_lock);
> } else if (sysfs_streq(buf, "never")) {
> spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> + changed = test_and_clear_bit(order, &huge_anon_orders_always);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
> + changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
This is badly implemented already (sigh) so a little tricky as to how to
abstract.
Yes the existing logic duplicated, doesn't mean we have to keep doing so :)
To put money where my mouth is attached a (totally untested, in line with
Kiryl's :P) patch to give a sense of how one might achieve this.
As to this vs. Kiryl's... I mean it might be nice to fix this crap up here to be
honest.
Maybe David can have deciding vote ;)
But see below for a caveat...
> spin_unlock(&huge_anon_orders_lock);
> } else
> ret = -EINVAL;
>
> - if (ret > 0) {
> + if (ret > 0 && changed) {
> int err;
>
> err = start_stop_khugepaged();
There's a caveat here as mentioned in reply to Kiryl - I'm concerned users
might rely on the set recommended min kbytes even when things don't change.
Not sure how likely that is, but it's a user-visible change in how this behaves.
Cheers, Lorenzo
>
> --
> 2.47.3
>
----8<----
From cb2c4c8bf183ef0d10068cfd12c12d19cb17a241 Mon Sep 17 00:00:00 2001
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Date: Wed, 4 Mar 2026 16:37:20 +0000
Subject: [PATCH] idea
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
mm/huge_memory.c | 74 ++++++++++++++++++++++++++++++------------------
1 file changed, 46 insertions(+), 28 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0df1f4a17430..97dabbeb9112 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -515,46 +515,64 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
return sysfs_emit(buf, "%s\n", output);
}
+enum huge_mode {
+ HUGE_ALWAYS,
+ HUGE_INHERIT,
+ HUGE_MADVISE,
+ HUGE_NUM_MODES,
+ HUGE_NEVER,
+};
+
+static bool change_anon_orders(int order, enum huge_mode mode)
+{
+ static unsigned long *orders[] = {
+ &huge_anon_orders_always,
+ &huge_anon_orders_inherit,
+ &huge_anon_orders_madvise,
+ };
+ bool changed = false;
+ int i;
+
+ spin_lock(&huge_anon_orders_lock);
+ for (i = 0; i < HUGE_NUM_MODES; i++) {
+ if (i == mode)
+ changed |= !test_and_set_bit(order, orders[mode]);
+ else
+ changed |= test_and_clear_bit(order, orders[mode]);
+ }
+ spin_unlock(&huge_anon_orders_lock);
+
+ return changed;
+}
+
static ssize_t anon_enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
int order = to_thpsize(kobj)->order;
ssize_t ret = count;
+ bool changed;
+
+ if (sysfs_streq(buf, "always"))
+ changed = change_anon_orders(order, HUGE_ALWAYS);
+ else if (sysfs_streq(buf, "inherit"))
+ changed = change_anon_orders(order, HUGE_INHERIT);
+ else if (sysfs_streq(buf, "madvise"))
+ changed = change_anon_orders(order, HUGE_MADVISE);
+ else if (sysfs_streq(buf, "never"))
+ changed = change_anon_orders(order, HUGE_NEVER);
+ else
+ return -EINVAL;
- if (sysfs_streq(buf, "always")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_always);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "inherit")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_inherit);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "madvise")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- set_bit(order, &huge_anon_orders_madvise);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "never")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- spin_unlock(&huge_anon_orders_lock);
- } else
- ret = -EINVAL;
-
- if (ret > 0) {
+ if (changed) {
int err;
err = start_stop_khugepaged();
if (err)
ret = err;
+ } else {
+ /* Users expect this even if unchanged. TODO: Put in header... */
+ //set_recommended_min_free_kbytes();
}
return ret;
}
--
2.53.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store()
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
@ 2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 16+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-04 16:40 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Wed, Mar 04, 2026 at 02:22:34AM -0800, Breno Leitao wrote:
> Avoid calling start_stop_khugepaged() at the top-level enabled_store()
> for /sys/kernel/mm/transparent_hugepage/enabled. Use test_and_set_bit()
> and test_and_clear_bit() to detect whether the configuration
> actually changed before calling start_stop_khugepaged().
>
> This avoid calling start_stop_khugepaged() unnecessary.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
Similar comments to 1/2 ofc.
> ---
> mm/huge_memory.c | 21 ++++++++++++++-------
> 1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9abfb115e9329..b6ed44b6e8c02 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -320,21 +320,28 @@ static ssize_t enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> + bool changed = false;
> ssize_t ret = count;
>
> if (sysfs_streq(buf, "always")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> - set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> + changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
> + &transparent_hugepage_flags);
> + changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_FLAG,
> + &transparent_hugepage_flags);
> } else if (sysfs_streq(buf, "madvise")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> - set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> + changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
> + &transparent_hugepage_flags);
> + changed |= !test_and_set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
> + &transparent_hugepage_flags);
> } else if (sysfs_streq(buf, "never")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> - clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> + changed = test_and_clear_bit(TRANSPARENT_HUGEPAGE_FLAG,
> + &transparent_hugepage_flags);
> + changed |= test_and_clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
> + &transparent_hugepage_flags);
> } else
> ret = -EINVAL;
>
> - if (ret > 0) {
> + if (ret > 0 && changed) {
> int err = start_stop_khugepaged();
> if (err)
> ret = err;
>
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
@ 2026-03-05 11:48 ` Breno Leitao
2026-03-05 12:30 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:44 ` David Hildenbrand (Arm)
1 sibling, 1 reply; 16+ messages in thread
From: Breno Leitao @ 2026-03-05 11:48 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Wed, Mar 04, 2026 at 04:40:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> On Wed, Mar 04, 2026 at 02:22:33AM -0800, Breno Leitao wrote:
> > Writing "never" (or any other value) multiple times to
> > /sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
> > start_stop_khugepaged() each time, even when nothing actually changed.
> > This causes set_recommended_min_free_kbytes() to run unconditionally,
> > which is unnecessary and floods the printk buffer with "raising
> > min_free_kbytes" messages. Example:
> >
> > # for i in $(seq 100); do
> > # echo never > /sys/kernel/mm/transparent_hugepage/enabled
> > # done
> >
> > # dmesg | grep "min_free_kbytes is not updated" | wc -l
> > 100
> >
> > Use test_and_set_bit()/test_and_clear_bit() instead of the plain
> > variants to detect whether any bit actually flipped, and skip the
> > start_stop_khugepaged() call entirely when the configuration is
> > unchanged.
> >
> > With this patch, redoing the same operation becomes a no-op.
> >
> > Signed-off-by: Breno Leitao <leitao@debian.org>
>
> General concept is sensible, but let's improve this code please.
Ack! Thanks for the suggestions.
> > spin_unlock(&huge_anon_orders_lock);
> > } else
> > ret = -EINVAL;
> >
> > - if (ret > 0) {
> > + if (ret > 0 && changed) {
> > int err;
> >
> > err = start_stop_khugepaged();
>
> There's a caveat here as mentioned in reply to Kiryl - I'm concerned users
> might rely on the set recommended min kbytes even when things don't change.
>
> Not sure how likely that is, but it's a user-visible change in how this behaves.
Is there any motivation that users are retouching
/sys/kernel/mm/transparent_hugepage just to trigger
set_recommended_min_free_kbytes() ? That seems weird, but, I will keep it in
the change.
> From cb2c4c8bf183ef0d10068cfd12c12d19cb17a241 Mon Sep 17 00:00:00 2001
> From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
> Date: Wed, 4 Mar 2026 16:37:20 +0000
> Subject: [PATCH] idea
>
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Thanks for the idea. Let me hack on top of it, and propose a v2.
> ---
> mm/huge_memory.c | 74 ++++++++++++++++++++++++++++++------------------
> 1 file changed, 46 insertions(+), 28 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 0df1f4a17430..97dabbeb9112 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -515,46 +515,64 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
> return sysfs_emit(buf, "%s\n", output);
> }
>
> +enum huge_mode {
> + HUGE_ALWAYS,
> + HUGE_INHERIT,
> + HUGE_MADVISE,
> + HUGE_NUM_MODES,
> + HUGE_NEVER,
> +};
> +
> +static bool change_anon_orders(int order, enum huge_mode mode)
> +{
> + static unsigned long *orders[] = {
> + &huge_anon_orders_always,
> + &huge_anon_orders_inherit,
> + &huge_anon_orders_madvise,
> + };
> + bool changed = false;
> + int i;
> +
> + spin_lock(&huge_anon_orders_lock);
> + for (i = 0; i < HUGE_NUM_MODES; i++) {
> + if (i == mode)
> + changed |= !test_and_set_bit(order, orders[mode]);
> + else
> + changed |= test_and_clear_bit(order, orders[mode]);
I suppose we want s/mode/i in the test_and_{clear,set}_bit() here:
if (i == mode)
// set for mode
changed |= !test_and_set_bit(order, orders[i]);
else
// clear for !mode
changed |= test_and_clear_bit(order, orders[i]);
For two reasons:
* you want to unset "i" when i != mode.
* you would have an OOB when accessing orders[HUGE_NEVER == 4]
> static ssize_t anon_enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> int order = to_thpsize(kobj)->order;
> ssize_t ret = count;
> + bool changed;
> +
> + if (sysfs_streq(buf, "always"))
> + changed = change_anon_orders(order, HUGE_ALWAYS);
> + else if (sysfs_streq(buf, "inherit"))
> + changed = change_anon_orders(order, HUGE_INHERIT);
> + else if (sysfs_streq(buf, "madvise"))
> + changed = change_anon_orders(order, HUGE_MADVISE);
> + else if (sysfs_streq(buf, "never"))
> + changed = change_anon_orders(order, HUGE_NEVER);
> + else
> + return -EINVAL;
I think we can simplify anon_enabled_store() even more, by leveraging sysfs_match_string().
Something like:
static const char *const anon_mode_strings[] = {
[HUGE_ALWAYS] = "always",
[HUGE_INHERIT] = "inherit",
[HUGE_MADVISE] = "madvise",
[HUGE_NEVER] = "never",
NULL,
};
and then
static ssize_t anon_enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
int order = to_thpsize(kobj)->order;
int mode;
mode = sysfs_match_string(enabled_mode_strings, buf);
if (mode < 0)
return mode;
if (change_anon_orders(order, mode)) {
int err = start_stop_khugepaged();
if (err)
return err;
} else {
/* Users expect this even if unchanged. TODO: Put in header... */
//set_recommended_min_free_kbytes();
}
return count;
}
Anyway, I like this approach, thanks!. Let me hack a v2 based on it.
--breno
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
2026-03-05 11:48 ` Breno Leitao
@ 2026-03-05 12:30 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 16+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-05 12:30 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Thu, Mar 05, 2026 at 03:48:07AM -0800, Breno Leitao wrote:
> On Wed, Mar 04, 2026 at 04:40:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> > On Wed, Mar 04, 2026 at 02:22:33AM -0800, Breno Leitao wrote:
> > > Writing "never" (or any other value) multiple times to
> > > /sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
> > > start_stop_khugepaged() each time, even when nothing actually changed.
> > > This causes set_recommended_min_free_kbytes() to run unconditionally,
> > > which is unnecessary and floods the printk buffer with "raising
> > > min_free_kbytes" messages. Example:
> > >
> > > # for i in $(seq 100); do
> > > # echo never > /sys/kernel/mm/transparent_hugepage/enabled
> > > # done
> > >
> > > # dmesg | grep "min_free_kbytes is not updated" | wc -l
> > > 100
> > >
> > > Use test_and_set_bit()/test_and_clear_bit() instead of the plain
> > > variants to detect whether any bit actually flipped, and skip the
> > > start_stop_khugepaged() call entirely when the configuration is
> > > unchanged.
> > >
> > > With this patch, redoing the same operation becomes a no-op.
> > >
> > > Signed-off-by: Breno Leitao <leitao@debian.org>
> >
> > General concept is sensible, but let's improve this code please.
>
> Ack! Thanks for the suggestions.
No problem, thanks for the patch! :)
>
> > > spin_unlock(&huge_anon_orders_lock);
> > > } else
> > > ret = -EINVAL;
> > >
> > > - if (ret > 0) {
> > > + if (ret > 0 && changed) {
> > > int err;
> > >
> > > err = start_stop_khugepaged();
> >
> > There's a caveat here as mentioned in reply to Kiryl - I'm concerned users
> > might rely on the set recommended min kbytes even when things don't change.
> >
> > Not sure how likely that is, but it's a user-visible change in how this behaves.
>
> Is there any motivation that users are retouching
> /sys/kernel/mm/transparent_hugepage just to trigger
> set_recommended_min_free_kbytes() ? That seems weird, but, I will keep it in
> the change.
I mean I can't really think of any, but I don't want to risk breaking (weird)
userspace.
>
> > From cb2c4c8bf183ef0d10068cfd12c12d19cb17a241 Mon Sep 17 00:00:00 2001
> > From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
> > Date: Wed, 4 Mar 2026 16:37:20 +0000
> > Subject: [PATCH] idea
> >
> > Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
>
> Thanks for the idea. Let me hack on top of it, and propose a v2.
Thanks!
>
> > ---
> > mm/huge_memory.c | 74 ++++++++++++++++++++++++++++++------------------
> > 1 file changed, 46 insertions(+), 28 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 0df1f4a17430..97dabbeb9112 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -515,46 +515,64 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
> > return sysfs_emit(buf, "%s\n", output);
> > }
> >
> > +enum huge_mode {
> > + HUGE_ALWAYS,
> > + HUGE_INHERIT,
> > + HUGE_MADVISE,
> > + HUGE_NUM_MODES,
> > + HUGE_NEVER,
> > +};
> > +
> > +static bool change_anon_orders(int order, enum huge_mode mode)
> > +{
> > + static unsigned long *orders[] = {
> > + &huge_anon_orders_always,
> > + &huge_anon_orders_inherit,
> > + &huge_anon_orders_madvise,
> > + };
> > + bool changed = false;
> > + int i;
> > +
> > + spin_lock(&huge_anon_orders_lock);
> > + for (i = 0; i < HUGE_NUM_MODES; i++) {
>
> > + if (i == mode)
> > + changed |= !test_and_set_bit(order, orders[mode]);
> > + else
> > + changed |= test_and_clear_bit(order, orders[mode]);
>
> I suppose we want s/mode/i in the test_and_{clear,set}_bit() here:
>
> if (i == mode)
> // set for mode
> changed |= !test_and_set_bit(order, orders[i]);
> else
> // clear for !mode
> changed |= test_and_clear_bit(order, orders[i]);
>
> For two reasons:
> * you want to unset "i" when i != mode.
> * you would have an OOB when accessing orders[HUGE_NEVER == 4]
>
>
> > static ssize_t anon_enabled_store(struct kobject *kobj,
> > struct kobj_attribute *attr,
> > const char *buf, size_t count)
> > {
> > int order = to_thpsize(kobj)->order;
> > ssize_t ret = count;
> > + bool changed;
> > +
> > + if (sysfs_streq(buf, "always"))
> > + changed = change_anon_orders(order, HUGE_ALWAYS);
> > + else if (sysfs_streq(buf, "inherit"))
> > + changed = change_anon_orders(order, HUGE_INHERIT);
> > + else if (sysfs_streq(buf, "madvise"))
> > + changed = change_anon_orders(order, HUGE_MADVISE);
> > + else if (sysfs_streq(buf, "never"))
> > + changed = change_anon_orders(order, HUGE_NEVER);
> > + else
> > + return -EINVAL;
>
> I think we can simplify anon_enabled_store() even more, by leveraging sysfs_match_string().
> Something like:
>
> static const char *const anon_mode_strings[] = {
> [HUGE_ALWAYS] = "always",
> [HUGE_INHERIT] = "inherit",
> [HUGE_MADVISE] = "madvise",
> [HUGE_NEVER] = "never",
> NULL,
> };
>
> and then
>
> static ssize_t anon_enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> int order = to_thpsize(kobj)->order;
> int mode;
>
> mode = sysfs_match_string(enabled_mode_strings, buf);
> if (mode < 0)
> return mode;
Nice!
>
> if (change_anon_orders(order, mode)) {
> int err = start_stop_khugepaged();
>
> if (err)
> return err;
> } else {
> /* Users expect this even if unchanged. TODO: Put in header... */
> //set_recommended_min_free_kbytes();
> }
> return count;
> }
>
>
>
> Anyway, I like this approach, thanks!. Let me hack a v2 based on it.
Great, thanks!
Note that my code seemed to introduce a splat, so it's buggy, make sure to check
it carefully (that 'untested' proviso was apposite, it turns out! :)
>
> --breno
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
@ 2026-03-05 12:41 ` David Hildenbrand (Arm)
2026-03-05 12:45 ` Lorenzo Stoakes (Oracle)
0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-05 12:41 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle), Kiryl Shutsemau
Cc: Breno Leitao, Andrew Morton, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kernel-team
>> +
>> + /* Check if anything has to be done */
>> + if (hugepage_pmd_enabled() == !!khugepaged_thread)
>
> This is horrible, better to break out the latter expression as a small static
> function like:
>
> static bool is_khugepaged_thread_running(void)
> {
> if (khugepaged_thread)
> return true;
> return false;
return khugepaged_thread;
should do the trick, given taht the return type of the function is a bool :)
> }
--
Cheers,
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-05 11:48 ` Breno Leitao
@ 2026-03-05 12:44 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 16+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-05 12:44 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle), Breno Leitao
Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Baolin Wang,
Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
Brendan Jackman, Johannes Weiner, linux-mm, linux-kernel,
usamaarif642, kas, kernel-team
> This is badly implemented already (sigh) so a little tricky as to how to
> abstract.
>
> Yes the existing logic duplicated, doesn't mean we have to keep doing so :)
>
> To put money where my mouth is attached a (totally untested, in line with
> Kiryl's :P) patch to give a sense of how one might achieve this.
>
> As to this vs. Kiryl's... I mean it might be nice to fix this crap up here to be
> honest.
>
> Maybe David can have deciding vote ;)
I guess there are plenty of cleanups to be had here. And yes, we can
have multiple ones! :)
--
Cheers,
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-05 12:41 ` David Hildenbrand (Arm)
@ 2026-03-05 12:45 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:46 ` Lorenzo Stoakes (Oracle)
2026-03-05 13:14 ` David Hildenbrand (Arm)
0 siblings, 2 replies; 16+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-05 12:45 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Kiryl Shutsemau, Breno Leitao, Andrew Morton, Lorenzo Stoakes,
Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, Lance Yang, Vlastimil Babka,
Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
Johannes Weiner, linux-mm, linux-kernel, usamaarif642,
kernel-team
On Thu, Mar 05, 2026 at 01:41:33PM +0100, David Hildenbrand (Arm) wrote:
>
> >> +
> >> + /* Check if anything has to be done */
> >> + if (hugepage_pmd_enabled() == !!khugepaged_thread)
> >
> > This is horrible, better to break out the latter expression as a small static
> > function like:
> >
> > static bool is_khugepaged_thread_running(void)
> > {
> > if (khugepaged_thread)
> > return true;
> > return false;
>
> return khugepaged_thread;
>
> should do the trick, given taht the return type of the function is a bool :)
Yeah I know :)
I mean it's subjective, but this form makes it clear that what you're returning
is _not_ a boolean.
I was a bit 'meh' when I first saw this pattern, but it's grown on me, sort of a
'make it so simple you can't get confused' kinda thing.
>
> > }
>
> --
> Cheers,
>
> David
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-05 12:45 ` Lorenzo Stoakes (Oracle)
@ 2026-03-05 12:46 ` Lorenzo Stoakes (Oracle)
2026-03-05 13:14 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 16+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-05 12:46 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Kiryl Shutsemau, Breno Leitao, Andrew Morton, Lorenzo Stoakes,
Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, Lance Yang, Vlastimil Babka,
Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
Johannes Weiner, linux-mm, linux-kernel, usamaarif642,
kernel-team
On Thu, Mar 05, 2026 at 12:45:56PM +0000, Lorenzo Stoakes (Oracle) wrote:
> On Thu, Mar 05, 2026 at 01:41:33PM +0100, David Hildenbrand (Arm) wrote:
> >
> > >> +
> > >> + /* Check if anything has to be done */
> > >> + if (hugepage_pmd_enabled() == !!khugepaged_thread)
> > >
> > > This is horrible, better to break out the latter expression as a small static
> > > function like:
> > >
> > > static bool is_khugepaged_thread_running(void)
> > > {
> > > if (khugepaged_thread)
> > > return true;
> > > return false;
> >
> > return khugepaged_thread;
> >
> > should do the trick, given taht the return type of the function is a bool :)
>
> Yeah I know :)
>
> I mean it's subjective, but this form makes it clear that what you're returning
> is _not_ a boolean.
Sorry obviously meant to say what you're TESTING not what you're RETURNING.
>
> I was a bit 'meh' when I first saw this pattern, but it's grown on me, sort of a
> 'make it so simple you can't get confused' kinda thing.
>
> >
> > > }
> >
> > --
> > Cheers,
> >
> > David
>
> Cheers, Lorenzo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-05 12:45 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:46 ` Lorenzo Stoakes (Oracle)
@ 2026-03-05 13:14 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 16+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-05 13:14 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Kiryl Shutsemau, Breno Leitao, Andrew Morton, Lorenzo Stoakes,
Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, Lance Yang, Vlastimil Babka,
Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
Johannes Weiner, linux-mm, linux-kernel, usamaarif642,
kernel-team
On 3/5/26 13:45, Lorenzo Stoakes (Oracle) wrote:
> On Thu, Mar 05, 2026 at 01:41:33PM +0100, David Hildenbrand (Arm) wrote:
>>
>>>
>>> This is horrible, better to break out the latter expression as a small static
>>> function like:
>>>
>>> static bool is_khugepaged_thread_running(void)
>>> {
>>> if (khugepaged_thread)
>>> return true;
>>> return false;
>>
>> return khugepaged_thread;
>>
>> should do the trick, given taht the return type of the function is a bool :)
>
> Yeah I know :)
>
> I mean it's subjective, but this form makes it clear that what you're returning
> is _not_ a boolean.
>
> I was a bit 'meh' when I first saw this pattern, but it's grown on me, sort of a
> 'make it so simple you can't get confused' kinda thing.
I consider
if (x)
return true;
return false;
rather ... questionable :)
--
Cheers,
David
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2026-03-05 13:15 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-05 11:48 ` Breno Leitao
2026-03-05 12:30 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:44 ` David Hildenbrand (Arm)
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
2026-03-04 16:40 ` Lorenzo Stoakes (Oracle)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 11:53 ` Breno Leitao
2026-03-04 16:24 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:41 ` David Hildenbrand (Arm)
2026-03-05 12:45 ` Lorenzo Stoakes (Oracle)
2026-03-05 12:46 ` Lorenzo Stoakes (Oracle)
2026-03-05 13:14 ` David Hildenbrand (Arm)
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox