* [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls
@ 2026-03-05 14:04 Breno Leitao
2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
Lorenzo Stoakes (Oracle),
Breno Leitao
Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
start_stop_khugepaged() called independent of any change.
start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
exact same message, even when nothing changes.
For instance, if you have a custom vm.min_free_kbytes, just touching
/sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
Example:
# sysctl -w vm.min_free_kbytes=112382
# for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done
and you have 100 WARN messages like the following, which is pretty dull:
khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred
A similar message shows up when setting thp to "always":
# for i in $(seq 100); do
# echo 1024 > /proc/sys/vm/min_free_kbytes
# echo always > /sys/kernel/mm/transparent_hugepage/enabled
# done
And then, we have 100 messages like:
khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations
This is more common when you have a configuration management system that
writes the THP configuration without an extra read, assuming that
nothing will happen if there is no change in the configuration, but it
prints these annoying messages.
For instance, at Meta's fleet, ~10K servers were producing 3.5M of
these messages per day.
Fix this by making the sysfs _store helpers a no-op if there is no state
change.
This version is heavily based on Lorezo's suggestion on V1.
---
Changes in v2:
- V2 is heavily based on Lorenzo and Kiryl feedback on v1.
- Link to v1: https://patch.msgid.link/20260304-thp_logs-v1-0-59038218a253@debian.org
---
Breno Leitao (3):
mm: khugepaged: export set_recommended_min_free_kbytes()
mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
mm: huge_memory: refactor enabled_store() with change_enabled()
include/linux/khugepaged.h | 1 +
mm/huge_memory.c | 135 +++++++++++++++++++++++++++++----------------
mm/khugepaged.c | 2 +-
3 files changed, 90 insertions(+), 48 deletions(-)
---
base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
change-id: 20260303-thp_logs-059d6b80f6d6
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes()
2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
@ 2026-03-05 14:04 ` Breno Leitao
2026-03-06 11:20 ` Lorenzo Stoakes (Oracle)
2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
Lorenzo Stoakes (Oracle),
Breno Leitao
Make set_recommended_min_free_kbytes() callable from outside
khugepaged.c by removing the static qualifier and adding a
declaration in include/linux/khugepaged.h.
This allows callers that change THP settings to recalculate
watermarks without going through start_stop_khugepaged().
Suggested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
include/linux/khugepaged.h | 1 +
mm/khugepaged.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index d7a9053ff4fed..76252865fca4b 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -16,6 +16,7 @@ extern void __khugepaged_exit(struct mm_struct *mm);
extern void khugepaged_enter_vma(struct vm_area_struct *vma,
vm_flags_t vm_flags);
extern void khugepaged_min_free_kbytes_update(void);
+extern void set_recommended_min_free_kbytes(void);
extern bool current_is_khugepaged(void);
void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
bool install_pmd);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1dd3cfca610db..56a41c21b44c9 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2630,7 +2630,7 @@ static int khugepaged(void *none)
return 0;
}
-static void set_recommended_min_free_kbytes(void)
+void set_recommended_min_free_kbytes(void)
{
struct zone *zone;
int nr_zones = 0;
--
2.47.3
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
@ 2026-03-05 14:04 ` Breno Leitao
2026-03-06 6:07 ` Baolin Wang
2026-03-06 11:32 ` Lorenzo Stoakes (Oracle)
2026-03-05 14:04 ` [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
2026-03-06 6:14 ` [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Baolin Wang
3 siblings, 2 replies; 15+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Consolidate the repeated spin_lock/set_bit/clear_bit pattern in
anon_enabled_store() into a new change_anon_orders() helper that
loops over an orders[] array, setting the bit for the selected mode
and clearing the others.
Introduce enum enabled_mode and enabled_mode_strings[] to be shared
with enabled_store() in a subsequent patch.
Use sysfs_match_string() with the enabled_mode_strings[] table to
replace the if/else chain of sysfs_streq() calls.
The helper uses test_and_set_bit()/test_and_clear_bit() to track
whether the state actually changed, so start_stop_khugepaged() is
only called when needed. When the mode is unchanged,
set_recommended_min_free_kbytes() is called directly to preserve
the watermark recalculation behavior of the original code.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/huge_memory.c | 84 +++++++++++++++++++++++++++++++++++---------------------
1 file changed, 52 insertions(+), 32 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8e2746ea74adf..19619213f54d1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -316,6 +316,20 @@ static ssize_t enabled_show(struct kobject *kobj,
return sysfs_emit(buf, "%s\n", output);
}
+enum enabled_mode {
+ ENABLED_ALWAYS,
+ ENABLED_MADVISE,
+ ENABLED_INHERIT,
+ ENABLED_NEVER,
+};
+
+static const char * const enabled_mode_strings[] = {
+ [ENABLED_ALWAYS] = "always",
+ [ENABLED_MADVISE] = "madvise",
+ [ENABLED_INHERIT] = "inherit",
+ [ENABLED_NEVER] = "never",
+};
+
static ssize_t enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
@@ -515,48 +529,54 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
return sysfs_emit(buf, "%s\n", output);
}
+static bool change_anon_orders(int order, enum enabled_mode mode)
+{
+ static unsigned long *orders[] = {
+ &huge_anon_orders_always,
+ &huge_anon_orders_madvise,
+ &huge_anon_orders_inherit,
+ };
+ bool changed = false;
+ int i;
+
+ spin_lock(&huge_anon_orders_lock);
+ for (i = 0; i < ARRAY_SIZE(orders); i++) {
+ if (i == mode)
+ changed |= !test_and_set_bit(order, orders[i]);
+ else
+ changed |= test_and_clear_bit(order, orders[i]);
+ }
+ spin_unlock(&huge_anon_orders_lock);
+
+ return changed;
+}
+
static ssize_t anon_enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
int order = to_thpsize(kobj)->order;
- ssize_t ret = count;
+ int mode;
- if (sysfs_streq(buf, "always")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_always);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "inherit")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_madvise);
- set_bit(order, &huge_anon_orders_inherit);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "madvise")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- set_bit(order, &huge_anon_orders_madvise);
- spin_unlock(&huge_anon_orders_lock);
- } else if (sysfs_streq(buf, "never")) {
- spin_lock(&huge_anon_orders_lock);
- clear_bit(order, &huge_anon_orders_always);
- clear_bit(order, &huge_anon_orders_inherit);
- clear_bit(order, &huge_anon_orders_madvise);
- spin_unlock(&huge_anon_orders_lock);
- } else
- ret = -EINVAL;
+ mode = sysfs_match_string(enabled_mode_strings, buf);
+ if (mode < 0)
+ return -EINVAL;
- if (ret > 0) {
- int err;
+ if (change_anon_orders(order, mode)) {
+ int err = start_stop_khugepaged();
- err = start_stop_khugepaged();
if (err)
- ret = err;
+ return err;
+ } else {
+ /*
+ * Recalculate watermarks even when the mode didn't
+ * change, as the previous code always called
+ * start_stop_khugepaged() which does this internally.
+ */
+ set_recommended_min_free_kbytes();
}
- return ret;
+
+ return count;
}
static struct kobj_attribute anon_enabled_attr =
--
2.47.3
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled()
2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
@ 2026-03-05 14:04 ` Breno Leitao
2026-03-06 11:39 ` Lorenzo Stoakes (Oracle)
2026-03-06 6:14 ` [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Baolin Wang
3 siblings, 1 reply; 15+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao
Refactor enabled_store() to use a new change_enabled() helper that
reuses the shared enum enabled_mode and enabled_mode_strings[]
introduced in the previous commit.
The helper uses the same loop pattern as change_anon_orders(),
iterating over an array of flag bit positions and using
test_and_set_bit()/test_and_clear_bit() to track whether the state
actually changed. ENABLED_INHERIT is rejected since it is not a
valid mode for the global THP setting.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/huge_memory.c | 51 ++++++++++++++++++++++++++++++++++++---------------
1 file changed, 36 insertions(+), 15 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 19619213f54d1..dd5011cf36839 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -330,30 +330,51 @@ static const char * const enabled_mode_strings[] = {
[ENABLED_NEVER] = "never",
};
+static bool change_enabled(enum enabled_mode mode)
+{
+ static const unsigned long thp_flags[] = {
+ TRANSPARENT_HUGEPAGE_FLAG,
+ TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+ };
+ bool changed = false;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(thp_flags); i++) {
+ if (i == mode)
+ changed |= !test_and_set_bit(thp_flags[i],
+ &transparent_hugepage_flags);
+ else
+ changed |= test_and_clear_bit(thp_flags[i],
+ &transparent_hugepage_flags);
+ }
+
+ return changed;
+}
+
static ssize_t enabled_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
- ssize_t ret = count;
+ int mode;
- if (sysfs_streq(buf, "always")) {
- clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
- set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- } else if (sysfs_streq(buf, "madvise")) {
- clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
- } else if (sysfs_streq(buf, "never")) {
- clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
- } else
- ret = -EINVAL;
+ mode = sysfs_match_string(enabled_mode_strings, buf);
+ if (mode < 0 || mode == ENABLED_INHERIT)
+ return -EINVAL;
- if (ret > 0) {
+ if (change_enabled(mode)) {
int err = start_stop_khugepaged();
+
if (err)
- ret = err;
+ return err;
+ } else {
+ /*
+ * Recalculate watermarks even when the mode didn't
+ * change, as the previous code always called
+ * start_stop_khugepaged() which does this internally.
+ */
+ set_recommended_min_free_kbytes();
}
- return ret;
+ return count;
}
static struct kobj_attribute enabled_attr = __ATTR_RW(enabled);
--
2.47.3
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
@ 2026-03-06 6:07 ` Baolin Wang
2026-03-06 10:30 ` Breno Leitao
2026-03-06 11:32 ` Lorenzo Stoakes (Oracle)
1 sibling, 1 reply; 15+ messages in thread
From: Baolin Wang @ 2026-03-06 6:07 UTC (permalink / raw)
To: Breno Leitao, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team
On 3/5/26 10:04 PM, Breno Leitao wrote:
> Consolidate the repeated spin_lock/set_bit/clear_bit pattern in
> anon_enabled_store() into a new change_anon_orders() helper that
> loops over an orders[] array, setting the bit for the selected mode
> and clearing the others.
>
> Introduce enum enabled_mode and enabled_mode_strings[] to be shared
> with enabled_store() in a subsequent patch.
>
> Use sysfs_match_string() with the enabled_mode_strings[] table to
> replace the if/else chain of sysfs_streq() calls.
>
> The helper uses test_and_set_bit()/test_and_clear_bit() to track
> whether the state actually changed, so start_stop_khugepaged() is
> only called when needed. When the mode is unchanged,
> set_recommended_min_free_kbytes() is called directly to preserve
> the watermark recalculation behavior of the original code.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> mm/huge_memory.c | 84 +++++++++++++++++++++++++++++++++++---------------------
> 1 file changed, 52 insertions(+), 32 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8e2746ea74adf..19619213f54d1 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -316,6 +316,20 @@ static ssize_t enabled_show(struct kobject *kobj,
> return sysfs_emit(buf, "%s\n", output);
> }
>
> +enum enabled_mode {
> + ENABLED_ALWAYS,
> + ENABLED_MADVISE,
> + ENABLED_INHERIT,
> + ENABLED_NEVER,
> +};
> +
> +static const char * const enabled_mode_strings[] = {
> + [ENABLED_ALWAYS] = "always",
> + [ENABLED_MADVISE] = "madvise",
> + [ENABLED_INHERIT] = "inherit",
> + [ENABLED_NEVER] = "never",
> +};
> +
> static ssize_t enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> @@ -515,48 +529,54 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
> return sysfs_emit(buf, "%s\n", output);
> }
>
> +static bool change_anon_orders(int order, enum enabled_mode mode)
> +{
> + static unsigned long *orders[] = {
> + &huge_anon_orders_always,
> + &huge_anon_orders_madvise,
> + &huge_anon_orders_inherit,
> + };
> + bool changed = false;
> + int i;
> +
> + spin_lock(&huge_anon_orders_lock);
> + for (i = 0; i < ARRAY_SIZE(orders); i++) {
> + if (i == mode)
> + changed |= !test_and_set_bit(order, orders[i]);
> + else
> + changed |= test_and_clear_bit(order, orders[i]);
> + }
> + spin_unlock(&huge_anon_orders_lock);
> +
> + return changed;
> +}
> +
> static ssize_t anon_enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> int order = to_thpsize(kobj)->order;
> - ssize_t ret = count;
> + int mode;
>
> - if (sysfs_streq(buf, "always")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_always);
> - spin_unlock(&huge_anon_orders_lock);
> - } else if (sysfs_streq(buf, "inherit")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_inherit);
> - spin_unlock(&huge_anon_orders_lock);
> - } else if (sysfs_streq(buf, "madvise")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - set_bit(order, &huge_anon_orders_madvise);
> - spin_unlock(&huge_anon_orders_lock);
> - } else if (sysfs_streq(buf, "never")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> - spin_unlock(&huge_anon_orders_lock);
> - } else
> - ret = -EINVAL;
> + mode = sysfs_match_string(enabled_mode_strings, buf);
> + if (mode < 0)
> + return -EINVAL;
>
> - if (ret > 0) {
> - int err;
> + if (change_anon_orders(order, mode)) {
> + int err = start_stop_khugepaged();
Thanks for the cleanup, and the code looks better.
>
> - err = start_stop_khugepaged();
> if (err)
> - ret = err;
> + return err;
> + } else {
> + /*
> + * Recalculate watermarks even when the mode didn't
> + * change, as the previous code always called
> + * start_stop_khugepaged() which does this internally.
> + */
> + set_recommended_min_free_kbytes();
However, this won’t fix your issue. You will still get lots of warning
messages even if no hugepage options are changed.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
` (2 preceding siblings ...)
2026-03-05 14:04 ` [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
@ 2026-03-06 6:14 ` Baolin Wang
2026-03-06 10:26 ` Breno Leitao
3 siblings, 1 reply; 15+ messages in thread
From: Baolin Wang @ 2026-03-06 6:14 UTC (permalink / raw)
To: Breno Leitao, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner
Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
Lorenzo Stoakes (Oracle)
On 3/5/26 10:04 PM, Breno Leitao wrote:
> Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
> start_stop_khugepaged() called independent of any change.
> start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
> exact same message, even when nothing changes.
>
> For instance, if you have a custom vm.min_free_kbytes, just touching
> /sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
> Example:
>
> # sysctl -w vm.min_free_kbytes=112382
> # for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done
>
> and you have 100 WARN messages like the following, which is pretty dull:
>
> khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred
>
> A similar message shows up when setting thp to "always":
>
> # for i in $(seq 100); do
> # echo 1024 > /proc/sys/vm/min_free_kbytes
> # echo always > /sys/kernel/mm/transparent_hugepage/enabled
> # done
>
> And then, we have 100 messages like:
>
> khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations
>
> This is more common when you have a configuration management system that
> writes the THP configuration without an extra read, assuming that
> nothing will happen if there is no change in the configuration, but it
> prints these annoying messages.
>
> For instance, at Meta's fleet, ~10K servers were producing 3.5M of
> these messages per day.
>
> Fix this by making the sysfs _store helpers a no-op if there is no state
> change.
>
> This version is heavily based on Lorezo's suggestion on V1.
Thanks for doing this. And you should also consider the shmem parts:
shmem_enabled_store() and thpsize_shmem_enabled_store(), both of which
will also call start_stop_khugepaged().
That means you can also generate lots of messages with the following script.
# for i in $(seq 100); do
# echo 1024 > /proc/sys/vm/min_free_kbytes
# echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled
# done
> ---
> Changes in v2:
> - V2 is heavily based on Lorenzo and Kiryl feedback on v1.
> - Link to v1: https://patch.msgid.link/20260304-thp_logs-v1-0-59038218a253@debian.org
>
> ---
> Breno Leitao (3):
> mm: khugepaged: export set_recommended_min_free_kbytes()
> mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
> mm: huge_memory: refactor enabled_store() with change_enabled()
>
> include/linux/khugepaged.h | 1 +
> mm/huge_memory.c | 135 +++++++++++++++++++++++++++++----------------
> mm/khugepaged.c | 2 +-
> 3 files changed, 90 insertions(+), 48 deletions(-)
> ---
> base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
> change-id: 20260303-thp_logs-059d6b80f6d6
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls
2026-03-06 6:14 ` [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Baolin Wang
@ 2026-03-06 10:26 ` Breno Leitao
0 siblings, 0 replies; 15+ messages in thread
From: Breno Leitao @ 2026-03-06 10:26 UTC (permalink / raw)
To: Baolin Wang
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
Brendan Jackman, Johannes Weiner, linux-mm, linux-kernel,
usamaarif642, kas, kernel-team, Lorenzo Stoakes (Oracle)
On Fri, Mar 06, 2026 at 02:14:31PM +0800, Baolin Wang wrote:
>
>
> On 3/5/26 10:04 PM, Breno Leitao wrote:
> > Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
> > start_stop_khugepaged() called independent of any change.
> > start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
> > exact same message, even when nothing changes.
> >
> > For instance, if you have a custom vm.min_free_kbytes, just touching
> > /sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
> > Example:
> >
> > # sysctl -w vm.min_free_kbytes=112382
> > # for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done
> >
> > and you have 100 WARN messages like the following, which is pretty dull:
> >
> > khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred
> >
> > A similar message shows up when setting thp to "always":
> >
> > # for i in $(seq 100); do
> > # echo 1024 > /proc/sys/vm/min_free_kbytes
> > # echo always > /sys/kernel/mm/transparent_hugepage/enabled
> > # done
> >
> > And then, we have 100 messages like:
> >
> > khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations
> >
> > This is more common when you have a configuration management system that
> > writes the THP configuration without an extra read, assuming that
> > nothing will happen if there is no change in the configuration, but it
> > prints these annoying messages.
> >
> > For instance, at Meta's fleet, ~10K servers were producing 3.5M of
> > these messages per day.
> >
> > Fix this by making the sysfs _store helpers a no-op if there is no state
> > change.
> >
> > This version is heavily based on Lorezo's suggestion on V1.
>
> Thanks for doing this. And you should also consider the shmem parts:
> shmem_enabled_store() and thpsize_shmem_enabled_store(), both of which will
> also call start_stop_khugepaged().
>
> That means you can also generate lots of messages with the following script.
>
> # for i in $(seq 100); do
> # echo 1024 > /proc/sys/vm/min_free_kbytes
> # echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled
> # done
Thanks for the input, I will include it in the fixes also.
--breno
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
2026-03-06 6:07 ` Baolin Wang
@ 2026-03-06 10:30 ` Breno Leitao
2026-03-06 11:22 ` Lorenzo Stoakes (Oracle)
0 siblings, 1 reply; 15+ messages in thread
From: Breno Leitao @ 2026-03-06 10:30 UTC (permalink / raw)
To: Baolin Wang
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
Brendan Jackman, Johannes Weiner, linux-mm, linux-kernel,
usamaarif642, kas, kernel-team
On Fri, Mar 06, 2026 at 02:07:50PM +0800, Baolin Wang wrote:
> > + mode = sysfs_match_string(enabled_mode_strings, buf);
> > + if (mode < 0)
> > + return -EINVAL;
> > - if (ret > 0) {
> > - int err;
> > + if (change_anon_orders(order, mode)) {
> > + int err = start_stop_khugepaged();
>
> Thanks for the cleanup, and the code looks better.
>
> > - err = start_stop_khugepaged();
> > if (err)
> > - ret = err;
> > + return err;
> > + } else {
> > + /*
> > + * Recalculate watermarks even when the mode didn't
> > + * change, as the previous code always called
> > + * start_stop_khugepaged() which does this internally.
> > + */
> > + set_recommended_min_free_kbytes();
>
> However, this won't fix your issue. You will still get lots of warning
> messages even if no hugepage options are changed.
Correct — this was part of the earlier discussion, where the suggestion
was to retain the set_recommended_min_free_kbytes() call even when it
is a no-op.
As for the warnings, I see a few possible approaches:
* Remove them entirely — are they providing any real value?
* Apply rate limiting to reduce the noise
* Something else?
Thanks for bringing this up,
--breno
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes()
2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
@ 2026-03-06 11:20 ` Lorenzo Stoakes (Oracle)
2026-03-06 11:33 ` Lorenzo Stoakes (Oracle)
0 siblings, 1 reply; 15+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-06 11:20 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Thu, Mar 05, 2026 at 06:04:53AM -0800, Breno Leitao wrote:
> Make set_recommended_min_free_kbytes() callable from outside
> khugepaged.c by removing the static qualifier and adding a
> declaration in include/linux/khugepaged.h.
>
> This allows callers that change THP settings to recalculate
> watermarks without going through start_stop_khugepaged().
>
> Suggested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> include/linux/khugepaged.h | 1 +
> mm/khugepaged.c | 2 +-
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> index d7a9053ff4fed..76252865fca4b 100644
> --- a/include/linux/khugepaged.h
> +++ b/include/linux/khugepaged.h
> @@ -16,6 +16,7 @@ extern void __khugepaged_exit(struct mm_struct *mm);
> extern void khugepaged_enter_vma(struct vm_area_struct *vma,
> vm_flags_t vm_flags);
> extern void khugepaged_min_free_kbytes_update(void);
> +extern void set_recommended_min_free_kbytes(void);
Please drop the extern, it's a historic artifact that we clean up when we add
new entries (I don't blame you for including it, it's a sort of 'unwritten'
convention :P)
Also, can we put this in mm/internal.h please? As we want to limit who can call
this even in-kernel.
> extern bool current_is_khugepaged(void);
> void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
> bool install_pmd);
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1dd3cfca610db..56a41c21b44c9 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2630,7 +2630,7 @@ static int khugepaged(void *none)
> return 0;
> }
>
> -static void set_recommended_min_free_kbytes(void)
> +void set_recommended_min_free_kbytes(void)
> {
> struct zone *zone;
> int nr_zones = 0;
>
> --
> 2.47.3
>
THanks, Lorenzo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
2026-03-06 10:30 ` Breno Leitao
@ 2026-03-06 11:22 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 15+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-06 11:22 UTC (permalink / raw)
To: Breno Leitao
Cc: Baolin Wang, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Fri, Mar 06, 2026 at 02:30:22AM -0800, Breno Leitao wrote:
> On Fri, Mar 06, 2026 at 02:07:50PM +0800, Baolin Wang wrote:
> > > + mode = sysfs_match_string(enabled_mode_strings, buf);
> > > + if (mode < 0)
> > > + return -EINVAL;
> > > - if (ret > 0) {
> > > - int err;
> > > + if (change_anon_orders(order, mode)) {
> > > + int err = start_stop_khugepaged();
> >
> > Thanks for the cleanup, and the code looks better.
> >
> > > - err = start_stop_khugepaged();
> > > if (err)
> > > - ret = err;
> > > + return err;
> > > + } else {
> > > + /*
> > > + * Recalculate watermarks even when the mode didn't
> > > + * change, as the previous code always called
> > > + * start_stop_khugepaged() which does this internally.
> > > + */
> > > + set_recommended_min_free_kbytes();
> >
> > However, this won't fix your issue. You will still get lots of warning
> > messages even if no hugepage options are changed.
>
> Correct — this was part of the earlier discussion, where the suggestion
> was to retain the set_recommended_min_free_kbytes() call even when it
> is a no-op.
>
> As for the warnings, I see a few possible approaches:
>
> * Remove them entirely — are they providing any real value?
> * Apply rate limiting to reduce the noise
I think rate limiting is the sensible approach.
> * Something else?
>
> Thanks for bringing this up,
> --breno
>
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
2026-03-06 6:07 ` Baolin Wang
@ 2026-03-06 11:32 ` Lorenzo Stoakes (Oracle)
2026-03-06 16:43 ` Breno Leitao
1 sibling, 1 reply; 15+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-06 11:32 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Thu, Mar 05, 2026 at 06:04:54AM -0800, Breno Leitao wrote:
> Consolidate the repeated spin_lock/set_bit/clear_bit pattern in
> anon_enabled_store() into a new change_anon_orders() helper that
> loops over an orders[] array, setting the bit for the selected mode
> and clearing the others.
>
> Introduce enum enabled_mode and enabled_mode_strings[] to be shared
> with enabled_store() in a subsequent patch.
>
> Use sysfs_match_string() with the enabled_mode_strings[] table to
> replace the if/else chain of sysfs_streq() calls.
>
> The helper uses test_and_set_bit()/test_and_clear_bit() to track
> whether the state actually changed, so start_stop_khugepaged() is
> only called when needed. When the mode is unchanged,
> set_recommended_min_free_kbytes() is called directly to preserve
> the watermark recalculation behavior of the original code.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
This is nice, you've improved my idea :)
I know you change another instance of this in 3/3, but I wonder if there are
other cases like this in the THP code we can fixup too?
Anyway, passing tests locally and LGTM so:
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
> mm/huge_memory.c | 84 +++++++++++++++++++++++++++++++++++---------------------
> 1 file changed, 52 insertions(+), 32 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8e2746ea74adf..19619213f54d1 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -316,6 +316,20 @@ static ssize_t enabled_show(struct kobject *kobj,
> return sysfs_emit(buf, "%s\n", output);
> }
>
> +enum enabled_mode {
> + ENABLED_ALWAYS,
> + ENABLED_MADVISE,
> + ENABLED_INHERIT,
> + ENABLED_NEVER,
> +};
> +
> +static const char * const enabled_mode_strings[] = {
> + [ENABLED_ALWAYS] = "always",
> + [ENABLED_MADVISE] = "madvise",
> + [ENABLED_INHERIT] = "inherit",
> + [ENABLED_NEVER] = "never",
> +};
> +
> static ssize_t enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> @@ -515,48 +529,54 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
> return sysfs_emit(buf, "%s\n", output);
> }
>
> +static bool change_anon_orders(int order, enum enabled_mode mode)
> +{
> + static unsigned long *orders[] = {
> + &huge_anon_orders_always,
> + &huge_anon_orders_madvise,
> + &huge_anon_orders_inherit,
> + };
> + bool changed = false;
> + int i;
> +
> + spin_lock(&huge_anon_orders_lock);
> + for (i = 0; i < ARRAY_SIZE(orders); i++) {
> + if (i == mode)
> + changed |= !test_and_set_bit(order, orders[i]);
> + else
> + changed |= test_and_clear_bit(order, orders[i]);
> + }
> + spin_unlock(&huge_anon_orders_lock);
> +
> + return changed;
> +}
> +
> static ssize_t anon_enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> int order = to_thpsize(kobj)->order;
> - ssize_t ret = count;
> + int mode;
>
> - if (sysfs_streq(buf, "always")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_always);
> - spin_unlock(&huge_anon_orders_lock);
> - } else if (sysfs_streq(buf, "inherit")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_madvise);
> - set_bit(order, &huge_anon_orders_inherit);
> - spin_unlock(&huge_anon_orders_lock);
> - } else if (sysfs_streq(buf, "madvise")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - set_bit(order, &huge_anon_orders_madvise);
> - spin_unlock(&huge_anon_orders_lock);
> - } else if (sysfs_streq(buf, "never")) {
> - spin_lock(&huge_anon_orders_lock);
> - clear_bit(order, &huge_anon_orders_always);
> - clear_bit(order, &huge_anon_orders_inherit);
> - clear_bit(order, &huge_anon_orders_madvise);
> - spin_unlock(&huge_anon_orders_lock);
> - } else
> - ret = -EINVAL;
> + mode = sysfs_match_string(enabled_mode_strings, buf);
> + if (mode < 0)
> + return -EINVAL;
>
> - if (ret > 0) {
> - int err;
> + if (change_anon_orders(order, mode)) {
> + int err = start_stop_khugepaged();
>
> - err = start_stop_khugepaged();
> if (err)
> - ret = err;
> + return err;
> + } else {
> + /*
> + * Recalculate watermarks even when the mode didn't
> + * change, as the previous code always called
> + * start_stop_khugepaged() which does this internally.
> + */
> + set_recommended_min_free_kbytes();
> }
> - return ret;
> +
> + return count;
> }
>
> static struct kobj_attribute anon_enabled_attr =
>
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes()
2026-03-06 11:20 ` Lorenzo Stoakes (Oracle)
@ 2026-03-06 11:33 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 15+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-06 11:33 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Fri, Mar 06, 2026 at 11:20:57AM +0000, Lorenzo Stoakes (Oracle) wrote:
> > include/linux/khugepaged.h | 1 +
> > mm/khugepaged.c | 2 +-
> > 2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
> > index d7a9053ff4fed..76252865fca4b 100644
> > --- a/include/linux/khugepaged.h
> > +++ b/include/linux/khugepaged.h
> > @@ -16,6 +16,7 @@ extern void __khugepaged_exit(struct mm_struct *mm);
> > extern void khugepaged_enter_vma(struct vm_area_struct *vma,
> > vm_flags_t vm_flags);
> > extern void khugepaged_min_free_kbytes_update(void);
> > +extern void set_recommended_min_free_kbytes(void);
>
> Please drop the extern, it's a historic artifact that we clean up when we add
> new entries (I don't blame you for including it, it's a sort of 'unwritten'
> convention :P)
Oh, and with that fixed, feel free to add:
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
To this patch on respin.
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled()
2026-03-05 14:04 ` [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
@ 2026-03-06 11:39 ` Lorenzo Stoakes (Oracle)
2026-03-06 15:56 ` Breno Leitao
0 siblings, 1 reply; 15+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-06 11:39 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Thu, Mar 05, 2026 at 06:04:55AM -0800, Breno Leitao wrote:
> Refactor enabled_store() to use a new change_enabled() helper that
> reuses the shared enum enabled_mode and enabled_mode_strings[]
> introduced in the previous commit.
>
> The helper uses the same loop pattern as change_anon_orders(),
> iterating over an array of flag bit positions and using
> test_and_set_bit()/test_and_clear_bit() to track whether the state
> actually changed. ENABLED_INHERIT is rejected since it is not a
> valid mode for the global THP setting.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> mm/huge_memory.c | 51 ++++++++++++++++++++++++++++++++++++---------------
> 1 file changed, 36 insertions(+), 15 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 19619213f54d1..dd5011cf36839 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -330,30 +330,51 @@ static const char * const enabled_mode_strings[] = {
> [ENABLED_NEVER] = "never",
> };
>
> +static bool change_enabled(enum enabled_mode mode)
> +{
> + static const unsigned long thp_flags[] = {
> + TRANSPARENT_HUGEPAGE_FLAG,
> + TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
> + };
> + bool changed = false;
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(thp_flags); i++) {
> + if (i == mode)
> + changed |= !test_and_set_bit(thp_flags[i],
> + &transparent_hugepage_flags);
> + else
> + changed |= test_and_clear_bit(thp_flags[i],
> + &transparent_hugepage_flags);
> + }
> +
> + return changed;
> +}
> +
> static ssize_t enabled_store(struct kobject *kobj,
> struct kobj_attribute *attr,
> const char *buf, size_t count)
> {
> - ssize_t ret = count;
> + int mode;
>
> - if (sysfs_streq(buf, "always")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> - set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> - } else if (sysfs_streq(buf, "madvise")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> - set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> - } else if (sysfs_streq(buf, "never")) {
> - clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
> - clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
> - } else
> - ret = -EINVAL;
> + mode = sysfs_match_string(enabled_mode_strings, buf);
> + if (mode < 0 || mode == ENABLED_INHERIT)
> + return -EINVAL;
The mode == ENABLED_INHERIT check is weird, it reads like 'oh a user CAN specify
this, but if they do we error out', but in reality
/sys/kernel/mm/transparent_hugepage/enabled explicitly outputs only always,
inherit, maddvise.
So, even though it's duplicative, I think it's probably saner to have:
static const char * const anon_enabled_mode_strings[] = {
[ENABLED_ALWAYS] = "always",
[ENABLED_MADVISE] = "madvise",
[ENABLED_INHERIT] = "inherit",
[ENABLED_NEVER] = "never",
};
static const char * const global_enabled_mode_strings[] = {
[ENABLED_ALWAYS] = "always",
[ENABLED_MADVISE] = "madvise",
[ENABLED_NEVER] = "never",
};
Just to make it clear what you're doing.
And if you add something for shmem add one for that too.
With that fixed, feel free to add:
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
>
> - if (ret > 0) {
> + if (change_enabled(mode)) {
> int err = start_stop_khugepaged();
> +
> if (err)
> - ret = err;
> + return err;
> + } else {
> + /*
> + * Recalculate watermarks even when the mode didn't
> + * change, as the previous code always called
> + * start_stop_khugepaged() which does this internally.
> + */
> + set_recommended_min_free_kbytes();
> }
> - return ret;
> + return count;
> }
>
> static struct kobj_attribute enabled_attr = __ATTR_RW(enabled);
>
> --
> 2.47.3
>
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled()
2026-03-06 11:39 ` Lorenzo Stoakes (Oracle)
@ 2026-03-06 15:56 ` Breno Leitao
0 siblings, 0 replies; 15+ messages in thread
From: Breno Leitao @ 2026-03-06 15:56 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Fri, Mar 06, 2026 at 11:39:28AM +0000, Lorenzo Stoakes (Oracle) wrote:
> On Thu, Mar 05, 2026 at 06:04:55AM -0800, Breno Leitao wrote:
> > + mode = sysfs_match_string(enabled_mode_strings, buf);
> > + if (mode < 0 || mode == ENABLED_INHERIT)
> > + return -EINVAL;
>
> The mode == ENABLED_INHERIT check is weird, it reads like 'oh a user CAN specify
> this, but if they do we error out', but in reality
> /sys/kernel/mm/transparent_hugepage/enabled explicitly outputs only always,
> inherit, maddvise.
>
> So, even though it's duplicative, I think it's probably saner to have:
Agree. That would be a better approach.
> With that fixed, feel free to add:
>
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Ack, thanks for the review,
--breno
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
2026-03-06 11:32 ` Lorenzo Stoakes (Oracle)
@ 2026-03-06 16:43 ` Breno Leitao
0 siblings, 0 replies; 15+ messages in thread
From: Breno Leitao @ 2026-03-06 16:43 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, linux-mm,
linux-kernel, usamaarif642, kas, kernel-team
On Fri, Mar 06, 2026 at 11:32:44AM +0000, Lorenzo Stoakes (Oracle) wrote:
> On Thu, Mar 05, 2026 at 06:04:54AM -0800, Breno Leitao wrote:
> > Signed-off-by: Breno Leitao <leitao@debian.org>
>
> This is nice, you've improved my idea :)
Thanks! Your idea was a good starting point, and digging deeper led me to
sysfs_match_string(), which I wasn't familiar with before — it turned out to
be exactly what we needed here.
Having the modes in anon_enabled_mode_strings[] also opens the door to further
simplification. I haven't thought it through completely yet, but I suspect we
could consolidate quite a few of the _store()/_show() helpers as a follow-up.
> I know you change another instance of this in 3/3, but I wonder if there are
> other cases like this in the THP code we can fixup too?
Yes, there are several other candidates — defrag_store, the ones Baolin Wang
reported, shmem_enabled_store(), and thpsize_shmem_enabled_store() among them.
I'd prefer to address those in a separate patch series once this one lands,
if that works for everyone.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-03-06 16:44 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
2026-03-06 11:20 ` Lorenzo Stoakes (Oracle)
2026-03-06 11:33 ` Lorenzo Stoakes (Oracle)
2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
2026-03-06 6:07 ` Baolin Wang
2026-03-06 10:30 ` Breno Leitao
2026-03-06 11:22 ` Lorenzo Stoakes (Oracle)
2026-03-06 11:32 ` Lorenzo Stoakes (Oracle)
2026-03-06 16:43 ` Breno Leitao
2026-03-05 14:04 ` [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
2026-03-06 11:39 ` Lorenzo Stoakes (Oracle)
2026-03-06 15:56 ` Breno Leitao
2026-03-06 6:14 ` [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Baolin Wang
2026-03-06 10:26 ` Breno Leitao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox