linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls
@ 2026-03-05 14:04 Breno Leitao
  2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle),
	Breno Leitao

Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
start_stop_khugepaged() called independent of any change.
start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
exact same message, even when nothing changes.

For instance, if you have a custom vm.min_free_kbytes, just touching
/sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
Example:

      # sysctl -w vm.min_free_kbytes=112382
      # for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done

and you have 100 WARN messages like the following, which is pretty dull:

      khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred

A similar message shows up when setting thp to "always":

      # for i in $(seq 100); do
      #       echo 1024 > /proc/sys/vm/min_free_kbytes
      #       echo always > /sys/kernel/mm/transparent_hugepage/enabled
      # done

And then, we have 100 messages like:

      khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations

This is more common when you have a configuration management system that
writes the THP configuration without an extra read, assuming that
nothing will happen if there is no change in the configuration, but it
prints these annoying messages.

For instance, at Meta's fleet, ~10K servers were producing 3.5M of
these messages per day.

Fix this by making the sysfs _store helpers a no-op if there is no state
change.

This version is heavily based on Lorezo's suggestion on V1.

---
Changes in v2:
- V2 is heavily based on Lorenzo and Kiryl feedback on v1.
- Link to v1: https://patch.msgid.link/20260304-thp_logs-v1-0-59038218a253@debian.org

---
Breno Leitao (3):
      mm: khugepaged: export set_recommended_min_free_kbytes()
      mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
      mm: huge_memory: refactor enabled_store() with change_enabled()

 include/linux/khugepaged.h |   1 +
 mm/huge_memory.c           | 135 +++++++++++++++++++++++++++++----------------
 mm/khugepaged.c            |   2 +-
 3 files changed, 90 insertions(+), 48 deletions(-)
---
base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
change-id: 20260303-thp_logs-059d6b80f6d6

Best regards,
--  
Breno Leitao <leitao@debian.org>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes()
  2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
@ 2026-03-05 14:04 ` Breno Leitao
  2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
  2026-03-05 14:04 ` [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
  2 siblings, 0 replies; 4+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle),
	Breno Leitao

Make set_recommended_min_free_kbytes() callable from outside
khugepaged.c by removing the static qualifier and adding a
declaration in include/linux/khugepaged.h.

This allows callers that change THP settings to recalculate
watermarks without going through start_stop_khugepaged().

Suggested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 include/linux/khugepaged.h | 1 +
 mm/khugepaged.c            | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
index d7a9053ff4fed..76252865fca4b 100644
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -16,6 +16,7 @@ extern void __khugepaged_exit(struct mm_struct *mm);
 extern void khugepaged_enter_vma(struct vm_area_struct *vma,
 				 vm_flags_t vm_flags);
 extern void khugepaged_min_free_kbytes_update(void);
+extern void set_recommended_min_free_kbytes(void);
 extern bool current_is_khugepaged(void);
 void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
 		bool install_pmd);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1dd3cfca610db..56a41c21b44c9 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2630,7 +2630,7 @@ static int khugepaged(void *none)
 	return 0;
 }
 
-static void set_recommended_min_free_kbytes(void)
+void set_recommended_min_free_kbytes(void)
 {
 	struct zone *zone;
 	int nr_zones = 0;

-- 
2.47.3



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
  2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
  2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
@ 2026-03-05 14:04 ` Breno Leitao
  2026-03-05 14:04 ` [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
  2 siblings, 0 replies; 4+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao

Consolidate the repeated spin_lock/set_bit/clear_bit pattern in
anon_enabled_store() into a new change_anon_orders() helper that
loops over an orders[] array, setting the bit for the selected mode
and clearing the others.

Introduce enum enabled_mode and enabled_mode_strings[] to be shared
with enabled_store() in a subsequent patch.

Use sysfs_match_string() with the enabled_mode_strings[] table to
replace the if/else chain of sysfs_streq() calls.

The helper uses test_and_set_bit()/test_and_clear_bit() to track
whether the state actually changed, so start_stop_khugepaged() is
only called when needed. When the mode is unchanged,
set_recommended_min_free_kbytes() is called directly to preserve
the watermark recalculation behavior of the original code.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 mm/huge_memory.c | 84 +++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 52 insertions(+), 32 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8e2746ea74adf..19619213f54d1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -316,6 +316,20 @@ static ssize_t enabled_show(struct kobject *kobj,
 	return sysfs_emit(buf, "%s\n", output);
 }
 
+enum enabled_mode {
+	ENABLED_ALWAYS,
+	ENABLED_MADVISE,
+	ENABLED_INHERIT,
+	ENABLED_NEVER,
+};
+
+static const char * const enabled_mode_strings[] = {
+	[ENABLED_ALWAYS]	= "always",
+	[ENABLED_MADVISE]	= "madvise",
+	[ENABLED_INHERIT]	= "inherit",
+	[ENABLED_NEVER]		= "never",
+};
+
 static ssize_t enabled_store(struct kobject *kobj,
 			     struct kobj_attribute *attr,
 			     const char *buf, size_t count)
@@ -515,48 +529,54 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
 	return sysfs_emit(buf, "%s\n", output);
 }
 
+static bool change_anon_orders(int order, enum enabled_mode mode)
+{
+	static unsigned long *orders[] = {
+		&huge_anon_orders_always,
+		&huge_anon_orders_madvise,
+		&huge_anon_orders_inherit,
+	};
+	bool changed = false;
+	int i;
+
+	spin_lock(&huge_anon_orders_lock);
+	for (i = 0; i < ARRAY_SIZE(orders); i++) {
+		if (i == mode)
+			changed |= !test_and_set_bit(order, orders[i]);
+		else
+			changed |= test_and_clear_bit(order, orders[i]);
+	}
+	spin_unlock(&huge_anon_orders_lock);
+
+	return changed;
+}
+
 static ssize_t anon_enabled_store(struct kobject *kobj,
 				  struct kobj_attribute *attr,
 				  const char *buf, size_t count)
 {
 	int order = to_thpsize(kobj)->order;
-	ssize_t ret = count;
+	int mode;
 
-	if (sysfs_streq(buf, "always")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_inherit);
-		clear_bit(order, &huge_anon_orders_madvise);
-		set_bit(order, &huge_anon_orders_always);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "inherit")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_madvise);
-		set_bit(order, &huge_anon_orders_inherit);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "madvise")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_inherit);
-		set_bit(order, &huge_anon_orders_madvise);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "never")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_inherit);
-		clear_bit(order, &huge_anon_orders_madvise);
-		spin_unlock(&huge_anon_orders_lock);
-	} else
-		ret = -EINVAL;
+	mode = sysfs_match_string(enabled_mode_strings, buf);
+	if (mode < 0)
+		return -EINVAL;
 
-	if (ret > 0) {
-		int err;
+	if (change_anon_orders(order, mode)) {
+		int err = start_stop_khugepaged();
 
-		err = start_stop_khugepaged();
 		if (err)
-			ret = err;
+			return err;
+	} else {
+		/*
+		 * Recalculate watermarks even when the mode didn't
+		 * change, as the previous code always called
+		 * start_stop_khugepaged() which does this internally.
+		 */
+		set_recommended_min_free_kbytes();
 	}
-	return ret;
+
+	return count;
 }
 
 static struct kobj_attribute anon_enabled_attr =

-- 
2.47.3



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled()
  2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
  2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
  2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
@ 2026-03-05 14:04 ` Breno Leitao
  2 siblings, 0 replies; 4+ messages in thread
From: Breno Leitao @ 2026-03-05 14:04 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team, Breno Leitao

Refactor enabled_store() to use a new change_enabled() helper that
reuses the shared enum enabled_mode and enabled_mode_strings[]
introduced in the previous commit.

The helper uses the same loop pattern as change_anon_orders(),
iterating over an array of flag bit positions and using
test_and_set_bit()/test_and_clear_bit() to track whether the state
actually changed. ENABLED_INHERIT is rejected since it is not a
valid mode for the global THP setting.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 mm/huge_memory.c | 51 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 19619213f54d1..dd5011cf36839 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -330,30 +330,51 @@ static const char * const enabled_mode_strings[] = {
 	[ENABLED_NEVER]		= "never",
 };
 
+static bool change_enabled(enum enabled_mode mode)
+{
+	static const unsigned long thp_flags[] = {
+		TRANSPARENT_HUGEPAGE_FLAG,
+		TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+	};
+	bool changed = false;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(thp_flags); i++) {
+		if (i == mode)
+			changed |= !test_and_set_bit(thp_flags[i],
+						     &transparent_hugepage_flags);
+		else
+			changed |= test_and_clear_bit(thp_flags[i],
+						      &transparent_hugepage_flags);
+	}
+
+	return changed;
+}
+
 static ssize_t enabled_store(struct kobject *kobj,
 			     struct kobj_attribute *attr,
 			     const char *buf, size_t count)
 {
-	ssize_t ret = count;
+	int mode;
 
-	if (sysfs_streq(buf, "always")) {
-		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-		set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-	} else if (sysfs_streq(buf, "madvise")) {
-		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-		set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (sysfs_streq(buf, "never")) {
-		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else
-		ret = -EINVAL;
+	mode = sysfs_match_string(enabled_mode_strings, buf);
+	if (mode < 0 || mode == ENABLED_INHERIT)
+		return -EINVAL;
 
-	if (ret > 0) {
+	if (change_enabled(mode)) {
 		int err = start_stop_khugepaged();
+
 		if (err)
-			ret = err;
+			return err;
+	} else {
+		/*
+		 * Recalculate watermarks even when the mode didn't
+		 * change, as the previous code always called
+		 * start_stop_khugepaged() which does this internally.
+		 */
+		set_recommended_min_free_kbytes();
 	}
-	return ret;
+	return count;
 }
 
 static struct kobj_attribute enabled_attr = __ATTR_RW(enabled);

-- 
2.47.3



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-03-05 14:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-05 14:04 [PATCH v2 0/3] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-05 14:04 ` [PATCH v2 1/3] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
2026-03-05 14:04 ` [PATCH v2 2/3] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
2026-03-05 14:04 ` [PATCH v2 3/3] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox