linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
To: Breno Leitao <leitao@debian.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	 Zi Yan <ziy@nvidia.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	 "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	 Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>,
	 Lance Yang <lance.yang@linux.dev>,
	Vlastimil Babka <vbabka@kernel.org>,
	 Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	 Brendan Jackman <jackmanb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	usamaarif642@gmail.com, kas@kernel.org, kernel-team@meta.com
Subject: Re: [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store()
Date: Wed, 4 Mar 2026 16:40:22 +0000	[thread overview]
Message-ID: <ec07d7f0-cad4-4d9b-8e40-d4ded8170340@lucifer.local> (raw)
In-Reply-To: <20260304-thp_logs-v1-1-59038218a253@debian.org>

On Wed, Mar 04, 2026 at 02:22:33AM -0800, Breno Leitao wrote:
> Writing "never" (or any other value) multiple times to
> /sys/kernel/mm/transparent_hugepage/hugepages-*/enabled calls
> start_stop_khugepaged() each time, even when nothing actually changed.
> This causes set_recommended_min_free_kbytes() to run unconditionally,
> which is unnecessary and floods the printk buffer with "raising
> min_free_kbytes" messages. Example:
>
>   # for i in $(seq 100); do
>   #       echo never > /sys/kernel/mm/transparent_hugepage/enabled
>   # done
>
>   # dmesg | grep "min_free_kbytes is not updated" | wc -l
>   100
>
> Use test_and_set_bit()/test_and_clear_bit() instead of the plain
> variants to detect whether any bit actually flipped, and skip the
> start_stop_khugepaged() call entirely when the configuration is
> unchanged.
>
> With this patch, redoing the same operation becomes a no-op.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>

General concept is sensible, but let's improve this code please.

> ---
>  mm/huge_memory.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8e2746ea74adf..9abfb115e9329 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -520,36 +520,37 @@ static ssize_t anon_enabled_store(struct kobject *kobj,
>  				  const char *buf, size_t count)
>  {
>  	int order = to_thpsize(kobj)->order;
> +	bool changed = false;
>  	ssize_t ret = count;
>
>  	if (sysfs_streq(buf, "always")) {
>  		spin_lock(&huge_anon_orders_lock);
> -		clear_bit(order, &huge_anon_orders_inherit);
> -		clear_bit(order, &huge_anon_orders_madvise);
> -		set_bit(order, &huge_anon_orders_always);
> +		changed = test_and_clear_bit(order, &huge_anon_orders_inherit);
> +		changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
> +		changed |= !test_and_set_bit(order, &huge_anon_orders_always);
>  		spin_unlock(&huge_anon_orders_lock);
>  	} else if (sysfs_streq(buf, "inherit")) {
>  		spin_lock(&huge_anon_orders_lock);
> -		clear_bit(order, &huge_anon_orders_always);
> -		clear_bit(order, &huge_anon_orders_madvise);
> -		set_bit(order, &huge_anon_orders_inherit);
> +		changed = test_and_clear_bit(order, &huge_anon_orders_always);
> +		changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);
> +		changed |= !test_and_set_bit(order, &huge_anon_orders_inherit);
>  		spin_unlock(&huge_anon_orders_lock);
>  	} else if (sysfs_streq(buf, "madvise")) {
>  		spin_lock(&huge_anon_orders_lock);
> -		clear_bit(order, &huge_anon_orders_always);
> -		clear_bit(order, &huge_anon_orders_inherit);
> -		set_bit(order, &huge_anon_orders_madvise);
> +		changed = test_and_clear_bit(order, &huge_anon_orders_always);
> +		changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
> +		changed |= !test_and_set_bit(order, &huge_anon_orders_madvise);
>  		spin_unlock(&huge_anon_orders_lock);
>  	} else if (sysfs_streq(buf, "never")) {
>  		spin_lock(&huge_anon_orders_lock);
> -		clear_bit(order, &huge_anon_orders_always);
> -		clear_bit(order, &huge_anon_orders_inherit);
> -		clear_bit(order, &huge_anon_orders_madvise);
> +		changed = test_and_clear_bit(order, &huge_anon_orders_always);
> +		changed |= test_and_clear_bit(order, &huge_anon_orders_inherit);
> +		changed |= test_and_clear_bit(order, &huge_anon_orders_madvise);

This is badly implemented already (sigh) so a little tricky as to how to
abstract.

Yes the existing logic duplicated, doesn't mean we have to keep doing so :)

To put money where my mouth is attached a (totally untested, in line with
Kiryl's :P) patch to give a sense of how one might achieve this.

As to this vs. Kiryl's... I mean it might be nice to fix this crap up here to be
honest.

Maybe David can have deciding vote ;)

But see below for a caveat...

>  		spin_unlock(&huge_anon_orders_lock);
>  	} else
>  		ret = -EINVAL;
>
> -	if (ret > 0) {
> +	if (ret > 0 && changed) {
>  		int err;
>
>  		err = start_stop_khugepaged();

There's a caveat here as mentioned in reply to Kiryl - I'm concerned users
might rely on the set recommended min kbytes even when things don't change.

Not sure how likely that is, but it's a user-visible change in how this behaves.

Cheers, Lorenzo

>
> --
> 2.47.3
>

----8<----
From cb2c4c8bf183ef0d10068cfd12c12d19cb17a241 Mon Sep 17 00:00:00 2001
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Date: Wed, 4 Mar 2026 16:37:20 +0000
Subject: [PATCH] idea

Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
 mm/huge_memory.c | 74 ++++++++++++++++++++++++++++++------------------
 1 file changed, 46 insertions(+), 28 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0df1f4a17430..97dabbeb9112 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -515,46 +515,64 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
 	return sysfs_emit(buf, "%s\n", output);
 }

+enum huge_mode {
+	HUGE_ALWAYS,
+	HUGE_INHERIT,
+	HUGE_MADVISE,
+	HUGE_NUM_MODES,
+	HUGE_NEVER,
+};
+
+static bool change_anon_orders(int order, enum huge_mode mode)
+{
+	static unsigned long *orders[] = {
+		&huge_anon_orders_always,
+		&huge_anon_orders_inherit,
+		&huge_anon_orders_madvise,
+	};
+	bool changed = false;
+	int i;
+
+	spin_lock(&huge_anon_orders_lock);
+	for (i = 0; i < HUGE_NUM_MODES; i++) {
+		if (i == mode)
+			changed |= !test_and_set_bit(order, orders[mode]);
+		else
+			changed |= test_and_clear_bit(order, orders[mode]);
+	}
+	spin_unlock(&huge_anon_orders_lock);
+
+	return changed;
+}
+
 static ssize_t anon_enabled_store(struct kobject *kobj,
 				 struct kobj_attribute *attr,
 				 const char *buf, size_t count)
 {
 	int order = to_thpsize(kobj)->order;
 	ssize_t ret = count;
+	bool changed;
+
+	if (sysfs_streq(buf, "always"))
+		changed = change_anon_orders(order, HUGE_ALWAYS);
+	else if (sysfs_streq(buf, "inherit"))
+		changed = change_anon_orders(order, HUGE_INHERIT);
+	else if (sysfs_streq(buf, "madvise"))
+		changed = change_anon_orders(order, HUGE_MADVISE);
+	else if (sysfs_streq(buf, "never"))
+		changed = change_anon_orders(order, HUGE_NEVER);
+	else
+		return -EINVAL;

-	if (sysfs_streq(buf, "always")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_inherit);
-		clear_bit(order, &huge_anon_orders_madvise);
-		set_bit(order, &huge_anon_orders_always);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "inherit")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_madvise);
-		set_bit(order, &huge_anon_orders_inherit);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "madvise")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_inherit);
-		set_bit(order, &huge_anon_orders_madvise);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "never")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_inherit);
-		clear_bit(order, &huge_anon_orders_madvise);
-		spin_unlock(&huge_anon_orders_lock);
-	} else
-		ret = -EINVAL;
-
-	if (ret > 0) {
+	if (changed) {
 		int err;

 		err = start_stop_khugepaged();
 		if (err)
 			ret = err;
+	} else {
+		/* Users expect this even if unchanged. TODO: Put in header... */
+		//set_recommended_min_free_kbytes();
 	}
 	return ret;
 }
--
2.53.0


  reply	other threads:[~2026-03-04 16:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-04 10:22 [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-04 10:22 ` [PATCH 1/2] mm: thp: avoid calling start_stop_khugepaged() in anon_enabled_store() Breno Leitao
2026-03-04 16:40   ` Lorenzo Stoakes (Oracle) [this message]
2026-03-04 10:22 ` [PATCH 2/2] mm: thp: avoid calling start_stop_khugepaged() in enabled_store() Breno Leitao
2026-03-04 16:40   ` Lorenzo Stoakes (Oracle)
2026-03-04 11:18 ` [PATCH 0/2] mm: thp: reduce unnecessary start_stop_khugepaged() calls Kiryl Shutsemau
2026-03-04 11:53   ` Breno Leitao
2026-03-04 16:24   ` Lorenzo Stoakes (Oracle)
2026-03-04 16:17 ` Lorenzo Stoakes (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec07d7f0-cad4-4d9b-8e40-d4ded8170340@lucifer.local \
    --to=ljs@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=kas@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=lance.yang@linux.dev \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox