From: Breno Leitao <leitao@debian.org>
To: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, Shuah Khan <shuah@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-kselftest@vger.kernel.org, kernel-team@meta.com,
Breno Leitao <leitao@debian.org>
Subject: [PATCH 1/2] mm/kmemleak: dedupe verbose scan output by allocation backtrace
Date: Tue, 21 Apr 2026 06:45:04 -0700 [thread overview]
Message-ID: <20260421-kmemleak_dedup-v1-1-65e31c6cdf0c@debian.org> (raw)
In-Reply-To: <20260421-kmemleak_dedup-v1-0-65e31c6cdf0c@debian.org>
In kmemleak's verbose mode, every unreferenced object found during
a scan is logged with its full header, hex dump and 16-frame backtrace.
Workloads that leak many objects from a single allocation site flood
dmesg with byte-for-byte identical backtraces, drowning out distinct
leaks and other kernel messages.
Dedupe within each scan using stackdepot's trace_handle as the key: for
every leaked object, look up an entry in a per-scan xarray keyed by
trace_handle. The first sighting stores a representative object; later
sightings just bump a counter. After the scan, walk the xarray once and
emit each unique backtrace, followed by a single summary line when more
than one object shares it.
Important to say that the contents of /sys/kernel/debug/kmemleak are
unchanged - only the verbose console output is collapsed.
Note 1: The xarray operations and kmalloc(GFP_ATOMIC) for the dedup
entry must happen outside object->lock: object->lock is a raw spinlock,
while the slab path takes higher wait-context locks (n->list_lock),
which lockdep flags as an invalid wait context. trace_handle is read
under object->lock, which serialises with kmemleak_update_trace()'s
writer, so it is safe to capture and use after dropping the lock.
Note 2: Stashed object pointers carry a get_object() reference across
rcu_read_unlock() that dedup_flush() drops after printing, preventing
use-after-free if the underlying allocation is freed concurrently.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/kmemleak.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 111 insertions(+), 2 deletions(-)
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 2eff0d6b622b6..046847d372777 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -92,6 +92,7 @@
#include <linux/nodemask.h>
#include <linux/mm.h>
#include <linux/workqueue.h>
+#include <linux/xarray.h>
#include <linux/crc32.h>
#include <asm/sections.h>
@@ -1684,6 +1685,82 @@ static void kmemleak_cond_resched(struct kmemleak_object *object)
put_object(object);
}
+/*
+ * Per-scan dedup table for verbose leak printing. Each entry collapses all
+ * leaks that share one allocation backtrace (keyed by stackdepot
+ * trace_handle) into a single representative object plus a count.
+ */
+struct kmemleak_dedup_entry {
+ struct kmemleak_object *object;
+ unsigned long count;
+};
+
+/*
+ * Record a leaked object in the dedup table. The representative object's
+ * use_count is incremented so it can be safely dereferenced by dedup_flush()
+ * outside the RCU read section; dedup_flush() drops the reference. On
+ * allocation failure (or a concurrent insert) the object is printed
+ * immediately, preserving today's "always log every leak" guarantee.
+ * Caller must not hold object->lock and must hold rcu_read_lock().
+ */
+static void dedup_record(struct xarray *dedup, struct kmemleak_object *object,
+ depot_stack_handle_t trace_handle)
+{
+ struct kmemleak_dedup_entry *entry;
+
+ entry = xa_load(dedup, trace_handle);
+ if (entry) {
+ /* This is a known beast, just increase the counter */
+ entry->count++;
+ return;
+ }
+
+ /*
+ * A brand new report. Object will have object->use_count increased
+ * in here, and released put_object() at dedup_flush
+ */
+ entry = kmalloc(sizeof(*entry), GFP_ATOMIC);
+ if (entry && get_object(object)) {
+ if (xa_insert(dedup, trace_handle, entry, GFP_ATOMIC) == 0) {
+ entry->object = object;
+ entry->count = 1;
+ return;
+ }
+ put_object(object);
+ }
+ kfree(entry);
+
+ /*
+ * Fallback for kmalloc/get_object(): Just print it straight away
+ */
+ raw_spin_lock_irq(&object->lock);
+ print_unreferenced(NULL, object);
+ raw_spin_unlock_irq(&object->lock);
+}
+
+/*
+ * Drain the dedup table: print one full record per unique backtrace,
+ * followed by a summary line whenever more than one object shared it.
+ * Releases the reference dedup_record() took on each representative object.
+ */
+static void dedup_flush(struct xarray *dedup)
+{
+ struct kmemleak_dedup_entry *entry;
+ unsigned long idx;
+
+ xa_for_each(dedup, idx, entry) {
+ raw_spin_lock_irq(&entry->object->lock);
+ print_unreferenced(NULL, entry->object);
+ raw_spin_unlock_irq(&entry->object->lock);
+ if (entry->count > 1)
+ pr_warn(" ... and %lu more object(s) with the same backtrace\n",
+ entry->count - 1);
+ put_object(entry->object);
+ kfree(entry);
+ xa_erase(dedup, idx);
+ }
+}
+
/*
* Scan data sections and all the referenced memory blocks allocated via the
* kernel's standard allocators. This function must be called with the
@@ -1834,10 +1911,19 @@ static void kmemleak_scan(void)
return;
/*
- * Scanning result reporting.
+ * Scanning result reporting. When verbose printing is enabled, dedupe
+ * by stackdepot trace_handle so each unique backtrace is logged once
+ * per scan, annotated with the number of objects that share it. The
+ * per-leak count below still reflects every object, and
+ * /sys/kernel/debug/kmemleak still lists them individually.
*/
+ struct xarray dedup;
+
+ xa_init(&dedup);
rcu_read_lock();
list_for_each_entry_rcu(object, &object_list, object_list) {
+ depot_stack_handle_t trace_handle;
+
if (need_resched())
kmemleak_cond_resched(object);
@@ -1849,18 +1935,41 @@ static void kmemleak_scan(void)
if (!color_white(object))
continue;
raw_spin_lock_irq(&object->lock);
+ trace_handle = 0;
if (unreferenced_object(object) &&
!(object->flags & OBJECT_REPORTED)) {
object->flags |= OBJECT_REPORTED;
if (kmemleak_verbose)
- print_unreferenced(NULL, object);
+ trace_handle = object->trace_handle;
new_leaks++;
}
raw_spin_unlock_irq(&object->lock);
+
+ /*
+ * Dedup bookkeeping must happen outside object->lock.
+ * dedup_record() may call kmalloc(GFP_ATOMIC), and the slab
+ * path takes locks (n->list_lock, etc.) at a higher
+ * wait-context level than the raw_spinlock_t object->lock;
+ *
+ * Passing object without object->lock here is safe:
+ * - the surrounding rcu_read_lock() keeps the memory alive
+ * even if a concurrent kmemleak_free() drops use_count to
+ * zero and queues free_object_rcu();
+ * - dedup_record() only manipulates use_count via the atomic
+ * get_object()/put_object() helpers and stores the bare
+ * pointer into the xarray;
+ * - on the fallback print path it re-acquires object->lock
+ * before calling print_unreferenced().
+ */
+ if (trace_handle)
+ dedup_record(&dedup, object, trace_handle);
}
rcu_read_unlock();
+ /* Flush'em all */
+ dedup_flush(&dedup);
+ xa_destroy(&dedup);
if (new_leaks) {
kmemleak_found_leaks = true;
--
2.52.0
next prev parent reply other threads:[~2026-04-21 13:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-21 13:45 [PATCH 0/2] mm/kmemleak: dedupe verbose scan output Breno Leitao
2026-04-21 13:45 ` Breno Leitao [this message]
2026-04-21 13:45 ` [PATCH 2/2] selftests/mm: add kmemleak verbose dedup test Breno Leitao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260421-kmemleak_dedup-v1-1-65e31c6cdf0c@debian.org \
--to=leitao@debian.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox