linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/4] page_owner: print stacks and their counter
@ 2023-11-20  8:42 Oscar Salvador
  2023-11-20  8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20  8:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
	Eric Dumazet, Alexander Potapenko, Oscar Salvador

Changes v5 -> v6:
     - Rebase on top of v6.7-rc1
     - Move stack_record struct to the header
     - Addressed feedback from Vlastimil
       (some code tweaks and changelogs suggestions)

Changes v4 -> v5:
     - Addressed feedback from Alexander Potapenko

Changes v3 -> v4:
     - Rebase (long time has passed)
     - Use boolean instead of enum for action by Alexander Potapenko
     - (I left some feedback untouched because it's been long and
        would like to discuss it here now instead of re-vamping
        and old thread)

Changes v2 -> v3:
     - Replace interface in favor of seq operations (suggested by Vlastimil)
     - Use debugfs interface to store/read valued (suggested by Ammar)

Hi,

page_owner is a great debug functionality tool that lets us know
about all pages that have been allocated/freed and their stacktrace.
This comes very handy when one has to debug memory leaks, since with
some scripting we can see outstanding allocations, which might end up
pointing to a leak.

In my experience, that is one of the most useful cases, but it can get
really tedious to screen through all pages and try to reconstruct the
stack <-> allocated/freed relationship. There is a lot of noise
to cancel off.

This patchset aims to ease that by adding a new functionality into page_owner.
What this does is to create a new read-only file "page_owner_stacks",
which prints out only the stacktraces followed by their outstanding
number of allocations (being that the times the stacktrace has allocated
but not freed yet).

This lets us have a clear overview of stacks <-> allocated/freed relationship
without the need to fiddle with pages and trying to match free stacktraces
with allocated stacktraces.

This is achieved by adding a new refcount_t field in the stack_record struct,
incrementing that refcount_t everytime the same stacktrace allocates,
and decrementing it when it frees a page. Details can be seen in the
respective patches.

We also create another file called "page_owner_threshold", which let us
specify a threshold, so when reading from "page_owner_stacks",
we will only see those stacktraces which counting goes beyond the
threshold we specified.

A PoC can be found below:

 # cat /sys/kernel/debug/page_owner_threshold
  0
 # cat /sys/kernel/debug/page_owner_stacks > stacks_full.txt
 # head -32 stacks_full.txt
  prep_new_page+0x10d/0x180
  get_page_from_freelist+0x1bd6/0x1e10
  __alloc_pages+0x194/0x360
  alloc_page_interleave+0x13/0x90
  new_slab+0x31d/0x530
  ___slab_alloc+0x5d7/0x720
  __slab_alloc.isra.85+0x4a/0x90
  kmem_cache_alloc+0x455/0x4a0
  acpi_ps_alloc_op+0x57/0x8f
  acpi_ps_create_scope_op+0x12/0x23
  acpi_ps_execute_method+0x102/0x2c1
  acpi_ns_evaluate+0x343/0x4da
  acpi_evaluate_object+0x1cb/0x392
  acpi_run_osc+0x135/0x260
  acpi_init+0x165/0x4ed
  do_one_initcall+0x3e/0x200
 stack count: 2
 
  free_pcp_prepare+0x287/0x5c0
  free_unref_page+0x1c/0xd0
  __mmdrop+0x50/0x160
  finish_task_switch+0x249/0x2b0
  __schedule+0x2c3/0x960
  schedule+0x44/0xb0
  futex_wait_queue+0x70/0xd0
  futex_wait+0x160/0x250
  do_futex+0x11c/0x1b0
  __x64_sys_futex+0x5e/0x1d0
  do_syscall_64+0x37/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 stack count: 1
 
  
 
 # echo 10000 > /sys/kernel/debug/page_owner_threshold
 # cat /sys/kernel/debug/page_owner_stacks > stacks_10000.txt
 # cat stacks_10000.txt 
  prep_new_page+0x10d/0x180
  get_page_from_freelist+0x1bd6/0x1e10
  __alloc_pages+0x194/0x360
  folio_alloc+0x17/0x40
  page_cache_ra_unbounded+0x96/0x170
  filemap_get_pages+0x23d/0x5e0
  filemap_read+0xbf/0x3a0
  __kernel_read+0x136/0x2f0
  kernel_read_file+0x197/0x2d0
  kernel_read_file_from_fd+0x54/0x90
  __do_sys_finit_module+0x89/0x120
  do_syscall_64+0x37/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 stack count: 36195
 
  prep_new_page+0x10d/0x180
  get_page_from_freelist+0x1bd6/0x1e10
  __alloc_pages+0x194/0x360
  folio_alloc+0x17/0x40
  page_cache_ra_unbounded+0x96/0x170
  filemap_get_pages+0x23d/0x5e0
  filemap_read+0xbf/0x3a0
  new_sync_read+0x106/0x180
  vfs_read+0x16f/0x190
  ksys_read+0xa5/0xe0
  do_syscall_64+0x37/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 stack count: 44484
 
  prep_new_page+0x10d/0x180
  get_page_from_freelist+0x1bd6/0x1e10
  __alloc_pages+0x194/0x360
  folio_alloc+0x17/0x40
  page_cache_ra_unbounded+0x96/0x170
  filemap_get_pages+0xdd/0x5e0
  filemap_read+0xbf/0x3a0
  new_sync_read+0x106/0x180
  vfs_read+0x16f/0x190
  ksys_read+0xa5/0xe0
  do_syscall_64+0x37/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 stack count: 17874

Oscar Salvador (4):
  lib/stackdepot: Add a refcount field in stack_record
  lib/stackdepot: Move stack_record struct definition into the header
  mm,page_owner: Add page_owner_stacks file to print out only stacks and
    their counter
  mm,page_owner: Filter out stacks by a threshold counter

 include/linux/stackdepot.h |  38 ++++++++++++
 lib/stackdepot.c           | 114 ++++++++++++++++++++++--------------
 mm/page_owner.c            | 115 +++++++++++++++++++++++++++++++++++++
 3 files changed, 224 insertions(+), 43 deletions(-)

-- 
2.42.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record
  2023-11-20  8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
@ 2023-11-20  8:42 ` Oscar Salvador
  2023-11-20  8:42 ` [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20  8:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
	Eric Dumazet, Alexander Potapenko, Oscar Salvador

We want to filter out the page_owner output and print only those
stacks for which their counter (number of outstanding allocations) goes
beyond a certain threshold.
This gives us the chance to get rid of a lot of noise.
In order to do that, we need to know how many outstanding allocations
with a particular stack (for allocation) do we have, so we add a new
refcount_t field in the stack_record struct.

Note that this might increase the size of the struct for some
architectures.
E.g: x86_64 is not affected due to alignment, but x86 32bits might.

Besides adding the refcount, this patch also introduces
stack_depot_{inc,dec}_count for the allocation/free handle.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 include/linux/stackdepot.h |  2 ++
 lib/stackdepot.c           | 53 +++++++++++++++++++++++++++++++-------
 mm/page_owner.c            |  6 +++++
 3 files changed, 51 insertions(+), 10 deletions(-)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index e58306783d8e..6ba4fcdb0c5f 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -94,6 +94,8 @@ static inline int stack_depot_early_init(void)	{ return 0; }
 depot_stack_handle_t __stack_depot_save(unsigned long *entries,
 					unsigned int nr_entries,
 					gfp_t gfp_flags, bool can_alloc);
+void stack_depot_inc_count(depot_stack_handle_t handle);
+void stack_depot_dec_count(depot_stack_handle_t handle);
 
 /**
  * stack_depot_save - Save a stack trace to stack depot
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 2f5aa851834e..d35edac430c4 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -60,6 +60,7 @@ struct stack_record {
 	u32 hash;			/* Hash in the hash table */
 	u32 size;			/* Number of stored frames */
 	union handle_parts handle;
+	refcount_t count;		/* Number of the same repeated stacks */
 	unsigned long entries[];	/* Variable-sized array of frames */
 };
 
@@ -305,6 +306,7 @@ depot_alloc_stack(unsigned long *entries, int size, u32 hash, void **prealloc)
 	stack->handle.offset = pool_offset >> DEPOT_STACK_ALIGN;
 	stack->handle.valid = 1;
 	stack->handle.extra = 0;
+	refcount_set(&stack->count, 1);
 	memcpy(stack->entries, entries, flex_array_size(stack, entries, size));
 	pool_offset += required_size;
 	/*
@@ -457,8 +459,7 @@ depot_stack_handle_t stack_depot_save(unsigned long *entries,
 }
 EXPORT_SYMBOL_GPL(stack_depot_save);
 
-unsigned int stack_depot_fetch(depot_stack_handle_t handle,
-			       unsigned long **entries)
+static struct stack_record *stack_depot_getstack(depot_stack_handle_t handle)
 {
 	union handle_parts parts = { .handle = handle };
 	/*
@@ -470,6 +471,26 @@ unsigned int stack_depot_fetch(depot_stack_handle_t handle,
 	size_t offset = parts.offset << DEPOT_STACK_ALIGN;
 	struct stack_record *stack;
 
+	if (!handle)
+		return NULL;
+
+	if (parts.pool_index > pool_index_cached) {
+		WARN(1, "pool index %d out of bounds (%d) for stack id %08x\n",
+		     parts.pool_index, pool_index_cached, handle);
+		return NULL;
+	}
+	pool = stack_pools[parts.pool_index];
+	if (!pool)
+		return NULL;
+	stack = pool + offset;
+	return stack;
+}
+
+unsigned int stack_depot_fetch(depot_stack_handle_t handle,
+			       unsigned long **entries)
+{
+	struct stack_record *stack;
+
 	*entries = NULL;
 	/*
 	 * Let KMSAN know *entries is initialized. This shall prevent false
@@ -480,21 +501,33 @@ unsigned int stack_depot_fetch(depot_stack_handle_t handle,
 	if (!handle)
 		return 0;
 
-	if (parts.pool_index > pool_index_cached) {
-		WARN(1, "pool index %d out of bounds (%d) for stack id %08x\n",
-			parts.pool_index, pool_index_cached, handle);
-		return 0;
-	}
-	pool = stack_pools[parts.pool_index];
-	if (!pool)
+	stack = stack_depot_getstack(handle);
+	if (!stack)
 		return 0;
-	stack = pool + offset;
 
 	*entries = stack->entries;
 	return stack->size;
 }
 EXPORT_SYMBOL_GPL(stack_depot_fetch);
 
+void stack_depot_inc_count(depot_stack_handle_t handle)
+{
+	struct stack_record *stack = NULL;
+
+	stack = stack_depot_getstack(handle);
+	if (stack)
+		refcount_inc(&stack->count);
+}
+
+void stack_depot_dec_count(depot_stack_handle_t handle)
+{
+	struct stack_record *stack = NULL;
+
+	stack = stack_depot_getstack(handle);
+	if (stack)
+		refcount_dec(&stack->count);
+}
+
 void stack_depot_print(depot_stack_handle_t stack)
 {
 	unsigned long *entries;
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 4f13ce7d2452..d53316d0d9be 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -139,6 +139,7 @@ void __reset_page_owner(struct page *page, unsigned short order)
 	int i;
 	struct page_ext *page_ext;
 	depot_stack_handle_t handle;
+	depot_stack_handle_t alloc_handle;
 	struct page_owner *page_owner;
 	u64 free_ts_nsec = local_clock();
 
@@ -146,6 +147,9 @@ void __reset_page_owner(struct page *page, unsigned short order)
 	if (unlikely(!page_ext))
 		return;
 
+	page_owner = get_page_owner(page_ext);
+	alloc_handle = page_owner->handle;
+
 	handle = save_stack(GFP_NOWAIT | __GFP_NOWARN);
 	for (i = 0; i < (1 << order); i++) {
 		__clear_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags);
@@ -155,6 +159,7 @@ void __reset_page_owner(struct page *page, unsigned short order)
 		page_ext = page_ext_next(page_ext);
 	}
 	page_ext_put(page_ext);
+	stack_depot_dec_count(alloc_handle);
 }
 
 static inline void __set_page_owner_handle(struct page_ext *page_ext,
@@ -196,6 +201,7 @@ noinline void __set_page_owner(struct page *page, unsigned short order,
 		return;
 	__set_page_owner_handle(page_ext, handle, order, gfp_mask);
 	page_ext_put(page_ext);
+	stack_depot_inc_count(handle);
 }
 
 void __set_page_owner_migrate_reason(struct page *page, int reason)
-- 
2.42.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header
  2023-11-20  8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
  2023-11-20  8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
@ 2023-11-20  8:42 ` Oscar Salvador
  2023-11-20  8:42 ` [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter Oscar Salvador
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20  8:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
	Eric Dumazet, Alexander Potapenko, Oscar Salvador

In order to move the heavy lifting into page_owner code, this one
needs to have access to stack_record structure, which right now
sits in lib/stackdepot.c.
Move it to the stackdepot.h header so page_owner can access
stack_record's struct fields.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 include/linux/stackdepot.h | 34 ++++++++++++++++++++++++++++++++++
 lib/stackdepot.c           | 34 ----------------------------------
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 6ba4fcdb0c5f..269a828a5e94 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -32,6 +32,40 @@ typedef u32 depot_stack_handle_t;
  */
 #define STACK_DEPOT_EXTRA_BITS 5
 
+#define DEPOT_HANDLE_BITS (sizeof(depot_stack_handle_t) * 8)
+
+#define DEPOT_VALID_BITS 1
+#define DEPOT_POOL_ORDER 2 /* Pool size order, 4 pages */
+#define DEPOT_POOL_SIZE (1LL << (PAGE_SHIFT + DEPOT_POOL_ORDER))
+#define DEPOT_STACK_ALIGN 4
+#define DEPOT_OFFSET_BITS (DEPOT_POOL_ORDER + PAGE_SHIFT - DEPOT_STACK_ALIGN)
+#define DEPOT_POOL_INDEX_BITS (DEPOT_HANDLE_BITS - DEPOT_VALID_BITS - \
+			       DEPOT_OFFSET_BITS - STACK_DEPOT_EXTRA_BITS)
+#define DEPOT_POOLS_CAP 8192
+#define DEPOT_MAX_POOLS \
+	(((1LL << (DEPOT_POOL_INDEX_BITS)) < DEPOT_POOLS_CAP) ? \
+	(1LL << (DEPOT_POOL_INDEX_BITS)) : DEPOT_POOLS_CAP)
+
+/* Compact structure that stores a reference to a stack. */
+union handle_parts {
+	depot_stack_handle_t handle;
+	struct {
+		u32 pool_index	: DEPOT_POOL_INDEX_BITS;
+		u32 offset	: DEPOT_OFFSET_BITS;
+		u32 valid	: DEPOT_VALID_BITS;
+		u32 extra	: STACK_DEPOT_EXTRA_BITS;
+	};
+};
+
+struct stack_record {
+	struct stack_record *next;	/* Link in the hash table */
+	u32 hash;			/* Hash in the hash table */
+	u32 size;			/* Number of stored frames */
+	union handle_parts handle;
+	refcount_t count;		/* Number of the same repeated stacks */
+	unsigned long entries[];	/* Variable-sized array of frames */
+};
+
 /*
  * Using stack depot requires its initialization, which can be done in 3 ways:
  *
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index d35edac430c4..1343d3095bc1 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -30,40 +30,6 @@
 #include <linux/memblock.h>
 #include <linux/kasan-enabled.h>
 
-#define DEPOT_HANDLE_BITS (sizeof(depot_stack_handle_t) * 8)
-
-#define DEPOT_VALID_BITS 1
-#define DEPOT_POOL_ORDER 2 /* Pool size order, 4 pages */
-#define DEPOT_POOL_SIZE (1LL << (PAGE_SHIFT + DEPOT_POOL_ORDER))
-#define DEPOT_STACK_ALIGN 4
-#define DEPOT_OFFSET_BITS (DEPOT_POOL_ORDER + PAGE_SHIFT - DEPOT_STACK_ALIGN)
-#define DEPOT_POOL_INDEX_BITS (DEPOT_HANDLE_BITS - DEPOT_VALID_BITS - \
-			       DEPOT_OFFSET_BITS - STACK_DEPOT_EXTRA_BITS)
-#define DEPOT_POOLS_CAP 8192
-#define DEPOT_MAX_POOLS \
-	(((1LL << (DEPOT_POOL_INDEX_BITS)) < DEPOT_POOLS_CAP) ? \
-	 (1LL << (DEPOT_POOL_INDEX_BITS)) : DEPOT_POOLS_CAP)
-
-/* Compact structure that stores a reference to a stack. */
-union handle_parts {
-	depot_stack_handle_t handle;
-	struct {
-		u32 pool_index	: DEPOT_POOL_INDEX_BITS;
-		u32 offset	: DEPOT_OFFSET_BITS;
-		u32 valid	: DEPOT_VALID_BITS;
-		u32 extra	: STACK_DEPOT_EXTRA_BITS;
-	};
-};
-
-struct stack_record {
-	struct stack_record *next;	/* Link in the hash table */
-	u32 hash;			/* Hash in the hash table */
-	u32 size;			/* Number of stored frames */
-	union handle_parts handle;
-	refcount_t count;		/* Number of the same repeated stacks */
-	unsigned long entries[];	/* Variable-sized array of frames */
-};
-
 static bool stack_depot_disabled;
 static bool __stack_depot_early_init_requested __initdata = IS_ENABLED(CONFIG_STACKDEPOT_ALWAYS_INIT);
 static bool __stack_depot_early_init_passed __initdata;
-- 
2.42.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter
  2023-11-20  8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
  2023-11-20  8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
  2023-11-20  8:42 ` [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
@ 2023-11-20  8:42 ` Oscar Salvador
  2023-11-20  8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
  2023-11-20  9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
  4 siblings, 0 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20  8:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
	Eric Dumazet, Alexander Potapenko, Oscar Salvador

We are not interested in the owner of each individual pfn,
but how many outstanding allocations are there for each unique
allocating stack trace.

Right now, getting that information is quite hard as one needs
to fiddle with page_owner output, screen through pfns and make
the links.

So, instead, let us add a new file called 'page_owner_stacks'
that shows just that.
Such file will only show the stacktrace once followed by its
counter, which represents the number of outstanding allocations.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 include/linux/stackdepot.h |  2 +
 lib/stackdepot.c           | 29 +++++++++++++
 mm/page_owner.c            | 87 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 118 insertions(+)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 269a828a5e94..09c478b1bf73 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -146,6 +146,8 @@ void stack_depot_dec_count(depot_stack_handle_t handle);
 depot_stack_handle_t stack_depot_save(unsigned long *entries,
 				      unsigned int nr_entries, gfp_t gfp_flags);
 
+struct stack_record *stack_depot_get_next_stack(unsigned long *table,
+						struct stack_record *curr_stack);
 /**
  * stack_depot_fetch - Fetch a stack trace from stack depot
  *
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 1343d3095bc1..65708c0c1e50 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -452,6 +452,35 @@ static struct stack_record *stack_depot_getstack(depot_stack_handle_t handle)
 	return stack;
 }
 
+struct stack_record *stack_depot_get_next_stack(unsigned long *table,
+						struct stack_record *curr_stack)
+{
+	unsigned long nr_table = *table;
+	struct stack_record *next = NULL, **stacks;
+	unsigned long stack_table_entries = stack_hash_mask + 1;
+
+	if (!curr_stack) {
+		if (nr_table) {
+new_table:
+			nr_table++;
+			if (nr_table >= stack_table_entries)
+				goto out;
+		}
+	stacks = &stack_table[nr_table];
+	curr_stack = (struct stack_record *)stacks;
+	next = curr_stack;
+	} else {
+		next = curr_stack->next;
+	}
+
+	if (!next)
+		goto new_table;
+
+out:
+	*table = nr_table;
+	return next;
+}
+
 unsigned int stack_depot_fetch(depot_stack_handle_t handle,
 			       unsigned long **entries)
 {
diff --git a/mm/page_owner.c b/mm/page_owner.c
index d53316d0d9be..509c11e506db 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -719,6 +719,91 @@ static const struct file_operations proc_page_owner_operations = {
 	.llseek		= lseek_page_owner,
 };
 
+static void *stack_start(struct seq_file *m, loff_t *ppos)
+{
+	unsigned long *nr_table = m->private;
+	void *stack;
+
+	/* First time */
+	if (*ppos == 0)
+		*nr_table = 0;
+
+	if (*ppos == -1UL)
+		return NULL;
+
+	stack = stack_depot_get_next_stack(nr_table, NULL);
+
+	return stack;
+}
+
+static void *stack_next(struct seq_file *m, void *v, loff_t *ppos)
+{
+	unsigned long *nr_table = m->private;
+	void *next_stack;
+
+	next_stack = stack_depot_get_next_stack(nr_table, v);
+	*ppos = next_stack ? *ppos + 1 : -1UL;
+
+	return next_stack;
+}
+
+static int stack_depot_get_stack_info(struct stack_record *stack, char *buf)
+{
+	if (!stack->size || stack->size < 0 ||
+	    stack->size > PAGE_SIZE || stack->handle.valid != 1 ||
+	    refcount_read(&stack->count) < 1)
+		return 0;
+
+	return stack_trace_snprint(buf, PAGE_SIZE, stack->entries, stack->size, 0);
+}
+
+static int stack_print(struct seq_file *m, void *v)
+{
+	char *buf;
+	int ret = 0;
+	struct stack_record *stack = (struct stack_record *)v;
+
+	buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
+
+	ret += stack_depot_get_stack_info(stack, buf);
+	if (!ret)
+		goto out;
+
+	scnprintf(buf + ret, PAGE_SIZE - ret, "stack_count: %d\n\n",
+		  refcount_read(&stack->count));
+
+	seq_printf(m, buf);
+	seq_puts(m, "\n\n");
+out:
+	kfree(buf);
+
+	return 0;
+}
+
+static void stack_stop(struct seq_file *m, void *v)
+{
+}
+
+static const struct seq_operations page_owner_stack_op = {
+	.start  = stack_start,
+	.next   = stack_next,
+	.stop   = stack_stop,
+	.show   = stack_print
+};
+
+static int page_owner_stack_open(struct inode *inode, struct file *file)
+{
+	return seq_open_private(file, &page_owner_stack_op,
+				sizeof(unsigned long));
+}
+
+const struct file_operations page_owner_stack_operations = {
+	.open           = page_owner_stack_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = seq_release,
+};
+
 static int __init pageowner_init(void)
 {
 	if (!static_branch_unlikely(&page_owner_inited)) {
@@ -728,6 +813,8 @@ static int __init pageowner_init(void)
 
 	debugfs_create_file("page_owner", 0400, NULL, NULL,
 			    &proc_page_owner_operations);
+	debugfs_create_file("page_owner_stacks", 0400, NULL, NULL,
+			    &page_owner_stack_operations);
 
 	return 0;
 }
-- 
2.42.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
  2023-11-20  8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
                   ` (2 preceding siblings ...)
  2023-11-20  8:42 ` [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter Oscar Salvador
@ 2023-11-20  8:43 ` Oscar Salvador
  2023-11-20 14:17   ` kernel test robot
  2023-11-20 19:52   ` kernel test robot
  2023-11-20  9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
  4 siblings, 2 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20  8:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
	Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
	Eric Dumazet, Alexander Potapenko, Oscar Salvador

We want to be able to filter out the output based on a certain
threshold, so we can leave out stacks with a low counter of
outstanding allocations.

We can control the threshold value by a new file called
'page_owner_threshold', which is 0 by default.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 mm/page_owner.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index 509c11e506db..6ee345dafd01 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -41,6 +41,8 @@ static depot_stack_handle_t dummy_handle;
 static depot_stack_handle_t failure_handle;
 static depot_stack_handle_t early_handle;
 
+static unsigned long page_owner_stack_threshold;
+
 static void init_early_allocated_pages(void);
 
 static int __init early_page_owner_param(char *buf)
@@ -763,6 +765,9 @@ static int stack_print(struct seq_file *m, void *v)
 	int ret = 0;
 	struct stack_record *stack = (struct stack_record *)v;
 
+	if (refcount_read(&stack->count) < page_owner_stack_threshold)
+		return 0;
+
 	buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
 
 	ret += stack_depot_get_stack_info(stack, buf);
@@ -804,6 +809,21 @@ const struct file_operations page_owner_stack_operations = {
 	.release        = seq_release,
 };
 
+int page_owner_threshold_get(void *data, u64 *val)
+{
+	*val = page_owner_stack_threshold;
+	return 0;
+}
+
+int page_owner_threshold_set(void *data, u64 val)
+{
+	page_owner_stack_threshold = val;
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(proc_page_owner_threshold, &page_owner_threshold_get,
+			&page_owner_threshold_set, "%llu");
+
 static int __init pageowner_init(void)
 {
 	if (!static_branch_unlikely(&page_owner_inited)) {
@@ -815,6 +835,8 @@ static int __init pageowner_init(void)
 			    &proc_page_owner_operations);
 	debugfs_create_file("page_owner_stacks", 0400, NULL, NULL,
 			    &page_owner_stack_operations);
+	debugfs_create_file("page_owner_threshold", 0600, NULL, NULL,
+			    &proc_page_owner_threshold);
 
 	return 0;
 }
-- 
2.42.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v6 0/4] page_owner: print stacks and their counter
  2023-11-20  8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
                   ` (3 preceding siblings ...)
  2023-11-20  8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
@ 2023-11-20  9:07 ` Vlastimil Babka
  2023-11-20 17:54   ` Andrey Konovalov
  4 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2023-11-20  9:07 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Waiman Long,
	Suren Baghdasaryan, Marco Elver, Andrey Konovalov, Eric Dumazet,
	Alexander Potapenko

On 11/20/23 09:42, Oscar Salvador wrote:
> Changes v5 -> v6:
>      - Rebase on top of v6.7-rc1

Hi,

I think at this point it would be better to rebase on top of
https://lore.kernel.org/all/cover.1698077459.git.andreyknvl@google.com/
which already contains the refcount field in stack_record.
(and maybe help with the review of that series as well? Hopefully it can get
to mm/unstable and -next towards 6.8 soon :)

Vlastimil


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
  2023-11-20  8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
@ 2023-11-20 14:17   ` kernel test robot
  2023-11-20 19:52   ` kernel test robot
  1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2023-11-20 14:17 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: llvm, oe-kbuild-all, Linux Memory Management List, linux-kernel,
	Michal Hocko, Vlastimil Babka, Waiman Long, Suren Baghdasaryan,
	Marco Elver, Andrey Konovalov, Eric Dumazet, Alexander Potapenko,
	Oscar Salvador

Hi Oscar,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.7-rc2 next-20231120]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Oscar-Salvador/lib-stackdepot-Add-a-refcount-field-in-stack_record/20231120-164913
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20231120084300.4368-5-osalvador%40suse.de
patch subject: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
config: arm-randconfig-001-20231120 (https://download.01.org/0day-ci/archive/20231120/202311202220.b6pHpn7J-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231120/202311202220.b6pHpn7J-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311202220.b6pHpn7J-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/page_owner.c:819:5: warning: no previous prototype for function 'page_owner_threshold_get' [-Wmissing-prototypes]
     819 | int page_owner_threshold_get(void *data, u64 *val)
         |     ^
   mm/page_owner.c:819:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     819 | int page_owner_threshold_get(void *data, u64 *val)
         | ^
         | static 
>> mm/page_owner.c:825:5: warning: no previous prototype for function 'page_owner_threshold_set' [-Wmissing-prototypes]
     825 | int page_owner_threshold_set(void *data, u64 val)
         |     ^
   mm/page_owner.c:825:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     825 | int page_owner_threshold_set(void *data, u64 val)
         | ^
         | static 
   2 warnings generated.


vim +/page_owner_threshold_get +819 mm/page_owner.c

   818	
 > 819	int page_owner_threshold_get(void *data, u64 *val)
   820	{
   821		*val = page_owner_stack_threshold;
   822		return 0;
   823	}
   824	
 > 825	int page_owner_threshold_set(void *data, u64 val)
   826	{
   827		page_owner_stack_threshold = val;
   828		return 0;
   829	}
   830	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v6 0/4] page_owner: print stacks and their counter
  2023-11-20  9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
@ 2023-11-20 17:54   ` Andrey Konovalov
  0 siblings, 0 replies; 9+ messages in thread
From: Andrey Konovalov @ 2023-11-20 17:54 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Andrew Morton, linux-kernel, linux-mm, Michal Hocko, Waiman Long,
	Suren Baghdasaryan, Marco Elver, Eric Dumazet,
	Alexander Potapenko, Vlastimil Babka

On Mon, Nov 20, 2023 at 10:07 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 11/20/23 09:42, Oscar Salvador wrote:
> > Changes v5 -> v6:
> >      - Rebase on top of v6.7-rc1
>
> Hi,
>
> I think at this point it would be better to rebase on top of
> https://lore.kernel.org/all/cover.1698077459.git.andreyknvl@google.com/
> which already contains the refcount field in stack_record.
> (and maybe help with the review of that series as well? Hopefully it can get
> to mm/unstable and -next towards 6.8 soon :)

Yes, please; I just mailed a v4 that is based on v6.7-rc2:
https://lore.kernel.org/linux-mm/cover.1700502145.git.andreyknvl@google.com/T/#t

You should also be able to use the new
stack_depot_save_flags(STACK_DEPOT_FLAG_GET)/stack_depot_put API to
keep only the relevant stack traces.

Thank you!


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
  2023-11-20  8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
  2023-11-20 14:17   ` kernel test robot
@ 2023-11-20 19:52   ` kernel test robot
  1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2023-11-20 19:52 UTC (permalink / raw)
  To: Oscar Salvador, Andrew Morton
  Cc: oe-kbuild-all, Linux Memory Management List, linux-kernel,
	Michal Hocko, Vlastimil Babka, Waiman Long, Suren Baghdasaryan,
	Marco Elver, Andrey Konovalov, Eric Dumazet, Alexander Potapenko,
	Oscar Salvador

Hi Oscar,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.7-rc2 next-20231120]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Oscar-Salvador/lib-stackdepot-Add-a-refcount-field-in-stack_record/20231120-164913
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20231120084300.4368-5-osalvador%40suse.de
patch subject: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
config: arc-allmodconfig (https://download.01.org/0day-ci/archive/20231121/202311210354.BwQoLzVO-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231121/202311210354.BwQoLzVO-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311210354.BwQoLzVO-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/page_owner.c:819:5: warning: no previous prototype for 'page_owner_threshold_get' [-Wmissing-prototypes]
     819 | int page_owner_threshold_get(void *data, u64 *val)
         |     ^~~~~~~~~~~~~~~~~~~~~~~~
>> mm/page_owner.c:825:5: warning: no previous prototype for 'page_owner_threshold_set' [-Wmissing-prototypes]
     825 | int page_owner_threshold_set(void *data, u64 val)
         |     ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/page_owner_threshold_get +819 mm/page_owner.c

   818	
 > 819	int page_owner_threshold_get(void *data, u64 *val)
   820	{
   821		*val = page_owner_stack_threshold;
   822		return 0;
   823	}
   824	
 > 825	int page_owner_threshold_set(void *data, u64 val)
   826	{
   827		page_owner_stack_threshold = val;
   828		return 0;
   829	}
   830	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-11-20 19:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-20  8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
2023-11-20  8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
2023-11-20  8:42 ` [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
2023-11-20  8:42 ` [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter Oscar Salvador
2023-11-20  8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
2023-11-20 14:17   ` kernel test robot
2023-11-20 19:52   ` kernel test robot
2023-11-20  9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
2023-11-20 17:54   ` Andrey Konovalov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox