* [PATCH v6 0/4] page_owner: print stacks and their counter
@ 2023-11-20 8:42 Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20 8:42 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
Eric Dumazet, Alexander Potapenko, Oscar Salvador
Changes v5 -> v6:
- Rebase on top of v6.7-rc1
- Move stack_record struct to the header
- Addressed feedback from Vlastimil
(some code tweaks and changelogs suggestions)
Changes v4 -> v5:
- Addressed feedback from Alexander Potapenko
Changes v3 -> v4:
- Rebase (long time has passed)
- Use boolean instead of enum for action by Alexander Potapenko
- (I left some feedback untouched because it's been long and
would like to discuss it here now instead of re-vamping
and old thread)
Changes v2 -> v3:
- Replace interface in favor of seq operations (suggested by Vlastimil)
- Use debugfs interface to store/read valued (suggested by Ammar)
Hi,
page_owner is a great debug functionality tool that lets us know
about all pages that have been allocated/freed and their stacktrace.
This comes very handy when one has to debug memory leaks, since with
some scripting we can see outstanding allocations, which might end up
pointing to a leak.
In my experience, that is one of the most useful cases, but it can get
really tedious to screen through all pages and try to reconstruct the
stack <-> allocated/freed relationship. There is a lot of noise
to cancel off.
This patchset aims to ease that by adding a new functionality into page_owner.
What this does is to create a new read-only file "page_owner_stacks",
which prints out only the stacktraces followed by their outstanding
number of allocations (being that the times the stacktrace has allocated
but not freed yet).
This lets us have a clear overview of stacks <-> allocated/freed relationship
without the need to fiddle with pages and trying to match free stacktraces
with allocated stacktraces.
This is achieved by adding a new refcount_t field in the stack_record struct,
incrementing that refcount_t everytime the same stacktrace allocates,
and decrementing it when it frees a page. Details can be seen in the
respective patches.
We also create another file called "page_owner_threshold", which let us
specify a threshold, so when reading from "page_owner_stacks",
we will only see those stacktraces which counting goes beyond the
threshold we specified.
A PoC can be found below:
# cat /sys/kernel/debug/page_owner_threshold
0
# cat /sys/kernel/debug/page_owner_stacks > stacks_full.txt
# head -32 stacks_full.txt
prep_new_page+0x10d/0x180
get_page_from_freelist+0x1bd6/0x1e10
__alloc_pages+0x194/0x360
alloc_page_interleave+0x13/0x90
new_slab+0x31d/0x530
___slab_alloc+0x5d7/0x720
__slab_alloc.isra.85+0x4a/0x90
kmem_cache_alloc+0x455/0x4a0
acpi_ps_alloc_op+0x57/0x8f
acpi_ps_create_scope_op+0x12/0x23
acpi_ps_execute_method+0x102/0x2c1
acpi_ns_evaluate+0x343/0x4da
acpi_evaluate_object+0x1cb/0x392
acpi_run_osc+0x135/0x260
acpi_init+0x165/0x4ed
do_one_initcall+0x3e/0x200
stack count: 2
free_pcp_prepare+0x287/0x5c0
free_unref_page+0x1c/0xd0
__mmdrop+0x50/0x160
finish_task_switch+0x249/0x2b0
__schedule+0x2c3/0x960
schedule+0x44/0xb0
futex_wait_queue+0x70/0xd0
futex_wait+0x160/0x250
do_futex+0x11c/0x1b0
__x64_sys_futex+0x5e/0x1d0
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
stack count: 1
# echo 10000 > /sys/kernel/debug/page_owner_threshold
# cat /sys/kernel/debug/page_owner_stacks > stacks_10000.txt
# cat stacks_10000.txt
prep_new_page+0x10d/0x180
get_page_from_freelist+0x1bd6/0x1e10
__alloc_pages+0x194/0x360
folio_alloc+0x17/0x40
page_cache_ra_unbounded+0x96/0x170
filemap_get_pages+0x23d/0x5e0
filemap_read+0xbf/0x3a0
__kernel_read+0x136/0x2f0
kernel_read_file+0x197/0x2d0
kernel_read_file_from_fd+0x54/0x90
__do_sys_finit_module+0x89/0x120
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
stack count: 36195
prep_new_page+0x10d/0x180
get_page_from_freelist+0x1bd6/0x1e10
__alloc_pages+0x194/0x360
folio_alloc+0x17/0x40
page_cache_ra_unbounded+0x96/0x170
filemap_get_pages+0x23d/0x5e0
filemap_read+0xbf/0x3a0
new_sync_read+0x106/0x180
vfs_read+0x16f/0x190
ksys_read+0xa5/0xe0
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
stack count: 44484
prep_new_page+0x10d/0x180
get_page_from_freelist+0x1bd6/0x1e10
__alloc_pages+0x194/0x360
folio_alloc+0x17/0x40
page_cache_ra_unbounded+0x96/0x170
filemap_get_pages+0xdd/0x5e0
filemap_read+0xbf/0x3a0
new_sync_read+0x106/0x180
vfs_read+0x16f/0x190
ksys_read+0xa5/0xe0
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
stack count: 17874
Oscar Salvador (4):
lib/stackdepot: Add a refcount field in stack_record
lib/stackdepot: Move stack_record struct definition into the header
mm,page_owner: Add page_owner_stacks file to print out only stacks and
their counter
mm,page_owner: Filter out stacks by a threshold counter
include/linux/stackdepot.h | 38 ++++++++++++
lib/stackdepot.c | 114 ++++++++++++++++++++++--------------
mm/page_owner.c | 115 +++++++++++++++++++++++++++++++++++++
3 files changed, 224 insertions(+), 43 deletions(-)
--
2.42.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record
2023-11-20 8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
@ 2023-11-20 8:42 ` Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20 8:42 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
Eric Dumazet, Alexander Potapenko, Oscar Salvador
We want to filter out the page_owner output and print only those
stacks for which their counter (number of outstanding allocations) goes
beyond a certain threshold.
This gives us the chance to get rid of a lot of noise.
In order to do that, we need to know how many outstanding allocations
with a particular stack (for allocation) do we have, so we add a new
refcount_t field in the stack_record struct.
Note that this might increase the size of the struct for some
architectures.
E.g: x86_64 is not affected due to alignment, but x86 32bits might.
Besides adding the refcount, this patch also introduces
stack_depot_{inc,dec}_count for the allocation/free handle.
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
include/linux/stackdepot.h | 2 ++
lib/stackdepot.c | 53 +++++++++++++++++++++++++++++++-------
mm/page_owner.c | 6 +++++
3 files changed, 51 insertions(+), 10 deletions(-)
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index e58306783d8e..6ba4fcdb0c5f 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -94,6 +94,8 @@ static inline int stack_depot_early_init(void) { return 0; }
depot_stack_handle_t __stack_depot_save(unsigned long *entries,
unsigned int nr_entries,
gfp_t gfp_flags, bool can_alloc);
+void stack_depot_inc_count(depot_stack_handle_t handle);
+void stack_depot_dec_count(depot_stack_handle_t handle);
/**
* stack_depot_save - Save a stack trace to stack depot
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 2f5aa851834e..d35edac430c4 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -60,6 +60,7 @@ struct stack_record {
u32 hash; /* Hash in the hash table */
u32 size; /* Number of stored frames */
union handle_parts handle;
+ refcount_t count; /* Number of the same repeated stacks */
unsigned long entries[]; /* Variable-sized array of frames */
};
@@ -305,6 +306,7 @@ depot_alloc_stack(unsigned long *entries, int size, u32 hash, void **prealloc)
stack->handle.offset = pool_offset >> DEPOT_STACK_ALIGN;
stack->handle.valid = 1;
stack->handle.extra = 0;
+ refcount_set(&stack->count, 1);
memcpy(stack->entries, entries, flex_array_size(stack, entries, size));
pool_offset += required_size;
/*
@@ -457,8 +459,7 @@ depot_stack_handle_t stack_depot_save(unsigned long *entries,
}
EXPORT_SYMBOL_GPL(stack_depot_save);
-unsigned int stack_depot_fetch(depot_stack_handle_t handle,
- unsigned long **entries)
+static struct stack_record *stack_depot_getstack(depot_stack_handle_t handle)
{
union handle_parts parts = { .handle = handle };
/*
@@ -470,6 +471,26 @@ unsigned int stack_depot_fetch(depot_stack_handle_t handle,
size_t offset = parts.offset << DEPOT_STACK_ALIGN;
struct stack_record *stack;
+ if (!handle)
+ return NULL;
+
+ if (parts.pool_index > pool_index_cached) {
+ WARN(1, "pool index %d out of bounds (%d) for stack id %08x\n",
+ parts.pool_index, pool_index_cached, handle);
+ return NULL;
+ }
+ pool = stack_pools[parts.pool_index];
+ if (!pool)
+ return NULL;
+ stack = pool + offset;
+ return stack;
+}
+
+unsigned int stack_depot_fetch(depot_stack_handle_t handle,
+ unsigned long **entries)
+{
+ struct stack_record *stack;
+
*entries = NULL;
/*
* Let KMSAN know *entries is initialized. This shall prevent false
@@ -480,21 +501,33 @@ unsigned int stack_depot_fetch(depot_stack_handle_t handle,
if (!handle)
return 0;
- if (parts.pool_index > pool_index_cached) {
- WARN(1, "pool index %d out of bounds (%d) for stack id %08x\n",
- parts.pool_index, pool_index_cached, handle);
- return 0;
- }
- pool = stack_pools[parts.pool_index];
- if (!pool)
+ stack = stack_depot_getstack(handle);
+ if (!stack)
return 0;
- stack = pool + offset;
*entries = stack->entries;
return stack->size;
}
EXPORT_SYMBOL_GPL(stack_depot_fetch);
+void stack_depot_inc_count(depot_stack_handle_t handle)
+{
+ struct stack_record *stack = NULL;
+
+ stack = stack_depot_getstack(handle);
+ if (stack)
+ refcount_inc(&stack->count);
+}
+
+void stack_depot_dec_count(depot_stack_handle_t handle)
+{
+ struct stack_record *stack = NULL;
+
+ stack = stack_depot_getstack(handle);
+ if (stack)
+ refcount_dec(&stack->count);
+}
+
void stack_depot_print(depot_stack_handle_t stack)
{
unsigned long *entries;
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 4f13ce7d2452..d53316d0d9be 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -139,6 +139,7 @@ void __reset_page_owner(struct page *page, unsigned short order)
int i;
struct page_ext *page_ext;
depot_stack_handle_t handle;
+ depot_stack_handle_t alloc_handle;
struct page_owner *page_owner;
u64 free_ts_nsec = local_clock();
@@ -146,6 +147,9 @@ void __reset_page_owner(struct page *page, unsigned short order)
if (unlikely(!page_ext))
return;
+ page_owner = get_page_owner(page_ext);
+ alloc_handle = page_owner->handle;
+
handle = save_stack(GFP_NOWAIT | __GFP_NOWARN);
for (i = 0; i < (1 << order); i++) {
__clear_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags);
@@ -155,6 +159,7 @@ void __reset_page_owner(struct page *page, unsigned short order)
page_ext = page_ext_next(page_ext);
}
page_ext_put(page_ext);
+ stack_depot_dec_count(alloc_handle);
}
static inline void __set_page_owner_handle(struct page_ext *page_ext,
@@ -196,6 +201,7 @@ noinline void __set_page_owner(struct page *page, unsigned short order,
return;
__set_page_owner_handle(page_ext, handle, order, gfp_mask);
page_ext_put(page_ext);
+ stack_depot_inc_count(handle);
}
void __set_page_owner_migrate_reason(struct page *page, int reason)
--
2.42.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header
2023-11-20 8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
@ 2023-11-20 8:42 ` Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter Oscar Salvador
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20 8:42 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
Eric Dumazet, Alexander Potapenko, Oscar Salvador
In order to move the heavy lifting into page_owner code, this one
needs to have access to stack_record structure, which right now
sits in lib/stackdepot.c.
Move it to the stackdepot.h header so page_owner can access
stack_record's struct fields.
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
include/linux/stackdepot.h | 34 ++++++++++++++++++++++++++++++++++
lib/stackdepot.c | 34 ----------------------------------
2 files changed, 34 insertions(+), 34 deletions(-)
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 6ba4fcdb0c5f..269a828a5e94 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -32,6 +32,40 @@ typedef u32 depot_stack_handle_t;
*/
#define STACK_DEPOT_EXTRA_BITS 5
+#define DEPOT_HANDLE_BITS (sizeof(depot_stack_handle_t) * 8)
+
+#define DEPOT_VALID_BITS 1
+#define DEPOT_POOL_ORDER 2 /* Pool size order, 4 pages */
+#define DEPOT_POOL_SIZE (1LL << (PAGE_SHIFT + DEPOT_POOL_ORDER))
+#define DEPOT_STACK_ALIGN 4
+#define DEPOT_OFFSET_BITS (DEPOT_POOL_ORDER + PAGE_SHIFT - DEPOT_STACK_ALIGN)
+#define DEPOT_POOL_INDEX_BITS (DEPOT_HANDLE_BITS - DEPOT_VALID_BITS - \
+ DEPOT_OFFSET_BITS - STACK_DEPOT_EXTRA_BITS)
+#define DEPOT_POOLS_CAP 8192
+#define DEPOT_MAX_POOLS \
+ (((1LL << (DEPOT_POOL_INDEX_BITS)) < DEPOT_POOLS_CAP) ? \
+ (1LL << (DEPOT_POOL_INDEX_BITS)) : DEPOT_POOLS_CAP)
+
+/* Compact structure that stores a reference to a stack. */
+union handle_parts {
+ depot_stack_handle_t handle;
+ struct {
+ u32 pool_index : DEPOT_POOL_INDEX_BITS;
+ u32 offset : DEPOT_OFFSET_BITS;
+ u32 valid : DEPOT_VALID_BITS;
+ u32 extra : STACK_DEPOT_EXTRA_BITS;
+ };
+};
+
+struct stack_record {
+ struct stack_record *next; /* Link in the hash table */
+ u32 hash; /* Hash in the hash table */
+ u32 size; /* Number of stored frames */
+ union handle_parts handle;
+ refcount_t count; /* Number of the same repeated stacks */
+ unsigned long entries[]; /* Variable-sized array of frames */
+};
+
/*
* Using stack depot requires its initialization, which can be done in 3 ways:
*
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index d35edac430c4..1343d3095bc1 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -30,40 +30,6 @@
#include <linux/memblock.h>
#include <linux/kasan-enabled.h>
-#define DEPOT_HANDLE_BITS (sizeof(depot_stack_handle_t) * 8)
-
-#define DEPOT_VALID_BITS 1
-#define DEPOT_POOL_ORDER 2 /* Pool size order, 4 pages */
-#define DEPOT_POOL_SIZE (1LL << (PAGE_SHIFT + DEPOT_POOL_ORDER))
-#define DEPOT_STACK_ALIGN 4
-#define DEPOT_OFFSET_BITS (DEPOT_POOL_ORDER + PAGE_SHIFT - DEPOT_STACK_ALIGN)
-#define DEPOT_POOL_INDEX_BITS (DEPOT_HANDLE_BITS - DEPOT_VALID_BITS - \
- DEPOT_OFFSET_BITS - STACK_DEPOT_EXTRA_BITS)
-#define DEPOT_POOLS_CAP 8192
-#define DEPOT_MAX_POOLS \
- (((1LL << (DEPOT_POOL_INDEX_BITS)) < DEPOT_POOLS_CAP) ? \
- (1LL << (DEPOT_POOL_INDEX_BITS)) : DEPOT_POOLS_CAP)
-
-/* Compact structure that stores a reference to a stack. */
-union handle_parts {
- depot_stack_handle_t handle;
- struct {
- u32 pool_index : DEPOT_POOL_INDEX_BITS;
- u32 offset : DEPOT_OFFSET_BITS;
- u32 valid : DEPOT_VALID_BITS;
- u32 extra : STACK_DEPOT_EXTRA_BITS;
- };
-};
-
-struct stack_record {
- struct stack_record *next; /* Link in the hash table */
- u32 hash; /* Hash in the hash table */
- u32 size; /* Number of stored frames */
- union handle_parts handle;
- refcount_t count; /* Number of the same repeated stacks */
- unsigned long entries[]; /* Variable-sized array of frames */
-};
-
static bool stack_depot_disabled;
static bool __stack_depot_early_init_requested __initdata = IS_ENABLED(CONFIG_STACKDEPOT_ALWAYS_INIT);
static bool __stack_depot_early_init_passed __initdata;
--
2.42.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter
2023-11-20 8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
@ 2023-11-20 8:42 ` Oscar Salvador
2023-11-20 8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
2023-11-20 9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
4 siblings, 0 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20 8:42 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
Eric Dumazet, Alexander Potapenko, Oscar Salvador
We are not interested in the owner of each individual pfn,
but how many outstanding allocations are there for each unique
allocating stack trace.
Right now, getting that information is quite hard as one needs
to fiddle with page_owner output, screen through pfns and make
the links.
So, instead, let us add a new file called 'page_owner_stacks'
that shows just that.
Such file will only show the stacktrace once followed by its
counter, which represents the number of outstanding allocations.
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
include/linux/stackdepot.h | 2 +
lib/stackdepot.c | 29 +++++++++++++
mm/page_owner.c | 87 ++++++++++++++++++++++++++++++++++++++
3 files changed, 118 insertions(+)
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 269a828a5e94..09c478b1bf73 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -146,6 +146,8 @@ void stack_depot_dec_count(depot_stack_handle_t handle);
depot_stack_handle_t stack_depot_save(unsigned long *entries,
unsigned int nr_entries, gfp_t gfp_flags);
+struct stack_record *stack_depot_get_next_stack(unsigned long *table,
+ struct stack_record *curr_stack);
/**
* stack_depot_fetch - Fetch a stack trace from stack depot
*
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 1343d3095bc1..65708c0c1e50 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -452,6 +452,35 @@ static struct stack_record *stack_depot_getstack(depot_stack_handle_t handle)
return stack;
}
+struct stack_record *stack_depot_get_next_stack(unsigned long *table,
+ struct stack_record *curr_stack)
+{
+ unsigned long nr_table = *table;
+ struct stack_record *next = NULL, **stacks;
+ unsigned long stack_table_entries = stack_hash_mask + 1;
+
+ if (!curr_stack) {
+ if (nr_table) {
+new_table:
+ nr_table++;
+ if (nr_table >= stack_table_entries)
+ goto out;
+ }
+ stacks = &stack_table[nr_table];
+ curr_stack = (struct stack_record *)stacks;
+ next = curr_stack;
+ } else {
+ next = curr_stack->next;
+ }
+
+ if (!next)
+ goto new_table;
+
+out:
+ *table = nr_table;
+ return next;
+}
+
unsigned int stack_depot_fetch(depot_stack_handle_t handle,
unsigned long **entries)
{
diff --git a/mm/page_owner.c b/mm/page_owner.c
index d53316d0d9be..509c11e506db 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -719,6 +719,91 @@ static const struct file_operations proc_page_owner_operations = {
.llseek = lseek_page_owner,
};
+static void *stack_start(struct seq_file *m, loff_t *ppos)
+{
+ unsigned long *nr_table = m->private;
+ void *stack;
+
+ /* First time */
+ if (*ppos == 0)
+ *nr_table = 0;
+
+ if (*ppos == -1UL)
+ return NULL;
+
+ stack = stack_depot_get_next_stack(nr_table, NULL);
+
+ return stack;
+}
+
+static void *stack_next(struct seq_file *m, void *v, loff_t *ppos)
+{
+ unsigned long *nr_table = m->private;
+ void *next_stack;
+
+ next_stack = stack_depot_get_next_stack(nr_table, v);
+ *ppos = next_stack ? *ppos + 1 : -1UL;
+
+ return next_stack;
+}
+
+static int stack_depot_get_stack_info(struct stack_record *stack, char *buf)
+{
+ if (!stack->size || stack->size < 0 ||
+ stack->size > PAGE_SIZE || stack->handle.valid != 1 ||
+ refcount_read(&stack->count) < 1)
+ return 0;
+
+ return stack_trace_snprint(buf, PAGE_SIZE, stack->entries, stack->size, 0);
+}
+
+static int stack_print(struct seq_file *m, void *v)
+{
+ char *buf;
+ int ret = 0;
+ struct stack_record *stack = (struct stack_record *)v;
+
+ buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
+
+ ret += stack_depot_get_stack_info(stack, buf);
+ if (!ret)
+ goto out;
+
+ scnprintf(buf + ret, PAGE_SIZE - ret, "stack_count: %d\n\n",
+ refcount_read(&stack->count));
+
+ seq_printf(m, buf);
+ seq_puts(m, "\n\n");
+out:
+ kfree(buf);
+
+ return 0;
+}
+
+static void stack_stop(struct seq_file *m, void *v)
+{
+}
+
+static const struct seq_operations page_owner_stack_op = {
+ .start = stack_start,
+ .next = stack_next,
+ .stop = stack_stop,
+ .show = stack_print
+};
+
+static int page_owner_stack_open(struct inode *inode, struct file *file)
+{
+ return seq_open_private(file, &page_owner_stack_op,
+ sizeof(unsigned long));
+}
+
+const struct file_operations page_owner_stack_operations = {
+ .open = page_owner_stack_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
static int __init pageowner_init(void)
{
if (!static_branch_unlikely(&page_owner_inited)) {
@@ -728,6 +813,8 @@ static int __init pageowner_init(void)
debugfs_create_file("page_owner", 0400, NULL, NULL,
&proc_page_owner_operations);
+ debugfs_create_file("page_owner_stacks", 0400, NULL, NULL,
+ &page_owner_stack_operations);
return 0;
}
--
2.42.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
2023-11-20 8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
` (2 preceding siblings ...)
2023-11-20 8:42 ` [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter Oscar Salvador
@ 2023-11-20 8:43 ` Oscar Salvador
2023-11-20 14:17 ` kernel test robot
2023-11-20 19:52 ` kernel test robot
2023-11-20 9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
4 siblings, 2 replies; 9+ messages in thread
From: Oscar Salvador @ 2023-11-20 8:43 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, linux-mm, Michal Hocko, Vlastimil Babka,
Waiman Long, Suren Baghdasaryan, Marco Elver, Andrey Konovalov,
Eric Dumazet, Alexander Potapenko, Oscar Salvador
We want to be able to filter out the output based on a certain
threshold, so we can leave out stacks with a low counter of
outstanding allocations.
We can control the threshold value by a new file called
'page_owner_threshold', which is 0 by default.
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
mm/page_owner.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 509c11e506db..6ee345dafd01 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -41,6 +41,8 @@ static depot_stack_handle_t dummy_handle;
static depot_stack_handle_t failure_handle;
static depot_stack_handle_t early_handle;
+static unsigned long page_owner_stack_threshold;
+
static void init_early_allocated_pages(void);
static int __init early_page_owner_param(char *buf)
@@ -763,6 +765,9 @@ static int stack_print(struct seq_file *m, void *v)
int ret = 0;
struct stack_record *stack = (struct stack_record *)v;
+ if (refcount_read(&stack->count) < page_owner_stack_threshold)
+ return 0;
+
buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
ret += stack_depot_get_stack_info(stack, buf);
@@ -804,6 +809,21 @@ const struct file_operations page_owner_stack_operations = {
.release = seq_release,
};
+int page_owner_threshold_get(void *data, u64 *val)
+{
+ *val = page_owner_stack_threshold;
+ return 0;
+}
+
+int page_owner_threshold_set(void *data, u64 val)
+{
+ page_owner_stack_threshold = val;
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(proc_page_owner_threshold, &page_owner_threshold_get,
+ &page_owner_threshold_set, "%llu");
+
static int __init pageowner_init(void)
{
if (!static_branch_unlikely(&page_owner_inited)) {
@@ -815,6 +835,8 @@ static int __init pageowner_init(void)
&proc_page_owner_operations);
debugfs_create_file("page_owner_stacks", 0400, NULL, NULL,
&page_owner_stack_operations);
+ debugfs_create_file("page_owner_threshold", 0600, NULL, NULL,
+ &proc_page_owner_threshold);
return 0;
}
--
2.42.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v6 0/4] page_owner: print stacks and their counter
2023-11-20 8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
` (3 preceding siblings ...)
2023-11-20 8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
@ 2023-11-20 9:07 ` Vlastimil Babka
2023-11-20 17:54 ` Andrey Konovalov
4 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2023-11-20 9:07 UTC (permalink / raw)
To: Oscar Salvador, Andrew Morton
Cc: linux-kernel, linux-mm, Michal Hocko, Waiman Long,
Suren Baghdasaryan, Marco Elver, Andrey Konovalov, Eric Dumazet,
Alexander Potapenko
On 11/20/23 09:42, Oscar Salvador wrote:
> Changes v5 -> v6:
> - Rebase on top of v6.7-rc1
Hi,
I think at this point it would be better to rebase on top of
https://lore.kernel.org/all/cover.1698077459.git.andreyknvl@google.com/
which already contains the refcount field in stack_record.
(and maybe help with the review of that series as well? Hopefully it can get
to mm/unstable and -next towards 6.8 soon :)
Vlastimil
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
2023-11-20 8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
@ 2023-11-20 14:17 ` kernel test robot
2023-11-20 19:52 ` kernel test robot
1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2023-11-20 14:17 UTC (permalink / raw)
To: Oscar Salvador, Andrew Morton
Cc: llvm, oe-kbuild-all, Linux Memory Management List, linux-kernel,
Michal Hocko, Vlastimil Babka, Waiman Long, Suren Baghdasaryan,
Marco Elver, Andrey Konovalov, Eric Dumazet, Alexander Potapenko,
Oscar Salvador
Hi Oscar,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.7-rc2 next-20231120]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Oscar-Salvador/lib-stackdepot-Add-a-refcount-field-in-stack_record/20231120-164913
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20231120084300.4368-5-osalvador%40suse.de
patch subject: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
config: arm-randconfig-001-20231120 (https://download.01.org/0day-ci/archive/20231120/202311202220.b6pHpn7J-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231120/202311202220.b6pHpn7J-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311202220.b6pHpn7J-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> mm/page_owner.c:819:5: warning: no previous prototype for function 'page_owner_threshold_get' [-Wmissing-prototypes]
819 | int page_owner_threshold_get(void *data, u64 *val)
| ^
mm/page_owner.c:819:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
819 | int page_owner_threshold_get(void *data, u64 *val)
| ^
| static
>> mm/page_owner.c:825:5: warning: no previous prototype for function 'page_owner_threshold_set' [-Wmissing-prototypes]
825 | int page_owner_threshold_set(void *data, u64 val)
| ^
mm/page_owner.c:825:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
825 | int page_owner_threshold_set(void *data, u64 val)
| ^
| static
2 warnings generated.
vim +/page_owner_threshold_get +819 mm/page_owner.c
818
> 819 int page_owner_threshold_get(void *data, u64 *val)
820 {
821 *val = page_owner_stack_threshold;
822 return 0;
823 }
824
> 825 int page_owner_threshold_set(void *data, u64 val)
826 {
827 page_owner_stack_threshold = val;
828 return 0;
829 }
830
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v6 0/4] page_owner: print stacks and their counter
2023-11-20 9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
@ 2023-11-20 17:54 ` Andrey Konovalov
0 siblings, 0 replies; 9+ messages in thread
From: Andrey Konovalov @ 2023-11-20 17:54 UTC (permalink / raw)
To: Oscar Salvador
Cc: Andrew Morton, linux-kernel, linux-mm, Michal Hocko, Waiman Long,
Suren Baghdasaryan, Marco Elver, Eric Dumazet,
Alexander Potapenko, Vlastimil Babka
On Mon, Nov 20, 2023 at 10:07 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 11/20/23 09:42, Oscar Salvador wrote:
> > Changes v5 -> v6:
> > - Rebase on top of v6.7-rc1
>
> Hi,
>
> I think at this point it would be better to rebase on top of
> https://lore.kernel.org/all/cover.1698077459.git.andreyknvl@google.com/
> which already contains the refcount field in stack_record.
> (and maybe help with the review of that series as well? Hopefully it can get
> to mm/unstable and -next towards 6.8 soon :)
Yes, please; I just mailed a v4 that is based on v6.7-rc2:
https://lore.kernel.org/linux-mm/cover.1700502145.git.andreyknvl@google.com/T/#t
You should also be able to use the new
stack_depot_save_flags(STACK_DEPOT_FLAG_GET)/stack_depot_put API to
keep only the relevant stack traces.
Thank you!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
2023-11-20 8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
2023-11-20 14:17 ` kernel test robot
@ 2023-11-20 19:52 ` kernel test robot
1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2023-11-20 19:52 UTC (permalink / raw)
To: Oscar Salvador, Andrew Morton
Cc: oe-kbuild-all, Linux Memory Management List, linux-kernel,
Michal Hocko, Vlastimil Babka, Waiman Long, Suren Baghdasaryan,
Marco Elver, Andrey Konovalov, Eric Dumazet, Alexander Potapenko,
Oscar Salvador
Hi Oscar,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on linus/master v6.7-rc2 next-20231120]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Oscar-Salvador/lib-stackdepot-Add-a-refcount-field-in-stack_record/20231120-164913
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20231120084300.4368-5-osalvador%40suse.de
patch subject: [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter
config: arc-allmodconfig (https://download.01.org/0day-ci/archive/20231121/202311210354.BwQoLzVO-lkp@intel.com/config)
compiler: arceb-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231121/202311210354.BwQoLzVO-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311210354.BwQoLzVO-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> mm/page_owner.c:819:5: warning: no previous prototype for 'page_owner_threshold_get' [-Wmissing-prototypes]
819 | int page_owner_threshold_get(void *data, u64 *val)
| ^~~~~~~~~~~~~~~~~~~~~~~~
>> mm/page_owner.c:825:5: warning: no previous prototype for 'page_owner_threshold_set' [-Wmissing-prototypes]
825 | int page_owner_threshold_set(void *data, u64 val)
| ^~~~~~~~~~~~~~~~~~~~~~~~
vim +/page_owner_threshold_get +819 mm/page_owner.c
818
> 819 int page_owner_threshold_get(void *data, u64 *val)
820 {
821 *val = page_owner_stack_threshold;
822 return 0;
823 }
824
> 825 int page_owner_threshold_set(void *data, u64 val)
826 {
827 page_owner_stack_threshold = val;
828 return 0;
829 }
830
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-11-20 19:55 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-20 8:42 [PATCH v6 0/4] page_owner: print stacks and their counter Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 1/4] lib/stackdepot: Add a refcount field in stack_record Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 2/4] lib/stackdepot: Move stack_record struct definition into the header Oscar Salvador
2023-11-20 8:42 ` [PATCH v6 3/4] mm,page_owner: Add page_owner_stacks file to print out only stacks and their counter Oscar Salvador
2023-11-20 8:43 ` [PATCH v6 4/4] mm,page_owner: Filter out stacks by a threshold counter Oscar Salvador
2023-11-20 14:17 ` kernel test robot
2023-11-20 19:52 ` kernel test robot
2023-11-20 9:07 ` [PATCH v6 0/4] page_owner: print stacks and their counter Vlastimil Babka
2023-11-20 17:54 ` Andrey Konovalov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox