linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/oom_kill: add sysctl_oom_dump_stack to control kernel stack dumping on OOM
@ 2025-12-11 20:24 Yang Xin
  2025-12-11 20:34 ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: Yang Xin @ 2025-12-11 20:24 UTC (permalink / raw)
  To: akpm, mhocko, rientjes, shakeel.butt; +Cc: linux-mm, linux-kernel, Yang Xin

    Most OOM kills triggered by user-space processes produce kernel stack
    traces that are not helpful for diagnosing the root cause. These traces
    usually just show the page fault handler or system call entry.

    Furthermore, dump_stack() can be expensive. It often runs with
    interrupts disabled or holds the console lock for a long time,
    potentially causing system latencies and preventing the system from
    responding to other events.

    This patch adds a new sysctl vm.oom_dump_stack to control this
    behavior. Writing '0' to /proc/sys/vm/oom_dump_stack suppresses the
    kernel stack dump during OOM kills, while '1' (the default) preserves
    the existing behavior.

Signed-off-by: Yang Xin <redleaf@linux.alibaba.com>
---
 mm/oom_kill.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 5eb11fbba704..a51dbd2e6912 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -56,6 +56,7 @@
 static int sysctl_panic_on_oom;
 static int sysctl_oom_kill_allocating_task;
 static int sysctl_oom_dump_tasks = 1;
+static int sysctl_oom_dump_stack = 1;
 
 /*
  * Serializes oom killer invocations (out_of_memory()) from all contexts to
@@ -464,7 +465,9 @@ static void dump_header(struct oom_control *oc)
 	if (!IS_ENABLED(CONFIG_COMPACTION) && oc->order)
 		pr_warn("COMPACTION is disabled!!!\n");
 
-	dump_stack();
+	if (sysctl_oom_dump_stack)
+		dump_stack();
+
 	if (is_memcg_oom(oc))
 		mem_cgroup_print_oom_meminfo(oc->memcg);
 	else {
@@ -736,6 +739,13 @@ static const struct ctl_table vm_oom_kill_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+                .procname       = "oom_dump_stack",
+                .data           = &sysctl_oom_dump_stack,
+                .maxlen         = sizeof(sysctl_oom_dump_stack),
+                .mode           = 0644,
+                .proc_handler   = proc_dointvec,
+        },
 };
 #endif
 
-- 
2.30.2



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/oom_kill: add sysctl_oom_dump_stack to control kernel stack dumping on OOM
  2025-12-11 20:24 [PATCH] mm/oom_kill: add sysctl_oom_dump_stack to control kernel stack dumping on OOM Yang Xin
@ 2025-12-11 20:34 ` Michal Hocko
  2025-12-14  1:55   ` David Rientjes
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2025-12-11 20:34 UTC (permalink / raw)
  To: Yang Xin; +Cc: akpm, rientjes, shakeel.butt, linux-mm, linux-kernel, Yang Xin

On Fri 12-12-25 04:24:33, Yang Xin wrote:
>     Most OOM kills triggered by user-space processes produce kernel stack
>     traces that are not helpful for diagnosing the root cause. These traces
>     usually just show the page fault handler or system call entry.
> 
>     Furthermore, dump_stack() can be expensive. It often runs with
>     interrupts disabled or holds the console lock for a long time,
>     potentially causing system latencies and preventing the system from
>     responding to other events.
> 
>     This patch adds a new sysctl vm.oom_dump_stack to control this
>     behavior. Writing '0' to /proc/sys/vm/oom_dump_stack suppresses the
>     kernel stack dump during OOM kills, while '1' (the default) preserves
>     the existing behavior.

While I fundamentally do not object to ways to suppress stacks traces
for OOM I would really like to hear more what kind of overhead we are
talking about here (stack traces are reported for tracing and other low
latency situations) and why does this matter for as cold of a path as
OOM is.

Also we are getting way too many of these sysctls. Maybe it is time to
look for a more customizable way to configure oom output that doesn't
require sysctl per output feature.

> Signed-off-by: Yang Xin <redleaf@linux.alibaba.com>
> ---
>  mm/oom_kill.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5eb11fbba704..a51dbd2e6912 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -56,6 +56,7 @@
>  static int sysctl_panic_on_oom;
>  static int sysctl_oom_kill_allocating_task;
>  static int sysctl_oom_dump_tasks = 1;
> +static int sysctl_oom_dump_stack = 1;
>  
>  /*
>   * Serializes oom killer invocations (out_of_memory()) from all contexts to
> @@ -464,7 +465,9 @@ static void dump_header(struct oom_control *oc)
>  	if (!IS_ENABLED(CONFIG_COMPACTION) && oc->order)
>  		pr_warn("COMPACTION is disabled!!!\n");
>  
> -	dump_stack();
> +	if (sysctl_oom_dump_stack)
> +		dump_stack();
> +
>  	if (is_memcg_oom(oc))
>  		mem_cgroup_print_oom_meminfo(oc->memcg);
>  	else {
> @@ -736,6 +739,13 @@ static const struct ctl_table vm_oom_kill_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec,
>  	},
> +	{
> +                .procname       = "oom_dump_stack",
> +                .data           = &sysctl_oom_dump_stack,
> +                .maxlen         = sizeof(sysctl_oom_dump_stack),
> +                .mode           = 0644,
> +                .proc_handler   = proc_dointvec,
> +        },
>  };
>  #endif
>  
> -- 
> 2.30.2
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/oom_kill: add sysctl_oom_dump_stack to control kernel stack dumping on OOM
  2025-12-11 20:34 ` Michal Hocko
@ 2025-12-14  1:55   ` David Rientjes
  0 siblings, 0 replies; 3+ messages in thread
From: David Rientjes @ 2025-12-14  1:55 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yang Xin, akpm, shakeel.butt, linux-mm, linux-kernel, Yang Xin

On Thu, 11 Dec 2025, Michal Hocko wrote:

> On Fri 12-12-25 04:24:33, Yang Xin wrote:
> >     Most OOM kills triggered by user-space processes produce kernel stack
> >     traces that are not helpful for diagnosing the root cause. These traces
> >     usually just show the page fault handler or system call entry.
> > 
> >     Furthermore, dump_stack() can be expensive. It often runs with
> >     interrupts disabled or holds the console lock for a long time,
> >     potentially causing system latencies and preventing the system from
> >     responding to other events.
> > 
> >     This patch adds a new sysctl vm.oom_dump_stack to control this
> >     behavior. Writing '0' to /proc/sys/vm/oom_dump_stack suppresses the
> >     kernel stack dump during OOM kills, while '1' (the default) preserves
> >     the existing behavior.
> 
> While I fundamentally do not object to ways to suppress stacks traces
> for OOM I would really like to hear more what kind of overhead we are
> talking about here (stack traces are reported for tracing and other low
> latency situations) and why does this matter for as cold of a path as
> OOM is.
> 
> Also we are getting way too many of these sysctls. Maybe it is time to
> look for a more customizable way to configure oom output that doesn't
> require sysctl per output feature.
> 

Strongly agree, I don't think this requires yet another sysctl.  It's also 
global, so it will affect all oom kills including the ones where you might 
find a stack to actually be really helpful to understand the issue.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-12-14  1:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-11 20:24 [PATCH] mm/oom_kill: add sysctl_oom_dump_stack to control kernel stack dumping on OOM Yang Xin
2025-12-11 20:34 ` Michal Hocko
2025-12-14  1:55   ` David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox