Re: [PATCH v2] watchdog/mm: Allow dumping memory info in pretimeout

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vincent Whitchurch <Vincent.Whitchurch@axis.com>
To: "linux@roeck-us.net" <linux@roeck-us.net>,
	"wim@linux-watchdog.org" <wim@linux-watchdog.org>,
	Vincent Whitchurch <Vincent.Whitchurch@axis.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	kernel <kernel@axis.com>,
	"linux-watchdog@vger.kernel.org" <linux-watchdog@vger.kernel.org>
Subject: Re: [PATCH v2] watchdog/mm: Allow dumping memory info in pretimeout
Date: Wed, 14 Jun 2023 07:42:46 +0000	[thread overview]
Message-ID: <041dcfd3d4e45c387fa1f6f49f53ccb59967b104.camel@axis.com> (raw)
In-Reply-To: <41ecdf8d-59be-ded0-1ace-0a7cadabbcc3@roeck-us.net>

On Mon, 2023-06-12 at 07:53 -0700, Guenter Roeck wrote:
> On 6/12/23 00:26, Vincent Whitchurch wrote:
> > On my (embedded) systems, the most common cause of hitting the watchdog
> > (pre)timeout is due to thrashing.  Diagnosing these problems is hard
> > without knowing the memory state at the point of the watchdog hit.  In
> > order to make this information available, add a module parameter to the
> > watchdog pretimeout panic governor to ask it to dump memory info and the
> > OOM task list (using a new helper in the OOM code) before triggering the
> > panic.
> 
> Personally I don't think this is the right way of approaching this problem.
> First, the userspace task controlling the watchdog should run as realtime
> task, forced to be in memory, and not be affected by thrashing.

That may not be appropriate in all cases since you may want the watchdog
to hit when the system as a whole really is unusable.

> Second, the problem should be observable well before the watchdog fires.

Yes, there are ways to try to detect it earlier (e.g. PSI) and attempt
recovery, even if the kernel's OOM killer itself is very slow to react.

But if those attempts fail for whatever reason and we actually do end up
hitting the watchdog, something like this patch provides information
which is invaluable for diagnosing the problem.

> Last but not least, I don't think it is appropriate to intertwine
> watchdog code with oom handling code as suggested here.

The show_mem() function is in lib/ so that's outside of the OOM
handling.  The oom_dump_tasks() function could perhaps be refactored and
moved to a neutral location so then we would avoid the intertwining.

     prev parent reply	other threads:[~2023-06-14  7:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-12  7:26 Vincent Whitchurch
2023-06-12 14:53 ` Guenter Roeck
2023-06-14  7:42   ` Vincent Whitchurch [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=041dcfd3d4e45c387fa1f6f49f53ccb59967b104.camel@axis.com \
    --to=vincent.whitchurch@axis.com \
    --cc=akpm@linux-foundation.org \
    --cc=kernel@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox