From: Roman Gushchin <guro@fb.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-mm@kvack.org, Alexander Viro <viro@zeniv.linux.org.uk>,
Ingo Molnar <mingo@kernel.org>,
kernel-team@fb.com, linux-kernel@vger.kernel.org
Subject: Re: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status
Date: Thu, 28 Sep 2017 14:53:57 +0100 [thread overview]
Message-ID: <20170928135357.GA8470@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <20170927163106.84b9622f183f087eff7f6da7@linux-foundation.org>
On Wed, Sep 27, 2017 at 04:31:06PM -0700, Andrew Morton wrote:
> On Wed, 20 Sep 2017 16:06:34 -0700 Roman Gushchin <guro@fb.com> wrote:
>
> > Right now there is no convenient way to check if a process is being
> > coredumped at the moment.
> >
> > It might be necessary to recognize such state to prevent killing
> > the process and getting a broken coredump.
> > Writing a large core might take significant time, and the process
> > is unresponsive during it, so it might be killed by timeout,
> > if another process is monitoring and killing/restarting
> > hanging tasks.
> >
> > To provide an ability to detect if a process is in the state of
> > being coreduped, we can expose a boolean CoreDumping flag
> > in /proc/pid/status.
> >
> > Example:
> > $ cat core.sh
> > #!/bin/sh
> >
> > echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
> > sleep 1000 &
> > PID=$!
> >
> > cat /proc/$PID/status | grep CoreDumping
> > kill -ABRT $PID
> > sleep 1
> > cat /proc/$PID/status | grep CoreDumping
> >
> > $ ./core.sh
> > CoreDumping: 0
> > CoreDumping: 1
>
> I assume you have some real-world use case which benefits from this.
Sure, we're getting a sensible number of corrupted coredump files
on machines in our fleet, just because processes are being killed
by timeout in the middle of the core writing process.
We do have a process health check, and some agent is responsible
for restarting processes which are not responding for health check requests.
Writing a large coredump to the disk can easily exceed the reasonable timeout
(especially on an overloaded machine).
This flag will allow the agent to distinguish processes which are being
coredumped, extend the timeout for them, and let them produce a full
coredump file.
>
> > fs/proc/array.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
>
> A Documentation/ would be appropriate? Include a brief mention of
> *why* someone might want to use this...
>
>
Here it is. Thank you!
--
prev parent reply other threads:[~2017-09-28 13:54 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20170914224431.GA9735@castle>
2017-09-20 23:06 ` Roman Gushchin
2017-09-22 15:44 ` Konstantin Khlebnikov
2017-09-22 17:18 ` Roman Gushchin
2017-09-26 12:39 ` Roman Gushchin
2017-09-27 23:31 ` Andrew Morton
2017-09-28 13:53 ` Roman Gushchin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170928135357.GA8470@castle.DHCP.thefacebook.com \
--to=guro@fb.com \
--cc=akpm@linux-foundation.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox