From: Andrew Morton <akpm@linux-foundation.org>
To: Lance Yang <ioworker0@gmail.com>
Cc: cunhuang@tencent.com, leonylgao@tencent.com,
j.granados@samsung.com, jsiddle@redhat.com,
kent.overstreet@linux.dev, 21cnbao@gmail.com,
ryan.roberts@arm.com, david@redhat.com, ziy@nvidia.com,
libang.li@antgroup.com, baolin.wang@linux.alibaba.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 0/2] hung_task: add detect count for hung tasks
Date: Wed, 23 Oct 2024 21:28:15 -0700 [thread overview]
Message-ID: <20241023212815.240844bdf83e4dc17b66b88c@linux-foundation.org> (raw)
In-Reply-To: <CAK1f24mGk4pCqf37zXaZbqbTOzLVBqRNnGmf4wEUA9MGYFGoig@mail.gmail.com>
On Thu, 24 Oct 2024 11:28:01 +0800 Lance Yang <ioworker0@gmail.com> wrote:
> Hi Andrew,
>
> Thanks a lot for paying attention!
>
> On Thu, Oct 24, 2024 at 10:05 AM Andrew Morton
> <akpm@linux-foundation.org> wrote:
> >
> > On Tue, 22 Oct 2024 19:47:34 +0800 Lance Yang <ioworker0@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > This patchset adds a counter, hung_task_detect_count, to track the number of
> > > times hung tasks are detected. This counter provides a straightforward way
> > > to monitor hung task events without manually checking dmesg logs.
> > >
> > > With this counter in place, system issues can be spotted quickly, allowing
> > > admins to step in promptly before system load spikes occur, even if the
> > > hung_task_warnings value has been decreased to 0 well before.
> > >
> > > Recently, we encountered a situation where warnings about hung tasks were
> > > buried in dmesg logs during load spikes. Introducing this counter could
> > > have helped us detect such issues earlier and improve our analysis efficiency.
> > >
> >
> > Isn't the answer to this problem "write a better parser"? I mean,
>
> Yeah, I certainly agree that having a good parser is important, and I'm
> working on that as well ;)
>
> > we're providing userspace with information which is already available.
>
> IHMO, there are two reasons why this counter remains valuable:
>
> 1) It allows us to easily detect hung tasks in time before load spikes occur,
> using simple and common monitoring tools like Prometheus.
But the new sysctl_hung_task_detect_count counter gets incremented a
microsecond before the printk comes out. I don't understand the
difference.
> 2) It ensures that we remain aware of hung tasks even when the
> hung_task_warnings value has already been decreased to 0 well before.
That makes sense, I guess. But fleshing this out with a real
operational scenario would help persuade reviewers of the benefit of
this change.
So please describe the utility with full details - sell it to us!
next prev parent reply other threads:[~2024-10-24 4:28 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-22 11:47 Lance Yang
2024-10-22 11:47 ` [PATCH 1/2] " Lance Yang
2024-10-22 11:47 ` [PATCH 2/2] hung_task: add docs for hung_task_detect_count Lance Yang
2024-10-24 2:05 ` [PATCH 0/2] hung_task: add detect count for hung tasks Andrew Morton
2024-10-24 3:28 ` Lance Yang
2024-10-24 4:28 ` Andrew Morton [this message]
2024-10-24 8:48 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241023212815.240844bdf83e4dc17b66b88c@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=21cnbao@gmail.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=cunhuang@tencent.com \
--cc=david@redhat.com \
--cc=ioworker0@gmail.com \
--cc=j.granados@samsung.com \
--cc=jsiddle@redhat.com \
--cc=kent.overstreet@linux.dev \
--cc=leonylgao@tencent.com \
--cc=libang.li@antgroup.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox