From: Kent Overstreet <kent.overstreet@linux.dev>
To: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Muchun Song <muchun.song@linux.dev>,
Linux-MM <linux-mm@kvack.org>,
linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers
Date: Thu, 30 Nov 2023 20:47:45 -0500 [thread overview]
Message-ID: <20231201014745.b2ud4w3ymztdtctu@moria.home.lan> (raw)
In-Reply-To: <ZWhEawxI1CT8stu9@tiehlicka>
On Thu, Nov 30, 2023 at 09:14:35AM +0100, Michal Hocko wrote:
> On Wed 29-11-23 18:11:47, Kent Overstreet wrote:
> > Considering that you're an MM guy, and that shrinkers are pretty much
> > universally used by _filesystem_ people - I'm not sure your experience
> > is the most relevant here?
>
> I really do not understand where you have concluded that. In those years
> of analysis I was not debugging my _own_ code. I was dealing with
> customer reports and I would not really blame them to specifically
> trigger any class of OOM reports.
I've also spent a considerable amount of time debugging OOM issues, and
a lot of that took a lot longer than it should of due to insufficient
visibility in what the system was doing.
I'm talking about things like tuning journal reclaim/writeback behaviour
(this is a tricky one! shrinkers can't shrink if all items are dirty,
but random update workloads really suffer if we're biasing too much in
favour of memory reclaim, i.e. limiting dirty ratio too much), or
debugging tests in fstests that really like to exhaust memory on just
the inode cache.
If you can take the time to understand what other people are trying to
do and share your own perspective on what you find useful - instead of
just saying "I've spent a lot of time on OOM reports and I haven't need
any of this/this is just for debugging" - we'll be able to have a much
more productive discussion.
Regarding another point you guys have been making - that this is "just
for developers debugging their own code" - that's a terribly dismissive
attitude to take as well.
Debugging doesn't stop when we're done testing the code on our local
machine and push it out to be merged; we're constantly debugging our
own code as it is running in the wild based on sparse bug reports with
at most a dmesg log. That dmesg log needs to, whenever possible, have
all the information we need to debug the issue.
In bcachefs, I have made this principle a _high_ priority; when I have a
bug in front of me, if there's visibility improvements that would make
the issue easier to debug I prioritize that _first_, and then fix the
actual bug. That's been one of the guiding principles that have enabled
me to work efficiently.
Code should tell you _what_ went wrong when something goes wrong,
whenever possible. Not just for ourselves, the individual developer, it
makes our code more maintainable by the people tha come after us.
> > For one, the patchset adds tracking for when a shrinker was last asked
> > to free something, vs. when it was actually freed. So right there, we
> > can finally see at a glance when a shrinker has gotten stuck and which
> > one.
>
> The primary problem I have with this is how to decide whether to dump
> shrinker data and/or which shrinkers to mention. How do you know that it
> is the specific shrinker which has contributed to the OOM state?
> Printing that data unconditionally will very likely be just additional
> balast in most production situations. Sure if you are doing a filesystem
> development and you are tuning your specific shrinker then this might be
> a really important information to have. But then it is a debugging devel
> tool rather than something we want or need to have in a generic oom
> report.
Like I've mentioned before, this patchset only reports on the top 10
shrinkers, by number of objects. If we can plumb through reporting on
memory usage in _bytes_, that would help even more with deciding what to
report on.
> All that being said, I am with you on the fact that the oom report in
> its current form could see improvements.
I'm glad we're finally in agreement on something!
If you want to share your own ideas on what could be improved and what
you find useful, maybe we could find some more common ground.
next prev parent reply other threads:[~2023-12-01 1:47 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-22 23:25 [PATCH 0/7] shrinker debugging improvements Kent Overstreet
2023-11-22 23:25 ` [PATCH 1/7] seq_buf: seq_buf_human_readable_u64() Kent Overstreet
2023-11-22 23:25 ` [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers Kent Overstreet
[not found] ` <deed9bb1-02b9-4e89-895b-38a84e5a9408@gmail.com>
2023-11-23 21:24 ` Kent Overstreet
2023-11-24 3:08 ` Qi Zheng
2023-11-25 0:30 ` Kent Overstreet
2023-11-28 3:27 ` Muchun Song
2023-11-28 3:53 ` Kent Overstreet
2023-11-28 6:23 ` Qi Zheng
2023-11-29 0:34 ` Roman Gushchin
2023-11-29 9:14 ` Michal Hocko
2023-11-29 23:11 ` Kent Overstreet
2023-11-30 3:09 ` Qi Zheng
2023-11-30 3:21 ` Kent Overstreet
2023-11-30 3:42 ` Qi Zheng
2023-11-30 4:14 ` Kent Overstreet
2023-11-30 19:01 ` Roman Gushchin
2023-12-01 0:00 ` Kent Overstreet
2023-12-01 1:18 ` Dave Chinner
2023-12-01 20:01 ` Roman Gushchin
2023-12-01 21:51 ` Kent Overstreet
2023-12-06 8:16 ` Dave Chinner
2023-12-06 19:13 ` Kent Overstreet
2023-12-09 1:44 ` Roman Gushchin
2023-12-09 2:04 ` Kent Overstreet
2023-11-30 8:14 ` Michal Hocko
2023-12-01 1:47 ` Kent Overstreet [this message]
2023-12-01 10:04 ` Michal Hocko
2023-12-01 21:25 ` Kent Overstreet
2023-12-04 10:33 ` Michal Hocko
2023-12-04 18:15 ` Kent Overstreet
2023-12-05 8:49 ` Michal Hocko
2023-12-05 23:21 ` Kent Overstreet
2023-11-24 11:46 ` kernel test robot
2023-11-28 10:01 ` Michal Hocko
2023-11-28 17:48 ` Kent Overstreet
2023-11-29 16:02 ` Michal Hocko
2023-11-29 22:36 ` Kent Overstreet
2023-11-22 23:25 ` [PATCH 3/7] mm: shrinker: Add new stats for .to_text() Kent Overstreet
2023-11-22 23:25 ` [PATCH 4/7] mm: Centralize & improve oom reporting in show_mem.c Kent Overstreet
2023-11-28 10:07 ` Michal Hocko
2023-11-28 17:54 ` Kent Overstreet
2023-11-29 8:59 ` Michal Hocko
2023-11-22 23:25 ` [PATCH 5/7] mm: shrinker: Add shrinker_to_text() to debugfs interface Kent Overstreet
2023-11-22 23:25 ` [PATCH 6/7] bcachefs: shrinker.to_text() methods Kent Overstreet
2023-11-22 23:25 ` [PATCH 7/7] bcachefs: add counters for failed shrinker reclaim Kent Overstreet
2023-11-28 9:59 ` [PATCH 0/7] shrinker debugging improvements Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231201014745.b2ud4w3ymztdtctu@moria.home.lan \
--to=kent.overstreet@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox