linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Michal Hocko <mhocko@suse.com>,
	Muchun Song <muchun.song@linux.dev>,
	Linux-MM <linux-mm@kvack.org>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers
Date: Wed, 6 Dec 2023 19:16:04 +1100	[thread overview]
Message-ID: <ZXAtxBKZmKhFxwYB@dread.disaster.area> (raw)
In-Reply-To: <ZWo7ncdgPsj6rP7_@P9FQF9L96D.corp.robot.car>

On Fri, Dec 01, 2023 at 12:01:33PM -0800, Roman Gushchin wrote:
> On Fri, Dec 01, 2023 at 12:18:44PM +1100, Dave Chinner wrote:
> > On Thu, Nov 30, 2023 at 11:01:23AM -0800, Roman Gushchin wrote:
> > > On Wed, Nov 29, 2023 at 10:21:49PM -0500, Kent Overstreet wrote:
> > > > On Thu, Nov 30, 2023 at 11:09:42AM +0800, Qi Zheng wrote:
> > > > > For non-bcachefs developers, who knows what those statistics mean?
> > 
> > > Ok, a simple question then:
> > > why can't you dump /proc/slabinfo after the OOM?
> > 
> > Taken to it's logical conclusion, we arrive at:
> > 
> > 	OOM-kill doesn't need to output anything at all except for
> > 	what it killed because we can dump
> > 	/proc/{mem,zone,vmalloc,buddy,slab}info after the OOM....
> > 
> > As it is, even asking such a question shows that you haven't looked
> > at the OOM kill output for a long time - it already reports the slab
> > cache usage information for caches that are reclaimable.
> > 
> > That is, if too much accounted slab cache based memory consumption
> > is detected at OOM-kill, it will calldump_unreclaimable_slab() to
> > dump all the SLAB_RECLAIM_ACCOUNT caches (i.e. those with shrinkers)
> > to the console as part of the OOM-kill output.
> 
> You are right, I missed that, partially because most of OOM's I had to deal
> with recently were memcg OOM's.
> 
> This changes my perspective at Kent's patches, if we dump this information
> already, it might be not a bad idea to do it nicer. So I take my words back
> here.
> 
> > 
> > The problem Kent is trying to address is that this output *isn't
> > sufficient to debug shrinker based memory reclaim issues*. It hasn't
> > been for a long time, and so we've all got our own special debug
> > patches and methods for checking that shrinkers are doing what they
> > are supposed to. Kent is trying to formalise one of the more useful
> > general methods for exposing that internal information when OOM
> > occurs...
> > 
> > Indeed, I can think of several uses for a shrinker->to_text() output
> > that we simply cannot do right now.
> > 
> > Any shrinker that does garbage collection on something that is not a
> > pure slab cache (e.g. xfs buffer cache, xfs inode gc subsystem,
> > graphics memory allocators, binder, etc) has no visibility of the
> > actuall memory being used by the subsystem in the OOM-kill output.
> > This information isn't in /proc/slabinfo, it's not accounted by a
> > SLAB_RECLAIM_ACCOUNT cache, and it's not accounted by anything in
> > the core mm statistics.
> > 
> > e.g. How does anyone other than a XFS expert know that the 500k of
> > active xfs_buf handles in the slab cache actually pins 15GB of
> > cached metadata allocated directly from the page allocator, not just
> > the 150MB of slab cache the handles take up?
> > 
> > Another example is that an inode can pin lots of heap memory (e.g.
> > for in-memory extent lists) and that may not be freeable until the
> > inode is reclaimed. So while the slab cache might not be excesively
> > large, we might have an a million inodes with a billion cumulative
> > extents cached in memory and it is the heap memory consumed by the
> > cached extents that is consuming the 30GB of "missing" kernel memory
> > that is causing OOM-kills to occur.
> > 
> > How is a user or developer supposed to know when one of these
> > situations has occurred given the current lack of memory usage
> > introspection into subsystems?
> 
> What would be the proper solution to this problem from your point of view?
> What functionality/API mm can provide to make the life of fs developers
> better here?

What can we do better?

The first thing we can do better that comes to mind is to merge
Kent's patches that allow the shrinker owner to output debug
information when requested by the infrastructure.

Then we - the shrinker implementers - have some control of our own
destiny.  We can add whatever we need to solve shrinker and OOM
problems realted to our shrinkers not doing the right thing.

But without that callout from the infrastructure and the
infrastructure to drive it at appropriate times, we will make zero
progress improving the situation. 

Yes, the code may not be perfect and, yes, it may not be useful to
mm developers, but for the people who have to debug shrinker related
problems in production systems we need all the help we can get. We
certainly don't care if it isn't perfect, just having something we
can partially tailor to our iindividual needs is far, far better
than the current situation of nothing at all...

-Dave.
-- 
Dave Chinner
david@fromorbit.com


  parent reply	other threads:[~2023-12-06  8:16 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-22 23:25 [PATCH 0/7] shrinker debugging improvements Kent Overstreet
2023-11-22 23:25 ` [PATCH 1/7] seq_buf: seq_buf_human_readable_u64() Kent Overstreet
2023-11-22 23:25 ` [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers Kent Overstreet
     [not found]   ` <deed9bb1-02b9-4e89-895b-38a84e5a9408@gmail.com>
2023-11-23 21:24     ` Kent Overstreet
2023-11-24  3:08       ` Qi Zheng
2023-11-25  0:30         ` Kent Overstreet
2023-11-28  3:27           ` Muchun Song
2023-11-28  3:53             ` Kent Overstreet
2023-11-28  6:23               ` Qi Zheng
2023-11-29  0:34                 ` Roman Gushchin
2023-11-29  9:14                   ` Michal Hocko
2023-11-29 23:11                     ` Kent Overstreet
2023-11-30  3:09                       ` Qi Zheng
2023-11-30  3:21                         ` Kent Overstreet
2023-11-30  3:42                           ` Qi Zheng
2023-11-30  4:14                             ` Kent Overstreet
2023-11-30 19:01                           ` Roman Gushchin
2023-12-01  0:00                             ` Kent Overstreet
2023-12-01  1:18                             ` Dave Chinner
2023-12-01 20:01                               ` Roman Gushchin
2023-12-01 21:51                                 ` Kent Overstreet
2023-12-06  8:16                                 ` Dave Chinner [this message]
2023-12-06 19:13                                   ` Kent Overstreet
2023-12-09  1:44                                     ` Roman Gushchin
2023-12-09  2:04                                       ` Kent Overstreet
2023-11-30  8:14                       ` Michal Hocko
2023-12-01  1:47                         ` Kent Overstreet
2023-12-01 10:04                           ` Michal Hocko
2023-12-01 21:25                             ` Kent Overstreet
2023-12-04 10:33                               ` Michal Hocko
2023-12-04 18:15                                 ` Kent Overstreet
2023-12-05  8:49                                   ` Michal Hocko
2023-12-05 23:21                                     ` Kent Overstreet
2023-11-24 11:46   ` kernel test robot
2023-11-28 10:01   ` Michal Hocko
2023-11-28 17:48     ` Kent Overstreet
2023-11-29 16:02       ` Michal Hocko
2023-11-29 22:36         ` Kent Overstreet
2023-11-22 23:25 ` [PATCH 3/7] mm: shrinker: Add new stats for .to_text() Kent Overstreet
2023-11-22 23:25 ` [PATCH 4/7] mm: Centralize & improve oom reporting in show_mem.c Kent Overstreet
2023-11-28 10:07   ` Michal Hocko
2023-11-28 17:54     ` Kent Overstreet
2023-11-29  8:59       ` Michal Hocko
2023-11-22 23:25 ` [PATCH 5/7] mm: shrinker: Add shrinker_to_text() to debugfs interface Kent Overstreet
2023-11-22 23:25 ` [PATCH 6/7] bcachefs: shrinker.to_text() methods Kent Overstreet
2023-11-22 23:25 ` [PATCH 7/7] bcachefs: add counters for failed shrinker reclaim Kent Overstreet
2023-11-28  9:59 ` [PATCH 0/7] shrinker debugging improvements Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXAtxBKZmKhFxwYB@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox