linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: David Wang <00107082@163.com>
Cc: Suren Baghdasaryan <surenb@google.com>,
	akpm@linux-foundation.org,  hannes@cmpxchg.org,
	pasha.tatashin@soleen.com, souravpanda@google.com,
	 vbabka@suse.cz, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] alloc_tag: add option to pick the first codetag along callchain
Date: Wed, 7 Jan 2026 11:13:25 -0500	[thread overview]
Message-ID: <aV6BnhG9yjX10O27@moria.home.lan> (raw)
In-Reply-To: <5b356cfa.4ab3.19b97196d49.Coremail.00107082@163.com>

On Wed, Jan 07, 2026 at 02:16:24PM +0800, David Wang wrote:
> 
> At 2026-01-07 12:07:34, "Kent Overstreet" <kent.overstreet@linux.dev> wrote:
> >I'm curious why you need to change __filemap_get_folio()? In filesystem
> >land we just lump that under "pagecache", but I guess you're doing more
> >interesting things with it in driver land?
> 
> Oh,  in [1],   there is a report about possible memory leak in cephfs, (The issue is still open, tracked in [2].), 
> large trunk of memory could not be released even after dropcache.
> memory allocation profiling shows those memory belongs to __filemap_get_folio,
> something like 
> >> ># sort -g /proc/allocinfo|tail|numfmt --to=iec
> >> >         12M     2987 mm/execmem.c:41 func:execmem_vmalloc 
> >> >         12M        3 kernel/dma/pool.c:96 func:atomic_pool_expand 
> >> >         13M      751 mm/slub.c:3061 func:alloc_slab_page 
> >> >         16M        8 mm/khugepaged.c:1069 func:alloc_charge_folio 
> >> >         18M     4355 mm/memory.c:1190 func:folio_prealloc 
> >> >         24M     6119 mm/memory.c:1192 func:folio_prealloc 
> >> >         58M    14784 mm/page_ext.c:271 func:alloc_page_ext 
> >> >         61M    15448 mm/readahead.c:189 func:ractl_alloc_folio 
> >> >         79M     6726 mm/slub.c:3059 func:alloc_slab_page 
> >> >         11G  2674488 mm/filemap.c:2012 func:__filemap_get_folio
> 
> After adding codetag to __filemap_get_folio, it shows
> 
> ># sort -g /proc/allocinfo|tail|numfmt --to=iec
> >         10M     2541 drivers/block/zram/zram_drv.c:1597 [zram]
> >func:zram_meta_alloc 12M     3001 mm/execmem.c:41 func:execmem_vmalloc 
> >         12M     3605 kernel/fork.c:311 func:alloc_thread_stack_node 
> >         16M      992 mm/slub.c:3061 func:alloc_slab_page 
> >         20M    35544 lib/xarray.c:378 func:xas_alloc 
> >         31M     7704 mm/memory.c:1192 func:folio_prealloc 
> >         69M    17562 mm/memory.c:1190 func:folio_prealloc 
> >        104M     8212 mm/slub.c:3059 func:alloc_slab_page 
> >        124M    30075 mm/readahead.c:189 func:ractl_alloc_folio 
> >        2.6G   661392 fs/netfs/buffered_read.c:635 [netfs] func:netfs_write_begin 
> >
> 
> Helpful or not, I am not sure. So far no bug has been spotted in the cephfs write path, yet.
> But at least, it provides more information and narrow down the scope of suspicious.
> 
> 
> https://lore.kernel.org/lkml/2a9ba88e.3aa6.19b0b73dd4e.Coremail.00107082@163.com/  [1]
> https://tracker.ceph.com/issues/74156   [2]

Well, my first thought when looking at that is that memory allocation
profiling is unlikely to be any more help there. Once you're dealing
with the page cache, if you're looking at a genuine leak it would pretty
much have to be a folio refcount leak, and the code that leaked the ref
could be anything that touched that folio - you're looking at a pretty
wide scope.

Unfortunately, we're not great at visibility and introspection in mm/,
and refcount bugs tend to be hard in general.

Better mm introspection would be helpful to say definitively that you're
looking at a refcount leak, but then once that's determined it's still
going to be pretty painful to track down.

The approach I took in bcachefs for refcount bugs was to write a small
library that in debug mode splits a refcount into sub-refcounts, and
then enumerate every single codepath that takes refs and gives them
distinct sub-refs - this means in debug mode we can instantly pinpoint
the function that's buggy (and even better, with the new CLASS() and
guard() stuff these sorts of bugs have been going away).

But grafting that onto folio refcounts would be a hell of a chore.

OTOH, converting code to CLASS() and guards is much more
straightforward - just a matter of writing little helpers if you need
them and then a bunch of mechanical conversions, and it's well worth it.

But, I'm reading through the Ceph code, and it has /less/ code involving
folio refcounts than I would expect.

Has anyone checked if the bug reproduces without zswap? I've definitely
seen a lot of bug reports involving that code.


  reply	other threads:[~2026-01-07 16:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-16  6:43 David Wang
2026-01-05 21:12 ` Suren Baghdasaryan
2026-01-06  3:50   ` David Wang
2026-01-06 10:54     ` Kent Overstreet
2026-01-06 14:07       ` David Wang
2026-01-06 23:26         ` Kent Overstreet
2026-01-07  3:38           ` David Wang
2026-01-07  4:07             ` Kent Overstreet
2026-01-07  6:16               ` David Wang
2026-01-07 16:13                 ` Kent Overstreet [this message]
2026-01-07 17:50                   ` David Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aV6BnhG9yjX10O27@moria.home.lan \
    --to=kent.overstreet@linux.dev \
    --cc=00107082@163.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=souravpanda@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox