linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Boris Burkov <boris@bur.io>
Cc: Shakeel Butt <shakeel.butt@linux.dev>,
	linux-mm@kvack.org, linmiaohe@huawei.com
Subject: Re: [PATCH RFC] mm: fix refcount check in mapping_evict_folio
Date: Wed, 14 Aug 2024 04:46:13 +0100	[thread overview]
Message-ID: <ZrwohUyp85wtLK-I@casper.infradead.org> (raw)
In-Reply-To: <20240814032715.GA400993@zen.localdomain>

On Tue, Aug 13, 2024 at 08:27:15PM -0700, Boris Burkov wrote:
> On Wed, Aug 14, 2024 at 04:15:25AM +0100, Matthew Wilcox wrote:
> > On Tue, Aug 13, 2024 at 12:58:09PM -0700, Shakeel Butt wrote:
> > > > +	/*
> > > > +	 * The refcount will be elevated if any page in the folio is mapped.
> > > > +	 *
> > > > +	 * The refcounts break down as follows:
> > > > +	 * 1 per mapped page
> > > > +	 * 1 from folio_attach_private, if private is set
> > > > +	 * 1 from allocating the page in the first place
> > > > +	 * 1 from the caller
> > > > +	 */
> > > 
> > > I think the above explanation is correct at least from my code
> > > inspection. Most of the callers are related to memory failure. I would
> > > reword the "1 per mapped page" to "1 per page in page cache" or
> > > something as mapped here might mean mapped in page tables.
> > 
> > It's not though.  The "1 from allocating the page in the first place"
> > is donated to the page cache.  It's late here and I don't have the
> > ability to work through what's really going on here.
> 
> Can you explain what you mean by "donated to the page cache" more
> precisely?
> 
> Perhaps there is something better btrfs can do with its refcounting
> as it calls alloc_pages_bulk_array, then filemap_add_folio, and finally
> folio_attach_private. But I am not sure which of those refcounts we can
> (or should?) drop.

Look at how readahead works for normal files; ignore what btrfs is doing
because it's probably wrong.  I'm going to use the term "expected
refcount" because there may also be temporary speculative refcounts
from stale references (either GUP or pagecache).

                folio = filemap_alloc_folio(gfp_mask, 0);
(expected refcount 1)
                ret = filemap_add_folio(mapping, folio, index + i, gfp_mask);
(expected refcount 1 + nr_pages)
        read_pages(ractl);
                aops->readahead(rac);
... calls readahead_folio() which calls folio_put()
(expected refcount nr_pages)

if filesystem calls folio_attach_private(), add one to the expected
refcount.

That's it.  Folios in the pagecache should have a refcount of nr_pages +
1 if private data exists.  Every caller who has called filemap_get_folio()
has an extra refcount.  Every user mapping of a page adds one to the
refcount (and to the mapcount).

If btrfs superblocks have an extra refcount, they're wrong and should
have it put somewhere.


At some point, I intend to reduce the number of atomic operations we do
by having filemap_add_folio() increment by one fewer than it currently
does, and removing the folio_put() in readahead_folio().  I haven't been
brave enough to do that yet.

I also think we should not increment the refcount by nr_pages when we
add it to the page cache.  Incrementing by one should be sufficient.
And that would mean that we can just delete the "folio_ref_add()"
in __filemap_add_folio().


  reply	other threads:[~2024-08-14  3:46 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-13 18:25 Boris Burkov
2024-08-13 19:58 ` Shakeel Butt
2024-08-14  3:15   ` Matthew Wilcox
2024-08-14  3:27     ` Boris Burkov
2024-08-14  3:46       ` Matthew Wilcox [this message]
2024-08-14  4:23         ` Boris Burkov
2024-08-20  8:00 ` David Hildenbrand
2024-08-20 14:00   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZrwohUyp85wtLK-I@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=boris@bur.io \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=shakeel.butt@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox