Re: [PATCH 1/5] mm: Do not reclaim private data from pinned page

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kara <jack@suse.cz>
To: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-mm@kvack.org, John Hubbard <jhubbard@nvidia.com>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH 1/5] mm: Do not reclaim private data from pinned page
Date: Tue, 14 Feb 2023 14:00:00 +0100	[thread overview]
Message-ID: <20230214130000.s5kynjhjiyrpvzxx@quack3> (raw)
In-Reply-To: <df6e150f-9d5c-6f68-f234-3e1ef419f464@redhat.com>

On Mon 13-02-23 10:01:35, David Hildenbrand wrote:
> On 09.02.23 13:31, Jan Kara wrote:
> > If the page is pinned, there's no point in trying to reclaim it.
> > Furthermore if the page is from the page cache we don't want to reclaim
> > fs-private data from the page because the pinning process may be writing
> > to the page at any time and reclaiming fs private info on a dirty page
> > can upset the filesystem (see link below).
> > 
> > Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >   mm/vmscan.c | 10 ++++++++++
> >   1 file changed, 10 insertions(+)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index bf3eedf0209c..ab3911a8b116 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> >   			}
> >   		}
> > +		/*
> > +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> > +		 * No point in trying to reclaim folio if it is pinned.
> > +		 * Furthermore we don't want to reclaim underlying fs metadata
> > +		 * if the folio is pinned and thus potentially modified by the
> > +		 * pinning process is that may upset the filesystem.
> > +		 */
> > +		if (folio_maybe_dma_pinned(folio))
> > +			goto activate_locked;
> > +
> >   		mapping = folio_mapping(folio);
> >   		if (folio_test_dirty(folio)) {
> >   			/*
> 
> At this point, we made sure that the folio is completely unmapped. However,
> we specify "TTU_BATCH_FLUSH", so rmap code might defer a TLB flush and
> consequently defer an IPI sync.
> 
> I remember that this check here is fine regarding GUP-fast: even if
> concurrent GUP-fast pins the page after our check here, it should observe
> the changed PTE and unpin it again.
>  
> Checking after unmapping makes sense: we reduce the likelyhood of false
> positives when a file-backed page is mapped many times (>= 1024). OTOH, we
> might unmap pinned pages because we cannot really detect it early.
> 
> For anon pages, we have an early (racy) check, which turned out "ok" in
> practice, because we don't frequently have that many anon pages that are
> shared by that many processes. I assume we don't want something similar for
> pagecache pages, because having a single page mapped by many processes can
> happen easily and would prevent reclaim.

Yeah, I think pagecache pages shared by many processes are more likely.
Furthermore I think pinned pagecache pages are rather rare so unmapping
them before checking seems fine to me. Obviously we can reconsider if
reality would prove me wrong ;).

> I once had a patch lying around that documented for the existing
> folio_maybe_dma_pinned() for anon pages exactly that (racy+false positives
> with many mappings).
> 
> Long story short, I assume this change is fine.

Thanks for the throughout verification :)

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

next prev parent reply	other threads:[~2023-02-14 13:00 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-09 12:31 [PATCH RFC 0/5] Writeback handling of pinned pages Jan Kara
2023-02-09 12:31 ` [PATCH 1/5] mm: Do not reclaim private data from pinned page Jan Kara
2023-02-09 16:17   ` Matthew Wilcox
2023-02-10 11:29     ` Jan Kara
2023-02-13  9:55       ` Christoph Hellwig
2023-02-14 13:06         ` Jan Kara
2023-02-14 21:40           ` John Hubbard
2023-02-16 11:56             ` Jan Kara
2023-02-13  9:01   ` David Hildenbrand
2023-02-14 13:00     ` Jan Kara [this message]
2023-02-09 12:31 ` [PATCH 2/5] ext4: Drop workaround for mm reclaiming fs private page data Jan Kara
2023-02-09 12:31 ` [PATCH 3/5] mm: Do not try to write pinned folio during memory cleaning writeback Jan Kara
2023-02-10  1:54   ` John Hubbard
2023-02-10  2:10     ` John Hubbard
2023-02-10 10:42       ` Jan Kara
2023-02-10 10:54     ` Jan Kara
2023-02-09 12:31 ` [PATCH 4/5] block: Add support for bouncing pinned pages Jan Kara
2023-02-13  9:59   ` Christoph Hellwig
2023-02-14 13:56     ` Jan Kara
2023-02-15  4:59       ` Dave Chinner
2023-02-15  6:24         ` Christoph Hellwig
2023-02-16 12:33           ` Jan Kara
2023-02-20  6:22             ` Christoph Hellwig
2023-02-27 11:39               ` Jan Kara
2023-02-27 13:36                 ` Christoph Hellwig
2023-02-09 12:31 ` [PATCH 5/5] iomap: Bounce pinned pages during writeback Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230214130000.s5kynjhjiyrpvzxx@quack3 \
    --to=jack@suse.cz \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox