From: Jan Kara <jack@suse.cz>
To: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
linux-mm@kvack.org, John Hubbard <jhubbard@nvidia.com>,
David Howells <dhowells@redhat.com>
Subject: Re: [PATCH 1/5] mm: Do not reclaim private data from pinned page
Date: Tue, 14 Feb 2023 14:00:00 +0100 [thread overview]
Message-ID: <20230214130000.s5kynjhjiyrpvzxx@quack3> (raw)
In-Reply-To: <df6e150f-9d5c-6f68-f234-3e1ef419f464@redhat.com>
On Mon 13-02-23 10:01:35, David Hildenbrand wrote:
> On 09.02.23 13:31, Jan Kara wrote:
> > If the page is pinned, there's no point in trying to reclaim it.
> > Furthermore if the page is from the page cache we don't want to reclaim
> > fs-private data from the page because the pinning process may be writing
> > to the page at any time and reclaiming fs private info on a dirty page
> > can upset the filesystem (see link below).
> >
> > Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > mm/vmscan.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index bf3eedf0209c..ab3911a8b116 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> > }
> > }
> > + /*
> > + * Folio is unmapped now so it cannot be newly pinned anymore.
> > + * No point in trying to reclaim folio if it is pinned.
> > + * Furthermore we don't want to reclaim underlying fs metadata
> > + * if the folio is pinned and thus potentially modified by the
> > + * pinning process is that may upset the filesystem.
> > + */
> > + if (folio_maybe_dma_pinned(folio))
> > + goto activate_locked;
> > +
> > mapping = folio_mapping(folio);
> > if (folio_test_dirty(folio)) {
> > /*
>
> At this point, we made sure that the folio is completely unmapped. However,
> we specify "TTU_BATCH_FLUSH", so rmap code might defer a TLB flush and
> consequently defer an IPI sync.
>
> I remember that this check here is fine regarding GUP-fast: even if
> concurrent GUP-fast pins the page after our check here, it should observe
> the changed PTE and unpin it again.
>
> Checking after unmapping makes sense: we reduce the likelyhood of false
> positives when a file-backed page is mapped many times (>= 1024). OTOH, we
> might unmap pinned pages because we cannot really detect it early.
>
> For anon pages, we have an early (racy) check, which turned out "ok" in
> practice, because we don't frequently have that many anon pages that are
> shared by that many processes. I assume we don't want something similar for
> pagecache pages, because having a single page mapped by many processes can
> happen easily and would prevent reclaim.
Yeah, I think pagecache pages shared by many processes are more likely.
Furthermore I think pinned pagecache pages are rather rare so unmapping
them before checking seems fine to me. Obviously we can reconsider if
reality would prove me wrong ;).
> I once had a patch lying around that documented for the existing
> folio_maybe_dma_pinned() for anon pages exactly that (racy+false positives
> with many mappings).
>
> Long story short, I assume this change is fine.
Thanks for the throughout verification :)
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2023-02-14 13:00 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-09 12:31 [PATCH RFC 0/5] Writeback handling of pinned pages Jan Kara
2023-02-09 12:31 ` [PATCH 1/5] mm: Do not reclaim private data from pinned page Jan Kara
2023-02-09 16:17 ` Matthew Wilcox
2023-02-10 11:29 ` Jan Kara
2023-02-13 9:55 ` Christoph Hellwig
2023-02-14 13:06 ` Jan Kara
2023-02-14 21:40 ` John Hubbard
2023-02-16 11:56 ` Jan Kara
2023-02-13 9:01 ` David Hildenbrand
2023-02-14 13:00 ` Jan Kara [this message]
2023-02-09 12:31 ` [PATCH 2/5] ext4: Drop workaround for mm reclaiming fs private page data Jan Kara
2023-02-09 12:31 ` [PATCH 3/5] mm: Do not try to write pinned folio during memory cleaning writeback Jan Kara
2023-02-10 1:54 ` John Hubbard
2023-02-10 2:10 ` John Hubbard
2023-02-10 10:42 ` Jan Kara
2023-02-10 10:54 ` Jan Kara
2023-02-09 12:31 ` [PATCH 4/5] block: Add support for bouncing pinned pages Jan Kara
2023-02-13 9:59 ` Christoph Hellwig
2023-02-14 13:56 ` Jan Kara
2023-02-15 4:59 ` Dave Chinner
2023-02-15 6:24 ` Christoph Hellwig
2023-02-16 12:33 ` Jan Kara
2023-02-20 6:22 ` Christoph Hellwig
2023-02-27 11:39 ` Jan Kara
2023-02-27 13:36 ` Christoph Hellwig
2023-02-09 12:31 ` [PATCH 5/5] iomap: Bounce pinned pages during writeback Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230214130000.s5kynjhjiyrpvzxx@quack3 \
--to=jack@suse.cz \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox