From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Jan Kara <jack@suse.cz>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
Lorenzo Stoakes <lstoakes@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] mm: Do not reclaim private data from pinned page
Date: Tue, 2 May 2023 17:53:08 +0200 [thread overview]
Message-ID: <42b80e03-fc72-dfa0-f18d-d6006ea48e76@redhat.com> (raw)
In-Reply-To: <ZFEw6DzzZX54z3B/@x1n>
On 02.05.23 17:48, Peter Xu wrote:
> On Tue, May 02, 2023 at 05:33:22PM +0200, David Hildenbrand wrote:
>> On 02.05.23 17:26, Peter Xu wrote:
>>> On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
>>>> If the page is pinned, there's no point in trying to reclaim it.
>>>> Furthermore if the page is from the page cache we don't want to reclaim
>>>> fs-private data from the page because the pinning process may be writing
>>>> to the page at any time and reclaiming fs private info on a dirty page
>>>> can upset the filesystem (see link below).
>>>>
>>>> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
>>>> Signed-off-by: Jan Kara <jack@suse.cz>
>>>> ---
>>>> mm/vmscan.c | 10 ++++++++++
>>>> 1 file changed, 10 insertions(+)
>>>>
>>>> This was the non-controversial part of my series [1] dealing with pinned pages
>>>> in filesystems. It is already a win as it avoids crashes in the filesystem and
>>>> we can drop workarounds for this in ext4. Can we merge it please?
>>>>
>>>> [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
>>>>
>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>> index bf3eedf0209c..401a379ea99a 100644
>>>> --- a/mm/vmscan.c
>>>> +++ b/mm/vmscan.c
>>>> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>>>> }
>>>> }
>>>> + /*
>>>> + * Folio is unmapped now so it cannot be newly pinned anymore.
>>>> + * No point in trying to reclaim folio if it is pinned.
>>>> + * Furthermore we don't want to reclaim underlying fs metadata
>>>> + * if the folio is pinned and thus potentially modified by the
>>>> + * pinning process as that may upset the filesystem.
>>>> + */
>>>> + if (folio_maybe_dma_pinned(folio))
>>>> + goto activate_locked;
>>>> +
>>>> mapping = folio_mapping(folio);
>>>> if (folio_test_dirty(folio)) {
>>>> /*
>>>> --
>>>> 2.35.3
>>>>
>>>>
>>>
>>> IIUC we have similar handling for anon (feb889fb40fafc). Should we merge
>>> the two sites and just move the check earlier? Thanks,
>>>
>>
>> feb889fb40fafc introduced a best-effort check that is racy, as the page is
>> still mapped (can still get pinned). Further, we get false positives most
>> only if a page is shared very often (1024 times), which happens rarely with
>> anon pages. Now that we handle COW+pinning correctly using
>> PageAnonExclusive, that check only optimizes for the "already pinned" case.
>> But it's not required for correctness anymore (so it can be racy).
>>
>> Here, however, we want more precision, and not false positives simply
>> because a page is mapped many times (which can happen easily) or can still
>> get pinned while mapped.
>
> Ah makes sense, thanks.
>
> Acked-by: Peter Xu <peterx@redhat.com>
>
> This seems not obvious, though, if we simply read the two commits. It'll be
> great if we mention it somewhere in either comment or commit message on the
> relationship of the two checks.
I once had a patch lying around to document the existing check:
https://github.com/davidhildenbrand/linux/commit/abb01d42a99b56e2c5e707ba80ddc8b05ad7d618
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2023-05-02 15:53 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-28 12:41 Jan Kara
2023-04-28 12:58 ` Matthew Wilcox
2023-04-28 13:05 ` Lorenzo Stoakes
2023-04-29 4:50 ` Christoph Hellwig
2023-05-01 18:12 ` John Hubbard
2023-05-02 14:45 ` David Hildenbrand
2023-05-02 15:26 ` Peter Xu
2023-05-02 15:33 ` David Hildenbrand
2023-05-02 15:48 ` Peter Xu
2023-05-02 15:53 ` David Hildenbrand [this message]
2023-05-02 20:20 ` Andrew Morton
2023-05-03 9:51 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42b80e03-fc72-dfa0-f18d-d6006ea48e76@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=peterx@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox