[PATCH] mm: Do not reclaim private data from pinned page

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm: Do not reclaim private data from pinned page
@ 2023-04-28 12:41 Jan Kara
  2023-04-28 12:58 ` Matthew Wilcox
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Jan Kara @ 2023-04-28 12:41 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-fsdevel, Lorenzo Stoakes, Andrew Morton, Christoph Hellwig,
	David Hildenbrand, Jan Kara

If the page is pinned, there's no point in trying to reclaim it.
Furthermore if the page is from the page cache we don't want to reclaim
fs-private data from the page because the pinning process may be writing
to the page at any time and reclaiming fs private info on a dirty page
can upset the filesystem (see link below).

Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/vmscan.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

This was the non-controversial part of my series [1] dealing with pinned pages
in filesystems. It is already a win as it avoids crashes in the filesystem and
we can drop workarounds for this in ext4. Can we merge it please?

[1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/

diff --git a/mm/vmscan.c b/mm/vmscan.c
index bf3eedf0209c..401a379ea99a 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 			}
 		}
 
+		/*
+		 * Folio is unmapped now so it cannot be newly pinned anymore.
+		 * No point in trying to reclaim folio if it is pinned.
+		 * Furthermore we don't want to reclaim underlying fs metadata
+		 * if the folio is pinned and thus potentially modified by the
+		 * pinning process as that may upset the filesystem.
+		 */
+		if (folio_maybe_dma_pinned(folio))
+			goto activate_locked;
+
 		mapping = folio_mapping(folio);
 		if (folio_test_dirty(folio)) {
 			/*
-- 
2.35.3



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
@ 2023-04-28 12:58 ` Matthew Wilcox
  2023-04-28 13:05 ` Lorenzo Stoakes
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Matthew Wilcox @ 2023-04-28 12:58 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, linux-fsdevel, Lorenzo Stoakes, Andrew Morton,
	Christoph Hellwig, David Hildenbrand

On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
> If the page is pinned, there's no point in trying to reclaim it.
> Furthermore if the page is from the page cache we don't want to reclaim
> fs-private data from the page because the pinning process may be writing
> to the page at any time and reclaiming fs private info on a dirty page
> can upset the filesystem (see link below).
> 
> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
  2023-04-28 12:58 ` Matthew Wilcox
@ 2023-04-28 13:05 ` Lorenzo Stoakes
  2023-04-29  4:50 ` Christoph Hellwig
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes @ 2023-04-28 13:05 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, linux-fsdevel, Andrew Morton, Christoph Hellwig,
	David Hildenbrand

On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
> If the page is pinned, there's no point in trying to reclaim it.
> Furthermore if the page is from the page cache we don't want to reclaim
> fs-private data from the page because the pinning process may be writing
> to the page at any time and reclaiming fs private info on a dirty page
> can upset the filesystem (see link below).
>
> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  mm/vmscan.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> This was the non-controversial part of my series [1] dealing with pinned pages
> in filesystems. It is already a win as it avoids crashes in the filesystem and
> we can drop workarounds for this in ext4. Can we merge it please?
>
> [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bf3eedf0209c..401a379ea99a 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  			}
>  		}
>
> +		/*
> +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> +		 * No point in trying to reclaim folio if it is pinned.
> +		 * Furthermore we don't want to reclaim underlying fs metadata
> +		 * if the folio is pinned and thus potentially modified by the
> +		 * pinning process as that may upset the filesystem.
> +		 */
> +		if (folio_maybe_dma_pinned(folio))
> +			goto activate_locked;
> +
>  		mapping = folio_mapping(folio);
>  		if (folio_test_dirty(folio)) {
>  			/*
> --
> 2.35.3
>

This seems very sensible and helps ameliorate problematic GUP/file
interactions so this seems a no-brainer.

Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
  2023-04-28 12:58 ` Matthew Wilcox
  2023-04-28 13:05 ` Lorenzo Stoakes
@ 2023-04-29  4:50 ` Christoph Hellwig
  2023-05-01 18:12 ` John Hubbard
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2023-04-29  4:50 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, linux-fsdevel, Lorenzo Stoakes, Andrew Morton,
	Christoph Hellwig, David Hildenbrand

On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
> If the page is pinned, there's no point in trying to reclaim it.
> Furthermore if the page is from the page cache we don't want to reclaim
> fs-private data from the page because the pinning process may be writing
> to the page at any time and reclaiming fs private info on a dirty page
> can upset the filesystem (see link below).

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
                   ` (2 preceding siblings ...)
  2023-04-29  4:50 ` Christoph Hellwig
@ 2023-05-01 18:12 ` John Hubbard
  2023-05-02 14:45 ` David Hildenbrand
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: John Hubbard @ 2023-05-01 18:12 UTC (permalink / raw)
  To: Jan Kara, linux-mm
  Cc: linux-fsdevel, Lorenzo Stoakes, Andrew Morton, Christoph Hellwig,
	David Hildenbrand

On 4/28/23 05:41, Jan Kara wrote:
> If the page is pinned, there's no point in trying to reclaim it.
> Furthermore if the page is from the page cache we don't want to reclaim
> fs-private data from the page because the pinning process may be writing
> to the page at any time and reclaiming fs private info on a dirty page
> can upset the filesystem (see link below).
> 
> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  mm/vmscan.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> This was the non-controversial part of my series [1] dealing with pinned pages
> in filesystems. It is already a win as it avoids crashes in the filesystem and
> we can drop workarounds for this in ext4. Can we merge it please?
> 
> [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bf3eedf0209c..401a379ea99a 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  			}
>  		}
>  
> +		/*
> +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> +		 * No point in trying to reclaim folio if it is pinned.
> +		 * Furthermore we don't want to reclaim underlying fs metadata
> +		 * if the folio is pinned and thus potentially modified by the
> +		 * pinning process as that may upset the filesystem.
> +		 */
> +		if (folio_maybe_dma_pinned(folio))
> +			goto activate_locked;
> +

This is huge! At long last. In fact, with this in the queue, I'm going to close
out our internal bug report from 2018 that launched this whole maybe-dma-pinned 
odyssey. :)

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA

>  		mapping = folio_mapping(folio);
>  		if (folio_test_dirty(folio)) {
>  			/*




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
                   ` (3 preceding siblings ...)
  2023-05-01 18:12 ` John Hubbard
@ 2023-05-02 14:45 ` David Hildenbrand
  2023-05-02 15:26 ` Peter Xu
  2023-05-02 20:20 ` Andrew Morton
  6 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand @ 2023-05-02 14:45 UTC (permalink / raw)
  To: Jan Kara, linux-mm
  Cc: linux-fsdevel, Lorenzo Stoakes, Andrew Morton, Christoph Hellwig

On 28.04.23 14:41, Jan Kara wrote:
> If the page is pinned, there's no point in trying to reclaim it.
> Furthermore if the page is from the page cache we don't want to reclaim
> fs-private data from the page because the pinning process may be writing
> to the page at any time and reclaiming fs private info on a dirty page
> can upset the filesystem (see link below).
> 
> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>   mm/vmscan.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> This was the non-controversial part of my series [1] dealing with pinned pages
> in filesystems. It is already a win as it avoids crashes in the filesystem and
> we can drop workarounds for this in ext4. Can we merge it please?
> 
> [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bf3eedf0209c..401a379ea99a 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>   			}
>   		}
>   
> +		/*
> +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> +		 * No point in trying to reclaim folio if it is pinned.
> +		 * Furthermore we don't want to reclaim underlying fs metadata
> +		 * if the folio is pinned and thus potentially modified by the
> +		 * pinning process as that may upset the filesystem.
> +		 */
> +		if (folio_maybe_dma_pinned(folio))
> +			goto activate_locked;
> +
>   		mapping = folio_mapping(folio);
>   		if (folio_test_dirty(folio)) {
>   			/*

Acked-by: David Hildenbrand <david@redhat.com>

Thanks!

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
                   ` (4 preceding siblings ...)
  2023-05-02 14:45 ` David Hildenbrand
@ 2023-05-02 15:26 ` Peter Xu
  2023-05-02 15:33   ` David Hildenbrand
  2023-05-02 20:20 ` Andrew Morton
  6 siblings, 1 reply; 12+ messages in thread
From: Peter Xu @ 2023-05-02 15:26 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, linux-fsdevel, Lorenzo Stoakes, Andrew Morton,
	Christoph Hellwig, David Hildenbrand

On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
> If the page is pinned, there's no point in trying to reclaim it.
> Furthermore if the page is from the page cache we don't want to reclaim
> fs-private data from the page because the pinning process may be writing
> to the page at any time and reclaiming fs private info on a dirty page
> can upset the filesystem (see link below).
> 
> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  mm/vmscan.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> This was the non-controversial part of my series [1] dealing with pinned pages
> in filesystems. It is already a win as it avoids crashes in the filesystem and
> we can drop workarounds for this in ext4. Can we merge it please?
> 
> [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bf3eedf0209c..401a379ea99a 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  			}
>  		}
>  
> +		/*
> +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> +		 * No point in trying to reclaim folio if it is pinned.
> +		 * Furthermore we don't want to reclaim underlying fs metadata
> +		 * if the folio is pinned and thus potentially modified by the
> +		 * pinning process as that may upset the filesystem.
> +		 */
> +		if (folio_maybe_dma_pinned(folio))
> +			goto activate_locked;
> +
>  		mapping = folio_mapping(folio);
>  		if (folio_test_dirty(folio)) {
>  			/*
> -- 
> 2.35.3
> 
> 

IIUC we have similar handling for anon (feb889fb40fafc).  Should we merge
the two sites and just move the check earlier?  Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-05-02 15:26 ` Peter Xu
@ 2023-05-02 15:33   ` David Hildenbrand
  2023-05-02 15:48     ` Peter Xu
  0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2023-05-02 15:33 UTC (permalink / raw)
  To: Peter Xu, Jan Kara
  Cc: linux-mm, linux-fsdevel, Lorenzo Stoakes, Andrew Morton,
	Christoph Hellwig

On 02.05.23 17:26, Peter Xu wrote:
> On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
>> If the page is pinned, there's no point in trying to reclaim it.
>> Furthermore if the page is from the page cache we don't want to reclaim
>> fs-private data from the page because the pinning process may be writing
>> to the page at any time and reclaiming fs private info on a dirty page
>> can upset the filesystem (see link below).
>>
>> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
>> Signed-off-by: Jan Kara <jack@suse.cz>
>> ---
>>   mm/vmscan.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> This was the non-controversial part of my series [1] dealing with pinned pages
>> in filesystems. It is already a win as it avoids crashes in the filesystem and
>> we can drop workarounds for this in ext4. Can we merge it please?
>>
>> [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index bf3eedf0209c..401a379ea99a 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>>   			}
>>   		}
>>   
>> +		/*
>> +		 * Folio is unmapped now so it cannot be newly pinned anymore.
>> +		 * No point in trying to reclaim folio if it is pinned.
>> +		 * Furthermore we don't want to reclaim underlying fs metadata
>> +		 * if the folio is pinned and thus potentially modified by the
>> +		 * pinning process as that may upset the filesystem.
>> +		 */
>> +		if (folio_maybe_dma_pinned(folio))
>> +			goto activate_locked;
>> +
>>   		mapping = folio_mapping(folio);
>>   		if (folio_test_dirty(folio)) {
>>   			/*
>> -- 
>> 2.35.3
>>
>>
> 
> IIUC we have similar handling for anon (feb889fb40fafc).  Should we merge
> the two sites and just move the check earlier?  Thanks,
> 

feb889fb40fafc introduced a best-effort check that is racy, as the page 
is still mapped (can still get pinned). Further, we get false positives 
most only if a page is shared very often (1024 times), which happens 
rarely with anon pages. Now that we handle COW+pinning correctly using 
PageAnonExclusive, that check only optimizes for the "already pinned" 
case. But it's not required for correctness anymore (so it can be racy).

Here, however, we want more precision, and not false positives simply 
because a page is mapped many times (which can happen easily) or can 
still get pinned while mapped.
-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-05-02 15:33   ` David Hildenbrand
@ 2023-05-02 15:48     ` Peter Xu
  2023-05-02 15:53       ` David Hildenbrand
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Xu @ 2023-05-02 15:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Jan Kara, linux-mm, linux-fsdevel, Lorenzo Stoakes,
	Andrew Morton, Christoph Hellwig

On Tue, May 02, 2023 at 05:33:22PM +0200, David Hildenbrand wrote:
> On 02.05.23 17:26, Peter Xu wrote:
> > On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
> > > If the page is pinned, there's no point in trying to reclaim it.
> > > Furthermore if the page is from the page cache we don't want to reclaim
> > > fs-private data from the page because the pinning process may be writing
> > > to the page at any time and reclaiming fs private info on a dirty page
> > > can upset the filesystem (see link below).
> > > 
> > > Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > ---
> > >   mm/vmscan.c | 10 ++++++++++
> > >   1 file changed, 10 insertions(+)
> > > 
> > > This was the non-controversial part of my series [1] dealing with pinned pages
> > > in filesystems. It is already a win as it avoids crashes in the filesystem and
> > > we can drop workarounds for this in ext4. Can we merge it please?
> > > 
> > > [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
> > > 
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index bf3eedf0209c..401a379ea99a 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> > >   			}
> > >   		}
> > > +		/*
> > > +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> > > +		 * No point in trying to reclaim folio if it is pinned.
> > > +		 * Furthermore we don't want to reclaim underlying fs metadata
> > > +		 * if the folio is pinned and thus potentially modified by the
> > > +		 * pinning process as that may upset the filesystem.
> > > +		 */
> > > +		if (folio_maybe_dma_pinned(folio))
> > > +			goto activate_locked;
> > > +
> > >   		mapping = folio_mapping(folio);
> > >   		if (folio_test_dirty(folio)) {
> > >   			/*
> > > -- 
> > > 2.35.3
> > > 
> > > 
> > 
> > IIUC we have similar handling for anon (feb889fb40fafc).  Should we merge
> > the two sites and just move the check earlier?  Thanks,
> > 
> 
> feb889fb40fafc introduced a best-effort check that is racy, as the page is
> still mapped (can still get pinned). Further, we get false positives most
> only if a page is shared very often (1024 times), which happens rarely with
> anon pages. Now that we handle COW+pinning correctly using
> PageAnonExclusive, that check only optimizes for the "already pinned" case.
> But it's not required for correctness anymore (so it can be racy).
> 
> Here, however, we want more precision, and not false positives simply
> because a page is mapped many times (which can happen easily) or can still
> get pinned while mapped.

Ah makes sense, thanks.

Acked-by: Peter Xu <peterx@redhat.com>

This seems not obvious, though, if we simply read the two commits. It'll be
great if we mention it somewhere in either comment or commit message on the
relationship of the two checks.

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-05-02 15:48     ` Peter Xu
@ 2023-05-02 15:53       ` David Hildenbrand
  0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand @ 2023-05-02 15:53 UTC (permalink / raw)
  To: Peter Xu
  Cc: Jan Kara, linux-mm, linux-fsdevel, Lorenzo Stoakes,
	Andrew Morton, Christoph Hellwig

On 02.05.23 17:48, Peter Xu wrote:
> On Tue, May 02, 2023 at 05:33:22PM +0200, David Hildenbrand wrote:
>> On 02.05.23 17:26, Peter Xu wrote:
>>> On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
>>>> If the page is pinned, there's no point in trying to reclaim it.
>>>> Furthermore if the page is from the page cache we don't want to reclaim
>>>> fs-private data from the page because the pinning process may be writing
>>>> to the page at any time and reclaiming fs private info on a dirty page
>>>> can upset the filesystem (see link below).
>>>>
>>>> Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
>>>> Signed-off-by: Jan Kara <jack@suse.cz>
>>>> ---
>>>>    mm/vmscan.c | 10 ++++++++++
>>>>    1 file changed, 10 insertions(+)
>>>>
>>>> This was the non-controversial part of my series [1] dealing with pinned pages
>>>> in filesystems. It is already a win as it avoids crashes in the filesystem and
>>>> we can drop workarounds for this in ext4. Can we merge it please?
>>>>
>>>> [1] https://lore.kernel.org/all/20230209121046.25360-1-jack@suse.cz/
>>>>
>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>> index bf3eedf0209c..401a379ea99a 100644
>>>> --- a/mm/vmscan.c
>>>> +++ b/mm/vmscan.c
>>>> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>>>>    			}
>>>>    		}
>>>> +		/*
>>>> +		 * Folio is unmapped now so it cannot be newly pinned anymore.
>>>> +		 * No point in trying to reclaim folio if it is pinned.
>>>> +		 * Furthermore we don't want to reclaim underlying fs metadata
>>>> +		 * if the folio is pinned and thus potentially modified by the
>>>> +		 * pinning process as that may upset the filesystem.
>>>> +		 */
>>>> +		if (folio_maybe_dma_pinned(folio))
>>>> +			goto activate_locked;
>>>> +
>>>>    		mapping = folio_mapping(folio);
>>>>    		if (folio_test_dirty(folio)) {
>>>>    			/*
>>>> -- 
>>>> 2.35.3
>>>>
>>>>
>>>
>>> IIUC we have similar handling for anon (feb889fb40fafc).  Should we merge
>>> the two sites and just move the check earlier?  Thanks,
>>>
>>
>> feb889fb40fafc introduced a best-effort check that is racy, as the page is
>> still mapped (can still get pinned). Further, we get false positives most
>> only if a page is shared very often (1024 times), which happens rarely with
>> anon pages. Now that we handle COW+pinning correctly using
>> PageAnonExclusive, that check only optimizes for the "already pinned" case.
>> But it's not required for correctness anymore (so it can be racy).
>>
>> Here, however, we want more precision, and not false positives simply
>> because a page is mapped many times (which can happen easily) or can still
>> get pinned while mapped.
> 
> Ah makes sense, thanks.
> 
> Acked-by: Peter Xu <peterx@redhat.com>
> 
> This seems not obvious, though, if we simply read the two commits. It'll be
> great if we mention it somewhere in either comment or commit message on the
> relationship of the two checks.

I once had a patch lying around to document the existing check:

https://github.com/davidhildenbrand/linux/commit/abb01d42a99b56e2c5e707ba80ddc8b05ad7d618

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
                   ` (5 preceding siblings ...)
  2023-05-02 15:26 ` Peter Xu
@ 2023-05-02 20:20 ` Andrew Morton
  2023-05-03  9:51   ` Jan Kara
  6 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2023-05-02 20:20 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-mm, linux-fsdevel, Lorenzo Stoakes, Christoph Hellwig,
	David Hildenbrand

On Fri, 28 Apr 2023 14:41:40 +0200 Jan Kara <jack@suse.cz> wrote:

> If the page is pinned, there's no point in trying to reclaim it.
> Furthermore if the page is from the page cache we don't want to reclaim
> fs-private data from the page because the pinning process may be writing
> to the page at any time and reclaiming fs private info on a dirty page
> can upset the filesystem (see link below).

Obviously I'll add a cc:stable here.  I'm suspecting it's so old that
there's no real Fixes: target that makes sense?

> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  			}
>  		}
>  
> +		/*
> +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> +		 * No point in trying to reclaim folio if it is pinned.
> +		 * Furthermore we don't want to reclaim underlying fs metadata
> +		 * if the folio is pinned and thus potentially modified by the
> +		 * pinning process as that may upset the filesystem.
> +		 */
> +		if (folio_maybe_dma_pinned(folio))
> +			goto activate_locked;
> +

So I expect the -stable maintainers will be looking for a pre-folios
version of this when the time comes.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm: Do not reclaim private data from pinned page
  2023-05-02 20:20 ` Andrew Morton
@ 2023-05-03  9:51   ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2023-05-03  9:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, linux-mm, linux-fsdevel, Lorenzo Stoakes,
	Christoph Hellwig, David Hildenbrand

On Tue 02-05-23 13:20:20, Andrew Morton wrote:
> On Fri, 28 Apr 2023 14:41:40 +0200 Jan Kara <jack@suse.cz> wrote:
> 
> > If the page is pinned, there's no point in trying to reclaim it.
> > Furthermore if the page is from the page cache we don't want to reclaim
> > fs-private data from the page because the pinning process may be writing
> > to the page at any time and reclaiming fs private info on a dirty page
> > can upset the filesystem (see link below).
> 
> Obviously I'll add a cc:stable here.  I'm suspecting it's so old that
> there's no real Fixes: target that makes sense?

In principle the problem is there ever since MM started to track dirty
shared pages and filesystems started to use .page_mkwrite callbacks. So
for very long, yes. That being said the fix makes sense only since we've
added page pinning infrastructure and started using it in various places
which is not that long ago (in 2020, first patches in this direction have
been merged to 5.7). So we could mark it for stable with 5.7+.

> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> >  			}
> >  		}
> >  
> > +		/*
> > +		 * Folio is unmapped now so it cannot be newly pinned anymore.
> > +		 * No point in trying to reclaim folio if it is pinned.
> > +		 * Furthermore we don't want to reclaim underlying fs metadata
> > +		 * if the folio is pinned and thus potentially modified by the
> > +		 * pinning process as that may upset the filesystem.
> > +		 */
> > +		if (folio_maybe_dma_pinned(folio))
> > +			goto activate_locked;
> > +
> 
> So I expect the -stable maintainers will be looking for a pre-folios
> version of this when the time comes.

Yeah, right. Luckily that's going to be pretty easy :).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-05-03  9:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-28 12:41 [PATCH] mm: Do not reclaim private data from pinned page Jan Kara
2023-04-28 12:58 ` Matthew Wilcox
2023-04-28 13:05 ` Lorenzo Stoakes
2023-04-29  4:50 ` Christoph Hellwig
2023-05-01 18:12 ` John Hubbard
2023-05-02 14:45 ` David Hildenbrand
2023-05-02 15:26 ` Peter Xu
2023-05-02 15:33   ` David Hildenbrand
2023-05-02 15:48     ` Peter Xu
2023-05-02 15:53       ` David Hildenbrand
2023-05-02 20:20 ` Andrew Morton
2023-05-03  9:51   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox