[PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq()
@ 2024-06-25  5:04 Hugh Dickins
  2024-06-25  9:01 ` Kefeng Wang
  2024-06-25 20:01 ` David Hildenbrand
  0 siblings, 2 replies; 6+ messages in thread
From: Hugh Dickins @ 2024-06-25  5:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kefeng Wang, Hugh Dickins, linux-mm, Tony Luck, Miaohe Lin,
	nao.horiguchi, Matthew Wilcox, David Hildenbrand, Muchun Song,
	Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan, Vishal Moola,
	Alistair Popple, Jane Chu, Oscar Salvador

Commit "mm: migrate: split folio_migrate_mapping()" drew attention to
"Note, the folio_ref_freeze() is moved out of xas_lock_irq(), Since the
folio is already isolated and locked during migration, so suppose that
there is no functional change."

That was a mistake. Freezing a folio's refcount to 0 is much like taking
a spinlock: see how filemap_get_entry() takes rcu_read_lock() then spins
around until the folio is unfrozen. If the task freezing is preempted (or
calls cond_resched(), as folio_mc_copy() may do), then it risks deadlock:
in my case, one CPU in zap_pte_range() with free_swap_and_cache_nr()
trying to reclaim swap while PTL is held, all the other CPUs in reclaim
spinning for that PTL.

I'm uncertain whether it's necessary for interrupts to be disabled as
well as preemption, but since they have to be disabled for the page
cache migration, it's much the best to do it all together as before.
So revert to folio_ref_freeze() under xas_lock_irq(): but keep the
preliminary folio_ref_count() check, which does make sense before
trying to copy the folio's data.

Use "expected_count" for the expected count throughout.

Signed-off-by: Hugh Dickins <hughd@google.com>
---
 mm/migrate.c | 59 +++++++++++++++++++++++++---------------------------
 1 file changed, 28 insertions(+), 31 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 27f070f64f27..8beedbb42a93 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -400,8 +400,8 @@ static int folio_expected_refs(struct address_space *mapping,
  * 2 for folios with a mapping
  * 3 for folios with a mapping and PagePrivate/PagePrivate2 set.
  */
-static void __folio_migrate_mapping(struct address_space *mapping,
-		struct folio *newfolio, struct folio *folio, int expected_cnt)
+static int __folio_migrate_mapping(struct address_space *mapping,
+		struct folio *newfolio, struct folio *folio, int expected_count)
 {
 	XA_STATE(xas, &mapping->i_pages, folio_index(folio));
 	struct zone *oldzone, *newzone;
@@ -415,13 +415,18 @@ static void __folio_migrate_mapping(struct address_space *mapping,
 		newfolio->mapping = folio->mapping;
 		if (folio_test_swapbacked(folio))
 			__folio_set_swapbacked(newfolio);
-		return;
+		return MIGRATEPAGE_SUCCESS;
 	}
 
 	oldzone = folio_zone(folio);
 	newzone = folio_zone(newfolio);
 
 	xas_lock_irq(&xas);
+	if (!folio_ref_freeze(folio, expected_count)) {
+		xas_unlock_irq(&xas);
+		return -EAGAIN;
+	}
+
 	/* Now we know that no one else is looking at the folio */
 	newfolio->index = folio->index;
 	newfolio->mapping = folio->mapping;
@@ -456,7 +461,7 @@ static void __folio_migrate_mapping(struct address_space *mapping,
 	 * old folio by unfreezing to one less reference.
 	 * We know this isn't the last reference.
 	 */
-	folio_ref_unfreeze(folio, expected_cnt - nr);
+	folio_ref_unfreeze(folio, expected_count - nr);
 
 	xas_unlock(&xas);
 	/* Leave irq disabled to prevent preemption while updating stats */
@@ -504,23 +509,19 @@ static void __folio_migrate_mapping(struct address_space *mapping,
 		}
 	}
 	local_irq_enable();
+
+	return MIGRATEPAGE_SUCCESS;
 }
 
 int folio_migrate_mapping(struct address_space *mapping, struct folio *newfolio,
 			  struct folio *folio, int extra_count)
 {
-	int expected_cnt = folio_expected_refs(mapping, folio) + extra_count;
+	int expected_count = folio_expected_refs(mapping, folio) + extra_count;
 
-	if (!mapping) {
-		if (folio_ref_count(folio) != expected_cnt)
-			return -EAGAIN;
-	} else {
-		if (!folio_ref_freeze(folio, expected_cnt))
-			return -EAGAIN;
-	}
+	if (folio_ref_count(folio) != expected_count)
+		return -EAGAIN;
 
-	__folio_migrate_mapping(mapping, newfolio, folio, expected_cnt);
-	return MIGRATEPAGE_SUCCESS;
+	return __folio_migrate_mapping(mapping, newfolio, folio, expected_count);
 }
 EXPORT_SYMBOL(folio_migrate_mapping);
 
@@ -534,16 +535,18 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
 	XA_STATE(xas, &mapping->i_pages, folio_index(src));
 	int ret, expected_count = folio_expected_refs(mapping, src);
 
-	if (!folio_ref_freeze(src, expected_count))
+	if (folio_ref_count(src) != expected_count)
 		return -EAGAIN;
 
 	ret = folio_mc_copy(dst, src);
-	if (unlikely(ret)) {
-		folio_ref_unfreeze(src, expected_count);
+	if (unlikely(ret))
 		return ret;
-	}
 
 	xas_lock_irq(&xas);
+	if (!folio_ref_freeze(src, expected_count)) {
+		xas_unlock_irq(&xas);
+		return -EAGAIN;
+	}
 
 	dst->index = src->index;
 	dst->mapping = src->mapping;
@@ -660,24 +663,18 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst,
 			   struct folio *src, void *src_private,
 			   enum migrate_mode mode)
 {
-	int ret, expected_cnt = folio_expected_refs(mapping, src);
+	int ret, expected_count = folio_expected_refs(mapping, src);
 
-	if (!mapping) {
-		if (folio_ref_count(src) != expected_cnt)
-			return -EAGAIN;
-	} else {
-		if (!folio_ref_freeze(src, expected_cnt))
-			return -EAGAIN;
-	}
+	if (folio_ref_count(src) != expected_count)
+		return -EAGAIN;
 
 	ret = folio_mc_copy(dst, src);
-	if (unlikely(ret)) {
-		if (mapping)
-			folio_ref_unfreeze(src, expected_cnt);
+	if (unlikely(ret))
 		return ret;
-	}
 
-	__folio_migrate_mapping(mapping, dst, src, expected_cnt);
+	ret = __folio_migrate_mapping(mapping, dst, src, expected_count);
+	if (ret != MIGRATEPAGE_SUCCESS)
+		return ret;
 
 	if (src_private)
 		folio_attach_private(dst, folio_detach_private(src));
-- 
2.35.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq()
  2024-06-25  5:04 [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq() Hugh Dickins
@ 2024-06-25  9:01 ` Kefeng Wang
  2024-06-25 19:23   ` Andrew Morton
  2024-06-25 19:51   ` Hugh Dickins
  2024-06-25 20:01 ` David Hildenbrand
  1 sibling, 2 replies; 6+ messages in thread
From: Kefeng Wang @ 2024-06-25  9:01 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton
  Cc: linux-mm, Tony Luck, Miaohe Lin, nao.horiguchi, Matthew Wilcox,
	David Hildenbrand, Muchun Song, Benjamin LaHaise, jglisse,
	Zi Yan, Jiaqi Yan, Vishal Moola, Alistair Popple, Jane Chu,
	Oscar Salvador



On 2024/6/25 13:04, Hugh Dickins wrote:
> Commit "mm: migrate: split folio_migrate_mapping()" drew attention to
> "Note, the folio_ref_freeze() is moved out of xas_lock_irq(), Since the
> folio is already isolated and locked during migration, so suppose that
> there is no functional change."
> 
> That was a mistake. Freezing a folio's refcount to 0 is much like taking
> a spinlock: see how filemap_get_entry() takes rcu_read_lock() then spins
> around until the folio is unfrozen. If the task freezing is preempted (or
> calls cond_resched(), as folio_mc_copy() may do), then it risks deadlock:
> in my case, one CPU in zap_pte_range() with free_swap_and_cache_nr()
> trying to reclaim swap while PTL is held, all the other CPUs in reclaim
> spinning for that PTL.

Oh, thanks for pointing this out, I didn't take that into account.
> 
> I'm uncertain whether it's necessary for interrupts to be disabled as
> well as preemption, but since they have to be disabled for the page
> cache migration, it's much the best to do it all together as before.
> So revert to folio_ref_freeze() under xas_lock_irq(): but keep the
> preliminary folio_ref_count() check, which does make sense before
> trying to copy the folio's data.

This is what my RFC version[1] does, which adds same reference check to
avoid the unnecessary folio_mc_copy().

Should I resend all patches, or Andrew directly pick this one?


Thanks.

[1] 
https://lore.kernel.org/linux-mm/20240129070934.3717659-7-wangkefeng.wang@huawei.com/


> 
> Use "expected_count" for the expected count throughout.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>   mm/migrate.c | 59 +++++++++++++++++++++++++---------------------------
>   1 file changed, 28 insertions(+), 31 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 27f070f64f27..8beedbb42a93 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -400,8 +400,8 @@ static int folio_expected_refs(struct address_space *mapping,
>    * 2 for folios with a mapping
>    * 3 for folios with a mapping and PagePrivate/PagePrivate2 set.
>    */
> -static void __folio_migrate_mapping(struct address_space *mapping,
> -		struct folio *newfolio, struct folio *folio, int expected_cnt)
> +static int __folio_migrate_mapping(struct address_space *mapping,
> +		struct folio *newfolio, struct folio *folio, int expected_count)

We can rename it back to folio_migrate_mapping().

>   {
>   	XA_STATE(xas, &mapping->i_pages, folio_index(folio));
>   	struct zone *oldzone, *newzone;
> @@ -415,13 +415,18 @@ static void __folio_migrate_mapping(struct address_space *mapping,
>   		newfolio->mapping = folio->mapping;
>   		if (folio_test_swapbacked(folio))
>   			__folio_set_swapbacked(newfolio);
> -		return;
> +		return MIGRATEPAGE_SUCCESS;
>   	}
>   
>   	oldzone = folio_zone(folio);
>   	newzone = folio_zone(newfolio);
>   
>   	xas_lock_irq(&xas);
> +	if (!folio_ref_freeze(folio, expected_count)) {
> +		xas_unlock_irq(&xas);
> +		return -EAGAIN;
> +	}
> +
>   	/* Now we know that no one else is looking at the folio */
>   	newfolio->index = folio->index;
>   	newfolio->mapping = folio->mapping;
> @@ -456,7 +461,7 @@ static void __folio_migrate_mapping(struct address_space *mapping,
>   	 * old folio by unfreezing to one less reference.
>   	 * We know this isn't the last reference.
>   	 */
> -	folio_ref_unfreeze(folio, expected_cnt - nr);
> +	folio_ref_unfreeze(folio, expected_count - nr);
>   
>   	xas_unlock(&xas);
>   	/* Leave irq disabled to prevent preemption while updating stats */
> @@ -504,23 +509,19 @@ static void __folio_migrate_mapping(struct address_space *mapping,
>   		}
>   	}
>   	local_irq_enable();
> +
> +	return MIGRATEPAGE_SUCCESS;
>   }
>   
>   int folio_migrate_mapping(struct address_space *mapping, struct folio *newfolio,
>   			  struct folio *folio, int extra_count)
>   {
> -	int expected_cnt = folio_expected_refs(mapping, folio) + extra_count;
> +	int expected_count = folio_expected_refs(mapping, folio) + extra_count;
>   
> -	if (!mapping) {
> -		if (folio_ref_count(folio) != expected_cnt)
> -			return -EAGAIN;
> -	} else {
> -		if (!folio_ref_freeze(folio, expected_cnt))
> -			return -EAGAIN;
> -	}
> +	if (folio_ref_count(folio) != expected_count)
> +		return -EAGAIN;
>   
> -	__folio_migrate_mapping(mapping, newfolio, folio, expected_cnt);
> -	return MIGRATEPAGE_SUCCESS;
> +	return __folio_migrate_mapping(mapping, newfolio, folio, expected_count);
>   }
>   EXPORT_SYMBOL(folio_migrate_mapping);
>   
> @@ -534,16 +535,18 @@ int migrate_huge_page_move_mapping(struct address_space *mapping,
>   	XA_STATE(xas, &mapping->i_pages, folio_index(src));
>   	int ret, expected_count = folio_expected_refs(mapping, src);
>   
> -	if (!folio_ref_freeze(src, expected_count))
> +	if (folio_ref_count(src) != expected_count)
>   		return -EAGAIN;
>   
>   	ret = folio_mc_copy(dst, src);
> -	if (unlikely(ret)) {
> -		folio_ref_unfreeze(src, expected_count);
> +	if (unlikely(ret))
>   		return ret;
> -	}
>   
>   	xas_lock_irq(&xas);
> +	if (!folio_ref_freeze(src, expected_count)) {
> +		xas_unlock_irq(&xas);
> +		return -EAGAIN;
> +	}
>   
>   	dst->index = src->index;
>   	dst->mapping = src->mapping;
> @@ -660,24 +663,18 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst,
>   			   struct folio *src, void *src_private,
>   			   enum migrate_mode mode)
>   {
> -	int ret, expected_cnt = folio_expected_refs(mapping, src);
> +	int ret, expected_count = folio_expected_refs(mapping, src);
>   
> -	if (!mapping) {
> -		if (folio_ref_count(src) != expected_cnt)
> -			return -EAGAIN;
> -	} else {
> -		if (!folio_ref_freeze(src, expected_cnt))
> -			return -EAGAIN;
> -	}
> +	if (folio_ref_count(src) != expected_count)
> +		return -EAGAIN;
>   
>   	ret = folio_mc_copy(dst, src);
> -	if (unlikely(ret)) {
> -		if (mapping)
> -			folio_ref_unfreeze(src, expected_cnt);
> +	if (unlikely(ret))
>   		return ret;
> -	}
>   
> -	__folio_migrate_mapping(mapping, dst, src, expected_cnt);
> +	ret = __folio_migrate_mapping(mapping, dst, src, expected_count);
> +	if (ret != MIGRATEPAGE_SUCCESS)
> +		return ret;
>   
>   	if (src_private)
>   		folio_attach_private(dst, folio_detach_private(src));


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq()
  2024-06-25  9:01 ` Kefeng Wang
@ 2024-06-25 19:23   ` Andrew Morton
  2024-06-25 19:51   ` Hugh Dickins
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2024-06-25 19:23 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Hugh Dickins, linux-mm, Tony Luck, Miaohe Lin, nao.horiguchi,
	Matthew Wilcox, David Hildenbrand, Muchun Song, Benjamin LaHaise,
	jglisse, Zi Yan, Jiaqi Yan, Vishal Moola, Alistair Popple,
	Jane Chu, Oscar Salvador

On Tue, 25 Jun 2024 17:01:16 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:

> Should I resend all patches, or Andrew directly pick this one?

I think a resend please - Hugh's fix doesn't fit into the series very well.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq()
  2024-06-25  9:01 ` Kefeng Wang
  2024-06-25 19:23   ` Andrew Morton
@ 2024-06-25 19:51   ` Hugh Dickins
  2024-06-26  7:24     ` Kefeng Wang
  1 sibling, 1 reply; 6+ messages in thread
From: Hugh Dickins @ 2024-06-25 19:51 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Hugh Dickins, Andrew Morton, linux-mm, Tony Luck, Miaohe Lin,
	nao.horiguchi, Matthew Wilcox, David Hildenbrand, Muchun Song,
	Benjamin LaHaise, jglisse, Zi Yan, Jiaqi Yan, Vishal Moola,
	Alistair Popple, Jane Chu, Oscar Salvador

On Tue, 25 Jun 2024, Kefeng Wang wrote:
> On 2024/6/25 13:04, Hugh Dickins wrote:
> > Commit "mm: migrate: split folio_migrate_mapping()" drew attention to
> > "Note, the folio_ref_freeze() is moved out of xas_lock_irq(), Since the
> > folio is already isolated and locked during migration, so suppose that
> > there is no functional change."
> > 
> > That was a mistake. Freezing a folio's refcount to 0 is much like taking
> > a spinlock: see how filemap_get_entry() takes rcu_read_lock() then spins
> > around until the folio is unfrozen. If the task freezing is preempted (or
> > calls cond_resched(), as folio_mc_copy() may do), then it risks deadlock:
> > in my case, one CPU in zap_pte_range() with free_swap_and_cache_nr()
> > trying to reclaim swap while PTL is held, all the other CPUs in reclaim
> > spinning for that PTL.
> 
> Oh, thanks for pointing this out, I didn't take that into account.
> > 
> > I'm uncertain whether it's necessary for interrupts to be disabled as
> > well as preemption, but since they have to be disabled for the page
> > cache migration, it's much the best to do it all together as before.
> > So revert to folio_ref_freeze() under xas_lock_irq(): but keep the
> > preliminary folio_ref_count() check, which does make sense before
> > trying to copy the folio's data.
> 
> This is what my RFC version[1] does, which adds same reference check to
> avoid the unnecessary folio_mc_copy().
> 
> Should I resend all patches, or Andrew directly pick this one?

Andrew asks for a resend: I was only aiming to fix the bug, and
have no perspective on how much of the series remains worthwhile.

> 
> 
> Thanks.
> 
> [1]
> https://lore.kernel.org/linux-mm/20240129070934.3717659-7-wangkefeng.wang@huawei.com/
> 
> 
> > 
> > Use "expected_count" for the expected count throughout.
> > 
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> > ---
> >   mm/migrate.c | 59 +++++++++++++++++++++++++---------------------------
> >   1 file changed, 28 insertions(+), 31 deletions(-)
> > 
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 27f070f64f27..8beedbb42a93 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -400,8 +400,8 @@ static int folio_expected_refs(struct address_space
> > *mapping,
> >    * 2 for folios with a mapping
> >    * 3 for folios with a mapping and PagePrivate/PagePrivate2 set.
> >    */
> > -static void __folio_migrate_mapping(struct address_space *mapping,
> > -		struct folio *newfolio, struct folio *folio, int expected_cnt)
> > +static int __folio_migrate_mapping(struct address_space *mapping,
> > +		struct folio *newfolio, struct folio *folio, int
> > expected_count)
> 
> We can rename it back to folio_migrate_mapping().

Almost.  I did want to remove the __ layer, but internally the last
parameter is expected_count, whereas externally it is extra_count:
each has merit in its place, so I left them as is.

Hugh


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq()
  2024-06-25 19:51   ` Hugh Dickins
@ 2024-06-26  7:24     ` Kefeng Wang
  0 siblings, 0 replies; 6+ messages in thread
From: Kefeng Wang @ 2024-06-26  7:24 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, linux-mm, Tony Luck, Miaohe Lin, nao.horiguchi,
	Matthew Wilcox, David Hildenbrand, Muchun Song, Benjamin LaHaise,
	jglisse, Zi Yan, Jiaqi Yan, Vishal Moola, Alistair Popple,
	Jane Chu, Oscar Salvador



On 2024/6/26 3:51, Hugh Dickins wrote:
> On Tue, 25 Jun 2024, Kefeng Wang wrote:
>> On 2024/6/25 13:04, Hugh Dickins wrote:
>>> Commit "mm: migrate: split folio_migrate_mapping()" drew attention to
>>> "Note, the folio_ref_freeze() is moved out of xas_lock_irq(), Since the
>>> folio is already isolated and locked during migration, so suppose that
>>> there is no functional change."
>>>
>>> That was a mistake. Freezing a folio's refcount to 0 is much like taking
>>> a spinlock: see how filemap_get_entry() takes rcu_read_lock() then spins
>>> around until the folio is unfrozen. If the task freezing is preempted (or
>>> calls cond_resched(), as folio_mc_copy() may do), then it risks deadlock:
>>> in my case, one CPU in zap_pte_range() with free_swap_and_cache_nr()
>>> trying to reclaim swap while PTL is held, all the other CPUs in reclaim
>>> spinning for that PTL.
>>
>> Oh, thanks for pointing this out, I didn't take that into account.
>>>
>>> I'm uncertain whether it's necessary for interrupts to be disabled as
>>> well as preemption, but since they have to be disabled for the page
>>> cache migration, it's much the best to do it all together as before.
>>> So revert to folio_ref_freeze() under xas_lock_irq(): but keep the
>>> preliminary folio_ref_count() check, which does make sense before
>>> trying to copy the folio's data.
>>
>> This is what my RFC version[1] does, which adds same reference check to
>> avoid the unnecessary folio_mc_copy().
>>
>> Should I resend all patches, or Andrew directly pick this one?
> 
> Andrew asks for a resend: I was only aiming to fix the bug, and
> have no perspective on how much of the series remains worthwhile.
> 
>>
>>
>> Thanks.
>>
>> [1]
>> https://lore.kernel.org/linux-mm/20240129070934.3717659-7-wangkefeng.wang@huawei.com/
>>
>>
>>>
>>> Use "expected_count" for the expected count throughout.
>>>
>>> Signed-off-by: Hugh Dickins <hughd@google.com>
>>> ---
>>>    mm/migrate.c | 59 +++++++++++++++++++++++++---------------------------
>>>    1 file changed, 28 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 27f070f64f27..8beedbb42a93 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -400,8 +400,8 @@ static int folio_expected_refs(struct address_space
>>> *mapping,
>>>     * 2 for folios with a mapping
>>>     * 3 for folios with a mapping and PagePrivate/PagePrivate2 set.
>>>     */
>>> -static void __folio_migrate_mapping(struct address_space *mapping,
>>> -		struct folio *newfolio, struct folio *folio, int expected_cnt)
>>> +static int __folio_migrate_mapping(struct address_space *mapping,
>>> +		struct folio *newfolio, struct folio *folio, int
>>> expected_count)
>>
>> We can rename it back to folio_migrate_mapping().
> 
> Almost.  I did want to remove the __ layer, but internally the last
> parameter is expected_count, whereas externally it is extra_count:
> each has merit in its place, so I left them as is.

Indeed， I refresh all patches, we still need this 
__filemap_migrate_mapping, thanks.
> 
> Hugh
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq()
  2024-06-25  5:04 [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq() Hugh Dickins
  2024-06-25  9:01 ` Kefeng Wang
@ 2024-06-25 20:01 ` David Hildenbrand
  1 sibling, 0 replies; 6+ messages in thread
From: David Hildenbrand @ 2024-06-25 20:01 UTC (permalink / raw)
  To: Hugh Dickins, Andrew Morton
  Cc: Kefeng Wang, linux-mm, Tony Luck, Miaohe Lin, nao.horiguchi,
	Matthew Wilcox, Muchun Song, Benjamin LaHaise, jglisse, Zi Yan,
	Jiaqi Yan, Vishal Moola, Alistair Popple, Jane Chu,
	Oscar Salvador

On 25.06.24 07:04, Hugh Dickins wrote:
> Commit "mm: migrate: split folio_migrate_mapping()" drew attention to
> "Note, the folio_ref_freeze() is moved out of xas_lock_irq(), Since the
> folio is already isolated and locked during migration, so suppose that
> there is no functional change."
> 
> That was a mistake. Freezing a folio's refcount to 0 is much like taking
> a spinlock: see how filemap_get_entry() takes rcu_read_lock() then spins
> around until the folio is unfrozen. If the task freezing is preempted (or
> calls cond_resched(), as folio_mc_copy() may do), then it risks deadlock:
> in my case, one CPU in zap_pte_range() with free_swap_and_cache_nr()
> trying to reclaim swap while PTL is held, all the other CPUs in reclaim
> spinning for that PTL.

Very much agreed. I think we should add a proper comment to 
folio_ref_freeze() to spell that out.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-26  7:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-25  5:04 [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq() Hugh Dickins
2024-06-25  9:01 ` Kefeng Wang
2024-06-25 19:23   ` Andrew Morton
2024-06-25 19:51   ` Hugh Dickins
2024-06-26  7:24     ` Kefeng Wang
2024-06-25 20:01 ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox