linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] mm/memory: move sanity checks in do_wp_page() after mapcount vs. refcount stabilization
@ 2025-04-15  9:50 David Hildenbrand
  2025-04-16  7:46 ` Oscar Salvador
  0 siblings, 1 reply; 2+ messages in thread
From: David Hildenbrand @ 2025-04-15  9:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, syzbot+5e8feb543ca8e12e0ede, Andrew Morton

In __folio_remove_rmap() for RMAP_LEVEL_PMD/RMAP_LEVEL_PUD and with
CONFIG_PAGE_MAPCOUNT we first decrement the folio mapcount (and
recompute mapped shared vs. mapped exclusively) to then adjust the
entire mapcount.

This means that another process might stumble in do_wp_page() over a
PTE-mapped PMD folio that is indicated as "exclusively mapped", but still
has an entire mapcount (PMD mapping), because it is racing with the process
that is unmapping the folio (PMD mapping). Note that do_wp_page() will
back off once it detects the remaining folio reference from the process
that is in the process of unmapping the folio.

This will trigger the early VM_WARN_ON_ONCE(folio_entire_mapcount(folio))
check in do_wp_page(), that can easily be reproduced by looping a couple
of times over allocating a PMD THP, forking a child where we immediately
unmap it again, and writing in the parent concurrently to the THP.

[  252.738129][T16470] ------------[ cut here ]------------
[  252.739267][T16470] WARNING: CPU: 3 PID: 16470 at mm/memory.c:3738 do_wp_page+0x2a75/0x2c00
[  252.740968][T16470] Modules linked in:
[  252.741958][T16470] CPU: 3 UID: 0 PID: 16470 Comm: ...
...
[  252.765841][T16470]  <TASK>
[  252.766419][T16470]  ? srso_alias_return_thunk+0x5/0xfbef5
[  252.767558][T16470]  ? rcu_is_watching+0x12/0x60
[  252.768525][T16470]  ? srso_alias_return_thunk+0x5/0xfbef5
[  252.769645][T16470]  ? srso_alias_return_thunk+0x5/0xfbef5
[  252.770778][T16470]  ? lock_acquire+0x33/0x80
[  252.771697][T16470]  ? __handle_mm_fault+0x5e8/0x3e40
[  252.772735][T16470]  ? __handle_mm_fault+0x5e8/0x3e40
[  252.773781][T16470]  __handle_mm_fault+0x1869/0x3e40
[  252.774839][T16470]  handle_mm_fault+0x22a/0x640
[  252.775808][T16470]  do_user_addr_fault+0x618/0x1000
[  252.776847][T16470]  exc_page_fault+0x68/0xd0
[  252.777775][T16470]  asm_exc_page_fault+0x26/0x30

While we could adjust the sequence in __folio_remove_rmap(), let's rater
move the mapcount sanity checks after the mapcount vs. refcount
stabilization phase. With this fix, a simple reproducer is happy.

While at it, convert the two VM_WARN_ON_ONCE() we are moving to
VM_WARN_ON_ONCE_FOLIO().

Reported-by: syzbot+5e8feb543ca8e12e0ede@syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/67fab4fe.050a0220.2c5fcf.0011.GAE@google.com
Fixes: 1da190f4d0a6 ("mm: Copy-on-Write (COW) reuse support for PTE-mapped THP")
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 2d8c265fc7d60..625886d40e091 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3734,8 +3734,6 @@ static bool __wp_can_reuse_large_anon_folio(struct folio *folio,
 		return false;
 
 	VM_WARN_ON_ONCE(folio_test_ksm(folio));
-	VM_WARN_ON_ONCE(folio_mapcount(folio) > folio_nr_pages(folio));
-	VM_WARN_ON_ONCE(folio_entire_mapcount(folio));
 
 	if (unlikely(folio_test_swapcache(folio))) {
 		/*
@@ -3760,6 +3758,8 @@ static bool __wp_can_reuse_large_anon_folio(struct folio *folio,
 	if (folio_large_mapcount(folio) != folio_ref_count(folio))
 		goto unlock;
 
+	VM_WARN_ON_ONCE_FOLIO(folio_large_mapcount(folio) > folio_nr_pages(folio), folio);
+	VM_WARN_ON_ONCE_FOLIO(folio_entire_mapcount(folio), folio);
 	VM_WARN_ON_ONCE(folio_mm_id(folio, 0) != vma->vm_mm->mm_id &&
 			folio_mm_id(folio, 1) != vma->vm_mm->mm_id);
 
-- 
2.49.0



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH v1] mm/memory: move sanity checks in do_wp_page() after mapcount vs. refcount stabilization
  2025-04-15  9:50 [PATCH v1] mm/memory: move sanity checks in do_wp_page() after mapcount vs. refcount stabilization David Hildenbrand
@ 2025-04-16  7:46 ` Oscar Salvador
  0 siblings, 0 replies; 2+ messages in thread
From: Oscar Salvador @ 2025-04-16  7:46 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, syzbot+5e8feb543ca8e12e0ede, Andrew Morton

On Tue, Apr 15, 2025 at 11:50:07AM +0200, David Hildenbrand wrote:
> In __folio_remove_rmap() for RMAP_LEVEL_PMD/RMAP_LEVEL_PUD and with
> CONFIG_PAGE_MAPCOUNT we first decrement the folio mapcount (and
> recompute mapped shared vs. mapped exclusively) to then adjust the
> entire mapcount.
> 
> This means that another process might stumble in do_wp_page() over a
> PTE-mapped PMD folio that is indicated as "exclusively mapped", but still
> has an entire mapcount (PMD mapping), because it is racing with the process
> that is unmapping the folio (PMD mapping). Note that do_wp_page() will
> back off once it detects the remaining folio reference from the process
> that is in the process of unmapping the folio.
> 
> This will trigger the early VM_WARN_ON_ONCE(folio_entire_mapcount(folio))
> check in do_wp_page(), that can easily be reproduced by looping a couple
> of times over allocating a PMD THP, forking a child where we immediately
> unmap it again, and writing in the parent concurrently to the THP.
> 
> [  252.738129][T16470] ------------[ cut here ]------------
> [  252.739267][T16470] WARNING: CPU: 3 PID: 16470 at mm/memory.c:3738 do_wp_page+0x2a75/0x2c00
> [  252.740968][T16470] Modules linked in:
> [  252.741958][T16470] CPU: 3 UID: 0 PID: 16470 Comm: ...
> ...
> [  252.765841][T16470]  <TASK>
> [  252.766419][T16470]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  252.767558][T16470]  ? rcu_is_watching+0x12/0x60
> [  252.768525][T16470]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  252.769645][T16470]  ? srso_alias_return_thunk+0x5/0xfbef5
> [  252.770778][T16470]  ? lock_acquire+0x33/0x80
> [  252.771697][T16470]  ? __handle_mm_fault+0x5e8/0x3e40
> [  252.772735][T16470]  ? __handle_mm_fault+0x5e8/0x3e40
> [  252.773781][T16470]  __handle_mm_fault+0x1869/0x3e40
> [  252.774839][T16470]  handle_mm_fault+0x22a/0x640
> [  252.775808][T16470]  do_user_addr_fault+0x618/0x1000
> [  252.776847][T16470]  exc_page_fault+0x68/0xd0
> [  252.777775][T16470]  asm_exc_page_fault+0x26/0x30
> 
> While we could adjust the sequence in __folio_remove_rmap(), let's rater
> move the mapcount sanity checks after the mapcount vs. refcount
> stabilization phase. With this fix, a simple reproducer is happy.
> 
> While at it, convert the two VM_WARN_ON_ONCE() we are moving to
> VM_WARN_ON_ONCE_FOLIO().
> 
> Reported-by: syzbot+5e8feb543ca8e12e0ede@syzkaller.appspotmail.com
> Closes: https://lkml.kernel.org/r/67fab4fe.050a0220.2c5fcf.0011.GAE@google.com
> Fixes: 1da190f4d0a6 ("mm: Copy-on-Write (COW) reuse support for PTE-mapped THP")
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

 

-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-04-16  7:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-15  9:50 [PATCH v1] mm/memory: move sanity checks in do_wp_page() after mapcount vs. refcount stabilization David Hildenbrand
2025-04-16  7:46 ` Oscar Salvador

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox