* [PATCH v2 0/3] mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()
@ 2024-06-17 23:11 Barry Song
2024-06-17 23:11 ` [PATCH v2 1/3] mm: extend rmap flags arguments for folio_add_new_anon_rmap Barry Song
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Barry Song @ 2024-06-17 23:11 UTC (permalink / raw)
To: akpm, david, linux-mm
Cc: baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts,
shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed,
yuzhao
From: Barry Song <v-songbaohua@oppo.com>
The whole thing was originally suggested by David while we tried
to weaken the WARN_ON in __folio_add_anon_rmap() while bringing up
mTHP swapin[1]. This patchset is also preparatory work for mTHP
swapin.
folio_add_new_anon_rmap() assumes that new anon rmaps are always
exclusive. However, this assumption doesn’t hold true for cases
like do_swap_page(), where a new anon might be added to the
swapcache and is not necessarily exclusive.
The patchset extends the rmap flags to allow folio_add_new_anon_rmap()
to handle both exclusive and non-exclusive new anon folios.
The do_swap_page() function is updated to use this extended API with
rmap flags. Consequently, all new anon folios now consistently use
folio_add_new_anon_rmap().
The special case for !folio_test_anon() in __folio_add_anon_rmap() can
be safely removed.
In conclusion, new anon folios always use folio_add_new_anon_rmap(),
regardless of exclusivity. Old anon folios continue to use
__folio_add_anon_rmap() via folio_add_anon_rmap_pmd() and
folio_add_anon_rmap_ptes().
[1] https://lore.kernel.org/linux-mm/20240118111036.72641-6-21cnbao@gmail.com/
-v2:
* fix crashes reported by Yuan Shuai during swapoff, thanks; David
also commented unuse_pte() for swapoff;
* add comments for !anon according to David, thanks;
* add Yuan Shuai's tested-by tags, thanks for Yuan Shuai's testing
on real phones;
* refine changelog;
-v1(RFC):
https://lore.kernel.org/linux-mm/20240613000721.23093-1-21cnbao@gmail.com/
Barry Song (3):
mm: extend rmap flags arguments for folio_add_new_anon_rmap
mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false
mm: remove folio_test_anon(folio)==false path in
__folio_add_anon_rmap()
include/linux/rmap.h | 2 +-
kernel/events/uprobes.c | 2 +-
mm/huge_memory.c | 2 +-
mm/khugepaged.c | 2 +-
mm/memory.c | 18 +++++++++++++-----
mm/migrate_device.c | 2 +-
mm/rmap.c | 34 +++++++++++++---------------------
mm/swapfile.c | 15 ++++++++++++---
mm/userfaultfd.c | 2 +-
9 files changed, 44 insertions(+), 35 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH v2 1/3] mm: extend rmap flags arguments for folio_add_new_anon_rmap 2024-06-17 23:11 [PATCH v2 0/3] mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap() Barry Song @ 2024-06-17 23:11 ` Barry Song 2024-06-22 3:02 ` Barry Song 2024-06-17 23:11 ` [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false Barry Song 2024-06-17 23:11 ` [PATCH v2 3/3] mm: remove folio_test_anon(folio)==false path in __folio_add_anon_rmap() Barry Song 2 siblings, 1 reply; 15+ messages in thread From: Barry Song @ 2024-06-17 23:11 UTC (permalink / raw) To: akpm, david, linux-mm Cc: baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan From: Barry Song <v-songbaohua@oppo.com> In the case of a swap-in, a new anonymous folio is not necessarily exclusive. This patch updates the rmap flags to allow a new anonymous folio to be treated as either exclusive or non-exclusive. To maintain the existing behavior, we always use EXCLUSIVE as the default setting. Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> Tested-by: Shuai Yuan <yuanshuai@oppo.com> --- include/linux/rmap.h | 2 +- kernel/events/uprobes.c | 2 +- mm/huge_memory.c | 2 +- mm/khugepaged.c | 2 +- mm/memory.c | 10 +++++----- mm/migrate_device.c | 2 +- mm/rmap.c | 17 ++++++++++------- mm/swapfile.c | 2 +- mm/userfaultfd.c | 2 +- 9 files changed, 22 insertions(+), 19 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b1bbe237ea4c..23d336e5d502 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -244,7 +244,7 @@ void folio_add_anon_rmap_ptes(struct folio *, struct page *, int nr_pages, void folio_add_anon_rmap_pmd(struct folio *, struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, - unsigned long address); + unsigned long address, rmap_t flags); void folio_add_file_rmap_ptes(struct folio *, struct page *, int nr_pages, struct vm_area_struct *); #define folio_add_file_rmap_pte(folio, page, vma) \ diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 2c83ba776fc7..c20368aa33dd 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -181,7 +181,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, if (new_page) { folio_get(new_folio); - folio_add_new_anon_rmap(new_folio, vma, addr); + folio_add_new_anon_rmap(new_folio, vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(new_folio, vma); } else /* no new page, just dec_mm_counter for old_page */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 425374ae06ed..9e2357ab9b9a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -973,7 +973,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, entry = mk_huge_pmd(page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); - folio_add_new_anon_rmap(folio, vma, haddr); + folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 774a97e6e2da..4d759a7487d0 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1213,7 +1213,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); - folio_add_new_anon_rmap(folio, vma, address); + folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); diff --git a/mm/memory.c b/mm/memory.c index b7137d9c99a9..1f24ecdafe05 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -930,7 +930,7 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma *prealloc = NULL; copy_user_highpage(&new_folio->page, page, addr, src_vma); __folio_mark_uptodate(new_folio); - folio_add_new_anon_rmap(new_folio, dst_vma, addr); + folio_add_new_anon_rmap(new_folio, dst_vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(new_folio, dst_vma); rss[MM_ANONPAGES]++; @@ -3400,7 +3400,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * some TLBs while the old PTE remains in others. */ ptep_clear_flush(vma, vmf->address, vmf->pte); - folio_add_new_anon_rmap(new_folio, vma, vmf->address); + folio_add_new_anon_rmap(new_folio, vma, vmf->address, RMAP_EXCLUSIVE); folio_add_lru_vma(new_folio, vma); BUG_ON(unshare && pte_write(entry)); set_pte_at(mm, vmf->address, vmf->pte, entry); @@ -4337,7 +4337,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* ksm created a completely new copy */ if (unlikely(folio != swapcache && swapcache)) { - folio_add_new_anon_rmap(folio, vma, address); + folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); } else { folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, @@ -4592,7 +4592,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) #ifdef CONFIG_TRANSPARENT_HUGEPAGE count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC); #endif - folio_add_new_anon_rmap(folio, vma, addr); + folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); setpte: if (vmf_orig_pte_uffd_wp(vmf)) @@ -4790,7 +4790,7 @@ void set_pte_range(struct vm_fault *vmf, struct folio *folio, /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { VM_BUG_ON_FOLIO(nr != 1, folio); - folio_add_new_anon_rmap(folio, vma, addr); + folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); } else { folio_add_file_rmap_ptes(folio, page, nr, vma); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 051d0a3ccbee..6d66dc1c6ffa 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -658,7 +658,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto unlock_abort; inc_mm_counter(mm, MM_ANONPAGES); - folio_add_new_anon_rmap(folio, vma, addr); + folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); if (!folio_is_zone_device(folio)) folio_add_lru_vma(folio, vma); folio_get(folio); diff --git a/mm/rmap.c b/mm/rmap.c index a3c99ac63155..2b19bb92eda5 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1406,25 +1406,26 @@ void folio_add_anon_rmap_pmd(struct folio *folio, struct page *page, * This means the inc-and-test can be bypassed. * The folio does not have to be locked. * - * If the folio is pmd-mappable, it is accounted as a THP. As the folio - * is new, it's assumed to be mapped exclusively by a single process. + * If the folio is pmd-mappable, it is accounted as a THP. */ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, - unsigned long address) + unsigned long address, rmap_t flags) { int nr = folio_nr_pages(folio); int nr_pmdmapped = 0; + bool exclusive = flags & RMAP_EXCLUSIVE; VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); VM_BUG_ON_VMA(address < vma->vm_start || address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); - __folio_set_anon(folio, vma, address, true); + __folio_set_anon(folio, vma, address, exclusive); if (likely(!folio_test_large(folio))) { /* increment count (starts at -1) */ atomic_set(&folio->_mapcount, 0); - SetPageAnonExclusive(&folio->page); + if (exclusive) + SetPageAnonExclusive(&folio->page); } else if (!folio_test_pmd_mappable(folio)) { int i; @@ -1433,7 +1434,8 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, /* increment count (starts at -1) */ atomic_set(&page->_mapcount, 0); - SetPageAnonExclusive(page); + if (exclusive) + SetPageAnonExclusive(page); } /* increment count (starts at -1) */ @@ -1445,7 +1447,8 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, /* increment count (starts at -1) */ atomic_set(&folio->_large_mapcount, 0); atomic_set(&folio->_nr_pages_mapped, ENTIRELY_MAPPED); - SetPageAnonExclusive(&folio->page); + if (exclusive) + SetPageAnonExclusive(&folio->page); nr_pmdmapped = nr; } diff --git a/mm/swapfile.c b/mm/swapfile.c index 9c6d8e557c0f..ae1d2700f6a3 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1911,7 +1911,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); } else { /* ksm created a completely new copy */ - folio_add_new_anon_rmap(folio, vma, addr); + folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); } new_pte = pte_mkold(mk_pte(page, vma->vm_page_prot)); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 5e7f2801698a..8dedaec00486 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -216,7 +216,7 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, folio_add_lru(folio); folio_add_file_rmap_pte(folio, page, dst_vma); } else { - folio_add_new_anon_rmap(folio, dst_vma, dst_addr); + folio_add_new_anon_rmap(folio, dst_vma, dst_addr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, dst_vma); } -- 2.34.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] mm: extend rmap flags arguments for folio_add_new_anon_rmap 2024-06-17 23:11 ` [PATCH v2 1/3] mm: extend rmap flags arguments for folio_add_new_anon_rmap Barry Song @ 2024-06-22 3:02 ` Barry Song 0 siblings, 0 replies; 15+ messages in thread From: Barry Song @ 2024-06-22 3:02 UTC (permalink / raw) To: akpm Cc: baolin.wang, chrisl, david, linux-kernel, linux-mm, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuanshuai, yuzhao > > From: Barry Song <v-songbaohua@oppo.com> > > In the case of a swap-in, a new anonymous folio is not necessarily > exclusive. This patch updates the rmap flags to allow a new anonymous > folio to be treated as either exclusive or non-exclusive. To maintain > the existing behavior, we always use EXCLUSIVE as the default setting. > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > --- Hi Andrew, Could you please help squash the following change (a documentation enhancement suggested by David) into this patch? From: Barry Song <v-songbaohua@oppo.com> Date: Sat, 22 Jun 2024 14:51:38 +1200 Subject: [PATCH] mm: enhence doc for extend rmap flags arguments for folio_add_new_anon_rmap Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> --- mm/rmap.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index df1a43295c85..9a8d9c848168 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1394,7 +1394,9 @@ void folio_add_anon_rmap_pmd(struct folio *folio, struct page *page, * * Like folio_add_anon_rmap_*() but must only be called on *new* folios. * This means the inc-and-test can be bypassed. - * The folio does not have to be locked. + * The folio doesn't necessarily need to be locked while it's exclusive + * unless two threads map it concurrently. However, the folio must be + * locked if it's shared. * * If the folio is pmd-mappable, it is accounted as a THP. */ @@ -1406,6 +1408,7 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, int nr_pmdmapped = 0; VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); VM_BUG_ON_VMA(address < vma->vm_start || address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); -- 2.34.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-17 23:11 [PATCH v2 0/3] mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap() Barry Song 2024-06-17 23:11 ` [PATCH v2 1/3] mm: extend rmap flags arguments for folio_add_new_anon_rmap Barry Song @ 2024-06-17 23:11 ` Barry Song 2024-06-18 9:54 ` David Hildenbrand 2024-06-20 7:46 ` David Hildenbrand 2024-06-17 23:11 ` [PATCH v2 3/3] mm: remove folio_test_anon(folio)==false path in __folio_add_anon_rmap() Barry Song 2 siblings, 2 replies; 15+ messages in thread From: Barry Song @ 2024-06-17 23:11 UTC (permalink / raw) To: akpm, david, linux-mm Cc: baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan From: Barry Song <v-songbaohua@oppo.com> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating the process of bringing up mTHP swapin. static __always_inline void __folio_add_anon_rmap(struct folio *folio, struct page *page, int nr_pages, struct vm_area_struct *vma, unsigned long address, rmap_t flags, enum rmap_level level) { ... if (unlikely(!folio_test_anon(folio))) { VM_WARN_ON_FOLIO(folio_test_large(folio) && level != RMAP_LEVEL_PMD, folio); } ... } It also improves the code’s readability. Currently, all new anonymous folios calling folio_add_anon_rmap_ptes() are order-0. This ensures that new folios cannot be partially exclusive; they are either entirely exclusive or entirely shared. Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> Tested-by: Shuai Yuan <yuanshuai@oppo.com> --- mm/memory.c | 8 ++++++++ mm/swapfile.c | 13 +++++++++++-- 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 1f24ecdafe05..620654c13b2f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (unlikely(folio != swapcache && swapcache)) { folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); + } else if (!folio_test_anon(folio)) { + /* + * We currently only expect small !anon folios, for which we now + * that they are either fully exclusive or fully shared. If we + * ever get large folios here, we have to be careful. + */ + VM_WARN_ON_ONCE(folio_test_large(folio)); + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); } else { folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, rmap_flags); diff --git a/mm/swapfile.c b/mm/swapfile.c index ae1d2700f6a3..69efa1a57087 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); if (pte_swp_exclusive(old_pte)) rmap_flags |= RMAP_EXCLUSIVE; - - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); + /* + * We currently only expect small !anon folios, for which we now that + * they are either fully exclusive or fully shared. If we ever get + * large folios here, we have to be careful. + */ + if (!folio_test_anon(folio)) { + VM_WARN_ON_ONCE(folio_test_large(folio)); + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); + } else { + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); + } } else { /* ksm created a completely new copy */ folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); -- 2.34.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-17 23:11 ` [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false Barry Song @ 2024-06-18 9:54 ` David Hildenbrand 2024-06-20 7:46 ` David Hildenbrand 1 sibling, 0 replies; 15+ messages in thread From: David Hildenbrand @ 2024-06-18 9:54 UTC (permalink / raw) To: Barry Song, akpm, linux-mm Cc: baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan On 18.06.24 01:11, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > the process of bringing up mTHP swapin. > > static __always_inline void __folio_add_anon_rmap(struct folio *folio, > struct page *page, int nr_pages, struct vm_area_struct *vma, > unsigned long address, rmap_t flags, enum rmap_level level) > { > ... > if (unlikely(!folio_test_anon(folio))) { > VM_WARN_ON_FOLIO(folio_test_large(folio) && > level != RMAP_LEVEL_PMD, folio); > } > ... > } > > It also improves the code’s readability. Currently, all new anonymous > folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > that new folios cannot be partially exclusive; they are either entirely > exclusive or entirely shared. > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > --- > mm/memory.c | 8 ++++++++ > mm/swapfile.c | 13 +++++++++++-- > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 1f24ecdafe05..620654c13b2f 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > if (unlikely(folio != swapcache && swapcache)) { > folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma); > + } else if (!folio_test_anon(folio)) { > + /* > + * We currently only expect small !anon folios, for which we now s/now/know/ > + * that they are either fully exclusive or fully shared. If we > + * ever get large folios here, we have to be careful. > + */ > + VM_WARN_ON_ONCE(folio_test_large(folio)); > + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > } else { > folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > rmap_flags); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index ae1d2700f6a3..69efa1a57087 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > if (pte_swp_exclusive(old_pte)) > rmap_flags |= RMAP_EXCLUSIVE; > - > - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + /* > + * We currently only expect small !anon folios, for which we now that s/now/know/ > + * they are either fully exclusive or fully shared. If we ever get > + * large folios here, we have to be careful. > + */ > + if (!folio_test_anon(folio)) { > + VM_WARN_ON_ONCE(folio_test_large(folio)); > + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); > + } else { > + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + } > } else { /* ksm created a completely new copy */ > folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma); Thanks! Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-17 23:11 ` [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false Barry Song 2024-06-18 9:54 ` David Hildenbrand @ 2024-06-20 7:46 ` David Hildenbrand 2024-06-20 8:33 ` Barry Song 1 sibling, 1 reply; 15+ messages in thread From: David Hildenbrand @ 2024-06-20 7:46 UTC (permalink / raw) To: Barry Song, akpm, linux-mm Cc: baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan On 18.06.24 01:11, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > the process of bringing up mTHP swapin. > > static __always_inline void __folio_add_anon_rmap(struct folio *folio, > struct page *page, int nr_pages, struct vm_area_struct *vma, > unsigned long address, rmap_t flags, enum rmap_level level) > { > ... > if (unlikely(!folio_test_anon(folio))) { > VM_WARN_ON_FOLIO(folio_test_large(folio) && > level != RMAP_LEVEL_PMD, folio); > } > ... > } > > It also improves the code’s readability. Currently, all new anonymous > folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > that new folios cannot be partially exclusive; they are either entirely > exclusive or entirely shared. > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > --- > mm/memory.c | 8 ++++++++ > mm/swapfile.c | 13 +++++++++++-- > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 1f24ecdafe05..620654c13b2f 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > if (unlikely(folio != swapcache && swapcache)) { > folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma); > + } else if (!folio_test_anon(folio)) { > + /* > + * We currently only expect small !anon folios, for which we now > + * that they are either fully exclusive or fully shared. If we > + * ever get large folios here, we have to be careful. > + */ > + VM_WARN_ON_ONCE(folio_test_large(folio)); > + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > } else { > folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > rmap_flags); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index ae1d2700f6a3..69efa1a57087 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > if (pte_swp_exclusive(old_pte)) > rmap_flags |= RMAP_EXCLUSIVE; > - > - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + /* > + * We currently only expect small !anon folios, for which we now that > + * they are either fully exclusive or fully shared. If we ever get > + * large folios here, we have to be careful. > + */ > + if (!folio_test_anon(folio)) { > + VM_WARN_ON_ONCE(folio_test_large(folio)); (comment applies to both cases) Thinking about Hugh's comment, we should likely add here: VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); [the check we are removing from __folio_add_anon_rmap()] and document for folio_add_new_anon_rmap() in patch #1, that when dealing with folios that might be mapped concurrently by others, the folio lock must be held. > + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); > + } else { > + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > + } > } else { /* ksm created a completely new copy */ > folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); > folio_add_lru_vma(folio, vma); -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-20 7:46 ` David Hildenbrand @ 2024-06-20 8:33 ` Barry Song 2024-06-20 8:49 ` David Hildenbrand 0 siblings, 1 reply; 15+ messages in thread From: Barry Song @ 2024-06-20 8:33 UTC (permalink / raw) To: David Hildenbrand Cc: akpm, linux-mm, baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: > > On 18.06.24 01:11, Barry Song wrote: > > From: Barry Song <v-songbaohua@oppo.com> > > > > For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > > with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > > suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > > the process of bringing up mTHP swapin. > > > > static __always_inline void __folio_add_anon_rmap(struct folio *folio, > > struct page *page, int nr_pages, struct vm_area_struct *vma, > > unsigned long address, rmap_t flags, enum rmap_level level) > > { > > ... > > if (unlikely(!folio_test_anon(folio))) { > > VM_WARN_ON_FOLIO(folio_test_large(folio) && > > level != RMAP_LEVEL_PMD, folio); > > } > > ... > > } > > > > It also improves the code’s readability. Currently, all new anonymous > > folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > > that new folios cannot be partially exclusive; they are either entirely > > exclusive or entirely shared. > > > > Suggested-by: David Hildenbrand <david@redhat.com> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > > --- > > mm/memory.c | 8 ++++++++ > > mm/swapfile.c | 13 +++++++++++-- > > 2 files changed, 19 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 1f24ecdafe05..620654c13b2f 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > if (unlikely(folio != swapcache && swapcache)) { > > folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > > folio_add_lru_vma(folio, vma); > > + } else if (!folio_test_anon(folio)) { > > + /* > > + * We currently only expect small !anon folios, for which we now > > + * that they are either fully exclusive or fully shared. If we > > + * ever get large folios here, we have to be careful. > > + */ > > + VM_WARN_ON_ONCE(folio_test_large(folio)); > > + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > > } else { > > folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > > rmap_flags); > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index ae1d2700f6a3..69efa1a57087 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > > VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > > if (pte_swp_exclusive(old_pte)) > > rmap_flags |= RMAP_EXCLUSIVE; > > - > > - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > > + /* > > + * We currently only expect small !anon folios, for which we now that > > + * they are either fully exclusive or fully shared. If we ever get > > + * large folios here, we have to be careful. > > + */ > > + if (!folio_test_anon(folio)) { > > + VM_WARN_ON_ONCE(folio_test_large(folio)); > > (comment applies to both cases) > > Thinking about Hugh's comment, we should likely add here: > > VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > > [the check we are removing from __folio_add_anon_rmap()] > > and document for folio_add_new_anon_rmap() in patch #1, that when > dealing with folios that might be mapped concurrently by others, the > folio lock must be held. I assume you mean something like the following for patch#1? diff --git a/mm/rmap.c b/mm/rmap.c index df1a43295c85..20986b25f1b2 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio *folio, struct page *page, * * Like folio_add_anon_rmap_*() but must only be called on *new* folios. * This means the inc-and-test can be bypassed. - * The folio does not have to be locked. + * The folio doesn't necessarily need to be locked while it's exclusive unless two threads + * map it concurrently. However, the folio must be locked if it's shared. * * If the folio is pmd-mappable, it is accounted as a THP. */ @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, int nr_pmdmapped = 0; VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); VM_BUG_ON_VMA(address < vma->vm_start || address + (nr << PAGE_SHIFT) > vma->vm_end, vma); __folio_set_swapbacked(folio); > > > + folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); > > + } else { > > + folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > > + } > > } else { /* ksm created a completely new copy */ > > folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); > > folio_add_lru_vma(folio, vma); > > -- > Cheers, > > David / dhildenb > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-20 8:33 ` Barry Song @ 2024-06-20 8:49 ` David Hildenbrand 2024-06-20 9:59 ` Barry Song 0 siblings, 1 reply; 15+ messages in thread From: David Hildenbrand @ 2024-06-20 8:49 UTC (permalink / raw) To: Barry Song Cc: akpm, linux-mm, baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan On 20.06.24 10:33, Barry Song wrote: > On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 18.06.24 01:11, Barry Song wrote: >>> From: Barry Song <v-songbaohua@oppo.com> >>> >>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() >>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will >>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating >>> the process of bringing up mTHP swapin. >>> >>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, >>> struct page *page, int nr_pages, struct vm_area_struct *vma, >>> unsigned long address, rmap_t flags, enum rmap_level level) >>> { >>> ... >>> if (unlikely(!folio_test_anon(folio))) { >>> VM_WARN_ON_FOLIO(folio_test_large(folio) && >>> level != RMAP_LEVEL_PMD, folio); >>> } >>> ... >>> } >>> >>> It also improves the code’s readability. Currently, all new anonymous >>> folios calling folio_add_anon_rmap_ptes() are order-0. This ensures >>> that new folios cannot be partially exclusive; they are either entirely >>> exclusive or entirely shared. >>> >>> Suggested-by: David Hildenbrand <david@redhat.com> >>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> >>> Tested-by: Shuai Yuan <yuanshuai@oppo.com> >>> --- >>> mm/memory.c | 8 ++++++++ >>> mm/swapfile.c | 13 +++++++++++-- >>> 2 files changed, 19 insertions(+), 2 deletions(-) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index 1f24ecdafe05..620654c13b2f 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>> if (unlikely(folio != swapcache && swapcache)) { >>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); >>> folio_add_lru_vma(folio, vma); >>> + } else if (!folio_test_anon(folio)) { >>> + /* >>> + * We currently only expect small !anon folios, for which we now >>> + * that they are either fully exclusive or fully shared. If we >>> + * ever get large folios here, we have to be careful. >>> + */ >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); >>> } else { >>> folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, >>> rmap_flags); >>> diff --git a/mm/swapfile.c b/mm/swapfile.c >>> index ae1d2700f6a3..69efa1a57087 100644 >>> --- a/mm/swapfile.c >>> +++ b/mm/swapfile.c >>> @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, >>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); >>> if (pte_swp_exclusive(old_pte)) >>> rmap_flags |= RMAP_EXCLUSIVE; >>> - >>> - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); >>> + /* >>> + * We currently only expect small !anon folios, for which we now that >>> + * they are either fully exclusive or fully shared. If we ever get >>> + * large folios here, we have to be careful. >>> + */ >>> + if (!folio_test_anon(folio)) { >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >> >> (comment applies to both cases) >> >> Thinking about Hugh's comment, we should likely add here: >> >> VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); >> >> [the check we are removing from __folio_add_anon_rmap()] >> >> and document for folio_add_new_anon_rmap() in patch #1, that when >> dealing with folios that might be mapped concurrently by others, the >> folio lock must be held. > > I assume you mean something like the following for patch#1? > > diff --git a/mm/rmap.c b/mm/rmap.c > index df1a43295c85..20986b25f1b2 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio > *folio, struct page *page, > * > * Like folio_add_anon_rmap_*() but must only be called on *new* folios. > * This means the inc-and-test can be bypassed. > - * The folio does not have to be locked. > + * The folio doesn't necessarily need to be locked while it's > exclusive unless two threads > + * map it concurrently. However, the folio must be locked if it's shared. > * > * If the folio is pmd-mappable, it is accounted as a THP. > */ > @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio > *folio, struct vm_area_struct *vma, > int nr_pmdmapped = 0; > > VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); > + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); For now this would likely do. I was concerned about a concurrent scenario in the exclusive case, but that shouldn't really happen I guess. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-20 8:49 ` David Hildenbrand @ 2024-06-20 9:59 ` Barry Song 2024-06-21 9:18 ` David Hildenbrand 0 siblings, 1 reply; 15+ messages in thread From: Barry Song @ 2024-06-20 9:59 UTC (permalink / raw) To: David Hildenbrand Cc: akpm, linux-mm, baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan On Thu, Jun 20, 2024 at 8:49 PM David Hildenbrand <david@redhat.com> wrote: > > On 20.06.24 10:33, Barry Song wrote: > > On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 18.06.24 01:11, Barry Song wrote: > >>> From: Barry Song <v-songbaohua@oppo.com> > >>> > >>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() > >>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will > >>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating > >>> the process of bringing up mTHP swapin. > >>> > >>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, > >>> struct page *page, int nr_pages, struct vm_area_struct *vma, > >>> unsigned long address, rmap_t flags, enum rmap_level level) > >>> { > >>> ... > >>> if (unlikely(!folio_test_anon(folio))) { > >>> VM_WARN_ON_FOLIO(folio_test_large(folio) && > >>> level != RMAP_LEVEL_PMD, folio); > >>> } > >>> ... > >>> } > >>> > >>> It also improves the code’s readability. Currently, all new anonymous > >>> folios calling folio_add_anon_rmap_ptes() are order-0. This ensures > >>> that new folios cannot be partially exclusive; they are either entirely > >>> exclusive or entirely shared. > >>> > >>> Suggested-by: David Hildenbrand <david@redhat.com> > >>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> > >>> Tested-by: Shuai Yuan <yuanshuai@oppo.com> > >>> --- > >>> mm/memory.c | 8 ++++++++ > >>> mm/swapfile.c | 13 +++++++++++-- > >>> 2 files changed, 19 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/mm/memory.c b/mm/memory.c > >>> index 1f24ecdafe05..620654c13b2f 100644 > >>> --- a/mm/memory.c > >>> +++ b/mm/memory.c > >>> @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > >>> if (unlikely(folio != swapcache && swapcache)) { > >>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); > >>> folio_add_lru_vma(folio, vma); > >>> + } else if (!folio_test_anon(folio)) { > >>> + /* > >>> + * We currently only expect small !anon folios, for which we now > >>> + * that they are either fully exclusive or fully shared. If we > >>> + * ever get large folios here, we have to be careful. > >>> + */ > >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); > >>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); > >>> } else { > >>> folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, > >>> rmap_flags); > >>> diff --git a/mm/swapfile.c b/mm/swapfile.c > >>> index ae1d2700f6a3..69efa1a57087 100644 > >>> --- a/mm/swapfile.c > >>> +++ b/mm/swapfile.c > >>> @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, > >>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); > >>> if (pte_swp_exclusive(old_pte)) > >>> rmap_flags |= RMAP_EXCLUSIVE; > >>> - > >>> - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); > >>> + /* > >>> + * We currently only expect small !anon folios, for which we now that > >>> + * they are either fully exclusive or fully shared. If we ever get > >>> + * large folios here, we have to be careful. > >>> + */ > >>> + if (!folio_test_anon(folio)) { > >>> + VM_WARN_ON_ONCE(folio_test_large(folio)); > >> > >> (comment applies to both cases) > >> > >> Thinking about Hugh's comment, we should likely add here: > >> > >> VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > >> > >> [the check we are removing from __folio_add_anon_rmap()] > >> > >> and document for folio_add_new_anon_rmap() in patch #1, that when > >> dealing with folios that might be mapped concurrently by others, the > >> folio lock must be held. > > > > I assume you mean something like the following for patch#1? > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index df1a43295c85..20986b25f1b2 100644 > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio > > *folio, struct page *page, > > * > > * Like folio_add_anon_rmap_*() but must only be called on *new* folios. > > * This means the inc-and-test can be bypassed. > > - * The folio does not have to be locked. > > + * The folio doesn't necessarily need to be locked while it's > > exclusive unless two threads > > + * map it concurrently. However, the folio must be locked if it's shared. > > * > > * If the folio is pmd-mappable, it is accounted as a THP. > > */ > > @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio > > *folio, struct vm_area_struct *vma, > > int nr_pmdmapped = 0; > > > > VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); > > + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); > > For now this would likely do. I was concerned about a concurrent > scenario in the exclusive case, but that shouldn't really happen I guess. > Since this is primarily a documentation update, I'll wait for two or three days to see if there are any more Linux-next reports before sending v3 combining these fixes together(I've already fixed another doc warn reported by lkp) and seek Andrew's assistance to drop v2 and apply v3. > -- > Cheers, > > David / dhildenb > Thanks Barry ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-20 9:59 ` Barry Song @ 2024-06-21 9:18 ` David Hildenbrand 2024-06-22 3:20 ` Barry Song 0 siblings, 1 reply; 15+ messages in thread From: David Hildenbrand @ 2024-06-21 9:18 UTC (permalink / raw) To: Barry Song Cc: akpm, linux-mm, baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan On 20.06.24 11:59, Barry Song wrote: > On Thu, Jun 20, 2024 at 8:49 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 20.06.24 10:33, Barry Song wrote: >>> On Thu, Jun 20, 2024 at 7:46 PM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> On 18.06.24 01:11, Barry Song wrote: >>>>> From: Barry Song <v-songbaohua@oppo.com> >>>>> >>>>> For the !folio_test_anon(folio) case, we can now invoke folio_add_new_anon_rmap() >>>>> with the rmap flags set to either EXCLUSIVE or non-EXCLUSIVE. This action will >>>>> suppress the VM_WARN_ON_FOLIO check within __folio_add_anon_rmap() while initiating >>>>> the process of bringing up mTHP swapin. >>>>> >>>>> static __always_inline void __folio_add_anon_rmap(struct folio *folio, >>>>> struct page *page, int nr_pages, struct vm_area_struct *vma, >>>>> unsigned long address, rmap_t flags, enum rmap_level level) >>>>> { >>>>> ... >>>>> if (unlikely(!folio_test_anon(folio))) { >>>>> VM_WARN_ON_FOLIO(folio_test_large(folio) && >>>>> level != RMAP_LEVEL_PMD, folio); >>>>> } >>>>> ... >>>>> } >>>>> >>>>> It also improves the code’s readability. Currently, all new anonymous >>>>> folios calling folio_add_anon_rmap_ptes() are order-0. This ensures >>>>> that new folios cannot be partially exclusive; they are either entirely >>>>> exclusive or entirely shared. >>>>> >>>>> Suggested-by: David Hildenbrand <david@redhat.com> >>>>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> >>>>> Tested-by: Shuai Yuan <yuanshuai@oppo.com> >>>>> --- >>>>> mm/memory.c | 8 ++++++++ >>>>> mm/swapfile.c | 13 +++++++++++-- >>>>> 2 files changed, 19 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/mm/memory.c b/mm/memory.c >>>>> index 1f24ecdafe05..620654c13b2f 100644 >>>>> --- a/mm/memory.c >>>>> +++ b/mm/memory.c >>>>> @@ -4339,6 +4339,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) >>>>> if (unlikely(folio != swapcache && swapcache)) { >>>>> folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE); >>>>> folio_add_lru_vma(folio, vma); >>>>> + } else if (!folio_test_anon(folio)) { >>>>> + /* >>>>> + * We currently only expect small !anon folios, for which we now >>>>> + * that they are either fully exclusive or fully shared. If we >>>>> + * ever get large folios here, we have to be careful. >>>>> + */ >>>>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >>>>> + folio_add_new_anon_rmap(folio, vma, address, rmap_flags); >>>>> } else { >>>>> folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, >>>>> rmap_flags); >>>>> diff --git a/mm/swapfile.c b/mm/swapfile.c >>>>> index ae1d2700f6a3..69efa1a57087 100644 >>>>> --- a/mm/swapfile.c >>>>> +++ b/mm/swapfile.c >>>>> @@ -1908,8 +1908,17 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, >>>>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); >>>>> if (pte_swp_exclusive(old_pte)) >>>>> rmap_flags |= RMAP_EXCLUSIVE; >>>>> - >>>>> - folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); >>>>> + /* >>>>> + * We currently only expect small !anon folios, for which we now that >>>>> + * they are either fully exclusive or fully shared. If we ever get >>>>> + * large folios here, we have to be careful. >>>>> + */ >>>>> + if (!folio_test_anon(folio)) { >>>>> + VM_WARN_ON_ONCE(folio_test_large(folio)); >>>> >>>> (comment applies to both cases) >>>> >>>> Thinking about Hugh's comment, we should likely add here: >>>> >>>> VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); >>>> >>>> [the check we are removing from __folio_add_anon_rmap()] >>>> >>>> and document for folio_add_new_anon_rmap() in patch #1, that when >>>> dealing with folios that might be mapped concurrently by others, the >>>> folio lock must be held. >>> >>> I assume you mean something like the following for patch#1? >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> index df1a43295c85..20986b25f1b2 100644 >>> --- a/mm/rmap.c >>> +++ b/mm/rmap.c >>> @@ -1394,7 +1394,8 @@ void folio_add_anon_rmap_pmd(struct folio >>> *folio, struct page *page, >>> * >>> * Like folio_add_anon_rmap_*() but must only be called on *new* folios. >>> * This means the inc-and-test can be bypassed. >>> - * The folio does not have to be locked. >>> + * The folio doesn't necessarily need to be locked while it's >>> exclusive unless two threads >>> + * map it concurrently. However, the folio must be locked if it's shared. >>> * >>> * If the folio is pmd-mappable, it is accounted as a THP. >>> */ >>> @@ -1406,6 +1407,7 @@ void folio_add_new_anon_rmap(struct folio >>> *folio, struct vm_area_struct *vma, >>> int nr_pmdmapped = 0; >>> >>> VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); >>> + VM_WARN_ON_FOLIO(!exclusive && !folio_test_locked(folio), folio); >> >> For now this would likely do. I was concerned about a concurrent >> scenario in the exclusive case, but that shouldn't really happen I guess. >> > > Since this is primarily a documentation update, I'll wait for two or > three days to see if > there are any more Linux-next reports before sending v3 combining these fixes > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > assistance to drop v2 and apply v3. Feel free to send fixup patches for such small stuff (for example, as reply to this mail). Usually, no need for a new series. Andrew will squash all fixups before merging it to mm-stable. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-21 9:18 ` David Hildenbrand @ 2024-06-22 3:20 ` Barry Song 2024-06-24 23:25 ` Andrew Morton 0 siblings, 1 reply; 15+ messages in thread From: Barry Song @ 2024-06-22 3:20 UTC (permalink / raw) To: david, akpm Cc: baolin.wang, chrisl, linux-kernel, linux-mm, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuanshuai, yuzhao > > > > Since this is primarily a documentation update, I'll wait for two or > > three days to see if > > there are any more Linux-next reports before sending v3 combining these fixes > > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > > assistance to drop v2 and apply v3. > > Feel free to send fixup patches for such small stuff (for example, as > reply to this mail). Usually, no need for a new series. Andrew will > squash all fixups before merging it to mm-stable. Hi Andrew, Can you please squash this change(another one suggested by David)? From: Barry Song <v-songbaohua@oppo.com> Date: Sat, 22 Jun 2024 15:14:53 +1200 Subject: [PATCH] enhance doc- mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> --- mm/memory.c | 1 + mm/swapfile.c | 1 + 2 files changed, 2 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 00728ea95583..982d81c83d49 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4346,6 +4346,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * here, we have to be careful. */ VM_WARN_ON_ONCE(folio_test_large(folio)); + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); folio_add_new_anon_rmap(folio, vma, address, rmap_flags); } else { folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, address, diff --git a/mm/swapfile.c b/mm/swapfile.c index b99b9f397c1c..ace2440ec0b7 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1923,6 +1923,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, */ if (!folio_test_anon(folio)) { VM_WARN_ON_ONCE(folio_test_large(folio)); + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); folio_add_new_anon_rmap(folio, vma, addr, rmap_flags); } else { folio_add_anon_rmap_pte(folio, page, vma, addr, rmap_flags); -- 2.34.1 > > -- > Cheers, > > David / dhildenb > Thanks Barry ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-22 3:20 ` Barry Song @ 2024-06-24 23:25 ` Andrew Morton 2024-06-24 23:42 ` Barry Song 0 siblings, 1 reply; 15+ messages in thread From: Andrew Morton @ 2024-06-24 23:25 UTC (permalink / raw) To: Barry Song Cc: david, baolin.wang, chrisl, linux-kernel, linux-mm, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuanshuai, yuzhao On Sat, 22 Jun 2024 15:20:02 +1200 Barry Song <21cnbao@gmail.com> wrote: > > > > > > Since this is primarily a documentation update, I'll wait for two or > > > three days to see if > > > there are any more Linux-next reports before sending v3 combining these fixes > > > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > > > assistance to drop v2 and apply v3. > > > > Feel free to send fixup patches for such small stuff (for example, as > > reply to this mail). Usually, no need for a new series. Andrew will > > squash all fixups before merging it to mm-stable. > > Hi Andrew, > > Can you please squash this change(another one suggested by David)? sure, but... > From: Barry Song <v-songbaohua@oppo.com> > Date: Sat, 22 Jun 2024 15:14:53 +1200 > Subject: [PATCH] enhance doc- mm: use folio_add_new_anon_rmap() if > folio_test_anon(folio)==false The only description we have here is "enhance doc" > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4346,6 +4346,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > * here, we have to be careful. > */ > VM_WARN_ON_ONCE(folio_test_large(folio)); > + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); And these aren't documentation changes. Please send along a small changelog for this patch. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false 2024-06-24 23:25 ` Andrew Morton @ 2024-06-24 23:42 ` Barry Song 0 siblings, 0 replies; 15+ messages in thread From: Barry Song @ 2024-06-24 23:42 UTC (permalink / raw) To: akpm Cc: baolin.wang, chrisl, david, linux-kernel, linux-mm, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuanshuai, yuzhao On Tue, Jun 25, 2024 at 11:25 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Sat, 22 Jun 2024 15:20:02 +1200 Barry Song <21cnbao@gmail.com> wrote: > > > > > > > > > Since this is primarily a documentation update, I'll wait for two or > > > > three days to see if > > > > there are any more Linux-next reports before sending v3 combining these fixes > > > > together(I've already fixed another doc warn reported by lkp) and seek Andrew's > > > > assistance to drop v2 and apply v3. > > > > > > Feel free to send fixup patches for such small stuff (for example, as > > > reply to this mail). Usually, no need for a new series. Andrew will > > > squash all fixups before merging it to mm-stable. > > > > Hi Andrew, > > > > Can you please squash this change(another one suggested by David)? > > sure, but... > > > From: Barry Song <v-songbaohua@oppo.com> > > Date: Sat, 22 Jun 2024 15:14:53 +1200 > > Subject: [PATCH] enhance doc- mm: use folio_add_new_anon_rmap() if > > folio_test_anon(folio)==false > > The only description we have here is "enhance doc" > > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -4346,6 +4346,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > * here, we have to be careful. > > */ > > VM_WARN_ON_ONCE(folio_test_large(folio)); > > + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > > And these aren't documentation changes. Please send along a small > changelog for this patch. Thanks for the suggestion. Could we have this in changelog? For new anon(!anon), there's a possibility that multiple concurrent threads might execute "if (!anon) folio_add_new_anon_rmap()" in parallel. In such cases, the threads should lock the folio before executing this sequence. We use VM_WARN_ON_FOLIO() to verify if this condition holds true. > Thanks Barry ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 3/3] mm: remove folio_test_anon(folio)==false path in __folio_add_anon_rmap() 2024-06-17 23:11 [PATCH v2 0/3] mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap() Barry Song 2024-06-17 23:11 ` [PATCH v2 1/3] mm: extend rmap flags arguments for folio_add_new_anon_rmap Barry Song 2024-06-17 23:11 ` [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false Barry Song @ 2024-06-17 23:11 ` Barry Song 2024-06-18 9:55 ` David Hildenbrand 2 siblings, 1 reply; 15+ messages in thread From: Barry Song @ 2024-06-17 23:11 UTC (permalink / raw) To: akpm, david, linux-mm Cc: baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan From: Barry Song <v-songbaohua@oppo.com> The folio_test_anon(folio)==false cases has been relocated to folio_add_new_anon_rmap(). Additionally, four other callers consistently pass anonymous folios. stack 1: remove_migration_pmd -> folio_add_anon_rmap_pmd -> __folio_add_anon_rmap stack 2: __split_huge_pmd_locked -> folio_add_anon_rmap_ptes -> __folio_add_anon_rmap stack 3: remove_migration_pmd -> folio_add_anon_rmap_pmd -> __folio_add_anon_rmap (RMAP_LEVEL_PMD) stack 4: try_to_merge_one_page -> replace_page -> folio_add_anon_rmap_pte -> __folio_add_anon_rmap __folio_add_anon_rmap() only needs to handle the cases folio_test_anon(folio)==true now. We can remove the !folio_test_anon(folio)) path within __folio_add_anon_rmap() now. Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> Tested-by: Shuai Yuan <yuanshuai@oppo.com> --- mm/rmap.c | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 2b19bb92eda5..ddcdda752982 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1297,23 +1297,12 @@ static __always_inline void __folio_add_anon_rmap(struct folio *folio, { int i, nr, nr_pmdmapped = 0; + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + nr = __folio_add_rmap(folio, page, nr_pages, level, &nr_pmdmapped); - if (unlikely(!folio_test_anon(folio))) { - VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); - /* - * For a PTE-mapped large folio, we only know that the single - * PTE is exclusive. Further, __folio_set_anon() might not get - * folio->index right when not given the address of the head - * page. - */ - VM_WARN_ON_FOLIO(folio_test_large(folio) && - level != RMAP_LEVEL_PMD, folio); - __folio_set_anon(folio, vma, address, - !!(flags & RMAP_EXCLUSIVE)); - } else if (likely(!folio_test_ksm(folio))) { + if (likely(!folio_test_ksm(folio))) __page_check_anon_rmap(folio, page, vma, address); - } __folio_mod_stat(folio, nr, nr_pmdmapped); -- 2.34.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 3/3] mm: remove folio_test_anon(folio)==false path in __folio_add_anon_rmap() 2024-06-17 23:11 ` [PATCH v2 3/3] mm: remove folio_test_anon(folio)==false path in __folio_add_anon_rmap() Barry Song @ 2024-06-18 9:55 ` David Hildenbrand 0 siblings, 0 replies; 15+ messages in thread From: David Hildenbrand @ 2024-06-18 9:55 UTC (permalink / raw) To: Barry Song, akpm, linux-mm Cc: baolin.wang, chrisl, linux-kernel, mhocko, ryan.roberts, shy828301, surenb, v-songbaohua, willy, ying.huang, yosryahmed, yuzhao, Shuai Yuan On 18.06.24 01:11, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > The folio_test_anon(folio)==false cases has been relocated to > folio_add_new_anon_rmap(). Additionally, four other callers > consistently pass anonymous folios. > > stack 1: > remove_migration_pmd > -> folio_add_anon_rmap_pmd > -> __folio_add_anon_rmap > > stack 2: > __split_huge_pmd_locked > -> folio_add_anon_rmap_ptes > -> __folio_add_anon_rmap > > stack 3: > remove_migration_pmd > -> folio_add_anon_rmap_pmd > -> __folio_add_anon_rmap (RMAP_LEVEL_PMD) > > stack 4: > try_to_merge_one_page > -> replace_page > -> folio_add_anon_rmap_pte > -> __folio_add_anon_rmap > > __folio_add_anon_rmap() only needs to handle the cases > folio_test_anon(folio)==true now. > We can remove the !folio_test_anon(folio)) path within > __folio_add_anon_rmap() now. > > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Tested-by: Shuai Yuan <yuanshuai@oppo.com> > --- > mm/rmap.c | 17 +++-------------- > 1 file changed, 3 insertions(+), 14 deletions(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index 2b19bb92eda5..ddcdda752982 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1297,23 +1297,12 @@ static __always_inline void __folio_add_anon_rmap(struct folio *folio, > { > int i, nr, nr_pmdmapped = 0; > > + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); > + > nr = __folio_add_rmap(folio, page, nr_pages, level, &nr_pmdmapped); > > - if (unlikely(!folio_test_anon(folio))) { > - VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); > - /* > - * For a PTE-mapped large folio, we only know that the single > - * PTE is exclusive. Further, __folio_set_anon() might not get > - * folio->index right when not given the address of the head > - * page. > - */ > - VM_WARN_ON_FOLIO(folio_test_large(folio) && > - level != RMAP_LEVEL_PMD, folio); > - __folio_set_anon(folio, vma, address, > - !!(flags & RMAP_EXCLUSIVE)); > - } else if (likely(!folio_test_ksm(folio))) { > + if (likely(!folio_test_ksm(folio))) > __page_check_anon_rmap(folio, page, vma, address); > - } > > __folio_mod_stat(folio, nr, nr_pmdmapped); > Lovely! Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-06-24 23:43 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-06-17 23:11 [PATCH v2 0/3] mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap() Barry Song 2024-06-17 23:11 ` [PATCH v2 1/3] mm: extend rmap flags arguments for folio_add_new_anon_rmap Barry Song 2024-06-22 3:02 ` Barry Song 2024-06-17 23:11 ` [PATCH v2 2/3] mm: use folio_add_new_anon_rmap() if folio_test_anon(folio)==false Barry Song 2024-06-18 9:54 ` David Hildenbrand 2024-06-20 7:46 ` David Hildenbrand 2024-06-20 8:33 ` Barry Song 2024-06-20 8:49 ` David Hildenbrand 2024-06-20 9:59 ` Barry Song 2024-06-21 9:18 ` David Hildenbrand 2024-06-22 3:20 ` Barry Song 2024-06-24 23:25 ` Andrew Morton 2024-06-24 23:42 ` Barry Song 2024-06-17 23:11 ` [PATCH v2 3/3] mm: remove folio_test_anon(folio)==false path in __folio_add_anon_rmap() Barry Song 2024-06-18 9:55 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox