* [PATCH 0/4] mm: mremap: fix move page tables
@ 2023-07-31 7:48 Kefeng Wang
2023-07-31 7:48 ` [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() Kefeng Wang
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Kefeng Wang @ 2023-07-31 7:48 UTC (permalink / raw)
To: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz,
Muchun Song, Mina Almasry, kirill, joel, william.kucharski,
kaleshsingh, linux-mm
Cc: linux-arm-kernel, linux-kernel, Kefeng Wang
The first three patches use correct flush tlb functions when move page
tables, and patch 4 is a small optimization for hugepage on arm64.
Kefeng Wang (4):
mm: hugetlb: use flush_hugetlb_tlb_range() in
move_hugetlb_page_tables()
mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd()
mm: mremap: use flush_pud_tlb_range in move_normal_pud()
arm64: tlb: set huge page size to stride for hugepage
arch/arm64/include/asm/tlbflush.h | 21 +++++++++++----------
mm/hugetlb.c | 4 ++--
mm/mremap.c | 4 ++--
3 files changed, 15 insertions(+), 14 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() 2023-07-31 7:48 [PATCH 0/4] mm: mremap: fix move page tables Kefeng Wang @ 2023-07-31 7:48 ` Kefeng Wang 2023-07-31 23:40 ` Mike Kravetz 2023-08-01 2:06 ` Muchun Song 2023-07-31 7:48 ` [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() Kefeng Wang ` (2 subsequent siblings) 3 siblings, 2 replies; 20+ messages in thread From: Kefeng Wang @ 2023-07-31 7:48 UTC (permalink / raw) To: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm Cc: linux-arm-kernel, linux-kernel, Kefeng Wang Archs may need to do special things when flushing hugepage tlb, so use the more applicable flush_hugetlb_tlb_range() instead of flush_tlb_range(). Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- mm/hugetlb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 64a3239b6407..ac876bfba340 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5281,9 +5281,9 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, } if (shared_pmd) - flush_tlb_range(vma, range.start, range.end); + flush_hugetlb_tlb_range(vma, range.start, range.end); else - flush_tlb_range(vma, old_end - len, old_end); + flush_hugetlb_tlb_range(vma, old_end - len, old_end); mmu_notifier_invalidate_range_end(&range); i_mmap_unlock_write(mapping); hugetlb_vma_unlock_write(vma); -- 2.41.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() 2023-07-31 7:48 ` [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() Kefeng Wang @ 2023-07-31 23:40 ` Mike Kravetz 2023-08-03 0:26 ` Mina Almasry 2023-08-01 2:06 ` Muchun Song 1 sibling, 1 reply; 20+ messages in thread From: Mike Kravetz @ 2023-07-31 23:40 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Catalin Marinas, Will Deacon, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On 07/31/23 15:48, Kefeng Wang wrote: > Archs may need to do special things when flushing hugepage tlb, > so use the more applicable flush_hugetlb_tlb_range() instead of > flush_tlb_range(). > > Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma") > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Thanks! Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Although, I missed this in 550a7d60bd5e :( Looks like only powerpc provides an arch specific flush_hugetlb_tlb_range today. -- Mike Kravetz > --- > mm/hugetlb.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 64a3239b6407..ac876bfba340 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5281,9 +5281,9 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, > } > > if (shared_pmd) > - flush_tlb_range(vma, range.start, range.end); > + flush_hugetlb_tlb_range(vma, range.start, range.end); > else > - flush_tlb_range(vma, old_end - len, old_end); > + flush_hugetlb_tlb_range(vma, old_end - len, old_end); > mmu_notifier_invalidate_range_end(&range); > i_mmap_unlock_write(mapping); > hugetlb_vma_unlock_write(vma); > -- > 2.41.0 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() 2023-07-31 23:40 ` Mike Kravetz @ 2023-08-03 0:26 ` Mina Almasry 0 siblings, 0 replies; 20+ messages in thread From: Mina Almasry @ 2023-08-03 0:26 UTC (permalink / raw) To: Mike Kravetz, James Houghton Cc: Kefeng Wang, Andrew Morton, Catalin Marinas, Will Deacon, Muchun Song, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On Mon, Jul 31, 2023 at 4:40 PM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > On 07/31/23 15:48, Kefeng Wang wrote: > > Archs may need to do special things when flushing hugepage tlb, > > so use the more applicable flush_hugetlb_tlb_range() instead of > > flush_tlb_range(). > > > > Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma") > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > > Thanks! > > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > Sorry for jumping in late, but given the concerns raised around HGM and the deviation between hugetlb and the rest of MM, does it make sense to try to make an incremental effort towards avoiding hugetlb specialization? In the context of this patch, I would prefer that the arch upgrade flush_tlb_range() to handle hugetlb correctly, instead of adding more hugetlb specific deviations, ala flush_hugetlb_tlb_range. While it's at it, maybe replace flush_hugetlb_tlb_range() in the code with flush_tlb_range(). Although, I don't have the expertise to judge if upgrading flush_tlb_range() to handle hugetlb is easy or feasible at all. > Although, I missed this in 550a7d60bd5e :( > > Looks like only powerpc provides an arch specific flush_hugetlb_tlb_range > today. > -- > Mike Kravetz > > > --- > > mm/hugetlb.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 64a3239b6407..ac876bfba340 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -5281,9 +5281,9 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, > > } > > > > if (shared_pmd) > > - flush_tlb_range(vma, range.start, range.end); > > + flush_hugetlb_tlb_range(vma, range.start, range.end); > > else > > - flush_tlb_range(vma, old_end - len, old_end); > > + flush_hugetlb_tlb_range(vma, old_end - len, old_end); > > mmu_notifier_invalidate_range_end(&range); > > i_mmap_unlock_write(mapping); > > hugetlb_vma_unlock_write(vma); > > -- > > 2.41.0 > > -- Thanks, Mina ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() 2023-07-31 7:48 ` [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() Kefeng Wang 2023-07-31 23:40 ` Mike Kravetz @ 2023-08-01 2:06 ` Muchun Song 1 sibling, 0 replies; 20+ messages in thread From: Muchun Song @ 2023-08-01 2:06 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel > On Jul 31, 2023, at 15:48, Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > Archs may need to do special things when flushing hugepage tlb, > so use the more applicable flush_hugetlb_tlb_range() instead of > flush_tlb_range(). > > Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma") > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Muchun Song <songmuchun@bytedance.com> ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() 2023-07-31 7:48 [PATCH 0/4] mm: mremap: fix move page tables Kefeng Wang 2023-07-31 7:48 ` [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() Kefeng Wang @ 2023-07-31 7:48 ` Kefeng Wang 2023-07-31 11:05 ` Catalin Marinas ` (2 more replies) 2023-07-31 7:48 ` [PATCH 3/4] mm: mremap: use flush_pud_tlb_range in move_normal_pud() Kefeng Wang 2023-07-31 7:48 ` [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage Kefeng Wang 3 siblings, 3 replies; 20+ messages in thread From: Kefeng Wang @ 2023-07-31 7:48 UTC (permalink / raw) To: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm Cc: linux-arm-kernel, linux-kernel, Kefeng Wang Archs may need to do special things when flushing thp tlb, so use the more applicable flush_pud_tlb_range() instead of flush_tlb_range(). Fixes: 2c91bd4a4e2e ("mm: speed up mremap by 20x on large regions") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- mm/mremap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/mremap.c b/mm/mremap.c index 11e06e4ab33b..1883205fa22b 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -284,7 +284,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, VM_BUG_ON(!pmd_none(*new_pmd)); pmd_populate(mm, new_pmd, pmd_pgtable(pmd)); - flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE); + flush_pmd_tlb_range(vma, old_addr, old_addr + PMD_SIZE); if (new_ptl != old_ptl) spin_unlock(new_ptl); spin_unlock(old_ptl); -- 2.41.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() 2023-07-31 7:48 ` [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() Kefeng Wang @ 2023-07-31 11:05 ` Catalin Marinas 2023-07-31 11:20 ` Kefeng Wang 2023-07-31 13:58 ` kernel test robot 2023-07-31 21:43 ` kernel test robot 2 siblings, 1 reply; 20+ messages in thread From: Catalin Marinas @ 2023-07-31 11:05 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On Mon, Jul 31, 2023 at 03:48:27PM +0800, Kefeng Wang wrote: > Archs may need to do special things when flushing thp tlb, > so use the more applicable flush_pud_tlb_range() instead of > flush_tlb_range(). > > Fixes: 2c91bd4a4e2e ("mm: speed up mremap by 20x on large regions") > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > mm/mremap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/mremap.c b/mm/mremap.c > index 11e06e4ab33b..1883205fa22b 100644 > --- a/mm/mremap.c > +++ b/mm/mremap.c > @@ -284,7 +284,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, > VM_BUG_ON(!pmd_none(*new_pmd)); > > pmd_populate(mm, new_pmd, pmd_pgtable(pmd)); > - flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE); > + flush_pmd_tlb_range(vma, old_addr, old_addr + PMD_SIZE); I don't think that's correct for arm64. The assumption in the flush_p*d_tlb_range() was that they are called only for block mappings at that p*d level (and we use FEAT_TTL on arm64 indicating that the leaf level is level 2 for pmd, 1 for pud). IIUC move_normal_pmd() is only called for table pmds which would have a leaf level of 3 (the pte). Same for the next patch doing the equivalent for the pud. -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() 2023-07-31 11:05 ` Catalin Marinas @ 2023-07-31 11:20 ` Kefeng Wang 0 siblings, 0 replies; 20+ messages in thread From: Kefeng Wang @ 2023-07-31 11:20 UTC (permalink / raw) To: Catalin Marinas Cc: Andrew Morton, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On 2023/7/31 19:05, Catalin Marinas wrote: > On Mon, Jul 31, 2023 at 03:48:27PM +0800, Kefeng Wang wrote: >> Archs may need to do special things when flushing thp tlb, >> so use the more applicable flush_pud_tlb_range() instead of >> flush_tlb_range(). >> >> Fixes: 2c91bd4a4e2e ("mm: speed up mremap by 20x on large regions") >> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >> --- >> mm/mremap.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/mremap.c b/mm/mremap.c >> index 11e06e4ab33b..1883205fa22b 100644 >> --- a/mm/mremap.c >> +++ b/mm/mremap.c >> @@ -284,7 +284,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, >> VM_BUG_ON(!pmd_none(*new_pmd)); >> >> pmd_populate(mm, new_pmd, pmd_pgtable(pmd)); >> - flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE); >> + flush_pmd_tlb_range(vma, old_addr, old_addr + PMD_SIZE); > > I don't think that's correct for arm64. The assumption in the > flush_p*d_tlb_range() was that they are called only for block mappings > at that p*d level (and we use FEAT_TTL on arm64 indicating that the leaf > level is level 2 for pmd, 1 for pud). IIUC move_normal_pmd() is only > called for table pmds which would have a leaf level of 3 (the pte). oops, yes, this is for NORMAL_PMD case, not HPAGE_PMD, please ignore patch 2/3. > > Same for the next patch doing the equivalent for the pud. > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() 2023-07-31 7:48 ` [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() Kefeng Wang 2023-07-31 11:05 ` Catalin Marinas @ 2023-07-31 13:58 ` kernel test robot 2023-07-31 21:43 ` kernel test robot 2 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2023-07-31 13:58 UTC (permalink / raw) To: Kefeng Wang, Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh Cc: oe-kbuild-all, Linux Memory Management List, linux-arm-kernel, linux-kernel, Kefeng Wang Hi Kefeng, kernel test robot noticed the following build errors: [auto build test ERROR on arm64/for-next/core] [also build test ERROR on arm-perf/for-next/perf linus/master v6.5-rc4 next-20230731] [cannot apply to akpm-mm/mm-everything] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Kefeng-Wang/mm-hugetlb-use-flush_hugetlb_tlb_range-in-move_hugetlb_page_tables/20230731-154016 base: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core patch link: https://lore.kernel.org/r/20230731074829.79309-3-wangkefeng.wang%40huawei.com patch subject: [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20230731/202307312137.ormxuS5g-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce: (https://download.01.org/0day-ci/archive/20230731/202307312137.ormxuS5g-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202307312137.ormxuS5g-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from <command-line>: In function 'move_normal_pmd', inlined from 'move_pgt_entry' at mm/mremap.c:463:11, inlined from 'move_page_tables' at mm/mremap.c:565:8: >> include/linux/compiler_types.h:397:45: error: call to '__compiletime_assert_338' declared with attribute error: BUILD_BUG failed 397 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ include/linux/compiler_types.h:378:25: note: in definition of macro '__compiletime_assert' 378 | prefix ## suffix(); \ | ^~~~~~ include/linux/compiler_types.h:397:9: note: in expansion of macro '_compiletime_assert' 397 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:59:21: note: in expansion of macro 'BUILD_BUG_ON_MSG' 59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") | ^~~~~~~~~~~~~~~~ include/linux/pgtable.h:1415:49: note: in expansion of macro 'BUILD_BUG' 1415 | #define flush_pmd_tlb_range(vma, addr, end) BUILD_BUG() | ^~~~~~~~~ mm/mremap.c:287:9: note: in expansion of macro 'flush_pmd_tlb_range' 287 | flush_pmd_tlb_range(vma, old_addr, old_addr + PMD_SIZE); | ^~~~~~~~~~~~~~~~~~~ vim +/__compiletime_assert_338 +397 include/linux/compiler_types.h eb5c2d4b45e3d2 Will Deacon 2020-07-21 383 eb5c2d4b45e3d2 Will Deacon 2020-07-21 384 #define _compiletime_assert(condition, msg, prefix, suffix) \ eb5c2d4b45e3d2 Will Deacon 2020-07-21 385 __compiletime_assert(condition, msg, prefix, suffix) eb5c2d4b45e3d2 Will Deacon 2020-07-21 386 eb5c2d4b45e3d2 Will Deacon 2020-07-21 387 /** eb5c2d4b45e3d2 Will Deacon 2020-07-21 388 * compiletime_assert - break build and emit msg if condition is false eb5c2d4b45e3d2 Will Deacon 2020-07-21 389 * @condition: a compile-time constant condition to check eb5c2d4b45e3d2 Will Deacon 2020-07-21 390 * @msg: a message to emit if condition is false eb5c2d4b45e3d2 Will Deacon 2020-07-21 391 * eb5c2d4b45e3d2 Will Deacon 2020-07-21 392 * In tradition of POSIX assert, this macro will break the build if the eb5c2d4b45e3d2 Will Deacon 2020-07-21 393 * supplied condition is *false*, emitting the supplied error message if the eb5c2d4b45e3d2 Will Deacon 2020-07-21 394 * compiler has support to do so. eb5c2d4b45e3d2 Will Deacon 2020-07-21 395 */ eb5c2d4b45e3d2 Will Deacon 2020-07-21 396 #define compiletime_assert(condition, msg) \ eb5c2d4b45e3d2 Will Deacon 2020-07-21 @397 _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) eb5c2d4b45e3d2 Will Deacon 2020-07-21 398 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() 2023-07-31 7:48 ` [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() Kefeng Wang 2023-07-31 11:05 ` Catalin Marinas 2023-07-31 13:58 ` kernel test robot @ 2023-07-31 21:43 ` kernel test robot 2 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2023-07-31 21:43 UTC (permalink / raw) To: Kefeng Wang, Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh Cc: llvm, oe-kbuild-all, Linux Memory Management List, linux-arm-kernel, linux-kernel, Kefeng Wang Hi Kefeng, kernel test robot noticed the following build errors: [auto build test ERROR on arm64/for-next/core] [also build test ERROR on arm-perf/for-next/perf linus/master v6.5-rc4] [cannot apply to akpm-mm/mm-everything] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Kefeng-Wang/mm-hugetlb-use-flush_hugetlb_tlb_range-in-move_hugetlb_page_tables/20230731-154016 base: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core patch link: https://lore.kernel.org/r/20230731074829.79309-3-wangkefeng.wang%40huawei.com patch subject: [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() config: x86_64-randconfig-x003-20230731 (https://download.01.org/0day-ci/archive/20230801/202308010553.KxefZFdO-lkp@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce: (https://download.01.org/0day-ci/archive/20230801/202308010553.KxefZFdO-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202308010553.KxefZFdO-lkp@intel.com/ All errors (new ones prefixed by >>): >> ld.lld: error: call to __compiletime_assert_860 marked "dontcall-error": BUILD_BUG failed -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/4] mm: mremap: use flush_pud_tlb_range in move_normal_pud() 2023-07-31 7:48 [PATCH 0/4] mm: mremap: fix move page tables Kefeng Wang 2023-07-31 7:48 ` [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() Kefeng Wang 2023-07-31 7:48 ` [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() Kefeng Wang @ 2023-07-31 7:48 ` Kefeng Wang 2023-07-31 16:42 ` kernel test robot 2023-07-31 7:48 ` [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage Kefeng Wang 3 siblings, 1 reply; 20+ messages in thread From: Kefeng Wang @ 2023-07-31 7:48 UTC (permalink / raw) To: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm Cc: linux-arm-kernel, linux-kernel, Kefeng Wang Archs may need to do special things when flushing thp tlb, so use the more applicable flush_pud_tlb_range() instead of flush_tlb_range(). Fixes: c49dd3401802 ("mm: speedup mremap on 1GB or larger regions") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- mm/mremap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/mremap.c b/mm/mremap.c index 1883205fa22b..25114e56901f 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -333,7 +333,7 @@ static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, VM_BUG_ON(!pud_none(*new_pud)); pud_populate(mm, new_pud, pud_pgtable(pud)); - flush_tlb_range(vma, old_addr, old_addr + PUD_SIZE); + flush_pud_tlb_range(vma, old_addr, old_addr + PUD_SIZE); if (new_ptl != old_ptl) spin_unlock(new_ptl); spin_unlock(old_ptl); -- 2.41.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 3/4] mm: mremap: use flush_pud_tlb_range in move_normal_pud() 2023-07-31 7:48 ` [PATCH 3/4] mm: mremap: use flush_pud_tlb_range in move_normal_pud() Kefeng Wang @ 2023-07-31 16:42 ` kernel test robot 0 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2023-07-31 16:42 UTC (permalink / raw) To: Kefeng Wang, Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh Cc: oe-kbuild-all, Linux Memory Management List, linux-arm-kernel, linux-kernel, Kefeng Wang Hi Kefeng, kernel test robot noticed the following build errors: [auto build test ERROR on arm64/for-next/core] [also build test ERROR on arm-perf/for-next/perf linus/master v6.5-rc4 next-20230731] [cannot apply to akpm-mm/mm-everything] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Kefeng-Wang/mm-hugetlb-use-flush_hugetlb_tlb_range-in-move_hugetlb_page_tables/20230731-154016 base: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core patch link: https://lore.kernel.org/r/20230731074829.79309-4-wangkefeng.wang%40huawei.com patch subject: [PATCH 3/4] mm: mremap: use flush_pud_tlb_range in move_normal_pud() config: riscv-allmodconfig (https://download.01.org/0day-ci/archive/20230801/202308010022.uY01vAew-lkp@intel.com/config) compiler: riscv64-linux-gcc (GCC) 12.3.0 reproduce: (https://download.01.org/0day-ci/archive/20230801/202308010022.uY01vAew-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202308010022.uY01vAew-lkp@intel.com/ All errors (new ones prefixed by >>): mm/mremap.c: In function 'move_normal_pud': >> mm/mremap.c:336:9: error: implicit declaration of function 'flush_pud_tlb_range'; did you mean 'flush_pmd_tlb_range'? [-Werror=implicit-function-declaration] 336 | flush_pud_tlb_range(vma, old_addr, old_addr + PUD_SIZE); | ^~~~~~~~~~~~~~~~~~~ | flush_pmd_tlb_range cc1: some warnings being treated as errors vim +336 mm/mremap.c 302 303 #if CONFIG_PGTABLE_LEVELS > 2 && defined(CONFIG_HAVE_MOVE_PUD) 304 static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, 305 unsigned long new_addr, pud_t *old_pud, pud_t *new_pud) 306 { 307 spinlock_t *old_ptl, *new_ptl; 308 struct mm_struct *mm = vma->vm_mm; 309 pud_t pud; 310 311 if (!arch_supports_page_table_move()) 312 return false; 313 /* 314 * The destination pud shouldn't be established, free_pgtables() 315 * should have released it. 316 */ 317 if (WARN_ON_ONCE(!pud_none(*new_pud))) 318 return false; 319 320 /* 321 * We don't have to worry about the ordering of src and dst 322 * ptlocks because exclusive mmap_lock prevents deadlock. 323 */ 324 old_ptl = pud_lock(vma->vm_mm, old_pud); 325 new_ptl = pud_lockptr(mm, new_pud); 326 if (new_ptl != old_ptl) 327 spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING); 328 329 /* Clear the pud */ 330 pud = *old_pud; 331 pud_clear(old_pud); 332 333 VM_BUG_ON(!pud_none(*new_pud)); 334 335 pud_populate(mm, new_pud, pud_pgtable(pud)); > 336 flush_pud_tlb_range(vma, old_addr, old_addr + PUD_SIZE); 337 if (new_ptl != old_ptl) 338 spin_unlock(new_ptl); 339 spin_unlock(old_ptl); 340 341 return true; 342 } 343 #else 344 static inline bool move_normal_pud(struct vm_area_struct *vma, 345 unsigned long old_addr, unsigned long new_addr, pud_t *old_pud, 346 pud_t *new_pud) 347 { 348 return false; 349 } 350 #endif 351 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 7:48 [PATCH 0/4] mm: mremap: fix move page tables Kefeng Wang ` (2 preceding siblings ...) 2023-07-31 7:48 ` [PATCH 3/4] mm: mremap: use flush_pud_tlb_range in move_normal_pud() Kefeng Wang @ 2023-07-31 7:48 ` Kefeng Wang 2023-07-31 8:33 ` Barry Song 2023-07-31 11:11 ` Catalin Marinas 3 siblings, 2 replies; 20+ messages in thread From: Kefeng Wang @ 2023-07-31 7:48 UTC (permalink / raw) To: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm Cc: linux-arm-kernel, linux-kernel, Kefeng Wang It is better to use huge_page_size() for hugepage(HugeTLB) instead of PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), it could reduce the loop in __flush_tlb_range(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 412a3b9a3c25..25e35e6f8093 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, dsb(ish); } -static inline void flush_tlb_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end) -{ - /* - * We cannot use leaf-only invalidation here, since we may be invalidating - * table entries as part of collapsing hugepages or moving page tables. - * Set the tlb_level to 0 because we can not get enough information here. - */ - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); -} +/* + * We cannot use leaf-only invalidation here, since we may be invalidating + * table entries as part of collapsing hugepages or moving page tables. + * Set the tlb_level to 0 because we can not get enough information here. + */ +#define flush_tlb_range(vma, start, end) \ + __flush_tlb_range(vma, start, end, \ + ((vma)->vm_flags & VM_HUGETLB) \ + ? huge_page_size(hstate_vma(vma)) \ + : PAGE_SIZE, false, 0) + static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) { -- 2.41.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 7:48 ` [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage Kefeng Wang @ 2023-07-31 8:33 ` Barry Song 2023-07-31 8:43 ` Barry Song 2023-07-31 11:11 ` Catalin Marinas 1 sibling, 1 reply; 20+ messages in thread From: Barry Song @ 2023-07-31 8:33 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On Mon, Jul 31, 2023 at 4:14 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > It is better to use huge_page_size() for hugepage(HugeTLB) instead of > PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), > it could reduce the loop in __flush_tlb_range(). > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- > 1 file changed, 11 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > index 412a3b9a3c25..25e35e6f8093 100644 > --- a/arch/arm64/include/asm/tlbflush.h > +++ b/arch/arm64/include/asm/tlbflush.h > @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > dsb(ish); > } > > -static inline void flush_tlb_range(struct vm_area_struct *vma, > - unsigned long start, unsigned long end) > -{ > - /* > - * We cannot use leaf-only invalidation here, since we may be invalidating > - * table entries as part of collapsing hugepages or moving page tables. > - * Set the tlb_level to 0 because we can not get enough information here. > - */ > - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); > -} > +/* > + * We cannot use leaf-only invalidation here, since we may be invalidating > + * table entries as part of collapsing hugepages or moving page tables. > + * Set the tlb_level to 0 because we can not get enough information here. > + */ > +#define flush_tlb_range(vma, start, end) \ > + __flush_tlb_range(vma, start, end, \ > + ((vma)->vm_flags & VM_HUGETLB) \ > + ? huge_page_size(hstate_vma(vma)) \ > + : PAGE_SIZE, false, 0) > + seems like a good idea. I wonder if a better implementation will be MMU_GATHER_PAGE_SIZE, in this case, we are going to support stride for other large folios as well, such as thp. > > static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) > { > -- > 2.41.0 > Thanks Barry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 8:33 ` Barry Song @ 2023-07-31 8:43 ` Barry Song 2023-07-31 9:28 ` Kefeng Wang 0 siblings, 1 reply; 20+ messages in thread From: Barry Song @ 2023-07-31 8:43 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On Mon, Jul 31, 2023 at 4:33 PM Barry Song <21cnbao@gmail.com> wrote: > > On Mon, Jul 31, 2023 at 4:14 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > > > It is better to use huge_page_size() for hugepage(HugeTLB) instead of > > PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), > > it could reduce the loop in __flush_tlb_range(). > > > > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > > --- > > arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- > > 1 file changed, 11 insertions(+), 10 deletions(-) > > > > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > > index 412a3b9a3c25..25e35e6f8093 100644 > > --- a/arch/arm64/include/asm/tlbflush.h > > +++ b/arch/arm64/include/asm/tlbflush.h > > @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > > dsb(ish); > > } > > > > -static inline void flush_tlb_range(struct vm_area_struct *vma, > > - unsigned long start, unsigned long end) > > -{ > > - /* > > - * We cannot use leaf-only invalidation here, since we may be invalidating > > - * table entries as part of collapsing hugepages or moving page tables. > > - * Set the tlb_level to 0 because we can not get enough information here. > > - */ > > - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); > > -} > > +/* > > + * We cannot use leaf-only invalidation here, since we may be invalidating > > + * table entries as part of collapsing hugepages or moving page tables. > > + * Set the tlb_level to 0 because we can not get enough information here. > > + */ > > +#define flush_tlb_range(vma, start, end) \ > > + __flush_tlb_range(vma, start, end, \ > > + ((vma)->vm_flags & VM_HUGETLB) \ > > + ? huge_page_size(hstate_vma(vma)) \ > > + : PAGE_SIZE, false, 0) > > + > > seems like a good idea. > > I wonder if a better implementation will be MMU_GATHER_PAGE_SIZE, in this case, > we are going to support stride for other large folios as well, such as thp. > BTW, in most cases we have already had right stride: arch/arm64/include/asm/tlb.h has already this to get stride: static inline void tlb_flush(struct mmu_gather *tlb) { struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0); bool last_level = !tlb->freed_tables; unsigned long stride = tlb_get_unmap_size(tlb); int tlb_level = tlb_get_level(tlb); /* * If we're tearing down the address space then we only care about * invalidating the walk-cache, since the ASID allocator won't * reallocate our ASID without invalidating the entire TLB. */ if (tlb->fullmm) { if (!last_level) flush_tlb_mm(tlb->mm); return; } __flush_tlb_range(&vma, tlb->start, tlb->end, stride, last_level, tlb_level); } > > > > static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) > > { > > -- > > 2.41.0 > > > > Thanks > Barry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 8:43 ` Barry Song @ 2023-07-31 9:28 ` Kefeng Wang 2023-07-31 10:21 ` Barry Song 0 siblings, 1 reply; 20+ messages in thread From: Kefeng Wang @ 2023-07-31 9:28 UTC (permalink / raw) To: Barry Song Cc: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On 2023/7/31 16:43, Barry Song wrote: > On Mon, Jul 31, 2023 at 4:33 PM Barry Song <21cnbao@gmail.com> wrote: >> >> On Mon, Jul 31, 2023 at 4:14 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: >>> >>> It is better to use huge_page_size() for hugepage(HugeTLB) instead of >>> PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), >>> it could reduce the loop in __flush_tlb_range(). >>> >>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>> --- >>> arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- >>> 1 file changed, 11 insertions(+), 10 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h >>> index 412a3b9a3c25..25e35e6f8093 100644 >>> --- a/arch/arm64/include/asm/tlbflush.h >>> +++ b/arch/arm64/include/asm/tlbflush.h >>> @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, >>> dsb(ish); >>> } >>> >>> -static inline void flush_tlb_range(struct vm_area_struct *vma, >>> - unsigned long start, unsigned long end) >>> -{ >>> - /* >>> - * We cannot use leaf-only invalidation here, since we may be invalidating >>> - * table entries as part of collapsing hugepages or moving page tables. >>> - * Set the tlb_level to 0 because we can not get enough information here. >>> - */ >>> - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); >>> -} >>> +/* >>> + * We cannot use leaf-only invalidation here, since we may be invalidating >>> + * table entries as part of collapsing hugepages or moving page tables. >>> + * Set the tlb_level to 0 because we can not get enough information here. >>> + */ >>> +#define flush_tlb_range(vma, start, end) \ >>> + __flush_tlb_range(vma, start, end, \ >>> + ((vma)->vm_flags & VM_HUGETLB) \ >>> + ? huge_page_size(hstate_vma(vma)) \ >>> + : PAGE_SIZE, false, 0) >>> + >> >> seems like a good idea. >> >> I wonder if a better implementation will be MMU_GATHER_PAGE_SIZE, in this case, >> we are going to support stride for other large folios as well, such as thp. >> > > BTW, in most cases we have already had right stride: > > arch/arm64/include/asm/tlb.h has already this to get stride: MMU_GATHER_PAGE_SIZE works for tlb_flush, but flush_tlb_range() directly called without mmu_gather, see above 3 patches is to use correct flush_[hugetlb/pmd/pud]_tlb_range(also there are some other places, like get_clear_contig_flush/clear_flush on arm64), so enable MMU_GATHER_PAGE_SIZE for arm64 is independent thing, right? > > static inline void tlb_flush(struct mmu_gather *tlb) > { > struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0); > bool last_level = !tlb->freed_tables; > unsigned long stride = tlb_get_unmap_size(tlb); > int tlb_level = tlb_get_level(tlb); > > /* > * If we're tearing down the address space then we only care about > * invalidating the walk-cache, since the ASID allocator won't > * reallocate our ASID without invalidating the entire TLB. > */ > if (tlb->fullmm) { > if (!last_level) > flush_tlb_mm(tlb->mm); > return; > } > > __flush_tlb_range(&vma, tlb->start, tlb->end, stride, > last_level, tlb_level); > } > >>> >>> static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end) >>> { >>> -- >>> 2.41.0 >>> >> >> Thanks >> Barry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 9:28 ` Kefeng Wang @ 2023-07-31 10:21 ` Barry Song 0 siblings, 0 replies; 20+ messages in thread From: Barry Song @ 2023-07-31 10:21 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Catalin Marinas, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On Mon, Jul 31, 2023 at 5:29 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > > > > On 2023/7/31 16:43, Barry Song wrote: > > On Mon, Jul 31, 2023 at 4:33 PM Barry Song <21cnbao@gmail.com> wrote: > >> > >> On Mon, Jul 31, 2023 at 4:14 PM Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > >>> > >>> It is better to use huge_page_size() for hugepage(HugeTLB) instead of > >>> PAGE_SIZE for stride, which has been done in flush_pmd/pud_tlb_range(), > >>> it could reduce the loop in __flush_tlb_range(). > >>> > >>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > >>> --- > >>> arch/arm64/include/asm/tlbflush.h | 21 +++++++++++---------- > >>> 1 file changed, 11 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > >>> index 412a3b9a3c25..25e35e6f8093 100644 > >>> --- a/arch/arm64/include/asm/tlbflush.h > >>> +++ b/arch/arm64/include/asm/tlbflush.h > >>> @@ -360,16 +360,17 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > >>> dsb(ish); > >>> } > >>> > >>> -static inline void flush_tlb_range(struct vm_area_struct *vma, > >>> - unsigned long start, unsigned long end) > >>> -{ > >>> - /* > >>> - * We cannot use leaf-only invalidation here, since we may be invalidating > >>> - * table entries as part of collapsing hugepages or moving page tables. > >>> - * Set the tlb_level to 0 because we can not get enough information here. > >>> - */ > >>> - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); > >>> -} > >>> +/* > >>> + * We cannot use leaf-only invalidation here, since we may be invalidating > >>> + * table entries as part of collapsing hugepages or moving page tables. > >>> + * Set the tlb_level to 0 because we can not get enough information here. > >>> + */ > >>> +#define flush_tlb_range(vma, start, end) \ > >>> + __flush_tlb_range(vma, start, end, \ > >>> + ((vma)->vm_flags & VM_HUGETLB) \ > >>> + ? huge_page_size(hstate_vma(vma)) \ > >>> + : PAGE_SIZE, false, 0) > >>> + > >> > >> seems like a good idea. > >> > >> I wonder if a better implementation will be MMU_GATHER_PAGE_SIZE, in this case, > >> we are going to support stride for other large folios as well, such as thp. > >> > > > > BTW, in most cases we have already had right stride: > > > > arch/arm64/include/asm/tlb.h has already this to get stride: > > MMU_GATHER_PAGE_SIZE works for tlb_flush, but flush_tlb_range() > directly called without mmu_gather, see above 3 patches is to > use correct flush_[hugetlb/pmd/pud]_tlb_range(also there are > some other places, like get_clear_contig_flush/clear_flush on arm64), > so enable MMU_GATHER_PAGE_SIZE for arm64 is independent thing, right? > You are right. I was thinking of those zap_pte/pmd_range cases especially for those vmas where large folios engage. but it is not very relevant. In that case, one vma might have mixed different folio sizes. your patch, for sure, will benefit hugetlb with arm64 contiguous bits. Thanks Barry ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 7:48 ` [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage Kefeng Wang 2023-07-31 8:33 ` Barry Song @ 2023-07-31 11:11 ` Catalin Marinas 2023-07-31 11:27 ` Kefeng Wang 1 sibling, 1 reply; 20+ messages in thread From: Catalin Marinas @ 2023-07-31 11:11 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On Mon, Jul 31, 2023 at 03:48:29PM +0800, Kefeng Wang wrote: > +/* > + * We cannot use leaf-only invalidation here, since we may be invalidating > + * table entries as part of collapsing hugepages or moving page tables. > + * Set the tlb_level to 0 because we can not get enough information here. > + */ > +#define flush_tlb_range(vma, start, end) \ > + __flush_tlb_range(vma, start, end, \ > + ((vma)->vm_flags & VM_HUGETLB) \ > + ? huge_page_size(hstate_vma(vma)) \ > + : PAGE_SIZE, false, 0) This won't work if we use the contiguous PTE to get 64K hugetlb pages on a 4K base page configuration. The 16 base pages in the range would have to be invalidated individually (the contig PTE bit is just a hint, the hardware may or may not take it into account). -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 11:11 ` Catalin Marinas @ 2023-07-31 11:27 ` Kefeng Wang 2023-07-31 13:18 ` Catalin Marinas 0 siblings, 1 reply; 20+ messages in thread From: Kefeng Wang @ 2023-07-31 11:27 UTC (permalink / raw) To: Catalin Marinas Cc: Andrew Morton, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On 2023/7/31 19:11, Catalin Marinas wrote: > On Mon, Jul 31, 2023 at 03:48:29PM +0800, Kefeng Wang wrote: >> +/* >> + * We cannot use leaf-only invalidation here, since we may be invalidating >> + * table entries as part of collapsing hugepages or moving page tables. >> + * Set the tlb_level to 0 because we can not get enough information here. >> + */ >> +#define flush_tlb_range(vma, start, end) \ >> + __flush_tlb_range(vma, start, end, \ >> + ((vma)->vm_flags & VM_HUGETLB) \ >> + ? huge_page_size(hstate_vma(vma)) \ >> + : PAGE_SIZE, false, 0) > > This won't work if we use the contiguous PTE to get 64K hugetlb pages on > a 4K base page configuration. The 16 base pages in the range would have > to be invalidated individually (the contig PTE bit is just a hint, the > hardware may or may not take it into account). Got it, the contig huge page is depended on hardware implementation, but for normal hugepage(2M/1G), we could use this, right? > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage 2023-07-31 11:27 ` Kefeng Wang @ 2023-07-31 13:18 ` Catalin Marinas 0 siblings, 0 replies; 20+ messages in thread From: Catalin Marinas @ 2023-07-31 13:18 UTC (permalink / raw) To: Kefeng Wang Cc: Andrew Morton, Will Deacon, Mike Kravetz, Muchun Song, Mina Almasry, kirill, joel, william.kucharski, kaleshsingh, linux-mm, linux-arm-kernel, linux-kernel On Mon, Jul 31, 2023 at 07:27:14PM +0800, Kefeng Wang wrote: > On 2023/7/31 19:11, Catalin Marinas wrote: > > On Mon, Jul 31, 2023 at 03:48:29PM +0800, Kefeng Wang wrote: > > > +/* > > > + * We cannot use leaf-only invalidation here, since we may be invalidating > > > + * table entries as part of collapsing hugepages or moving page tables. > > > + * Set the tlb_level to 0 because we can not get enough information here. > > > + */ > > > +#define flush_tlb_range(vma, start, end) \ > > > + __flush_tlb_range(vma, start, end, \ > > > + ((vma)->vm_flags & VM_HUGETLB) \ > > > + ? huge_page_size(hstate_vma(vma)) \ > > > + : PAGE_SIZE, false, 0) > > > > This won't work if we use the contiguous PTE to get 64K hugetlb pages on > > a 4K base page configuration. The 16 base pages in the range would have > > to be invalidated individually (the contig PTE bit is just a hint, the > > hardware may or may not take it into account). > > Got it, the contig huge page is depended on hardware implementation, > but for normal hugepage(2M/1G), we could use this, right? Right. Only the pmd/pud cases. -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2023-08-03 0:26 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-07-31 7:48 [PATCH 0/4] mm: mremap: fix move page tables Kefeng Wang 2023-07-31 7:48 ` [PATCH 1/4] mm: hugetlb: use flush_hugetlb_tlb_range() in move_hugetlb_page_tables() Kefeng Wang 2023-07-31 23:40 ` Mike Kravetz 2023-08-03 0:26 ` Mina Almasry 2023-08-01 2:06 ` Muchun Song 2023-07-31 7:48 ` [PATCH 2/4] mm: mremap: use flush_pmd_tlb_range() in move_normal_pmd() Kefeng Wang 2023-07-31 11:05 ` Catalin Marinas 2023-07-31 11:20 ` Kefeng Wang 2023-07-31 13:58 ` kernel test robot 2023-07-31 21:43 ` kernel test robot 2023-07-31 7:48 ` [PATCH 3/4] mm: mremap: use flush_pud_tlb_range in move_normal_pud() Kefeng Wang 2023-07-31 16:42 ` kernel test robot 2023-07-31 7:48 ` [PATCH 4/4] arm64: tlb: set huge page size to stride for hugepage Kefeng Wang 2023-07-31 8:33 ` Barry Song 2023-07-31 8:43 ` Barry Song 2023-07-31 9:28 ` Kefeng Wang 2023-07-31 10:21 ` Barry Song 2023-07-31 11:11 ` Catalin Marinas 2023-07-31 11:27 ` Kefeng Wang 2023-07-31 13:18 ` Catalin Marinas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox