* [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() @ 2024-06-03 3:31 Lance Yang 2024-06-03 3:36 ` Matthew Wilcox 2024-06-03 4:14 ` Barry Song 0 siblings, 2 replies; 7+ messages in thread From: Lance Yang @ 2024-06-03 3:31 UTC (permalink / raw) To: akpm Cc: ryan.roberts, david, 21cnbao, baolin.wang, ziy, fengwei.yin, ying.huang, libang.li, linux-mm, linux-kernel, Lance Yang Let's make folio_mlock_step() simply a wrapper around folio_pte_batch(), which will greatly reduce the cost of ptep_get() when scanning a range of contptes. Signed-off-by: Lance Yang <ioworker0@gmail.com> --- mm/mlock.c | 23 ++++++----------------- 1 file changed, 6 insertions(+), 17 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 30b51cdea89d..1ae6232d38cf 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) static inline unsigned int folio_mlock_step(struct folio *folio, pte_t *pte, unsigned long addr, unsigned long end) { - unsigned int count, i, nr = folio_nr_pages(folio); - unsigned long pfn = folio_pfn(folio); - pte_t ptent = ptep_get(pte); - - if (!folio_test_large(folio)) + if (likely(!folio_test_large(folio))) return 1; - count = pfn + nr - pte_pfn(ptent); - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); - - for (i = 0; i < count; i++, pte++) { - pte_t entry = ptep_get(pte); - - if (!pte_present(entry)) - break; - if (pte_pfn(entry) - pfn >= nr) - break; - } + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; + int max_nr = (end - addr) / PAGE_SIZE; + pte_t ptent = ptep_get(pte); - return i; + return folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags, NULL, + NULL, NULL); } static inline bool allow_mlock_munlock(struct folio *folio, -- 2.33.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() 2024-06-03 3:31 [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() Lance Yang @ 2024-06-03 3:36 ` Matthew Wilcox 2024-06-03 4:13 ` Lance Yang 2024-06-03 4:14 ` Barry Song 1 sibling, 1 reply; 7+ messages in thread From: Matthew Wilcox @ 2024-06-03 3:36 UTC (permalink / raw) To: Lance Yang Cc: akpm, ryan.roberts, david, 21cnbao, baolin.wang, ziy, fengwei.yin, ying.huang, libang.li, linux-mm, linux-kernel On Mon, Jun 03, 2024 at 11:31:17AM +0800, Lance Yang wrote: > { > - unsigned int count, i, nr = folio_nr_pages(folio); > - unsigned long pfn = folio_pfn(folio); > - pte_t ptent = ptep_get(pte); Please don't move type declarations later in the function. Just because you can doesn't mean you should. > - if (!folio_test_large(folio)) > + if (likely(!folio_test_large(folio))) > return 1; How likely is this now? How likely will it be in two years time? Does this actually make any difference in either code generation or performance? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() 2024-06-03 3:36 ` Matthew Wilcox @ 2024-06-03 4:13 ` Lance Yang 0 siblings, 0 replies; 7+ messages in thread From: Lance Yang @ 2024-06-03 4:13 UTC (permalink / raw) To: Matthew Wilcox Cc: akpm, ryan.roberts, david, 21cnbao, baolin.wang, ziy, fengwei.yin, ying.huang, libang.li, linux-mm, linux-kernel Hi Matthew, Thanks for taking time to review! On Mon, Jun 3, 2024 at 11:36 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Jun 03, 2024 at 11:31:17AM +0800, Lance Yang wrote: > > { > > - unsigned int count, i, nr = folio_nr_pages(folio); > > - unsigned long pfn = folio_pfn(folio); > > - pte_t ptent = ptep_get(pte); > > Please don't move type declarations later in the function. Just because > you can doesn't mean you should. Thanks for pointing this out, I'll adjust as you suggested. > > > - if (!folio_test_large(folio)) > > + if (likely(!folio_test_large(folio))) > > return 1; > > How likely is this now? How likely will it be in two years time? > Does this actually make any difference in either code generation or > performance? IMO, this hint could impact code generation and performance :) But it seems that 'likely' is not necessary here. I'll remove it. Thanks again for your time! Lance > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() 2024-06-03 3:31 [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() Lance Yang 2024-06-03 3:36 ` Matthew Wilcox @ 2024-06-03 4:14 ` Barry Song 2024-06-03 4:27 ` Lance Yang 2024-06-03 8:58 ` Baolin Wang 1 sibling, 2 replies; 7+ messages in thread From: Barry Song @ 2024-06-03 4:14 UTC (permalink / raw) To: Lance Yang Cc: akpm, ryan.roberts, david, baolin.wang, ziy, fengwei.yin, ying.huang, libang.li, linux-mm, linux-kernel On Mon, Jun 3, 2024 at 3:31 PM Lance Yang <ioworker0@gmail.com> wrote: > > Let's make folio_mlock_step() simply a wrapper around folio_pte_batch(), > which will greatly reduce the cost of ptep_get() when scanning a range of > contptes. > > Signed-off-by: Lance Yang <ioworker0@gmail.com> > --- > mm/mlock.c | 23 ++++++----------------- > 1 file changed, 6 insertions(+), 17 deletions(-) > > diff --git a/mm/mlock.c b/mm/mlock.c > index 30b51cdea89d..1ae6232d38cf 100644 > --- a/mm/mlock.c > +++ b/mm/mlock.c > @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) > static inline unsigned int folio_mlock_step(struct folio *folio, > pte_t *pte, unsigned long addr, unsigned long end) > { > - unsigned int count, i, nr = folio_nr_pages(folio); > - unsigned long pfn = folio_pfn(folio); > - pte_t ptent = ptep_get(pte); > - > - if (!folio_test_large(folio)) > + if (likely(!folio_test_large(folio))) > return 1; > > - count = pfn + nr - pte_pfn(ptent); > - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); > - > - for (i = 0; i < count; i++, pte++) { > - pte_t entry = ptep_get(pte); > - > - if (!pte_present(entry)) > - break; > - if (pte_pfn(entry) - pfn >= nr) > - break; > - } > + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; > + int max_nr = (end - addr) / PAGE_SIZE; > + pte_t ptent = ptep_get(pte); > > - return i; > + return folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags, NULL, > + NULL, NULL); > } what about a minimum change as below? index 30b51cdea89d..e8b98f84fbd2 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) static inline unsigned int folio_mlock_step(struct folio *folio, pte_t *pte, unsigned long addr, unsigned long end) { - unsigned int count, i, nr = folio_nr_pages(folio); - unsigned long pfn = folio_pfn(folio); + unsigned int count = (end - addr) >> PAGE_SHIFT; pte_t ptent = ptep_get(pte); + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; if (!folio_test_large(folio)) return 1; - count = pfn + nr - pte_pfn(ptent); - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); - - for (i = 0; i < count; i++, pte++) { - pte_t entry = ptep_get(pte); - - if (!pte_present(entry)) - break; - if (pte_pfn(entry) - pfn >= nr) - break; - } - - return i; + return folio_pte_batch(folio, addr, pte, ptent, count, fpb_flags, NULL, + NULL, NULL); } > > static inline bool allow_mlock_munlock(struct folio *folio, > -- > 2.33.1 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() 2024-06-03 4:14 ` Barry Song @ 2024-06-03 4:27 ` Lance Yang 2024-06-03 8:58 ` Baolin Wang 1 sibling, 0 replies; 7+ messages in thread From: Lance Yang @ 2024-06-03 4:27 UTC (permalink / raw) To: Barry Song Cc: akpm, ryan.roberts, david, baolin.wang, ziy, fengwei.yin, ying.huang, libang.li, linux-mm, linux-kernel Hi Barry, Thanks for taking time to review! On Mon, Jun 3, 2024 at 12:14 PM Barry Song <21cnbao@gmail.com> wrote: > > On Mon, Jun 3, 2024 at 3:31 PM Lance Yang <ioworker0@gmail.com> wrote: > > > > Let's make folio_mlock_step() simply a wrapper around folio_pte_batch(), > > which will greatly reduce the cost of ptep_get() when scanning a range of > > contptes. > > > > Signed-off-by: Lance Yang <ioworker0@gmail.com> > > --- > > mm/mlock.c | 23 ++++++----------------- > > 1 file changed, 6 insertions(+), 17 deletions(-) > > > > diff --git a/mm/mlock.c b/mm/mlock.c > > index 30b51cdea89d..1ae6232d38cf 100644 > > --- a/mm/mlock.c > > +++ b/mm/mlock.c > > @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) > > static inline unsigned int folio_mlock_step(struct folio *folio, > > pte_t *pte, unsigned long addr, unsigned long end) > > { > > - unsigned int count, i, nr = folio_nr_pages(folio); > > - unsigned long pfn = folio_pfn(folio); > > - pte_t ptent = ptep_get(pte); > > - > > - if (!folio_test_large(folio)) > > + if (likely(!folio_test_large(folio))) > > return 1; > > > > - count = pfn + nr - pte_pfn(ptent); > > - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); > > - > > - for (i = 0; i < count; i++, pte++) { > > - pte_t entry = ptep_get(pte); > > - > > - if (!pte_present(entry)) > > - break; > > - if (pte_pfn(entry) - pfn >= nr) > > - break; > > - } > > + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; > > + int max_nr = (end - addr) / PAGE_SIZE; > > + pte_t ptent = ptep_get(pte); > > > > - return i; > > + return folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags, NULL, > > + NULL, NULL); > > } > > what about a minimum change as below? Nice, that makes sense to me ;) I'll adjust as you suggested. Thanks again for your time! Lance > index 30b51cdea89d..e8b98f84fbd2 100644 > --- a/mm/mlock.c > +++ b/mm/mlock.c > @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) > static inline unsigned int folio_mlock_step(struct folio *folio, > pte_t *pte, unsigned long addr, unsigned long end) > { > - unsigned int count, i, nr = folio_nr_pages(folio); > - unsigned long pfn = folio_pfn(folio); > + unsigned int count = (end - addr) >> PAGE_SHIFT; > pte_t ptent = ptep_get(pte); > + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; > > if (!folio_test_large(folio)) > return 1; > > - count = pfn + nr - pte_pfn(ptent); > - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); > - > - for (i = 0; i < count; i++, pte++) { > - pte_t entry = ptep_get(pte); > - > - if (!pte_present(entry)) > - break; > - if (pte_pfn(entry) - pfn >= nr) > - break; > - } > - > - return i; > + return folio_pte_batch(folio, addr, pte, ptent, count, fpb_flags, NULL, > + NULL, NULL); > } > > > > > > > static inline bool allow_mlock_munlock(struct folio *folio, > > -- > > 2.33.1 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() 2024-06-03 4:14 ` Barry Song 2024-06-03 4:27 ` Lance Yang @ 2024-06-03 8:58 ` Baolin Wang 2024-06-03 9:03 ` David Hildenbrand 1 sibling, 1 reply; 7+ messages in thread From: Baolin Wang @ 2024-06-03 8:58 UTC (permalink / raw) To: Barry Song, Lance Yang Cc: akpm, ryan.roberts, david, ziy, fengwei.yin, ying.huang, libang.li, linux-mm, linux-kernel On 2024/6/3 12:14, Barry Song wrote: > On Mon, Jun 3, 2024 at 3:31 PM Lance Yang <ioworker0@gmail.com> wrote: >> >> Let's make folio_mlock_step() simply a wrapper around folio_pte_batch(), >> which will greatly reduce the cost of ptep_get() when scanning a range of >> contptes. >> >> Signed-off-by: Lance Yang <ioworker0@gmail.com> >> --- >> mm/mlock.c | 23 ++++++----------------- >> 1 file changed, 6 insertions(+), 17 deletions(-) >> >> diff --git a/mm/mlock.c b/mm/mlock.c >> index 30b51cdea89d..1ae6232d38cf 100644 >> --- a/mm/mlock.c >> +++ b/mm/mlock.c >> @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) >> static inline unsigned int folio_mlock_step(struct folio *folio, >> pte_t *pte, unsigned long addr, unsigned long end) >> { >> - unsigned int count, i, nr = folio_nr_pages(folio); >> - unsigned long pfn = folio_pfn(folio); >> - pte_t ptent = ptep_get(pte); >> - >> - if (!folio_test_large(folio)) >> + if (likely(!folio_test_large(folio))) >> return 1; >> >> - count = pfn + nr - pte_pfn(ptent); >> - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); >> - >> - for (i = 0; i < count; i++, pte++) { >> - pte_t entry = ptep_get(pte); >> - >> - if (!pte_present(entry)) >> - break; >> - if (pte_pfn(entry) - pfn >= nr) >> - break; >> - } >> + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; >> + int max_nr = (end - addr) / PAGE_SIZE; >> + pte_t ptent = ptep_get(pte); >> >> - return i; >> + return folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags, NULL, >> + NULL, NULL); >> } > > what about a minimum change as below? > index 30b51cdea89d..e8b98f84fbd2 100644 > --- a/mm/mlock.c > +++ b/mm/mlock.c > @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) > static inline unsigned int folio_mlock_step(struct folio *folio, > pte_t *pte, unsigned long addr, unsigned long end) > { > - unsigned int count, i, nr = folio_nr_pages(folio); > - unsigned long pfn = folio_pfn(folio); > + unsigned int count = (end - addr) >> PAGE_SHIFT; > pte_t ptent = ptep_get(pte); > + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; > > if (!folio_test_large(folio)) > return 1; > > - count = pfn + nr - pte_pfn(ptent); > - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); > - > - for (i = 0; i < count; i++, pte++) { > - pte_t entry = ptep_get(pte); > - > - if (!pte_present(entry)) > - break; > - if (pte_pfn(entry) - pfn >= nr) > - break; > - } > - > - return i; > + return folio_pte_batch(folio, addr, pte, ptent, count, fpb_flags, NULL, > + NULL, NULL); > } LGTM. Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() 2024-06-03 8:58 ` Baolin Wang @ 2024-06-03 9:03 ` David Hildenbrand 0 siblings, 0 replies; 7+ messages in thread From: David Hildenbrand @ 2024-06-03 9:03 UTC (permalink / raw) To: Baolin Wang, Barry Song, Lance Yang Cc: akpm, ryan.roberts, ziy, fengwei.yin, ying.huang, libang.li, linux-mm, linux-kernel On 03.06.24 10:58, Baolin Wang wrote: > > > On 2024/6/3 12:14, Barry Song wrote: >> On Mon, Jun 3, 2024 at 3:31 PM Lance Yang <ioworker0@gmail.com> wrote: >>> >>> Let's make folio_mlock_step() simply a wrapper around folio_pte_batch(), >>> which will greatly reduce the cost of ptep_get() when scanning a range of >>> contptes. >>> >>> Signed-off-by: Lance Yang <ioworker0@gmail.com> >>> --- >>> mm/mlock.c | 23 ++++++----------------- >>> 1 file changed, 6 insertions(+), 17 deletions(-) >>> >>> diff --git a/mm/mlock.c b/mm/mlock.c >>> index 30b51cdea89d..1ae6232d38cf 100644 >>> --- a/mm/mlock.c >>> +++ b/mm/mlock.c >>> @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) >>> static inline unsigned int folio_mlock_step(struct folio *folio, >>> pte_t *pte, unsigned long addr, unsigned long end) >>> { >>> - unsigned int count, i, nr = folio_nr_pages(folio); >>> - unsigned long pfn = folio_pfn(folio); >>> - pte_t ptent = ptep_get(pte); >>> - >>> - if (!folio_test_large(folio)) >>> + if (likely(!folio_test_large(folio))) >>> return 1; >>> >>> - count = pfn + nr - pte_pfn(ptent); >>> - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); >>> - >>> - for (i = 0; i < count; i++, pte++) { >>> - pte_t entry = ptep_get(pte); >>> - >>> - if (!pte_present(entry)) >>> - break; >>> - if (pte_pfn(entry) - pfn >= nr) >>> - break; >>> - } >>> + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; >>> + int max_nr = (end - addr) / PAGE_SIZE; >>> + pte_t ptent = ptep_get(pte); >>> >>> - return i; >>> + return folio_pte_batch(folio, addr, pte, ptent, max_nr, fpb_flags, NULL, >>> + NULL, NULL); >>> } >> >> what about a minimum change as below? >> index 30b51cdea89d..e8b98f84fbd2 100644 >> --- a/mm/mlock.c >> +++ b/mm/mlock.c >> @@ -307,26 +307,15 @@ void munlock_folio(struct folio *folio) >> static inline unsigned int folio_mlock_step(struct folio *folio, >> pte_t *pte, unsigned long addr, unsigned long end) >> { >> - unsigned int count, i, nr = folio_nr_pages(folio); >> - unsigned long pfn = folio_pfn(folio); >> + unsigned int count = (end - addr) >> PAGE_SHIFT; >> pte_t ptent = ptep_get(pte); >> + const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; >> >> if (!folio_test_large(folio)) >> return 1; >> >> - count = pfn + nr - pte_pfn(ptent); >> - count = min_t(unsigned int, count, (end - addr) >> PAGE_SHIFT); >> - >> - for (i = 0; i < count; i++, pte++) { >> - pte_t entry = ptep_get(pte); >> - >> - if (!pte_present(entry)) >> - break; >> - if (pte_pfn(entry) - pfn >= nr) >> - break; >> - } >> - >> - return i; >> + return folio_pte_batch(folio, addr, pte, ptent, count, fpb_flags, NULL, >> + NULL, NULL); >> } > > LGTM. > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-06-03 9:04 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-06-03 3:31 [PATCH 1/1] mm/mlock: implement folio_mlock_step() using folio_pte_batch() Lance Yang 2024-06-03 3:36 ` Matthew Wilcox 2024-06-03 4:13 ` Lance Yang 2024-06-03 4:14 ` Barry Song 2024-06-03 4:27 ` Lance Yang 2024-06-03 8:58 ` Baolin Wang 2024-06-03 9:03 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox