On 30/05/25 2:53 pm, lizhe.67@bytedance.com wrote: > From: Li Zhe > > In the current implementation of the longterm pin_user_pages() function, > we invoke the collect_longterm_unpinnable_folios() function. This function > iterates through the list to check whether each folio belongs to the > "longterm_unpinnabled" category. The folios in this list essentially > correspond to a contiguous region of user-space addresses, with each folio > representing a physical address in increments of PAGESIZE. If this > user-space address range is mapped with large folio, we can optimize the > performance of function pin_user_pages() by reducing the number of if-else > branches and the frequency of memory accesses using READ_ONCE. This patch > leverages this approach to achieve performance improvements. > > The performance test results obtained through the gup_test tool from the > kernel source tree are as follows. We achieve an improvement of over 75% > for large folio with pagesize=2M. For normal page, we have only observed > a very slight degradation in performance. Thanks for the patch! I have no idea on GUP but in my limited understanding the patch looks fine, let's wait for more comments. [----] > } > > +static struct folio *pofs_next_folio(struct folio *folio, > + struct pages_or_folios *pofs, long *index_ptr) > +{ > + long i = *index_ptr + 1; > + unsigned long nr_pages = folio_nr_pages(folio); > + > + if (!pofs->has_folios) > + while ((i < pofs->nr_entries) && > + /* Is this page part of this folio? */ > + (folio_page_idx(folio, pofs->pages[i]) < nr_pages)) > + i++; > + > + if (unlikely(i == pofs->nr_entries)) > + return NULL; > + *index_ptr = i; > + > + return pofs_get_folio(pofs, i); > +} > + > /* > * Returns the number of collected folios. Return value is always >= 0. > */ > @@ -2324,16 +2343,12 @@ static void collect_longterm_unpinnable_folios( > struct list_head *movable_folio_list, > struct pages_or_folios *pofs) > { > - struct folio *prev_folio = NULL; > bool drain_allow = true; > - unsigned long i; > - > - for (i = 0; i < pofs->nr_entries; i++) { > - struct folio *folio = pofs_get_folio(pofs, i); > + long i = 0; Why not unsigned long? > + struct folio *folio; > > - if (folio == prev_folio) > - continue; > - prev_folio = folio; > + for (folio = pofs_get_folio(pofs, 0); folio; > + folio = pofs_next_folio(folio, pofs, &i)) { > > if (folio_is_longterm_pinnable(folio)) > continue;