From: lizhe.67@bytedance.com
To: david@redhat.com
Cc: akpm@linux-foundation.org, alex.williamson@redhat.com,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, lizhe.67@bytedance.com, peterx@redhat.com
Subject: Re: [PATCH v4 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio
Date: Wed, 18 Jun 2025 17:39:06 +0800 [thread overview]
Message-ID: <20250618093906.23048-1-lizhe.67@bytedance.com> (raw)
In-Reply-To: <58f9d004-918d-4f66-85e7-d090db0af5de@redhat.com>
On Wed, 18 Jun 2025 10:54:38 +0200, david@redhat.com wrote:
> On 18.06.25 08:11, lizhe.67@bytedance.com wrote:
> > On Tue, 17 Jun 2025 15:47:09 +0200, david@redhat.com wrote:
> >
> >>> How do you think of this implementation?
> >>>
> >>> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >>> index 242b05671502..eb91f99ea973 100644
> >>> --- a/include/linux/mm.h
> >>> +++ b/include/linux/mm.h
> >>> @@ -2165,6 +2165,23 @@ static inline long folio_nr_pages(const struct folio *folio)
> >>> return folio_large_nr_pages(folio);
> >>> }
> >>>
> >>> +/*
> >>> + * folio_remaining_pages - Counts the number of pages from a given
> >>> + * start page to the end of the folio.
> >>> + *
> >>> + * @folio: Pointer to folio
> >>> + * @start_page: The starting page from which to begin counting.
> >>> + *
> >>> + * Returned number includes the provided start page.
> >>> + *
> >>> + * The caller must ensure that @start_page belongs to @folio.
> >>> + */
> >>> +static inline unsigned long folio_remaining_pages(struct folio *folio,
> >>> + struct page *start_page)
> >>> +{
> >>> + return folio_nr_pages(folio) - folio_page_idx(folio, start_page);
> >>> +}
> >>> +
> >>> /* Only hugetlbfs can allocate folios larger than MAX_ORDER */
> >>> #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
> >>> #define MAX_FOLIO_NR_PAGES (1UL << PUD_ORDER)
> >>> diff --git a/mm/gup.c b/mm/gup.c
> >>> index 15debead5f5b..14ae2e3088b4 100644
> >>> --- a/mm/gup.c
> >>> +++ b/mm/gup.c
> >>> @@ -242,7 +242,7 @@ static inline struct folio *gup_folio_range_next(struct page *start,
> >>>
> >>> if (folio_test_large(folio))
> >>> nr = min_t(unsigned int, npages - i,
> >>> - folio_nr_pages(folio) - folio_page_idx(folio, next));
> >>> + folio_remaining_pages(folio, next));
> >>>
> >>> *ntails = nr;
> >>> return folio;
> >>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> >>> index b2fc5266e3d2..34e85258060c 100644
> >>> --- a/mm/page_isolation.c
> >>> +++ b/mm/page_isolation.c
> >>> @@ -96,7 +96,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
> >>> return page;
> >>> }
> >>>
> >>> - skip_pages = folio_nr_pages(folio) - folio_page_idx(folio, page);
> >>> + skip_pages = folio_remaining_pages(folio, page);
> >>> pfn += skip_pages - 1;
> >>> continue;
> >>> }
> >>> ---
> >>
> >> Guess I would have pulled the "min" in there, but passing something like
> >> ULONG_MAX for the page_isolation case also looks rather ugly.
> >
> > Yes, the page_isolation case does not require the 'min' logic. Since
> > there are already places in the current kernel code where
> > folio_remaining_pages() is used without needing min, we could simply
> > create a custom wrapper function based on folio_remaining_pages() only
> > in those specific scenarios where min is necessary.
>
> I would just do something like that, removing gup_folio_range_next() completely.
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 98a606908307b..6c224b1c6c169 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1995,6 +1995,32 @@ static inline long folio_nr_pages(const struct folio *folio)
> return folio_large_nr_pages(folio);
> }
>
> +/**
> + * folio_remaining_pages - The remaining number of pages in the folio.
> + * @folio: The folio.
> + * @page: The folio page to start the counting.
> + * @max_npages: Limit how far to count.
> + *
> + * The returned number will include the provided page.
> + *
> + * The caller must ensure that @page belongs to @folio. When setting
> + * @max_npages to ULONG_MAX, the parameter is effectively ignored.
> + *
> + * Return: The remaining number of pages, limited by @max_npages.
> + */
> +static inline unsigned long folio_remaining_pages(struct folio *folio,
> + struct page *page, unsigned long max_npages)
> +{
> + unsigned long nr;
> +
> + if (!folio_test_large(folio))
> + return 1;
> + nr = folio_large_nr_pages(folio) - folio_page_idx(folio, page);
> + if (__builtin_constant_p(max_npages) && max_npages == ULONG_MAX)
> + return nr;
> + return min_t(unsigned long, nr, max_npages);
> +}
> +
> /* Only hugetlbfs can allocate folios larger than MAX_ORDER */
> #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
> #define MAX_FOLIO_NR_PAGES (1UL << PUD_ORDER)
> diff --git a/mm/gup.c b/mm/gup.c
> index 6888e871a74a9..3385428d028f6 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -234,21 +234,6 @@ void folio_add_pin(struct folio *folio)
> }
> }
>
> -static inline struct folio *gup_folio_range_next(struct page *start,
> - unsigned long npages, unsigned long i, unsigned int *ntails)
> -{
> - struct page *next = nth_page(start, i);
> - struct folio *folio = page_folio(next);
> - unsigned int nr = 1;
> -
> - if (folio_test_large(folio))
> - nr = min_t(unsigned int, npages - i,
> - folio_nr_pages(folio) - folio_page_idx(folio, next));
> -
> - *ntails = nr;
> - return folio;
> -}
> -
> static inline struct folio *gup_folio_next(struct page **list,
> unsigned long npages, unsigned long i, unsigned int *ntails)
> {
> @@ -356,11 +341,13 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
> bool make_dirty)
> {
> unsigned long i;
> - struct folio *folio;
> unsigned int nr;
>
> for (i = 0; i < npages; i += nr) {
> - folio = gup_folio_range_next(page, npages, i, &nr);
> + struct page *next = nth_page(page, i);
> + struct folio *folio = page_folio(next);
> +
> + nr = folio_remaining_pages(folio, next, npages - i);
> if (make_dirty && !folio_test_dirty(folio)) {
> folio_lock(folio);
> folio_mark_dirty(folio);
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index ece3bfc56bcd5..6f75defb38d73 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -96,7 +96,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
> return page;
> }
>
> - skip_pages = folio_nr_pages(folio) - folio_page_idx(folio, page);
> + skip_pages = folio_remaining_pages(folio, page, ULONG_MAX);
> pfn += skip_pages - 1;
> continue;
> }
> --
>
>
> But up to you, anything with such a helper function makes the code easier to digest.
Thank you. This implementation looks better. Please allow me to
integrate this patch into my patchset.
Thanks,
Zhe
prev parent reply other threads:[~2025-06-18 9:39 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-17 4:18 [PATCH v4 0/3] " lizhe.67
2025-06-17 4:18 ` [PATCH v4 1/3] vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote() lizhe.67
2025-06-17 4:18 ` [PATCH v4 2/3] gup: introduce unpin_user_folio_dirty_locked() lizhe.67
2025-06-17 7:35 ` David Hildenbrand
2025-06-17 13:42 ` Jason Gunthorpe
2025-06-17 13:45 ` David Hildenbrand
2025-06-17 13:58 ` David Hildenbrand
2025-06-17 14:04 ` David Hildenbrand
2025-06-17 15:22 ` Jason Gunthorpe
2025-06-18 6:28 ` lizhe.67
2025-06-18 8:20 ` David Hildenbrand
2025-06-18 11:36 ` Jason Gunthorpe
2025-06-18 11:40 ` David Hildenbrand
2025-06-18 11:42 ` David Hildenbrand
2025-06-18 11:46 ` Jason Gunthorpe
2025-06-18 11:52 ` David Hildenbrand
2025-06-18 11:56 ` Jason Gunthorpe
2025-06-18 12:19 ` lizhe.67
2025-06-18 13:23 ` Jason Gunthorpe
2025-06-19 9:05 ` lizhe.67
2025-06-19 12:35 ` Jason Gunthorpe
2025-06-19 12:49 ` lizhe.67
2025-06-17 4:18 ` [PATCH v4 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
2025-06-17 7:43 ` David Hildenbrand
2025-06-17 9:21 ` [PATCH v4 2/3] gup: introduce unpin_user_folio_dirty_locked() lizhe.67
2025-06-17 9:27 ` David Hildenbrand
2025-06-17 9:47 ` [PATCH v4 3/3] vfio/type1: optimize vfio_unpin_pages_remote() for large folio lizhe.67
2025-06-17 9:49 ` David Hildenbrand
2025-06-17 12:42 ` lizhe.67
2025-06-17 13:47 ` David Hildenbrand
2025-06-18 6:11 ` lizhe.67
2025-06-18 7:22 ` lizhe.67
2025-06-18 8:54 ` David Hildenbrand
2025-06-18 9:39 ` lizhe.67 [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250618093906.23048-1-lizhe.67@bytedance.com \
--to=lizhe.67@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox