[PATCH RFC 12/12] mm/gup: Merge hugetlb into generic mm code

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	John Hubbard <jhubbard@nvidia.com>,
	Mike Rapoport <rppt@kernel.org>,
	peterx@redhat.com, Hugh Dickins <hughd@google.com>,
	David Hildenbrand <david@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Rik van Riel <riel@surriel.com>,
	James Houghton <jthoughton@google.com>,
	Yang Shi <shy828301@gmail.com>, Jason Gunthorpe <jgg@nvidia.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH RFC 12/12] mm/gup: Merge hugetlb into generic mm code
Date: Wed, 15 Nov 2023 20:29:08 -0500	[thread overview]
Message-ID: <20231116012908.392077-13-peterx@redhat.com> (raw)
In-Reply-To: <20231116012908.392077-1-peterx@redhat.com>

Now follow_page() is ready to handle hugetlb pages in whatever form, and
over all architectures.  Switch to the generic code path.

Time to retire hugetlb_follow_page_mask(), following the previous
retirement of follow_hugetlb_page() in 4849807114b8.

There may be a slight difference of how the loops run when processing GUP
over a large hugetlb range on either ARM64 (e.g. CONT_PMD) or RISCV (mostly
its Svnapot extension on 64K huge pages): each loop of __get_user_pages()
will resolve one pgtable entry with the patch applied, rather than relying
on the size of hugetlb hstate, the latter may cover multiple entries in one
loop.

However, the performance difference should hopefully not be a major
concern, considering that GUP just yet got 57edfcfd3419 ("mm/gup:
accelerate thp gup even for "pages != NULL""), and that's not part of a
performance analysis but a side dish.  If the performance will be a
concern, we can consider handle CONT_PTE in follow_page(), for example.

Before that is justified to be necessary, keep everything clean and simple.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/hugetlb.h |  7 ----
 mm/gup.c                | 15 +++------
 mm/hugetlb.c            | 71 -----------------------------------------
 3 files changed, 5 insertions(+), 88 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index bb07279b8991..87630a185acf 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -332,13 +332,6 @@ static inline void hugetlb_zap_end(
 {
 }
 
-static inline struct page *hugetlb_follow_page_mask(
-    struct vm_area_struct *vma, unsigned long address, unsigned int flags,
-    unsigned int *page_mask)
-{
-	BUILD_BUG(); /* should never be compiled in if !CONFIG_HUGETLB_PAGE*/
-}
-
 static inline int copy_hugetlb_page_range(struct mm_struct *dst,
 					  struct mm_struct *src,
 					  struct vm_area_struct *dst_vma,
diff --git a/mm/gup.c b/mm/gup.c
index e635278f65f9..23fcac5aa3db 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -830,18 +830,11 @@ static struct page *follow_page_mask(struct vm_area_struct *vma,
 {
 	pgd_t *pgd, pgdval;
 	struct mm_struct *mm = vma->vm_mm;
+	struct page *page;
 
-	ctx->page_mask = 0;
-
-	/*
-	 * Call hugetlb_follow_page_mask for hugetlb vmas as it will use
-	 * special hugetlb page table walking code.  This eliminates the
-	 * need to check for hugetlb entries in the general walking code.
-	 */
-	if (is_vm_hugetlb_page(vma))
-		return hugetlb_follow_page_mask(vma, address, flags,
-						&ctx->page_mask);
+	vma_pgtable_walk_begin(vma);
 
+	ctx->page_mask = 0;
 	pgd = pgd_offset(mm, address);
 	pgdval = *pgd;
 
@@ -853,6 +846,8 @@ static struct page *follow_page_mask(struct vm_area_struct *vma,
 	else
 		page = follow_p4d_mask(vma, address, pgd, flags, ctx);
 
+	vma_pgtable_walk_end(vma);
+
 	return page;
 }
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 29705e5c6f40..3013122a739f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6783,77 +6783,6 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
 }
 #endif /* CONFIG_USERFAULTFD */
 
-struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
-				      unsigned long address, unsigned int flags,
-				      unsigned int *page_mask)
-{
-	struct hstate *h = hstate_vma(vma);
-	struct mm_struct *mm = vma->vm_mm;
-	unsigned long haddr = address & huge_page_mask(h);
-	struct page *page = NULL;
-	spinlock_t *ptl;
-	pte_t *pte, entry;
-	int ret;
-
-	hugetlb_vma_lock_read(vma);
-	pte = hugetlb_walk(vma, haddr, huge_page_size(h));
-	if (!pte)
-		goto out_unlock;
-
-	ptl = huge_pte_lock(h, mm, pte);
-	entry = huge_ptep_get(pte);
-	if (pte_present(entry)) {
-		page = pte_page(entry);
-
-		if (!huge_pte_write(entry)) {
-			if (flags & FOLL_WRITE) {
-				page = NULL;
-				goto out;
-			}
-
-			if (gup_must_unshare(vma, flags, page)) {
-				/* Tell the caller to do unsharing */
-				page = ERR_PTR(-EMLINK);
-				goto out;
-			}
-		}
-
-		page = nth_page(page, ((address & ~huge_page_mask(h)) >> PAGE_SHIFT));
-
-		/*
-		 * Note that page may be a sub-page, and with vmemmap
-		 * optimizations the page struct may be read only.
-		 * try_grab_page() will increase the ref count on the
-		 * head page, so this will be OK.
-		 *
-		 * try_grab_page() should always be able to get the page here,
-		 * because we hold the ptl lock and have verified pte_present().
-		 */
-		ret = try_grab_page(page, flags);
-
-		if (WARN_ON_ONCE(ret)) {
-			page = ERR_PTR(ret);
-			goto out;
-		}
-
-		*page_mask = (1U << huge_page_order(h)) - 1;
-	}
-out:
-	spin_unlock(ptl);
-out_unlock:
-	hugetlb_vma_unlock_read(vma);
-
-	/*
-	 * Fixup retval for dump requests: if pagecache doesn't exist,
-	 * don't try to allocate a new page but just skip it.
-	 */
-	if (!page && (flags & FOLL_DUMP) &&
-	    !hugetlbfs_pagecache_present(h, vma, address))
-		page = ERR_PTR(-EFAULT);
-
-	return page;
-}
-
 long hugetlb_change_protection(struct vm_area_struct *vma,
 		unsigned long address, unsigned long end,
 		pgprot_t newprot, unsigned long cp_flags)
-- 
2.41.0

next prev parent reply	other threads:[~2023-11-16  1:29 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-16  1:28 [PATCH RFC 00/12] mm/gup: Unify hugetlb, part 2 Peter Xu
2023-11-16  1:28 ` [PATCH RFC 01/12] mm/hugetlb: Export hugetlbfs_pagecache_present() Peter Xu
2023-11-23  7:23   ` Christoph Hellwig
2023-11-23 16:05     ` Peter Xu
2023-11-16  1:28 ` [PATCH RFC 02/12] mm: Provide generic pmd_thp_or_huge() Peter Xu
2023-11-16  1:28 ` [PATCH RFC 03/12] mm: Export HPAGE_PXD_* macros even if !THP Peter Xu
2023-11-23  7:23   ` Christoph Hellwig
2023-11-23  9:53     ` Mike Rapoport
2023-11-23 15:27       ` Peter Xu
2023-11-16  1:29 ` [PATCH RFC 04/12] mm: Introduce vma_pgtable_walk_{begin|end}() Peter Xu
2023-11-23  7:24   ` Christoph Hellwig
2023-11-23 16:11     ` Peter Xu
2023-11-24  4:02   ` Aneesh Kumar K.V
2023-11-24 15:34     ` Peter Xu
2023-11-16  1:29 ` [PATCH RFC 05/12] mm/gup: Fix follow_devmap_p[mu]d() to return even if NULL Peter Xu
2023-11-23  7:25   ` Christoph Hellwig
2023-11-23 17:59     ` Peter Xu
2023-11-16  1:29 ` [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing Peter Xu
2023-11-20  8:26   ` Christoph Hellwig
2023-11-21 15:59     ` Peter Xu
2023-11-22  8:00       ` Christoph Hellwig
2023-11-22 15:22         ` Peter Xu
2023-11-23  7:21           ` Christoph Hellwig
2023-11-23 16:10             ` Peter Xu
2023-11-23 18:22           ` Christophe Leroy
2023-11-23 19:37             ` Peter Xu
2023-11-24  5:28               ` Aneesh Kumar K.V
2023-11-24  7:03               ` Christophe Leroy
2023-11-24  1:06           ` Michael Ellerman
2023-11-23 15:47         ` Matthew Wilcox
2023-11-23 17:22           ` Peter Xu
2023-11-23 19:11             ` Ryan Roberts
2023-11-23 19:46               ` Peter Xu
2023-11-24  9:06                 ` Ryan Roberts
2023-11-24 16:07                   ` Peter Xu
2023-11-30 21:30                     ` Peter Xu
2023-12-03 13:33                       ` Christophe Leroy
2023-12-04 11:11                         ` Ryan Roberts
2023-12-04 11:25                           ` Christophe Leroy
2023-12-04 11:46                             ` Ryan Roberts
2023-12-04 11:57                               ` Christophe Leroy
2023-12-04 12:02                                 ` Ryan Roberts
2023-12-04 16:48                           ` Peter Xu
2023-11-16  1:29 ` [PATCH RFC 07/12] mm/gup: Refactor record_subpages() to find 1st small page Peter Xu
2023-11-16 14:51   ` Matthew Wilcox
2023-11-16 19:40     ` Peter Xu
2023-11-16 19:41       ` Matthew Wilcox
2023-11-16  1:29 ` [PATCH RFC 08/12] mm/gup: Handle hugetlb for no_page_table() Peter Xu
2023-11-23  7:26   ` Christoph Hellwig
2023-11-16  1:29 ` [PATCH RFC 09/12] mm/gup: Handle huge pud for follow_pud_mask() Peter Xu
2023-11-23  7:28   ` Christoph Hellwig
2023-11-23 16:19     ` Peter Xu
2023-11-16  1:29 ` [PATCH RFC 10/12] mm/gup: Handle huge pmd for follow_pmd_mask() Peter Xu
2023-11-16  1:29 ` [PATCH RFC 11/12] mm/gup: Handle hugepd for follow_page() Peter Xu
2023-11-16  1:29 ` Peter Xu [this message]
2023-11-23  7:29   ` [PATCH RFC 12/12] mm/gup: Merge hugetlb into generic mm code Christoph Hellwig
2023-11-23 16:21     ` Peter Xu
2023-11-22 14:51 ` [PATCH RFC 00/12] mm/gup: Unify hugetlb, part 2 Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231116012908.392077-13-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mike.kravetz@oracle.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox