From: Peter Xu <peterx@redhat.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: "Mike Kravetz" <mike.kravetz@oracle.com>,
"Andrea Arcangeli" <aarcange@redhat.com>,
"Mika Penttilä" <mpenttil@redhat.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
peterx@redhat.com, "Axel Rasmussen" <axelrasmussen@google.com>,
"Nadav Amit" <nadav.amit@gmail.com>,
"David Hildenbrand" <david@redhat.com>,
linux-stable <stable@vger.kernel.org>
Subject: [PATCH v2 1/6] mm/hugetlb: Fix uffd-wp during fork()
Date: Mon, 17 Apr 2023 15:53:12 -0400 [thread overview]
Message-ID: <20230417195317.898696-2-peterx@redhat.com> (raw)
In-Reply-To: <20230417195317.898696-1-peterx@redhat.com>
There're a bunch of things that were wrong:
- Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
rather than huge_pte_uffd_wp().
- When copying over a pte, we should drop uffd-wp bit when
!EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
- When doing early CoW for private hugetlb (e.g. when the parent page was
pinned), uffd-wp bit should be properly carried over if necessary.
No bug reported probably because most people do not even care about these
corner cases, but they are still bugs and can be exposed by the recent unit
tests introduced, so fix all of them in one shot.
Cc: linux-stable <stable@vger.kernel.org>
Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
mm/hugetlb.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f16b25b1a6b9..0213efaf31be 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
static void
hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
- struct folio *new_folio)
+ struct folio *new_folio, pte_t old)
{
+ pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
+
__folio_mark_uptodate(new_folio);
hugepage_add_new_anon_rmap(new_folio, vma, addr);
- set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
+ if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
+ newpte = huge_pte_mkuffd_wp(newpte);
+ set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
folio_set_hugetlb_migratable(new_folio);
}
@@ -5032,14 +5036,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
*/
;
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
- bool uffd_wp = huge_pte_uffd_wp(entry);
-
- if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ if (!userfaultfd_wp(dst_vma))
entry = huge_pte_clear_uffd_wp(entry);
set_huge_pte_at(dst, addr, dst_pte, entry);
} else if (unlikely(is_hugetlb_entry_migration(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
- bool uffd_wp = huge_pte_uffd_wp(entry);
+ bool uffd_wp = pte_swp_uffd_wp(entry);
if (!is_readable_migration_entry(swp_entry) && cow) {
/*
@@ -5050,10 +5052,10 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
swp_offset(swp_entry));
entry = swp_entry_to_pte(swp_entry);
if (userfaultfd_wp(src_vma) && uffd_wp)
- entry = huge_pte_mkuffd_wp(entry);
+ entry = pte_swp_mkuffd_wp(entry);
set_huge_pte_at(src, addr, src_pte, entry);
}
- if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ if (!userfaultfd_wp(dst_vma))
entry = huge_pte_clear_uffd_wp(entry);
set_huge_pte_at(dst, addr, dst_pte, entry);
} else if (unlikely(is_pte_marker(entry))) {
@@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
/* huge_ptep of dst_pte won't change as in child */
goto again;
}
- hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
+ hugetlb_install_folio(dst_vma, dst_pte, addr,
+ new_folio, src_pte_old);
spin_unlock(src_ptl);
spin_unlock(dst_ptl);
continue;
@@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
entry = huge_pte_wrprotect(entry);
}
+ if (!userfaultfd_wp(dst_vma))
+ entry = huge_pte_clear_uffd_wp(entry);
+
set_huge_pte_at(dst, addr, dst_pte, entry);
hugetlb_count_add(npages, dst);
}
--
2.39.1
next prev parent reply other threads:[~2023-04-17 19:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-17 19:53 [PATCH v2 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
2023-04-17 19:53 ` Peter Xu [this message]
2023-04-17 19:53 ` [PATCH v2 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens Peter Xu
2023-04-17 23:48 ` Andrew Morton
2023-04-18 15:14 ` Peter Xu
2023-04-17 19:53 ` [PATCH v2 3/6] selftests/mm: Add a few options for uffd-unit-test Peter Xu
2023-04-18 16:08 ` David Hildenbrand
2023-04-17 19:53 ` [PATCH v2 4/6] selftests/mm: Extend and rename uffd pagemap test Peter Xu
2023-04-17 19:53 ` [PATCH v2 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS Peter Xu
2023-04-17 19:53 ` [PATCH v2 6/6] selftests/mm: Add tests for RO pinning vs fork() Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230417195317.898696-2-peterx@redhat.com \
--to=peterx@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=mpenttil@redhat.com \
--cc=nadav.amit@gmail.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox