linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins
@ 2023-04-13 23:11 Peter Xu
  2023-04-13 23:11 ` [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork() Peter Xu
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Peter Xu @ 2023-04-13 23:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, David Hildenbrand, peterx,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli

This is a follow up of previous discussion here:

https://lore.kernel.org/r/20230324222707.GA3046@monkey

There, Mike correctly pointed out that uffd-wp bit can get lost too when
Copy-On-Read triggers.  Last time we didn't have a reproducer, I finally
wrote a reproducer and attached as the last patch.

When at it, I decided to also add some more uffd-wp tests against fork(),
and I found more bugs.  None of them were reported by anyone probably
because none of us cares, but since they're still bugs and can be
reproduced by the unit test I fixed them too in another patch.

The initial patch 1-2 are fixes to bugs, copied stable.

The rest patches 3-6 introduces unit tests to verify (based on the recent
rework on uffd unit test).  Note that not all the bugfixes in patch 1 is
verified (e.g. on changes to hugetlb hwpoison / migration entries), but I
assume they can be reviewed with careful eyes.

Thanks,

Peter Xu (6):
  mm/hugetlb: Fix uffd-wp during fork()
  mm/hugetlb: Fix uffd-wp bit lost when unsharing happens
  selftests/mm: Add a few options for uffd-unit-test
  selftests/mm: Extend and rename uffd pagemap test
  selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
  selftests/mm: Add tests for RO pinning vs fork()

 mm/hugetlb.c                                 |  33 +-
 tools/testing/selftests/mm/Makefile          |   8 +-
 tools/testing/selftests/mm/check_config.sh   |   4 +-
 tools/testing/selftests/mm/uffd-unit-tests.c | 318 +++++++++++++++++--
 4 files changed, 315 insertions(+), 48 deletions(-)

-- 
2.39.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()
  2023-04-13 23:11 [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
@ 2023-04-13 23:11 ` Peter Xu
  2023-04-14  9:37   ` David Hildenbrand
                     ` (2 more replies)
  2023-04-13 23:11 ` [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens Peter Xu
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 18+ messages in thread
From: Peter Xu @ 2023-04-13 23:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, David Hildenbrand, peterx,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli, linux-stable

There're a bunch of things that were wrong:

  - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
    rather than huge_pte_uffd_wp().

  - When copying over a pte, we should drop uffd-wp bit when
    !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).

  - When doing early CoW for private hugetlb (e.g. when the parent page was
    pinned), uffd-wp bit should be properly carried over if necessary.

No bug reported probably because most people do not even care about these
corner cases, but they are still bugs and can be exposed by the recent unit
tests introduced, so fix all of them in one shot.

Cc: linux-stable <stable@vger.kernel.org>
Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/hugetlb.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f16b25b1a6b9..7320e64aacc6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
 
 static void
 hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
-		     struct folio *new_folio)
+		      struct folio *new_folio, pte_t old)
 {
+	pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
+
 	__folio_mark_uptodate(new_folio);
 	hugepage_add_new_anon_rmap(new_folio, vma, addr);
-	set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
+	if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
+		newpte = huge_pte_mkuffd_wp(newpte);
+	set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
 	hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
 	folio_set_hugetlb_migratable(new_folio);
 }
@@ -5032,14 +5036,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 			 */
 			;
 		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
-			bool uffd_wp = huge_pte_uffd_wp(entry);
-
-			if (!userfaultfd_wp(dst_vma) && uffd_wp)
+			if (!userfaultfd_wp(dst_vma))
 				entry = huge_pte_clear_uffd_wp(entry);
 			set_huge_pte_at(dst, addr, dst_pte, entry);
 		} else if (unlikely(is_hugetlb_entry_migration(entry))) {
 			swp_entry_t swp_entry = pte_to_swp_entry(entry);
-			bool uffd_wp = huge_pte_uffd_wp(entry);
 
 			if (!is_readable_migration_entry(swp_entry) && cow) {
 				/*
@@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 				swp_entry = make_readable_migration_entry(
 							swp_offset(swp_entry));
 				entry = swp_entry_to_pte(swp_entry);
-				if (userfaultfd_wp(src_vma) && uffd_wp)
-					entry = huge_pte_mkuffd_wp(entry);
+				if (userfaultfd_wp(src_vma) &&
+				    pte_swp_uffd_wp(entry))
+					entry = pte_swp_mkuffd_wp(entry);
 				set_huge_pte_at(src, addr, src_pte, entry);
 			}
-			if (!userfaultfd_wp(dst_vma) && uffd_wp)
+			if (!userfaultfd_wp(dst_vma))
 				entry = huge_pte_clear_uffd_wp(entry);
 			set_huge_pte_at(dst, addr, dst_pte, entry);
 		} else if (unlikely(is_pte_marker(entry))) {
@@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 					/* huge_ptep of dst_pte won't change as in child */
 					goto again;
 				}
-				hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
+				hugetlb_install_folio(dst_vma, dst_pte, addr,
+						      new_folio, src_pte_old);
 				spin_unlock(src_ptl);
 				spin_unlock(dst_ptl);
 				continue;
@@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 				entry = huge_pte_wrprotect(entry);
 			}
 
+			if (!userfaultfd_wp(dst_vma))
+				entry = huge_pte_clear_uffd_wp(entry);
+
 			set_huge_pte_at(dst, addr, dst_pte, entry);
 			hugetlb_count_add(npages, dst);
 		}
-- 
2.39.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens
  2023-04-13 23:11 [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
  2023-04-13 23:11 ` [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork() Peter Xu
@ 2023-04-13 23:11 ` Peter Xu
  2023-04-14  9:23   ` David Hildenbrand
  2023-04-14 22:19   ` Mike Kravetz
  2023-04-13 23:11 ` [PATCH 3/6] selftests/mm: Add a few options for uffd-unit-test Peter Xu
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 18+ messages in thread
From: Peter Xu @ 2023-04-13 23:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, David Hildenbrand, peterx,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli, linux-stable

When we try to unshare a pinned page for a private hugetlb, uffd-wp bit can
get lost during unsharing.  Fix it by carrying it over.

This should be very rare, only if an unsharing happened on a private
hugetlb page with uffd-wp protected (e.g. in a child which shares the same
page with parent with UFFD_FEATURE_EVENT_FORK enabled).

Cc: linux-stable <stable@vger.kernel.org>
Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection")
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/hugetlb.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 7320e64aacc6..083aae35bff8 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5637,13 +5637,16 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	spin_lock(ptl);
 	ptep = hugetlb_walk(vma, haddr, huge_page_size(h));
 	if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) {
+		pte_t newpte = make_huge_pte(vma, &new_folio->page, !unshare);
+
 		/* Break COW or unshare */
 		huge_ptep_clear_flush(vma, haddr, ptep);
 		mmu_notifier_invalidate_range(mm, range.start, range.end);
 		page_remove_rmap(old_page, vma, true);
 		hugepage_add_new_anon_rmap(new_folio, vma, haddr);
-		set_huge_pte_at(mm, haddr, ptep,
-				make_huge_pte(vma, &new_folio->page, !unshare));
+		if (huge_pte_uffd_wp(pte))
+			newpte = huge_pte_mkuffd_wp(newpte);
+		set_huge_pte_at(mm, haddr, ptep, newpte);
 		folio_set_hugetlb_migratable(new_folio);
 		/* Make the old page be freed below */
 		new_folio = page_folio(old_page);
-- 
2.39.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/6] selftests/mm: Add a few options for uffd-unit-test
  2023-04-13 23:11 [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
  2023-04-13 23:11 ` [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork() Peter Xu
  2023-04-13 23:11 ` [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens Peter Xu
@ 2023-04-13 23:11 ` Peter Xu
  2023-04-13 23:11 ` [PATCH 4/6] selftests/mm: Extend and rename uffd pagemap test Peter Xu
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2023-04-13 23:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, David Hildenbrand, peterx,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli

Namely:

  "-f": add a wildcard filter for tests to run
  "-l": list tests rather than running any
  "-h": help msg

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tools/testing/selftests/mm/uffd-unit-tests.c | 52 +++++++++++++++++---
 1 file changed, 45 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index d871bf732e62..452ca05a829d 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -909,28 +909,65 @@ uffd_test_case_t uffd_tests[] = {
 	},
 };
 
+static void usage(const char *prog)
+{
+	printf("usage: %s [-f TESTNAME]\n", prog);
+	puts("");
+	puts(" -f: test name to filter (e.g., event)");
+	puts(" -h: show the help msg");
+	puts(" -l: list tests only");
+	puts("");
+	exit(KSFT_FAIL);
+}
+
 int main(int argc, char *argv[])
 {
 	int n_tests = sizeof(uffd_tests) / sizeof(uffd_test_case_t);
 	int n_mems = sizeof(mem_types) / sizeof(mem_type_t);
+	const char *test_filter = NULL;
+	bool list_only = false;
 	uffd_test_case_t *test;
 	mem_type_t *mem_type;
 	uffd_test_args_t args;
 	char test_name[128];
 	const char *errmsg;
-	int has_uffd;
+	int has_uffd, opt;
 	int i, j;
 
-	has_uffd = test_uffd_api(false);
-	has_uffd |= test_uffd_api(true);
+	while ((opt = getopt(argc, argv, "f:hl")) != -1) {
+		switch (opt) {
+		case 'f':
+			test_filter = optarg;
+			break;
+		case 'l':
+			list_only = true;
+			break;
+		case 'h':
+		default:
+			/* Unknown */
+			usage(argv[0]);
+			break;
+		}
+	}
+
+	if (!test_filter && !list_only) {
+		has_uffd = test_uffd_api(false);
+		has_uffd |= test_uffd_api(true);
 
-	if (!has_uffd) {
-		printf("Userfaultfd not supported or unprivileged, skip all tests\n");
-		exit(KSFT_SKIP);
+		if (!has_uffd) {
+			printf("Userfaultfd not supported or unprivileged, skip all tests\n");
+			exit(KSFT_SKIP);
+		}
 	}
 
 	for (i = 0; i < n_tests; i++) {
 		test = &uffd_tests[i];
+		if (test_filter && !strstr(test->name, test_filter))
+			continue;
+		if (list_only) {
+			printf("%s\n", test->name);
+			continue;
+		}
 		for (j = 0; j < n_mems; j++) {
 			mem_type = &mem_types[j];
 			if (!(test->mem_targets & mem_type->mem_flag))
@@ -952,7 +989,8 @@ int main(int argc, char *argv[])
 		}
 	}
 
-	uffd_test_report();
+	if (!list_only)
+		uffd_test_report();
 
 	return ksft_get_fail_cnt() ? KSFT_FAIL : KSFT_PASS;
 }
-- 
2.39.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/6] selftests/mm: Extend and rename uffd pagemap test
  2023-04-13 23:11 [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
                   ` (2 preceding siblings ...)
  2023-04-13 23:11 ` [PATCH 3/6] selftests/mm: Add a few options for uffd-unit-test Peter Xu
@ 2023-04-13 23:11 ` Peter Xu
  2023-04-13 23:11 ` [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS Peter Xu
  2023-04-13 23:12 ` [PATCH 6/6] selftests/mm: Add tests for RO pinning vs fork() Peter Xu
  5 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2023-04-13 23:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, David Hildenbrand, peterx,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli

Extend it to all types of mem, meanwhile add one parallel test when
EVENT_FORK is enabled, where uffd-wp bits should be persisted rather than
dropped.

Since at it, rename the test to "wp-fork" to better show what it means.
Making the new test called "wp-fork-with-event".

Before:

        Testing pagemap on anon... done

After:

        Testing wp-fork on anon... done
        Testing wp-fork on shmem... done
        Testing wp-fork on shmem-private... done
        Testing wp-fork on hugetlb... done
        Testing wp-fork on hugetlb-private... done
        Testing wp-fork-with-event on anon... done
        Testing wp-fork-with-event on shmem... done
        Testing wp-fork-with-event on shmem-private... done
        Testing wp-fork-with-event on hugetlb... done
        Testing wp-fork-with-event on hugetlb-private... done

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tools/testing/selftests/mm/uffd-unit-tests.c | 130 +++++++++++++++----
 1 file changed, 106 insertions(+), 24 deletions(-)

diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index 452ca05a829d..739fc4d30342 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -227,25 +227,65 @@ static int pagemap_open(void)
 			err("pagemap uffd-wp bit error: 0x%"PRIx64, value); \
 	} while (0)
 
-static int pagemap_test_fork(bool present)
+typedef struct {
+	int parent_uffd, child_uffd;
+} fork_event_args;
+
+static void *fork_event_consumer(void *data)
 {
-	pid_t child = fork();
+	fork_event_args *args = data;
+	struct uffd_msg msg = { 0 };
+
+	/* Read until a full msg received */
+	while (uffd_read_msg(args->parent_uffd, &msg));
+
+	if (msg.event != UFFD_EVENT_FORK)
+		err("wrong message: %u\n", msg.event);
+
+	/* Just to be properly freed later */
+	args->child_uffd = msg.arg.fork.ufd;
+	return NULL;
+}
+
+static int pagemap_test_fork(int uffd, bool with_event)
+{
+	fork_event_args args = { .parent_uffd = uffd, .child_uffd = -1 };
+	pthread_t thread;
+	pid_t child;
 	uint64_t value;
 	int fd, result;
 
+	/* Prepare a thread to resolve EVENT_FORK */
+	if (with_event) {
+		if (pthread_create(&thread, NULL, fork_event_consumer, &args))
+			err("pthread_create()");
+	}
+
+	child = fork();
 	if (!child) {
 		/* Open the pagemap fd of the child itself */
 		fd = pagemap_open();
 		value = pagemap_get_entry(fd, area_dst);
 		/*
-		 * After fork() uffd-wp bit should be gone as long as we're
-		 * without UFFD_FEATURE_EVENT_FORK
+		 * After fork(), we should handle uffd-wp bit differently:
+		 *
+		 * (1) when with EVENT_FORK, it should persist
+		 * (2) when without EVENT_FORK, it should be dropped
 		 */
-		pagemap_check_wp(value, false);
+		pagemap_check_wp(value, with_event);
 		/* Succeed */
 		exit(0);
 	}
 	waitpid(child, &result, 0);
+
+	if (with_event) {
+		if (pthread_join(thread, NULL))
+			err("pthread_join()");
+		if (args.child_uffd < 0)
+			err("Didn't receive child uffd");
+		close(args.child_uffd);
+	}
+
 	return result;
 }
 
@@ -295,7 +335,8 @@ static void uffd_wp_unpopulated_test(uffd_test_args_t *args)
 	uffd_test_pass();
 }
 
-static void uffd_pagemap_test(uffd_test_args_t *args)
+static void uffd_wp_fork_test_common(uffd_test_args_t *args,
+				     bool with_event)
 {
 	int pagemap_fd;
 	uint64_t value;
@@ -311,23 +352,42 @@ static void uffd_pagemap_test(uffd_test_args_t *args)
 	wp_range(uffd, (uint64_t)area_dst, page_size, true);
 	value = pagemap_get_entry(pagemap_fd, area_dst);
 	pagemap_check_wp(value, true);
-	/* Make sure uffd-wp bit dropped when fork */
-	if (pagemap_test_fork(true))
-		err("Detected stall uffd-wp bit in child");
-
-	/* Exclusive required or PAGEOUT won't work */
-	if (!(value & PM_MMAP_EXCLUSIVE))
-		err("multiple mapping detected: 0x%"PRIx64, value);
+	if (pagemap_test_fork(uffd, with_event)) {
+		uffd_test_fail("Detected %s uffd-wp bit in child in present pte",
+			       with_event ? "missing" : "stall");
+		goto out;
+	}
 
-	if (madvise(area_dst, page_size, MADV_PAGEOUT))
-		err("madvise(MADV_PAGEOUT) failed");
+	/*
+	 * This is an attempt for zapping the pgtable so as to test the
+	 * markers.
+	 *
+	 * For private mappings, PAGEOUT will only work on exclusive ptes
+	 * (PM_MMAP_EXCLUSIVE) which we should satisfy.
+	 *
+	 * For shared, PAGEOUT may not work.  Use DONTNEED instead which
+	 * plays a similar role of zapping (rather than freeing the page)
+	 * to expose pte markers.
+	 */
+	if (args->mem_type->shared) {
+		if (madvise(area_dst, page_size, MADV_DONTNEED))
+			err("MADV_DONTNEED");
+	} else {
+		/*
+		 * NOTE: ignore retval because private-hugetlb doesn't yet
+		 * support swapping, so it could fail.
+		 */
+		madvise(area_dst, page_size, MADV_PAGEOUT);
+	}
 
 	/* Uffd-wp should persist even swapped out */
 	value = pagemap_get_entry(pagemap_fd, area_dst);
 	pagemap_check_wp(value, true);
-	/* Make sure uffd-wp bit dropped when fork */
-	if (pagemap_test_fork(false))
-		err("Detected stall uffd-wp bit in child");
+	if (pagemap_test_fork(uffd, with_event)) {
+		uffd_test_fail("Detected %s uffd-wp bit in child in zapped pte",
+			       with_event ? "missing" : "stall");
+		goto out;
+	}
 
 	/* Unprotect; this tests swap pte modifications */
 	wp_range(uffd, (uint64_t)area_dst, page_size, false);
@@ -338,9 +398,21 @@ static void uffd_pagemap_test(uffd_test_args_t *args)
 	*area_dst = 2;
 	value = pagemap_get_entry(pagemap_fd, area_dst);
 	pagemap_check_wp(value, false);
-
-	close(pagemap_fd);
 	uffd_test_pass();
+out:
+	if (uffd_unregister(uffd, area_dst, nr_pages * page_size))
+		err("unregister failed");
+	close(pagemap_fd);
+}
+
+static void uffd_wp_fork_test(uffd_test_args_t *args)
+{
+	uffd_wp_fork_test_common(args, false);
+}
+
+static void uffd_wp_fork_with_event_test(uffd_test_args_t *args)
+{
+	uffd_wp_fork_test_common(args, true);
 }
 
 static void check_memory_contents(char *p)
@@ -836,10 +908,20 @@ uffd_test_case_t uffd_tests[] = {
 		.uffd_feature_required = 0,
 	},
 	{
-		.name = "pagemap",
-		.uffd_fn = uffd_pagemap_test,
-		.mem_targets = MEM_ANON,
-		.uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP,
+		.name = "wp-fork",
+		.uffd_fn = uffd_wp_fork_test,
+		.mem_targets = MEM_ALL,
+		.uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+		UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
+	},
+	{
+		.name = "wp-fork-with-event",
+		.uffd_fn = uffd_wp_fork_with_event_test,
+		.mem_targets = MEM_ALL,
+		.uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+		UFFD_FEATURE_WP_HUGETLBFS_SHMEM |
+		/* when set, child process should inherit uffd-wp bits */
+		UFFD_FEATURE_EVENT_FORK,
 	},
 	{
 		.name = "wp-unpopulated",
-- 
2.39.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
  2023-04-13 23:11 [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
                   ` (3 preceding siblings ...)
  2023-04-13 23:11 ` [PATCH 4/6] selftests/mm: Extend and rename uffd pagemap test Peter Xu
@ 2023-04-13 23:11 ` Peter Xu
  2023-04-14  9:52   ` David Hildenbrand
  2023-04-13 23:12 ` [PATCH 6/6] selftests/mm: Add tests for RO pinning vs fork() Peter Xu
  5 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2023-04-13 23:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, David Hildenbrand, peterx,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli

The macro and facility can be reused in other tests too.  Make it general.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tools/testing/selftests/mm/Makefile        | 8 ++++----
 tools/testing/selftests/mm/check_config.sh | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 5a3434419403..9ffce175d5e6 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -161,8 +161,8 @@ warn_32bit_failure:
 endif
 endif
 
-# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
-$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
+# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
+$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
 
 $(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
 
@@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
 
 EXTRA_CLEAN += local_config.mk local_config.h
 
-ifeq ($(COW_EXTRA_LIBS),)
+ifeq ($(IOURING_EXTRA_LIBS),)
 all: warn_missing_liburing
 
 warn_missing_liburing:
 	@echo ; \
-	echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
+	echo "Warning: missing liburing support. Some tests will be skipped." ; \
 	echo
 endif
diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
index bcba3af0acea..3954f4746161 100644
--- a/tools/testing/selftests/mm/check_config.sh
+++ b/tools/testing/selftests/mm/check_config.sh
@@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
 
 if [ -f $tmpfile_o ]; then
     echo "#define LOCAL_CONFIG_HAVE_LIBURING 1"  > $OUTPUT_H_FILE
-    echo "COW_EXTRA_LIBS = -luring"              > $OUTPUT_MKFILE
+    echo "IOURING_EXTRA_LIBS = -luring"          > $OUTPUT_MKFILE
 else
     echo "// No liburing support found"          > $OUTPUT_H_FILE
     echo "# No liburing support found, so:"      > $OUTPUT_MKFILE
-    echo "COW_EXTRA_LIBS = "                    >> $OUTPUT_MKFILE
+    echo "IOURING_EXTRA_LIBS = "                >> $OUTPUT_MKFILE
 fi
 
 rm ${tmpname}.*
-- 
2.39.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 6/6] selftests/mm: Add tests for RO pinning vs fork()
  2023-04-13 23:11 [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
                   ` (4 preceding siblings ...)
  2023-04-13 23:11 ` [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS Peter Xu
@ 2023-04-13 23:12 ` Peter Xu
  5 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2023-04-13 23:12 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Nadav Amit, peterx, Mike Kravetz, Andrew Morton,
	Andrea Arcangeli, Axel Rasmussen, David Hildenbrand

Add 10 one more test to cover RO pinning against fork() over uffd-wp.  It
covers both:

  (1) Early CoW test in fork() when page pinned,
  (2) page unshare due to RO longterm pin.

They are:

Testing wp-fork-pin on anon... done
Testing wp-fork-pin on shmem... done
Testing wp-fork-pin on shmem-private... done
Testing wp-fork-pin on hugetlb... done
Testing wp-fork-pin on hugetlb-private... done
Testing wp-fork-pin-with-event on anon... done
Testing wp-fork-pin-with-event on shmem... done
Testing wp-fork-pin-with-event on shmem-private... done
Testing wp-fork-pin-with-event on hugetlb... done
Testing wp-fork-pin-with-event on hugetlb-private... done

CONFIG_GUP_TEST needed or they'll be skipped.

Testing wp-fork-pin on anon... skipped [reason: Possibly CONFIG_GUP_TEST missing or unprivileged]

Note that only private pages matter here, but no hurt to also run all of
them over shared.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tools/testing/selftests/mm/uffd-unit-tests.c | 144 ++++++++++++++++++-
 1 file changed, 141 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index 739fc4d30342..269c86768a02 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -7,6 +7,8 @@
 
 #include "uffd-common.h"
 
+#include "../../../../mm/gup_test.h"
+
 #ifdef __NR_userfaultfd
 
 /* The unit test doesn't need a large or random size, make it 32MB for now */
@@ -247,7 +249,53 @@ static void *fork_event_consumer(void *data)
 	return NULL;
 }
 
-static int pagemap_test_fork(int uffd, bool with_event)
+typedef struct {
+	int gup_fd;
+	bool pinned;
+} pin_args;
+
+/*
+ * Returns 0 if succeed, <0 for errors.  pin_pages() needs to be paired
+ * with unpin_pages().  Currently it needs to be RO longterm pin to satisfy
+ * all needs of the test cases (e.g., trigger unshare, trigger fork() early
+ * CoW, etc.).
+ */
+static int pin_pages(pin_args *args, void *buffer, size_t size)
+{
+	struct pin_longterm_test test = {
+		.addr = (uintptr_t)buffer,
+		.size = size,
+		/* Read-only pins */
+		.flags = 0,
+	};
+
+	if (args->pinned)
+		err("already pinned");
+
+	args->gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
+	if (args->gup_fd < 0)
+		return -errno;
+
+	if (ioctl(args->gup_fd, PIN_LONGTERM_TEST_START, &test)) {
+		/* Even if gup_test existed, can be an old gup_test / kernel */
+		close(args->gup_fd);
+		return -errno;
+	}
+	args->pinned = true;
+	return 0;
+}
+
+static void unpin_pages(pin_args *args)
+{
+	if (!args->pinned)
+		err("unpin without pin first");
+	if (ioctl(args->gup_fd, PIN_LONGTERM_TEST_STOP))
+		err("PIN_LONGTERM_TEST_STOP");
+	close(args->gup_fd);
+	args->pinned = false;
+}
+
+static int pagemap_test_fork(int uffd, bool with_event, bool test_pin)
 {
 	fork_event_args args = { .parent_uffd = uffd, .child_uffd = -1 };
 	pthread_t thread;
@@ -264,7 +312,17 @@ static int pagemap_test_fork(int uffd, bool with_event)
 	child = fork();
 	if (!child) {
 		/* Open the pagemap fd of the child itself */
+		pin_args args = {};
+
 		fd = pagemap_open();
+
+		if (test_pin && pin_pages(&args, area_dst, page_size))
+			/*
+			 * Normally when reach here we have pinned in
+			 * previous tests, so shouldn't fail anymore
+			 */
+			err("pin page failed in child");
+
 		value = pagemap_get_entry(fd, area_dst);
 		/*
 		 * After fork(), we should handle uffd-wp bit differently:
@@ -273,6 +331,8 @@ static int pagemap_test_fork(int uffd, bool with_event)
 		 * (2) when without EVENT_FORK, it should be dropped
 		 */
 		pagemap_check_wp(value, with_event);
+		if (test_pin)
+			unpin_pages(&args);
 		/* Succeed */
 		exit(0);
 	}
@@ -352,7 +412,7 @@ static void uffd_wp_fork_test_common(uffd_test_args_t *args,
 	wp_range(uffd, (uint64_t)area_dst, page_size, true);
 	value = pagemap_get_entry(pagemap_fd, area_dst);
 	pagemap_check_wp(value, true);
-	if (pagemap_test_fork(uffd, with_event)) {
+	if (pagemap_test_fork(uffd, with_event, false)) {
 		uffd_test_fail("Detected %s uffd-wp bit in child in present pte",
 			       with_event ? "missing" : "stall");
 		goto out;
@@ -383,7 +443,7 @@ static void uffd_wp_fork_test_common(uffd_test_args_t *args,
 	/* Uffd-wp should persist even swapped out */
 	value = pagemap_get_entry(pagemap_fd, area_dst);
 	pagemap_check_wp(value, true);
-	if (pagemap_test_fork(uffd, with_event)) {
+	if (pagemap_test_fork(uffd, with_event, false)) {
 		uffd_test_fail("Detected %s uffd-wp bit in child in zapped pte",
 			       with_event ? "missing" : "stall");
 		goto out;
@@ -415,6 +475,68 @@ static void uffd_wp_fork_with_event_test(uffd_test_args_t *args)
 	uffd_wp_fork_test_common(args, true);
 }
 
+static void uffd_wp_fork_pin_test_common(uffd_test_args_t *args,
+					 bool with_event)
+{
+	int pagemap_fd;
+	pin_args pin_args = {};
+
+	if (uffd_register(uffd, area_dst, page_size, false, true, false))
+		err("register failed");
+
+	pagemap_fd = pagemap_open();
+
+	/* Touch the page */
+	*area_dst = 1;
+	wp_range(uffd, (uint64_t)area_dst, page_size, true);
+
+	/*
+	 * 1. First pin, then fork().  This tests fork() special path when
+	 * doing early CoW if the page is private.
+	 */
+	if (pin_pages(&pin_args, area_dst, page_size)) {
+		uffd_test_skip("Possibly CONFIG_GUP_TEST missing "
+			       "or unprivileged");
+		close(pagemap_fd);
+		uffd_unregister(uffd, area_dst, page_size);
+		return;
+	}
+
+	if (pagemap_test_fork(uffd, with_event, false)) {
+		uffd_test_fail("Detected %s uffd-wp bit in early CoW of fork()",
+			       with_event ? "missing" : "stall");
+		unpin_pages(&pin_args);
+		goto out;
+	}
+
+	unpin_pages(&pin_args);
+
+	/*
+	 * 2. First fork(), then pin (in the child, where test_pin==true).
+	 * This tests COR, aka, page unsharing on private memories.
+	 */
+	if (pagemap_test_fork(uffd, with_event, true)) {
+		uffd_test_fail("Detected %s uffd-wp bit when RO pin",
+			       with_event ? "missing" : "stall");
+		goto out;
+	}
+	uffd_test_pass();
+out:
+	if (uffd_unregister(uffd, area_dst, page_size))
+		err("register failed");
+	close(pagemap_fd);
+}
+
+static void uffd_wp_fork_pin_test(uffd_test_args_t *args)
+{
+	uffd_wp_fork_pin_test_common(args, false);
+}
+
+static void uffd_wp_fork_pin_with_event_test(uffd_test_args_t *args)
+{
+	uffd_wp_fork_pin_test_common(args, true);
+}
+
 static void check_memory_contents(char *p)
 {
 	unsigned long i, j;
@@ -923,6 +1045,22 @@ uffd_test_case_t uffd_tests[] = {
 		/* when set, child process should inherit uffd-wp bits */
 		UFFD_FEATURE_EVENT_FORK,
 	},
+	{
+		.name = "wp-fork-pin",
+		.uffd_fn = uffd_wp_fork_pin_test,
+		.mem_targets = MEM_ALL,
+		.uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+		UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
+	},
+	{
+		.name = "wp-fork-pin-with-event",
+		.uffd_fn = uffd_wp_fork_pin_with_event_test,
+		.mem_targets = MEM_ALL,
+		.uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+		UFFD_FEATURE_WP_HUGETLBFS_SHMEM |
+		/* when set, child process should inherit uffd-wp bits */
+		UFFD_FEATURE_EVENT_FORK,
+	},
 	{
 		.name = "wp-unpopulated",
 		.uffd_fn = uffd_wp_unpopulated_test,
-- 
2.39.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens
  2023-04-13 23:11 ` [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens Peter Xu
@ 2023-04-14  9:23   ` David Hildenbrand
  2023-04-14 22:19   ` Mike Kravetz
  1 sibling, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2023-04-14  9:23 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, Mike Kravetz, Nadav Amit,
	Andrea Arcangeli, linux-stable

On 14.04.23 01:11, Peter Xu wrote:
> When we try to unshare a pinned page for a private hugetlb, uffd-wp bit can
> get lost during unsharing.  Fix it by carrying it over.
> 
> This should be very rare, only if an unsharing happened on a private
> hugetlb page with uffd-wp protected (e.g. in a child which shares the same
> page with parent with UFFD_FEATURE_EVENT_FORK enabled).
> 
> Cc: linux-stable <stable@vger.kernel.org>
> Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection")
> Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   mm/hugetlb.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 7320e64aacc6..083aae35bff8 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5637,13 +5637,16 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
>   	spin_lock(ptl);
>   	ptep = hugetlb_walk(vma, haddr, huge_page_size(h));
>   	if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) {
> +		pte_t newpte = make_huge_pte(vma, &new_folio->page, !unshare);
> +
>   		/* Break COW or unshare */
>   		huge_ptep_clear_flush(vma, haddr, ptep);
>   		mmu_notifier_invalidate_range(mm, range.start, range.end);
>   		page_remove_rmap(old_page, vma, true);
>   		hugepage_add_new_anon_rmap(new_folio, vma, haddr);
> -		set_huge_pte_at(mm, haddr, ptep,
> -				make_huge_pte(vma, &new_folio->page, !unshare));
> +		if (huge_pte_uffd_wp(pte))
> +			newpte = huge_pte_mkuffd_wp(newpte);
> +		set_huge_pte_at(mm, haddr, ptep, newpte);
>   		folio_set_hugetlb_migratable(new_folio);
>   		/* Make the old page be freed below */
>   		new_folio = page_folio(old_page);

LGTM, thanks

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()
  2023-04-13 23:11 ` [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork() Peter Xu
@ 2023-04-14  9:37   ` David Hildenbrand
  2023-04-14  9:45   ` Mika Penttilä
  2023-04-14 22:17   ` Mike Kravetz
  2 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2023-04-14  9:37 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, Mike Kravetz, Nadav Amit,
	Andrea Arcangeli, linux-stable

On 14.04.23 01:11, Peter Xu wrote:
> There're a bunch of things that were wrong:
> 
>    - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
>      rather than huge_pte_uffd_wp().
> 
>    - When copying over a pte, we should drop uffd-wp bit when
>      !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
> 
>    - When doing early CoW for private hugetlb (e.g. when the parent page was
>      pinned), uffd-wp bit should be properly carried over if necessary.
> 
> No bug reported probably because most people do not even care about these
> corner cases, but they are still bugs and can be exposed by the recent unit
> tests introduced, so fix all of them in one shot.
> 
> Cc: linux-stable <stable@vger.kernel.org>
> Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   mm/hugetlb.c | 26 ++++++++++++++++----------
>   1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f16b25b1a6b9..7320e64aacc6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
>   
>   static void
>   hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
> -		     struct folio *new_folio)
> +		      struct folio *new_folio, pte_t old)
>   {

Nit: The function now expects old to be !swap_pte. Which works perfectly 
fine with existing code -- the function name is a bit generic and 
misleading, unfortunately. IMHO, instead of factoring that functionality 
out to desperately try keeping copy_hugetlb_page_range() somewhat 
readable, we should just have factored out the complete copy+replace 
into a copy_hugetlb_page() function -- similar to the ordinary page 
handling -- which would have made copy_hugetlb_page_range() more 
readable eventually.

Anyhow, unrelated.

> +	pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
> +
>   	__folio_mark_uptodate(new_folio);
>   	hugepage_add_new_anon_rmap(new_folio, vma, addr);
> -	set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
> +	if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
> +		newpte = huge_pte_mkuffd_wp(newpte);
> +	set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
>   	hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
>   	folio_set_hugetlb_migratable(new_folio);
>   }
> @@ -5032,14 +5036,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   			 */
>   			;
>   		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
> -			bool uffd_wp = huge_pte_uffd_wp(entry);
> -
> -			if (!userfaultfd_wp(dst_vma) && uffd_wp)
> +			if (!userfaultfd_wp(dst_vma))
>   				entry = huge_pte_clear_uffd_wp(entry);
>   			set_huge_pte_at(dst, addr, dst_pte, entry);
>   		} else if (unlikely(is_hugetlb_entry_migration(entry))) {
>   			swp_entry_t swp_entry = pte_to_swp_entry(entry);
> -			bool uffd_wp = huge_pte_uffd_wp(entry);
>   
>   			if (!is_readable_migration_entry(swp_entry) && cow) {
>   				/*
> @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   				swp_entry = make_readable_migration_entry(
>   							swp_offset(swp_entry));
>   				entry = swp_entry_to_pte(swp_entry);
> -				if (userfaultfd_wp(src_vma) && uffd_wp)
> -					entry = huge_pte_mkuffd_wp(entry);
> +				if (userfaultfd_wp(src_vma) &&
> +				    pte_swp_uffd_wp(entry))
> +					entry = pte_swp_mkuffd_wp(entry);
>   				set_huge_pte_at(src, addr, src_pte, entry);
>   			}
> -			if (!userfaultfd_wp(dst_vma) && uffd_wp)
> +			if (!userfaultfd_wp(dst_vma))
>   				entry = huge_pte_clear_uffd_wp(entry);
>   			set_huge_pte_at(dst, addr, dst_pte, entry);
>   		} else if (unlikely(is_pte_marker(entry))) {
> @@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   					/* huge_ptep of dst_pte won't change as in child */
>   					goto again;
>   				}
> -				hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
> +				hugetlb_install_folio(dst_vma, dst_pte, addr,
> +						      new_folio, src_pte_old);
>   				spin_unlock(src_ptl);
>   				spin_unlock(dst_ptl);
>   				continue;
> @@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   				entry = huge_pte_wrprotect(entry);
>   			}
>   
> +			if (!userfaultfd_wp(dst_vma))
> +				entry = huge_pte_clear_uffd_wp(entry);
> +
>   			set_huge_pte_at(dst, addr, dst_pte, entry);
>   			hugetlb_count_add(npages, dst);
>   		}

LGTM

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()
  2023-04-13 23:11 ` [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork() Peter Xu
  2023-04-14  9:37   ` David Hildenbrand
@ 2023-04-14  9:45   ` Mika Penttilä
  2023-04-14 14:09     ` Peter Xu
  2023-04-14 22:17   ` Mike Kravetz
  2 siblings, 1 reply; 18+ messages in thread
From: Mika Penttilä @ 2023-04-14  9:45 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, David Hildenbrand, Mike Kravetz,
	Nadav Amit, Andrea Arcangeli, linux-stable



On 14.4.2023 2.11, Peter Xu wrote:
> There're a bunch of things that were wrong:
> 
>    - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
>      rather than huge_pte_uffd_wp().
> 
>    - When copying over a pte, we should drop uffd-wp bit when
>      !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
> 
>    - When doing early CoW for private hugetlb (e.g. when the parent page was
>      pinned), uffd-wp bit should be properly carried over if necessary.
> 
> No bug reported probably because most people do not even care about these
> corner cases, but they are still bugs and can be exposed by the recent unit
> tests introduced, so fix all of them in one shot.
> 
> Cc: linux-stable <stable@vger.kernel.org>
> Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   mm/hugetlb.c | 26 ++++++++++++++++----------
>   1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f16b25b1a6b9..7320e64aacc6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
>   
>   static void
>   hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
> -		     struct folio *new_folio)
> +		      struct folio *new_folio, pte_t old)
>   {
> +	pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
> +
>   	__folio_mark_uptodate(new_folio);
>   	hugepage_add_new_anon_rmap(new_folio, vma, addr);
> -	set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
> +	if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
> +		newpte = huge_pte_mkuffd_wp(newpte);
> +	set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
>   	hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
>   	folio_set_hugetlb_migratable(new_folio);
>   }
> @@ -5032,14 +5036,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   			 */
>   			;
>   		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
> -			bool uffd_wp = huge_pte_uffd_wp(entry);
> -
> -			if (!userfaultfd_wp(dst_vma) && uffd_wp)
> +			if (!userfaultfd_wp(dst_vma))
>   				entry = huge_pte_clear_uffd_wp(entry);
>   			set_huge_pte_at(dst, addr, dst_pte, entry);
>   		} else if (unlikely(is_hugetlb_entry_migration(entry))) {
>   			swp_entry_t swp_entry = pte_to_swp_entry(entry);
> -			bool uffd_wp = huge_pte_uffd_wp(entry);
>   
>   			if (!is_readable_migration_entry(swp_entry) && cow) {
>   				/*
> @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   				swp_entry = make_readable_migration_entry(
>   							swp_offset(swp_entry));
>   				entry = swp_entry_to_pte(swp_entry);
> -				if (userfaultfd_wp(src_vma) && uffd_wp)
> -					entry = huge_pte_mkuffd_wp(entry);
> +				if (userfaultfd_wp(src_vma) &&
> +				    pte_swp_uffd_wp(entry))
> +					entry = pte_swp_mkuffd_wp(entry);


This looks interesting with pte_swp_uffd_wp and pte_swp_mkuffd_wp ?


>   				set_huge_pte_at(src, addr, src_pte, entry);
>   			}
> -			if (!userfaultfd_wp(dst_vma) && uffd_wp)
> +			if (!userfaultfd_wp(dst_vma))
>   				entry = huge_pte_clear_uffd_wp(entry);
>   			set_huge_pte_at(dst, addr, dst_pte, entry);
>   		} else if (unlikely(is_pte_marker(entry))) {
> @@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   					/* huge_ptep of dst_pte won't change as in child */
>   					goto again;
>   				}
> -				hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
> +				hugetlb_install_folio(dst_vma, dst_pte, addr,
> +						      new_folio, src_pte_old);
>   				spin_unlock(src_ptl);
>   				spin_unlock(dst_ptl);
>   				continue;
> @@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>   				entry = huge_pte_wrprotect(entry);
>   			}
>   
> +			if (!userfaultfd_wp(dst_vma))
> +				entry = huge_pte_clear_uffd_wp(entry);
> +
>   			set_huge_pte_at(dst, addr, dst_pte, entry);
>   			hugetlb_count_add(npages, dst);
>   		}


--Mika




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
  2023-04-13 23:11 ` [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS Peter Xu
@ 2023-04-14  9:52   ` David Hildenbrand
  2023-04-14 13:56     ` Peter Xu
  0 siblings, 1 reply; 18+ messages in thread
From: David Hildenbrand @ 2023-04-14  9:52 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Axel Rasmussen, Andrew Morton, Mike Kravetz, Nadav Amit,
	Andrea Arcangeli

On 14.04.23 01:11, Peter Xu wrote:
> The macro and facility can be reused in other tests too.  Make it general.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   tools/testing/selftests/mm/Makefile        | 8 ++++----
>   tools/testing/selftests/mm/check_config.sh | 4 ++--
>   2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> index 5a3434419403..9ffce175d5e6 100644
> --- a/tools/testing/selftests/mm/Makefile
> +++ b/tools/testing/selftests/mm/Makefile
> @@ -161,8 +161,8 @@ warn_32bit_failure:
>   endif
>   endif
>   
> -# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> -$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
> +# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> +$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
>   
>   $(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
>   
> @@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
>   
>   EXTRA_CLEAN += local_config.mk local_config.h
>   
> -ifeq ($(COW_EXTRA_LIBS),)
> +ifeq ($(IOURING_EXTRA_LIBS),)
>   all: warn_missing_liburing
>   
>   warn_missing_liburing:
>   	@echo ; \
> -	echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
> +	echo "Warning: missing liburing support. Some tests will be skipped." ; \
>   	echo
>   endif
> diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
> index bcba3af0acea..3954f4746161 100644
> --- a/tools/testing/selftests/mm/check_config.sh
> +++ b/tools/testing/selftests/mm/check_config.sh
> @@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
>   
>   if [ -f $tmpfile_o ]; then
>       echo "#define LOCAL_CONFIG_HAVE_LIBURING 1"  > $OUTPUT_H_FILE
> -    echo "COW_EXTRA_LIBS = -luring"              > $OUTPUT_MKFILE
> +    echo "IOURING_EXTRA_LIBS = -luring"          > $OUTPUT_MKFILE
>   else
>       echo "// No liburing support found"          > $OUTPUT_H_FILE
>       echo "# No liburing support found, so:"      > $OUTPUT_MKFILE
> -    echo "COW_EXTRA_LIBS = "                    >> $OUTPUT_MKFILE
> +    echo "IOURING_EXTRA_LIBS = "                >> $OUTPUT_MKFILE
>   fi
>   
>   rm ${tmpname}.*

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
  2023-04-14  9:52   ` David Hildenbrand
@ 2023-04-14 13:56     ` Peter Xu
  2023-04-14 14:29       ` David Hildenbrand
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2023-04-14 13:56 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Axel Rasmussen, Andrew Morton,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli

On Fri, Apr 14, 2023 at 11:52:40AM +0200, David Hildenbrand wrote:
> On 14.04.23 01:11, Peter Xu wrote:
> > The macro and facility can be reused in other tests too.  Make it general.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   tools/testing/selftests/mm/Makefile        | 8 ++++----
> >   tools/testing/selftests/mm/check_config.sh | 4 ++--
> >   2 files changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> > index 5a3434419403..9ffce175d5e6 100644
> > --- a/tools/testing/selftests/mm/Makefile
> > +++ b/tools/testing/selftests/mm/Makefile
> > @@ -161,8 +161,8 @@ warn_32bit_failure:
> >   endif
> >   endif
> > -# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> > -$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
> > +# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> > +$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
> >   $(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
> > @@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
> >   EXTRA_CLEAN += local_config.mk local_config.h
> > -ifeq ($(COW_EXTRA_LIBS),)
> > +ifeq ($(IOURING_EXTRA_LIBS),)
> >   all: warn_missing_liburing
> >   warn_missing_liburing:
> >   	@echo ; \
> > -	echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
> > +	echo "Warning: missing liburing support. Some tests will be skipped." ; \
> >   	echo
> >   endif
> > diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
> > index bcba3af0acea..3954f4746161 100644
> > --- a/tools/testing/selftests/mm/check_config.sh
> > +++ b/tools/testing/selftests/mm/check_config.sh
> > @@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
> >   if [ -f $tmpfile_o ]; then
> >       echo "#define LOCAL_CONFIG_HAVE_LIBURING 1"  > $OUTPUT_H_FILE
> > -    echo "COW_EXTRA_LIBS = -luring"              > $OUTPUT_MKFILE
> > +    echo "IOURING_EXTRA_LIBS = -luring"          > $OUTPUT_MKFILE
> >   else
> >       echo "// No liburing support found"          > $OUTPUT_H_FILE
> >       echo "# No liburing support found, so:"      > $OUTPUT_MKFILE
> > -    echo "COW_EXTRA_LIBS = "                    >> $OUTPUT_MKFILE
> > +    echo "IOURING_EXTRA_LIBS = "                >> $OUTPUT_MKFILE
> >   fi
> >   rm ${tmpname}.*
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>

Oops, I planned to drop this patch but I forgot.. I was planning to use
iouring but only later found that it cannot take RO pins so switched to
gup_test per your cow test.  Hence this patch is not needed anymore.

But since it's already there and looks like still good to have.. let me
keep it around with your R-b then.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()
  2023-04-14  9:45   ` Mika Penttilä
@ 2023-04-14 14:09     ` Peter Xu
  2023-04-14 14:23       ` Mika Penttilä
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2023-04-14 14:09 UTC (permalink / raw)
  To: Mika Penttilä
  Cc: linux-kernel, linux-mm, Axel Rasmussen, Andrew Morton,
	David Hildenbrand, Mike Kravetz, Nadav Amit, Andrea Arcangeli,
	linux-stable

On Fri, Apr 14, 2023 at 12:45:29PM +0300, Mika Penttilä wrote:
> >   		} else if (unlikely(is_hugetlb_entry_migration(entry))) {
> >   			swp_entry_t swp_entry = pte_to_swp_entry(entry);
> > -			bool uffd_wp = huge_pte_uffd_wp(entry);

[1]

> >   			if (!is_readable_migration_entry(swp_entry) && cow) {
> >   				/*
> > @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> >   				swp_entry = make_readable_migration_entry(
> >   							swp_offset(swp_entry));
> >   				entry = swp_entry_to_pte(swp_entry);

[2]

> > -				if (userfaultfd_wp(src_vma) && uffd_wp)
> > -					entry = huge_pte_mkuffd_wp(entry);
> > +				if (userfaultfd_wp(src_vma) &&
> > +				    pte_swp_uffd_wp(entry))
> > +					entry = pte_swp_mkuffd_wp(entry);
> 
> 
> This looks interesting with pte_swp_uffd_wp and pte_swp_mkuffd_wp ?

Could you explain what do you mean?

I think these helpers are the right ones to use, as afaict hugetlb
migration should follow the same pte format with !hugetlb.  However, I
noticed I did it wrong when dropping the temp var - when at [1], "entry"
still points to the src entry, but at [2] it's already pointing to the
newly created one..  so I think I can't drop the var, a fixup should like:

===8<===
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 083aae35bff8..cd3a9d8f4b70 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5041,6 +5041,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
                        set_huge_pte_at(dst, addr, dst_pte, entry);
                } else if (unlikely(is_hugetlb_entry_migration(entry))) {
                        swp_entry_t swp_entry = pte_to_swp_entry(entry);
+                       bool uffd_wp = pte_swp_uffd_wp(entry);

                        if (!is_readable_migration_entry(swp_entry) && cow) {
                                /*
@@ -5050,8 +5051,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
                                swp_entry = make_readable_migration_entry(
                                                        swp_offset(swp_entry));
                                entry = swp_entry_to_pte(swp_entry);
-                               if (userfaultfd_wp(src_vma) &&
-                                   pte_swp_uffd_wp(entry))
+                               if (userfaultfd_wp(src_vma) && uffd_wp)
                                        entry = pte_swp_mkuffd_wp(entry);
                                set_huge_pte_at(src, addr, src_pte, entry);
===8<===

Besides, did I miss something else?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()
  2023-04-14 14:09     ` Peter Xu
@ 2023-04-14 14:23       ` Mika Penttilä
  2023-04-14 15:21         ` Peter Xu
  0 siblings, 1 reply; 18+ messages in thread
From: Mika Penttilä @ 2023-04-14 14:23 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-kernel, linux-mm, Axel Rasmussen, Andrew Morton,
	David Hildenbrand, Mike Kravetz, Nadav Amit, Andrea Arcangeli,
	linux-stable



On 14.4.2023 17.09, Peter Xu wrote:
> On Fri, Apr 14, 2023 at 12:45:29PM +0300, Mika Penttilä wrote:
>>>    		} else if (unlikely(is_hugetlb_entry_migration(entry))) {
>>>    			swp_entry_t swp_entry = pte_to_swp_entry(entry);
>>> -			bool uffd_wp = huge_pte_uffd_wp(entry);
> 
> [1]
> 
>>>    			if (!is_readable_migration_entry(swp_entry) && cow) {
>>>    				/*
>>> @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>>>    				swp_entry = make_readable_migration_entry(
>>>    							swp_offset(swp_entry));
>>>    				entry = swp_entry_to_pte(swp_entry);
> 
> [2]
> 
>>> -				if (userfaultfd_wp(src_vma) && uffd_wp)
>>> -					entry = huge_pte_mkuffd_wp(entry);
>>> +				if (userfaultfd_wp(src_vma) &&
>>> +				    pte_swp_uffd_wp(entry))
>>> +					entry = pte_swp_mkuffd_wp(entry);
>>
>>
>> This looks interesting with pte_swp_uffd_wp and pte_swp_mkuffd_wp ?
> 
> Could you explain what do you mean?
> 

Yes like you noticed also you called pte_swp_mkuffd_wp(entry) iff 
pte_swp_uffd_wp(entry) which is of course a nop.

But the fixup not dropping the temp var should work.

> I think these helpers are the right ones to use, as afaict hugetlb
> migration should follow the same pte format with !hugetlb.  However, I
> noticed I did it wrong when dropping the temp var - when at [1], "entry"
> still points to the src entry, but at [2] it's already pointing to the
> newly created one..  so I think I can't drop the var, a fixup should like:
> 
> ===8<===
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 083aae35bff8..cd3a9d8f4b70 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5041,6 +5041,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>                          set_huge_pte_at(dst, addr, dst_pte, entry);
>                  } else if (unlikely(is_hugetlb_entry_migration(entry))) {
>                          swp_entry_t swp_entry = pte_to_swp_entry(entry);
> +                       bool uffd_wp = pte_swp_uffd_wp(entry);
> 
>                          if (!is_readable_migration_entry(swp_entry) && cow) {
>                                  /*
> @@ -5050,8 +5051,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>                                  swp_entry = make_readable_migration_entry(
>                                                          swp_offset(swp_entry));
>                                  entry = swp_entry_to_pte(swp_entry);
> -                               if (userfaultfd_wp(src_vma) &&
> -                                   pte_swp_uffd_wp(entry))
> +                               if (userfaultfd_wp(src_vma) && uffd_wp)
>                                          entry = pte_swp_mkuffd_wp(entry);
>                                  set_huge_pte_at(src, addr, src_pte, entry);
> ===8<===
> 
> Besides, did I miss something else?
> 
> Thanks,
> 

--Mika



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
  2023-04-14 13:56     ` Peter Xu
@ 2023-04-14 14:29       ` David Hildenbrand
  0 siblings, 0 replies; 18+ messages in thread
From: David Hildenbrand @ 2023-04-14 14:29 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-kernel, linux-mm, Axel Rasmussen, Andrew Morton,
	Mike Kravetz, Nadav Amit, Andrea Arcangeli

On 14.04.23 15:56, Peter Xu wrote:
> On Fri, Apr 14, 2023 at 11:52:40AM +0200, David Hildenbrand wrote:
>> On 14.04.23 01:11, Peter Xu wrote:
>>> The macro and facility can be reused in other tests too.  Make it general.
>>>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    tools/testing/selftests/mm/Makefile        | 8 ++++----
>>>    tools/testing/selftests/mm/check_config.sh | 4 ++--
>>>    2 files changed, 6 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
>>> index 5a3434419403..9ffce175d5e6 100644
>>> --- a/tools/testing/selftests/mm/Makefile
>>> +++ b/tools/testing/selftests/mm/Makefile
>>> @@ -161,8 +161,8 @@ warn_32bit_failure:
>>>    endif
>>>    endif
>>> -# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
>>> -$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
>>> +# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
>>> +$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
>>>    $(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
>>> @@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
>>>    EXTRA_CLEAN += local_config.mk local_config.h
>>> -ifeq ($(COW_EXTRA_LIBS),)
>>> +ifeq ($(IOURING_EXTRA_LIBS),)
>>>    all: warn_missing_liburing
>>>    warn_missing_liburing:
>>>    	@echo ; \
>>> -	echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
>>> +	echo "Warning: missing liburing support. Some tests will be skipped." ; \
>>>    	echo
>>>    endif
>>> diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
>>> index bcba3af0acea..3954f4746161 100644
>>> --- a/tools/testing/selftests/mm/check_config.sh
>>> +++ b/tools/testing/selftests/mm/check_config.sh
>>> @@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
>>>    if [ -f $tmpfile_o ]; then
>>>        echo "#define LOCAL_CONFIG_HAVE_LIBURING 1"  > $OUTPUT_H_FILE
>>> -    echo "COW_EXTRA_LIBS = -luring"              > $OUTPUT_MKFILE
>>> +    echo "IOURING_EXTRA_LIBS = -luring"          > $OUTPUT_MKFILE
>>>    else
>>>        echo "// No liburing support found"          > $OUTPUT_H_FILE
>>>        echo "# No liburing support found, so:"      > $OUTPUT_MKFILE
>>> -    echo "COW_EXTRA_LIBS = "                    >> $OUTPUT_MKFILE
>>> +    echo "IOURING_EXTRA_LIBS = "                >> $OUTPUT_MKFILE
>>>    fi
>>>    rm ${tmpname}.*
>>
>> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> Oops, I planned to drop this patch but I forgot.. I was planning to use
> iouring but only later found that it cannot take RO pins so switched to
> gup_test per your cow test.  Hence this patch is not needed anymore.
> 

Yeah, it's unfortunate ... I briefly thought about adding R/O fixed 
buffer support, but it looked like more work than eventual benefit.

> But since it's already there and looks like still good to have.. let me
> keep it around with your R-b then.

Yes, makes sense to me.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()
  2023-04-14 14:23       ` Mika Penttilä
@ 2023-04-14 15:21         ` Peter Xu
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Xu @ 2023-04-14 15:21 UTC (permalink / raw)
  To: Mika Penttilä
  Cc: linux-kernel, linux-mm, Axel Rasmussen, Andrew Morton,
	David Hildenbrand, Mike Kravetz, Nadav Amit, Andrea Arcangeli,
	linux-stable

On Fri, Apr 14, 2023 at 05:23:12PM +0300, Mika Penttilä wrote:
> But the fixup not dropping the temp var should work.

Ok I see.  I'll wait for a few more days for a respin.  Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()
  2023-04-13 23:11 ` [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork() Peter Xu
  2023-04-14  9:37   ` David Hildenbrand
  2023-04-14  9:45   ` Mika Penttilä
@ 2023-04-14 22:17   ` Mike Kravetz
  2 siblings, 0 replies; 18+ messages in thread
From: Mike Kravetz @ 2023-04-14 22:17 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-kernel, linux-mm, Axel Rasmussen, Andrew Morton,
	David Hildenbrand, Nadav Amit, Andrea Arcangeli, linux-stable

On 04/13/23 19:11, Peter Xu wrote:
> There're a bunch of things that were wrong:
> 
>   - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
>     rather than huge_pte_uffd_wp().

That was/is quite confusing to me at least.

> 
>   - When copying over a pte, we should drop uffd-wp bit when
>     !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
> 
>   - When doing early CoW for private hugetlb (e.g. when the parent page was
>     pinned), uffd-wp bit should be properly carried over if necessary.
> 
> No bug reported probably because most people do not even care about these
> corner cases, but they are still bugs and can be exposed by the recent unit
> tests introduced, so fix all of them in one shot.
> 
> Cc: linux-stable <stable@vger.kernel.org>
> Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/hugetlb.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)

No issues except losing information in pte entry as pointed out by Mika.

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens
  2023-04-13 23:11 ` [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens Peter Xu
  2023-04-14  9:23   ` David Hildenbrand
@ 2023-04-14 22:19   ` Mike Kravetz
  1 sibling, 0 replies; 18+ messages in thread
From: Mike Kravetz @ 2023-04-14 22:19 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-kernel, linux-mm, Axel Rasmussen, Andrew Morton,
	David Hildenbrand, Nadav Amit, Andrea Arcangeli, linux-stable

On 04/13/23 19:11, Peter Xu wrote:
> When we try to unshare a pinned page for a private hugetlb, uffd-wp bit can
> get lost during unsharing.  Fix it by carrying it over.
> 
> This should be very rare, only if an unsharing happened on a private
> hugetlb page with uffd-wp protected (e.g. in a child which shares the same
> page with parent with UFFD_FEATURE_EVENT_FORK enabled).
> 
> Cc: linux-stable <stable@vger.kernel.org>
> Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection")
> Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/hugetlb.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 7320e64aacc6..083aae35bff8 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5637,13 +5637,16 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
>  	spin_lock(ptl);
>  	ptep = hugetlb_walk(vma, haddr, huge_page_size(h));
>  	if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) {
> +		pte_t newpte = make_huge_pte(vma, &new_folio->page, !unshare);
> +
>  		/* Break COW or unshare */
>  		huge_ptep_clear_flush(vma, haddr, ptep);
>  		mmu_notifier_invalidate_range(mm, range.start, range.end);
>  		page_remove_rmap(old_page, vma, true);
>  		hugepage_add_new_anon_rmap(new_folio, vma, haddr);
> -		set_huge_pte_at(mm, haddr, ptep,
> -				make_huge_pte(vma, &new_folio->page, !unshare));
> +		if (huge_pte_uffd_wp(pte))
> +			newpte = huge_pte_mkuffd_wp(newpte);
> +		set_huge_pte_at(mm, haddr, ptep, newpte);
>  		folio_set_hugetlb_migratable(new_folio);
>  		/* Make the old page be freed below */
>  		new_folio = page_folio(old_page);
> -- 
> 2.39.1
> 

Thanks!  Looks good,

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-04-14 22:19 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-13 23:11 [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins Peter Xu
2023-04-13 23:11 ` [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork() Peter Xu
2023-04-14  9:37   ` David Hildenbrand
2023-04-14  9:45   ` Mika Penttilä
2023-04-14 14:09     ` Peter Xu
2023-04-14 14:23       ` Mika Penttilä
2023-04-14 15:21         ` Peter Xu
2023-04-14 22:17   ` Mike Kravetz
2023-04-13 23:11 ` [PATCH 2/6] mm/hugetlb: Fix uffd-wp bit lost when unsharing happens Peter Xu
2023-04-14  9:23   ` David Hildenbrand
2023-04-14 22:19   ` Mike Kravetz
2023-04-13 23:11 ` [PATCH 3/6] selftests/mm: Add a few options for uffd-unit-test Peter Xu
2023-04-13 23:11 ` [PATCH 4/6] selftests/mm: Extend and rename uffd pagemap test Peter Xu
2023-04-13 23:11 ` [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS Peter Xu
2023-04-14  9:52   ` David Hildenbrand
2023-04-14 13:56     ` Peter Xu
2023-04-14 14:29       ` David Hildenbrand
2023-04-13 23:12 ` [PATCH 6/6] selftests/mm: Add tests for RO pinning vs fork() Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox