linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] MDWE without inheritance
@ 2023-05-17 15:03 Florent Revest
  2023-05-17 15:03 ` [PATCH v2 1/5] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-17 15:03 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, david, peterx, izbyshev, broonie, szabolcs.nagy,
	kpsingh, gthelen, toiwoton, Florent Revest

Joey recently introduced a Memory-Deny-Write-Executable (MDWE) prctl which tags
current with a flag that prevents pages that were previously not executable from
becoming executable.
This tag always gets inherited by children tasks. (it's in MMF_INIT_MASK)

At Google, we've been using a somewhat similar downstream patch for a few years
now. To make the adoption of this feature easier, we've had it support a mode in
which the W^X flag does not propagate to children. For example, this is handy if
a C process which wants W^X protection suspects it could start children
processes that would use a JIT.

I'd like to align our features with the upstream prctl. This series proposes a
new NO_INHERIT flag to the MDWE prctl to make this kind of adoption easier. It
sets a different flag in current that is not in MMF_INIT_MASK and which does not
propagate.

As part of looking into MDWE, I also fixed a couple of things in the MDWE test.

This series applies on the mm-everything-2023-05-16-23-30 tag of the mm tree:
  https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/

Diff since v1:
- MMF_HAS_MDWE_NO_INHERIT clears MMF_HAS_MDWE in the fork path as part of a
  MMF_INIT_FLAGS macro (suggested by Catalin)
- PR_MDWE_* are defined as unsigned long rather than int (suggested by Andrey)

Florent Revest (5):
  kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test
  kselftest: vm: Fix mdwe's mmap_FIXED test case
  mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
  mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl
  kselftest: vm: Add tests for no-inherit memory-deny-write-execute

 include/linux/sched/coredump.h         |  10 +++
 include/uapi/linux/prctl.h             |   3 +-
 kernel/fork.c                          |   2 +-
 kernel/sys.c                           |  24 +++++-
 tools/include/uapi/linux/prctl.h       |   3 +-
 tools/testing/selftests/mm/mdwe_test.c | 110 +++++++++++++++++++++----
 6 files changed, 131 insertions(+), 21 deletions(-)

-- 
2.40.1.606.ga4b1b128d6-goog



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/5] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test
  2023-05-17 15:03 [PATCH v2 0/5] MDWE without inheritance Florent Revest
@ 2023-05-17 15:03 ` Florent Revest
  2023-05-22  8:52   ` David Hildenbrand
  2023-05-17 15:03 ` [PATCH v2 2/5] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Florent Revest @ 2023-05-17 15:03 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, david, peterx, izbyshev, broonie, szabolcs.nagy,
	kpsingh, gthelen, toiwoton, Florent Revest

Signed-off-by: Florent Revest <revest@chromium.org>
---
 tools/testing/selftests/mm/mdwe_test.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c
index bc91bef5d254..d0954c657feb 100644
--- a/tools/testing/selftests/mm/mdwe_test.c
+++ b/tools/testing/selftests/mm/mdwe_test.c
@@ -49,19 +49,19 @@ FIXTURE_VARIANT(mdwe)
 
 FIXTURE_VARIANT_ADD(mdwe, stock)
 {
-        .enabled = false,
+	.enabled = false,
 	.forked = false,
 };
 
 FIXTURE_VARIANT_ADD(mdwe, enabled)
 {
-        .enabled = true,
+	.enabled = true,
 	.forked = false,
 };
 
 FIXTURE_VARIANT_ADD(mdwe, forked)
 {
-        .enabled = true,
+	.enabled = true,
 	.forked = true,
 };
 
-- 
2.40.1.606.ga4b1b128d6-goog



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 2/5] kselftest: vm: Fix mdwe's mmap_FIXED test case
  2023-05-17 15:03 [PATCH v2 0/5] MDWE without inheritance Florent Revest
  2023-05-17 15:03 ` [PATCH v2 1/5] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest
@ 2023-05-17 15:03 ` Florent Revest
  2023-05-22  8:53   ` David Hildenbrand
  2023-05-17 15:03 ` [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long Florent Revest
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Florent Revest @ 2023-05-17 15:03 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, david, peterx, izbyshev, broonie, szabolcs.nagy,
	kpsingh, gthelen, toiwoton, Florent Revest

I checked with the original author, the mmap_FIXED test case wasn't
properly tested and fails. Currently, it maps two consecutive (non
overlapping) pages and expects the second mapping to be denied by MDWE
but these two pages have nothing to do with each other so MDWE is
actually out of the picture here.

What the test actually intended to do was to remap a virtual address
using MAP_FIXED. However, this operation unmaps the existing mapping and
creates a new one so the va is backed by a new page and MDWE is again
out of the picture, all remappings should succeed.

This patch keeps the test case to make it clear that this situation is
expected to work.

Signed-off-by: Florent Revest <revest@chromium.org>
Fixes: 4cf1fe34fd18 ("kselftest: vm: add tests for memory-deny-write-execute")
---
 tools/testing/selftests/mm/mdwe_test.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c
index d0954c657feb..91aa9c3099e7 100644
--- a/tools/testing/selftests/mm/mdwe_test.c
+++ b/tools/testing/selftests/mm/mdwe_test.c
@@ -168,13 +168,10 @@ TEST_F(mdwe, mmap_FIXED)
 	self->p = mmap(NULL, self->size, PROT_READ, self->flags, 0, 0);
 	ASSERT_NE(self->p, MAP_FAILED);
 
-	p = mmap(self->p + self->size, self->size, PROT_READ | PROT_EXEC,
+	/* MAP_FIXED unmaps the existing page before mapping which is allowed */
+	p = mmap(self->p, self->size, PROT_READ | PROT_EXEC,
 		 self->flags | MAP_FIXED, 0, 0);
-	if (variant->enabled) {
-		EXPECT_EQ(p, MAP_FAILED);
-	} else {
-		EXPECT_EQ(p, self->p);
-	}
+	EXPECT_EQ(p, self->p);
 }
 
 TEST_F(mdwe, arm64_BTI)
-- 
2.40.1.606.ga4b1b128d6-goog



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
  2023-05-17 15:03 [PATCH v2 0/5] MDWE without inheritance Florent Revest
  2023-05-17 15:03 ` [PATCH v2 1/5] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest
  2023-05-17 15:03 ` [PATCH v2 2/5] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest
@ 2023-05-17 15:03 ` Florent Revest
  2023-05-22  8:55   ` David Hildenbrand
  2023-05-23 14:11   ` Catalin Marinas
  2023-05-17 15:03 ` [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest
  2023-05-17 15:03 ` [PATCH v2 5/5] kselftest: vm: Add tests for no-inherit memory-deny-write-execute Florent Revest
  4 siblings, 2 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-17 15:03 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, david, peterx, izbyshev, broonie, szabolcs.nagy,
	kpsingh, gthelen, toiwoton, Florent Revest

Alexey pointed out that defining a prctl flag as an int is a footgun
because, under some circumstances, when used as a flag to prctl, it can
be casted to long with garbage upper bits which would result in
unexpected behaviors.

This patch changes the constant to a UL to eliminate these
possibilities.

Signed-off-by: Florent Revest <revest@chromium.org>
Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru>
---
 include/uapi/linux/prctl.h       | 2 +-
 tools/include/uapi/linux/prctl.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index f23d9a16507f..6e9af6cbc950 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -283,7 +283,7 @@ struct prctl_mm_map {
 
 /* Memory deny write / execute */
 #define PR_SET_MDWE			65
-# define PR_MDWE_REFUSE_EXEC_GAIN	1
+# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
 
 #define PR_GET_MDWE			66
 
diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h
index 759b3f53e53f..6e6563e97fef 100644
--- a/tools/include/uapi/linux/prctl.h
+++ b/tools/include/uapi/linux/prctl.h
@@ -283,7 +283,7 @@ struct prctl_mm_map {
 
 /* Memory deny write / execute */
 #define PR_SET_MDWE			65
-# define PR_MDWE_REFUSE_EXEC_GAIN	1
+# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
 
 #define PR_GET_MDWE			66
 
-- 
2.40.1.606.ga4b1b128d6-goog



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl
  2023-05-17 15:03 [PATCH v2 0/5] MDWE without inheritance Florent Revest
                   ` (2 preceding siblings ...)
  2023-05-17 15:03 ` [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long Florent Revest
@ 2023-05-17 15:03 ` Florent Revest
  2023-05-22  9:01   ` David Hildenbrand
  2023-05-23 16:36   ` Catalin Marinas
  2023-05-17 15:03 ` [PATCH v2 5/5] kselftest: vm: Add tests for no-inherit memory-deny-write-execute Florent Revest
  4 siblings, 2 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-17 15:03 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, david, peterx, izbyshev, broonie, szabolcs.nagy,
	kpsingh, gthelen, toiwoton, Florent Revest

This extends the current PR_SET_MDWE prctl arg with a bit to indicate
that the process doesn't want MDWE protection to propagate to children.

To implement this no-inherit mode, the tag in current->mm->flags must be
absent from MMF_INIT_MASK. This means that the encoding for "MDWE but
without inherit" is different in the prctl than in the mm flags. This
leads to a bit of bit-mangling in the prctl implementation.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/sched/coredump.h   | 10 ++++++++++
 include/uapi/linux/prctl.h       |  1 +
 kernel/fork.c                    |  2 +-
 kernel/sys.c                     | 24 +++++++++++++++++++++---
 tools/include/uapi/linux/prctl.h |  1 +
 5 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
index 0ee96ea7a0e9..11f5e3dacb4e 100644
--- a/include/linux/sched/coredump.h
+++ b/include/linux/sched/coredump.h
@@ -91,4 +91,14 @@ static inline int get_dumpable(struct mm_struct *mm)
 				 MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK)
 
 #define MMF_VM_MERGE_ANY	29
+#define MMF_HAS_MDWE_NO_INHERIT	30
+
+#define MMF_INIT_FLAGS(flags)	({					\
+	unsigned long new_flags = flags;				\
+	if (new_flags & (1UL << MMF_HAS_MDWE_NO_INHERIT))		\
+		new_flags &= ~((1UL << MMF_HAS_MDWE) |			\
+				(1UL << MMF_HAS_MDWE_NO_INHERIT));	\
+	new_flags & MMF_INIT_MASK;					\
+})
+
 #endif /* _LINUX_SCHED_COREDUMP_H */
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 6e9af6cbc950..dacbe824e7c3 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -284,6 +284,7 @@ struct prctl_mm_map {
 /* Memory deny write / execute */
 #define PR_SET_MDWE			65
 # define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
+# define PR_MDWE_NO_INHERIT		(1UL << 1)
 
 #define PR_GET_MDWE			66
 
diff --git a/kernel/fork.c b/kernel/fork.c
index d17995934eb4..62d52ad99937 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1284,7 +1284,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	hugetlb_count_init(mm);
 
 	if (current->mm) {
-		mm->flags = current->mm->flags & MMF_INIT_MASK;
+		mm->flags = MMF_INIT_FLAGS(current->mm->flags);
 		mm->def_flags = current->mm->def_flags & VM_INIT_DEF_MASK;
 	} else {
 		mm->flags = default_dump_filter;
diff --git a/kernel/sys.c b/kernel/sys.c
index 339fee3eff6a..320eae3b12ab 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2368,9 +2368,25 @@ static inline int prctl_set_mdwe(unsigned long bits, unsigned long arg3,
 	if (arg3 || arg4 || arg5)
 		return -EINVAL;
 
-	if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN))
+	if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT))
 		return -EINVAL;
 
+	/* NO_INHERIT only makes sense with REFUSE_EXEC_GAIN */
+	if (bits & PR_MDWE_NO_INHERIT && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
+		return -EINVAL;
+
+	/* Can't gain NO_INHERIT from !NO_INHERIT */
+	if (bits & PR_MDWE_NO_INHERIT &&
+	    test_bit(MMF_HAS_MDWE, &current->mm->flags) &&
+	    !test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags))
+		return -EPERM;
+
+	if (bits & PR_MDWE_NO_INHERIT)
+		set_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags);
+	else if (test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags)
+		 && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
+		return -EPERM; /* Cannot unset the flag */
+
 	if (bits & PR_MDWE_REFUSE_EXEC_GAIN)
 		set_bit(MMF_HAS_MDWE, &current->mm->flags);
 	else if (test_bit(MMF_HAS_MDWE, &current->mm->flags))
@@ -2385,8 +2401,10 @@ static inline int prctl_get_mdwe(unsigned long arg2, unsigned long arg3,
 	if (arg2 || arg3 || arg4 || arg5)
 		return -EINVAL;
 
-	return test_bit(MMF_HAS_MDWE, &current->mm->flags) ?
-		PR_MDWE_REFUSE_EXEC_GAIN : 0;
+	return (test_bit(MMF_HAS_MDWE, &current->mm->flags) ?
+		PR_MDWE_REFUSE_EXEC_GAIN : 0) |
+		(test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags) ?
+		PR_MDWE_NO_INHERIT : 0);
 }
 
 static int prctl_get_auxv(void __user *addr, unsigned long len)
diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h
index 6e6563e97fef..f7448d99520c 100644
--- a/tools/include/uapi/linux/prctl.h
+++ b/tools/include/uapi/linux/prctl.h
@@ -284,6 +284,7 @@ struct prctl_mm_map {
 /* Memory deny write / execute */
 #define PR_SET_MDWE			65
 # define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
+# define PR_MDWE_NO_INHERIT		(1UL << 1)
 
 #define PR_GET_MDWE			66
 
-- 
2.40.1.606.ga4b1b128d6-goog



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 5/5] kselftest: vm: Add tests for no-inherit memory-deny-write-execute
  2023-05-17 15:03 [PATCH v2 0/5] MDWE without inheritance Florent Revest
                   ` (3 preceding siblings ...)
  2023-05-17 15:03 ` [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest
@ 2023-05-17 15:03 ` Florent Revest
  4 siblings, 0 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-17 15:03 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, david, peterx, izbyshev, broonie, szabolcs.nagy,
	kpsingh, gthelen, toiwoton, Florent Revest

Add some tests to cover the new PR_MDWE_NO_INHERIT flag of the
PR_SET_MDWE prctl.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 tools/testing/selftests/mm/mdwe_test.c | 95 ++++++++++++++++++++++++--
 1 file changed, 89 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c
index 91aa9c3099e7..9f08ed1b99ae 100644
--- a/tools/testing/selftests/mm/mdwe_test.c
+++ b/tools/testing/selftests/mm/mdwe_test.c
@@ -22,6 +22,8 @@
 
 TEST(prctl_flags)
 {
+	EXPECT_LT(prctl(PR_SET_MDWE, PR_MDWE_NO_INHERIT, 0L, 0L, 7L), 0);
+
 	EXPECT_LT(prctl(PR_SET_MDWE, 7L, 0L, 0L, 0L), 0);
 	EXPECT_LT(prctl(PR_SET_MDWE, 0L, 7L, 0L, 0L), 0);
 	EXPECT_LT(prctl(PR_SET_MDWE, 0L, 0L, 7L, 0L), 0);
@@ -33,6 +35,66 @@ TEST(prctl_flags)
 	EXPECT_LT(prctl(PR_GET_MDWE, 0L, 0L, 0L, 7L), 0);
 }
 
+FIXTURE(consecutive_prctl_flags) {};
+FIXTURE_SETUP(consecutive_prctl_flags) {}
+FIXTURE_TEARDOWN(consecutive_prctl_flags) {}
+
+FIXTURE_VARIANT(consecutive_prctl_flags)
+{
+	unsigned long first_flags;
+	unsigned long second_flags;
+	bool should_work;
+};
+
+FIXTURE_VARIANT_ADD(consecutive_prctl_flags, same)
+{
+	.first_flags = PR_MDWE_REFUSE_EXEC_GAIN,
+	.second_flags = PR_MDWE_REFUSE_EXEC_GAIN,
+	.should_work = true,
+};
+
+FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_disable_mdwe)
+{
+	.first_flags = PR_MDWE_REFUSE_EXEC_GAIN,
+	.second_flags = 0,
+	.should_work = false,
+};
+
+FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_disable_mdwe_no_inherit)
+{
+	.first_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT,
+	.second_flags = 0,
+	.should_work = false,
+};
+
+FIXTURE_VARIANT_ADD(consecutive_prctl_flags, can_lower_privileges)
+{
+	.first_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT,
+	.second_flags = PR_MDWE_REFUSE_EXEC_GAIN,
+	.should_work = true,
+};
+
+FIXTURE_VARIANT_ADD(consecutive_prctl_flags, cant_gain_privileges)
+{
+	.first_flags = PR_MDWE_REFUSE_EXEC_GAIN,
+	.second_flags = PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT,
+	.should_work = false,
+};
+
+TEST_F(consecutive_prctl_flags, two_prctls)
+{
+	int ret;
+
+	EXPECT_EQ(prctl(PR_SET_MDWE, variant->first_flags, 0L, 0L, 0L), 0);
+
+	ret = prctl(PR_SET_MDWE, variant->second_flags, 0L, 0L, 0L);
+	if (variant->should_work) {
+		EXPECT_EQ(ret, 0);
+	} else {
+		EXPECT_NE(ret, 0);
+	}
+}
+
 FIXTURE(mdwe)
 {
 	void *p;
@@ -45,28 +107,45 @@ FIXTURE_VARIANT(mdwe)
 {
 	bool enabled;
 	bool forked;
+	bool inherit;
 };
 
 FIXTURE_VARIANT_ADD(mdwe, stock)
 {
 	.enabled = false,
 	.forked = false,
+	.inherit = false,
 };
 
 FIXTURE_VARIANT_ADD(mdwe, enabled)
 {
 	.enabled = true,
 	.forked = false,
+	.inherit = true,
 };
 
-FIXTURE_VARIANT_ADD(mdwe, forked)
+FIXTURE_VARIANT_ADD(mdwe, inherited)
 {
 	.enabled = true,
 	.forked = true,
+	.inherit = true,
 };
 
+FIXTURE_VARIANT_ADD(mdwe, not_inherited)
+{
+	.enabled = true,
+	.forked = true,
+	.inherit = false,
+};
+
+static bool executable_map_should_fail(const FIXTURE_VARIANT(mdwe) *variant)
+{
+	return variant->enabled && (!variant->forked || variant->inherit);
+}
+
 FIXTURE_SETUP(mdwe)
 {
+	unsigned long mdwe_flags;
 	int ret, status;
 
 	self->p = NULL;
@@ -76,13 +155,17 @@ FIXTURE_SETUP(mdwe)
 	if (!variant->enabled)
 		return;
 
-	ret = prctl(PR_SET_MDWE, PR_MDWE_REFUSE_EXEC_GAIN, 0L, 0L, 0L);
+	mdwe_flags = PR_MDWE_REFUSE_EXEC_GAIN;
+	if (!variant->inherit)
+		mdwe_flags |= PR_MDWE_NO_INHERIT;
+
+	ret = prctl(PR_SET_MDWE, mdwe_flags, 0L, 0L, 0L);
 	ASSERT_EQ(ret, 0) {
 		TH_LOG("PR_SET_MDWE failed or unsupported");
 	}
 
 	ret = prctl(PR_GET_MDWE, 0L, 0L, 0L, 0L);
-	ASSERT_EQ(ret, 1);
+	ASSERT_EQ(ret, mdwe_flags);
 
 	if (variant->forked) {
 		self->pid = fork();
@@ -113,7 +196,7 @@ TEST_F(mdwe, mmap_READ_EXEC)
 TEST_F(mdwe, mmap_WRITE_EXEC)
 {
 	self->p = mmap(NULL, self->size, PROT_WRITE | PROT_EXEC, self->flags, 0, 0);
-	if (variant->enabled) {
+	if (executable_map_should_fail(variant)) {
 		EXPECT_EQ(self->p, MAP_FAILED);
 	} else {
 		EXPECT_NE(self->p, MAP_FAILED);
@@ -139,7 +222,7 @@ TEST_F(mdwe, mprotect_add_EXEC)
 	ASSERT_NE(self->p, MAP_FAILED);
 
 	ret = mprotect(self->p, self->size, PROT_READ | PROT_EXEC);
-	if (variant->enabled) {
+	if (executable_map_should_fail(variant)) {
 		EXPECT_LT(ret, 0);
 	} else {
 		EXPECT_EQ(ret, 0);
@@ -154,7 +237,7 @@ TEST_F(mdwe, mprotect_WRITE_EXEC)
 	ASSERT_NE(self->p, MAP_FAILED);
 
 	ret = mprotect(self->p, self->size, PROT_WRITE | PROT_EXEC);
-	if (variant->enabled) {
+	if (executable_map_should_fail(variant)) {
 		EXPECT_LT(ret, 0);
 	} else {
 		EXPECT_EQ(ret, 0);
-- 
2.40.1.606.ga4b1b128d6-goog



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/5] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test
  2023-05-17 15:03 ` [PATCH v2 1/5] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest
@ 2023-05-22  8:52   ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2023-05-22  8:52 UTC (permalink / raw)
  To: Florent Revest, linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, peterx, izbyshev, broonie, szabolcs.nagy, kpsingh,
	gthelen, toiwoton

On 17.05.23 17:03, Florent Revest wrote:
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>   tools/testing/selftests/mm/mdwe_test.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c
> index bc91bef5d254..d0954c657feb 100644
> --- a/tools/testing/selftests/mm/mdwe_test.c
> +++ b/tools/testing/selftests/mm/mdwe_test.c
> @@ -49,19 +49,19 @@ FIXTURE_VARIANT(mdwe)
>   
>   FIXTURE_VARIANT_ADD(mdwe, stock)
>   {
> -        .enabled = false,
> +	.enabled = false,
>   	.forked = false,
>   };
>   
>   FIXTURE_VARIANT_ADD(mdwe, enabled)
>   {
> -        .enabled = true,
> +	.enabled = true,
>   	.forked = false,
>   };
>   
>   FIXTURE_VARIANT_ADD(mdwe, forked)
>   {
> -        .enabled = true,
> +	.enabled = true,
>   	.forked = true,
>   };
>   

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/5] kselftest: vm: Fix mdwe's mmap_FIXED test case
  2023-05-17 15:03 ` [PATCH v2 2/5] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest
@ 2023-05-22  8:53   ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2023-05-22  8:53 UTC (permalink / raw)
  To: Florent Revest, linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, peterx, izbyshev, broonie, szabolcs.nagy, kpsingh,
	gthelen, toiwoton

On 17.05.23 17:03, Florent Revest wrote:
> I checked with the original author, the mmap_FIXED test case wasn't
> properly tested and fails. Currently, it maps two consecutive (non
> overlapping) pages and expects the second mapping to be denied by MDWE
> but these two pages have nothing to do with each other so MDWE is
> actually out of the picture here.
> 
> What the test actually intended to do was to remap a virtual address
> using MAP_FIXED. However, this operation unmaps the existing mapping and
> creates a new one so the va is backed by a new page and MDWE is again
> out of the picture, all remappings should succeed.
> 
> This patch keeps the test case to make it clear that this situation is
> expected to work.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> Fixes: 4cf1fe34fd18 ("kselftest: vm: add tests for memory-deny-write-execute")
> ---
>   tools/testing/selftests/mm/mdwe_test.c | 9 +++------
>   1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/testing/selftests/mm/mdwe_test.c b/tools/testing/selftests/mm/mdwe_test.c
> index d0954c657feb..91aa9c3099e7 100644
> --- a/tools/testing/selftests/mm/mdwe_test.c
> +++ b/tools/testing/selftests/mm/mdwe_test.c
> @@ -168,13 +168,10 @@ TEST_F(mdwe, mmap_FIXED)
>   	self->p = mmap(NULL, self->size, PROT_READ, self->flags, 0, 0);
>   	ASSERT_NE(self->p, MAP_FAILED);
>   
> -	p = mmap(self->p + self->size, self->size, PROT_READ | PROT_EXEC,
> +	/* MAP_FIXED unmaps the existing page before mapping which is allowed */
> +	p = mmap(self->p, self->size, PROT_READ | PROT_EXEC,
>   		 self->flags | MAP_FIXED, 0, 0);
> -	if (variant->enabled) {
> -		EXPECT_EQ(p, MAP_FAILED);
> -	} else {
> -		EXPECT_EQ(p, self->p);
> -	}
> +	EXPECT_EQ(p, self->p);
>   }
>   
>   TEST_F(mdwe, arm64_BTI)

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
  2023-05-17 15:03 ` [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long Florent Revest
@ 2023-05-22  8:55   ` David Hildenbrand
       [not found]     ` <884d131bbc28ebfa0b729176e6415269@ispras.ru>
  2023-05-23 14:11   ` Catalin Marinas
  1 sibling, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2023-05-22  8:55 UTC (permalink / raw)
  To: Florent Revest, linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, peterx, izbyshev, broonie, szabolcs.nagy, kpsingh,
	gthelen, toiwoton

On 17.05.23 17:03, Florent Revest wrote:
> Alexey pointed out that defining a prctl flag as an int is a footgun
> because, under some circumstances, when used as a flag to prctl, it can
> be casted to long with garbage upper bits which would result in
> unexpected behaviors.
> 
> This patch changes the constant to a UL to eliminate these
> possibilities.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru>
> ---
>   include/uapi/linux/prctl.h       | 2 +-
>   tools/include/uapi/linux/prctl.h | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index f23d9a16507f..6e9af6cbc950 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>   
>   /* Memory deny write / execute */
>   #define PR_SET_MDWE			65
> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>   
>   #define PR_GET_MDWE			66
>   
> diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h
> index 759b3f53e53f..6e6563e97fef 100644
> --- a/tools/include/uapi/linux/prctl.h
> +++ b/tools/include/uapi/linux/prctl.h
> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>   
>   /* Memory deny write / execute */
>   #define PR_SET_MDWE			65
> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>   
>   #define PR_GET_MDWE			66
>   

Both are changing existing uapi, so you'll already have existing user 
space using the old values, that your kernel code has to deal with no?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl
  2023-05-17 15:03 ` [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest
@ 2023-05-22  9:01   ` David Hildenbrand
  2023-05-22 16:11     ` Florent Revest
  2023-05-23 16:36   ` Catalin Marinas
  1 sibling, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2023-05-22  9:01 UTC (permalink / raw)
  To: Florent Revest, linux-kernel, linux-mm
  Cc: akpm, catalin.marinas, anshuman.khandual, joey.gouly, mhocko,
	keescook, peterx, izbyshev, broonie, szabolcs.nagy, kpsingh,
	gthelen, toiwoton

On 17.05.23 17:03, Florent Revest wrote:
> This extends the current PR_SET_MDWE prctl arg with a bit to indicate
> that the process doesn't want MDWE protection to propagate to children.
> 
> To implement this no-inherit mode, the tag in current->mm->flags must be
> absent from MMF_INIT_MASK. This means that the encoding for "MDWE but
> without inherit" is different in the prctl than in the mm flags. This
> leads to a bit of bit-mangling in the prctl implementation.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>   include/linux/sched/coredump.h   | 10 ++++++++++
>   include/uapi/linux/prctl.h       |  1 +
>   kernel/fork.c                    |  2 +-
>   kernel/sys.c                     | 24 +++++++++++++++++++++---
>   tools/include/uapi/linux/prctl.h |  1 +
>   5 files changed, 34 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
> index 0ee96ea7a0e9..11f5e3dacb4e 100644
> --- a/include/linux/sched/coredump.h
> +++ b/include/linux/sched/coredump.h
> @@ -91,4 +91,14 @@ static inline int get_dumpable(struct mm_struct *mm)
>   				 MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK)
>   
>   #define MMF_VM_MERGE_ANY	29
> +#define MMF_HAS_MDWE_NO_INHERIT	30
> +
> +#define MMF_INIT_FLAGS(flags)	({					\
> +	unsigned long new_flags = flags;				\
> +	if (new_flags & (1UL << MMF_HAS_MDWE_NO_INHERIT))		\
> +		new_flags &= ~((1UL << MMF_HAS_MDWE) |			\
> +				(1UL << MMF_HAS_MDWE_NO_INHERIT));	\
> +	new_flags & MMF_INIT_MASK;					\
> +})

Why the desire for macros here? :)

We have a single user of MMF_INIT_FLAGS, why not inline or use a proper 
inline function?


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl
  2023-05-22  9:01   ` David Hildenbrand
@ 2023-05-22 16:11     ` Florent Revest
  0 siblings, 0 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-22 16:11 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, akpm, catalin.marinas, anshuman.khandual,
	joey.gouly, mhocko, keescook, peterx, izbyshev, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Mon, May 22, 2023 at 11:01 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 17.05.23 17:03, Florent Revest wrote:
> > +#define MMF_INIT_FLAGS(flags)        ({                                      \
> > +     unsigned long new_flags = flags;                                \
> > +     if (new_flags & (1UL << MMF_HAS_MDWE_NO_INHERIT))               \
> > +             new_flags &= ~((1UL << MMF_HAS_MDWE) |                  \
> > +                             (1UL << MMF_HAS_MDWE_NO_INHERIT));      \
> > +     new_flags & MMF_INIT_MASK;                                      \
> > +})
>
> Why the desire for macros here? :)

I just thought that's what the cool kids do nowadays ?! :)

Eh, I'm joking, I completely agree. Somehow this was suggested to me
in v1 as a macro and I didn't think of questioning that, but a static
inline function should be more readable indeed! I will fix this in v3.

> We have a single user of MMF_INIT_FLAGS, why not inline or use a proper
> inline function?

I have a slight preference for a separate function, so we don't spill
over too much of this logic in fork.c.

Thanks for the review David!


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
       [not found]     ` <884d131bbc28ebfa0b729176e6415269@ispras.ru>
@ 2023-05-22 16:22       ` David Hildenbrand
       [not found]         ` <3c2e210b75bd56909322e8a3e5086d91@ispras.ru>
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2023-05-22 16:22 UTC (permalink / raw)
  To: Alexey Izbyshev
  Cc: Florent Revest, linux-kernel, linux-mm, akpm, catalin.marinas,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On 22.05.23 12:35, Alexey Izbyshev wrote:
> On 2023-05-22 11:55, David Hildenbrand wrote:
>> On 17.05.23 17:03, Florent Revest wrote:
>>> Alexey pointed out that defining a prctl flag as an int is a footgun
>>> because, under some circumstances, when used as a flag to prctl, it
>>> can
>>> be casted to long with garbage upper bits which would result in
>>> unexpected behaviors.
>>>
>>> This patch changes the constant to a UL to eliminate these
>>> possibilities.
>>>
>>> Signed-off-by: Florent Revest <revest@chromium.org>
>>> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru>
>>> ---
>>>    include/uapi/linux/prctl.h       | 2 +-
>>>    tools/include/uapi/linux/prctl.h | 2 +-
>>>    2 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
>>> index f23d9a16507f..6e9af6cbc950 100644
>>> --- a/include/uapi/linux/prctl.h
>>> +++ b/include/uapi/linux/prctl.h
>>> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>>>      /* Memory deny write / execute */
>>>    #define PR_SET_MDWE			65
>>> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
>>> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>>>      #define PR_GET_MDWE			66
>>>    diff --git a/tools/include/uapi/linux/prctl.h
>>> b/tools/include/uapi/linux/prctl.h
>>> index 759b3f53e53f..6e6563e97fef 100644
>>> --- a/tools/include/uapi/linux/prctl.h
>>> +++ b/tools/include/uapi/linux/prctl.h
>>> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>>>      /* Memory deny write / execute */
>>>    #define PR_SET_MDWE			65
>>> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
>>> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>>>      #define PR_GET_MDWE			66
>>>
>>
>> Both are changing existing uapi, so you'll already have existing user
>> space using the old values, that your kernel code has to deal with no?
> 
> I'm the one who suggested this change, so I feel the need to clarify.
> 
> For any existing 64-bit user space code using the kernel and the uapi
> headers before this patch and doing the wrong prctl(PR_SET_MDWE,
> PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct prctl(PR_SET_MDWE,
> (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two possibilities
> when prctl() implementation extracts the second argument via va_arg(op,
> unsigned long):
> 
> * It gets lucky, and the upper 32 bits of the argument are zero. The
> call does what is expected by the user.
> 
> * The upper 32 bits are non-zero junk. The flags argument is rejected by
> the kernel, and the call fails with EINVAL (unexpectedly for the user).
> 
> This change is intended to affect only the second case, and only after
> the program is recompiled with the new uapi headers. The currently
> wrong, but naturally-looking prctl(PR_SET_MDWE,
> PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct.
> 
> The kernel ABI is unaffected by this change, since it has been defined
> in terms of unsigned long from the start.

The thing I'm concerned about is the following: old user space (that 
would fail) on new kernel where we define some upper 32bit to actually 
have a meaning (where it would succeed with wrong semantics).

IOW, can we ever really "use" these upper 32bit, or should we instead 
only consume the lower 32bit in the kernel and effectively ignore the 
upper 32bit?

I guess the feature is not that old, so having many existing user space 
applications is unlikely.

Which raises the question if we want to tag this here with a "Fixes" and 
eventually cc stable (hmm ...)?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
       [not found]         ` <3c2e210b75bd56909322e8a3e5086d91@ispras.ru>
@ 2023-05-23  9:12           ` David Hildenbrand
  2023-05-23 13:07             ` Catalin Marinas
       [not found]             ` <7c572622c0d8e283fc880fe3f4ffac27@ispras.ru>
  2023-05-26 19:02           ` Florent Revest
  1 sibling, 2 replies; 22+ messages in thread
From: David Hildenbrand @ 2023-05-23  9:12 UTC (permalink / raw)
  To: Alexey Izbyshev
  Cc: Florent Revest, linux-kernel, linux-mm, akpm, catalin.marinas,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On 22.05.23 20:58, Alexey Izbyshev wrote:
> On 2023-05-22 19:22, David Hildenbrand wrote:
>> On 22.05.23 12:35, Alexey Izbyshev wrote:
>>> On 2023-05-22 11:55, David Hildenbrand wrote:
>>>> On 17.05.23 17:03, Florent Revest wrote:
>>>>> Alexey pointed out that defining a prctl flag as an int is a footgun
>>>>> because, under some circumstances, when used as a flag to prctl, it
>>>>> can
>>>>> be casted to long with garbage upper bits which would result in
>>>>> unexpected behaviors.
>>>>>
>>>>> This patch changes the constant to a UL to eliminate these
>>>>> possibilities.
>>>>>
>>>>> Signed-off-by: Florent Revest <revest@chromium.org>
>>>>> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru>
>>>>> ---
>>>>>     include/uapi/linux/prctl.h       | 2 +-
>>>>>     tools/include/uapi/linux/prctl.h | 2 +-
>>>>>     2 files changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
>>>>> index f23d9a16507f..6e9af6cbc950 100644
>>>>> --- a/include/uapi/linux/prctl.h
>>>>> +++ b/include/uapi/linux/prctl.h
>>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>>>>>       /* Memory deny write / execute */
>>>>>     #define PR_SET_MDWE			65
>>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
>>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>>>>>       #define PR_GET_MDWE			66
>>>>>     diff --git a/tools/include/uapi/linux/prctl.h
>>>>> b/tools/include/uapi/linux/prctl.h
>>>>> index 759b3f53e53f..6e6563e97fef 100644
>>>>> --- a/tools/include/uapi/linux/prctl.h
>>>>> +++ b/tools/include/uapi/linux/prctl.h
>>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map {
>>>>>       /* Memory deny write / execute */
>>>>>     #define PR_SET_MDWE			65
>>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN	1
>>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN	(1UL << 0)
>>>>>       #define PR_GET_MDWE			66
>>>>>
>>>>
>>>> Both are changing existing uapi, so you'll already have existing user
>>>> space using the old values, that your kernel code has to deal with
>>>> no?
>>>
>>> I'm the one who suggested this change, so I feel the need to clarify.
>>>
>>> For any existing 64-bit user space code using the kernel and the uapi
>>> headers before this patch and doing the wrong prctl(PR_SET_MDWE,
>>> PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct
>>> prctl(PR_SET_MDWE,
>>> (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two possibilities
>>> when prctl() implementation extracts the second argument via
>>> va_arg(op,
>>> unsigned long):
>>>
>>> * It gets lucky, and the upper 32 bits of the argument are zero. The
>>> call does what is expected by the user.
>>>
>>> * The upper 32 bits are non-zero junk. The flags argument is rejected
>>> by
>>> the kernel, and the call fails with EINVAL (unexpectedly for the
>>> user).
>>>
>>> This change is intended to affect only the second case, and only after
>>> the program is recompiled with the new uapi headers. The currently
>>> wrong, but naturally-looking prctl(PR_SET_MDWE,
>>> PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct.
>>>
>>> The kernel ABI is unaffected by this change, since it has been defined
>>> in terms of unsigned long from the start.
>>
>> The thing I'm concerned about is the following: old user space (that
>> would fail) on new kernel where we define some upper 32bit to actually
>> have a meaning (where it would succeed with wrong semantics).
>>
>> IOW, can we ever really "use" these upper 32bit, or should we instead
>> only consume the lower 32bit in the kernel and effectively ignore the
>> upper 32bit?
>>
> I see, thanks. But I think this question is mostly independent from this
> patch. The patch removes a footgun, but it doesn't change the flags
> check in the kernel, and nothing stops the user from doing
> 
> int flags = PR_MDWE_REFUSE_EXEC_GAIN;
> prctl(PR_SET_MDWE, flags);
> 
> So we have to decide whether to ignore the upper 32 bits or not even if
> this patch is not applied (actually *had to* when PR_SET_MDWE op was
> being added).

Well, an alternative to this patch would be to say "well, for this prctl 
we ignore any upper 32bit. Then, this change would not be needed. Yes, I 
also don't like that :)

Bu unrelated, I looked at some other random prctl.

#define SUID_DUMP_USER           1

And in kernel/sys.c:

	case PR_SET_DUMPABLE:
		if (arg2 != SUID_DUMP_DISABLE && arg2 != SUID_DUMP_USER)
	...

Wouldn't that also suffer from the same issue, or how is this different?

Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We need 
arg2 -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ?

prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0)


I'm easily confused by such (va_args) things, so sorry for the dummy 
questions.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
  2023-05-23  9:12           ` David Hildenbrand
@ 2023-05-23 13:07             ` Catalin Marinas
       [not found]               ` <f47d587fe5a6285f88191fbb13f367c7@ispras.ru>
       [not found]             ` <7c572622c0d8e283fc880fe3f4ffac27@ispras.ru>
  1 sibling, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2023-05-23 13:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Alexey Izbyshev, Florent Revest, linux-kernel, linux-mm, akpm,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Tue, May 23, 2023 at 11:12:37AM +0200, David Hildenbrand wrote:
> Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We need arg2
> -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ?
> 
> prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0)
> 
> I'm easily confused by such (va_args) things, so sorry for the dummy
> questions.

Isn't the prctl() prototype in the user headers defined with the first
argument as int while the rest as unsigned long? At least from the man
page:

int prctl(int option, unsigned long arg2, unsigned long arg3,
	  unsigned long arg4, unsigned long arg5);

So there are no va_args tricks (which confuse me as well).

Any int passed to arg[2-5] would be converted by the compiler to an
unsigned long before being passed to the kernel. So I think the change
in this patch is harmless as the conversion is happening anyway.

(well, unless I completely missed what the problem is)

-- 
Catalin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
       [not found]               ` <f47d587fe5a6285f88191fbb13f367c7@ispras.ru>
@ 2023-05-23 14:09                 ` Catalin Marinas
  2023-05-23 15:01                   ` Szabolcs Nagy
  0 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2023-05-23 14:09 UTC (permalink / raw)
  To: Alexey Izbyshev
  Cc: David Hildenbrand, Florent Revest, linux-kernel, linux-mm, akpm,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Tue, May 23, 2023 at 04:25:45PM +0300, Alexey Izbyshev wrote:
> On 2023-05-23 16:07, Catalin Marinas wrote:
> > On Tue, May 23, 2023 at 11:12:37AM +0200, David Hildenbrand wrote:
> > > Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We
> > > need arg2
> > > -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ?
> > > 
> > > prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0)
> > > 
> > > I'm easily confused by such (va_args) things, so sorry for the dummy
> > > questions.
> > 
> > Isn't the prctl() prototype in the user headers defined with the first
> > argument as int while the rest as unsigned long? At least from the man
> > page:
> > 
> > int prctl(int option, unsigned long arg2, unsigned long arg3,
> > 	  unsigned long arg4, unsigned long arg5);
> > 
> > So there are no va_args tricks (which confuse me as well).
> > 
> I have explicitly mentioned the problem with man pages in my response to
> David[1]. Quoting myself:
> 
> > This stuff *is* confusing, and note that Linux man pages don't even tell
> that prctl() is actually declared as a variadic function (and for
> ptrace() this is mentioned only in the notes, but not in its signature).

Ah, thanks for the clarification (I somehow missed your reply).

> The reality:
> 
> * glibc: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sys/prctl.h;h=821aeefc1339b35210e8918ecfe9833ed2792626;hb=glibc-2.37#l42
> 
> * musl:
> https://git.musl-libc.org/cgit/musl/tree/include/sys/prctl.h?h=v1.2.4#n180
> 
> Though there is a test in the kernel that does define its own prototype,
> avoiding the issue: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/sched/cs_prctl_test.c?h=v6.3#n77

At least for glibc, it seems that there is a conversion to unsigned
long:

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/prctl.c#l28

unsigned long int arg2 = va_arg (arg, unsigned long int);

(does va_arg expand to an actual cast?)

If the libc passes a 32-bit to a kernel ABI that expects 64-bit, I think
it's a user-space bug and not a kernel ABI issue.

-- 
Catalin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
       [not found]             ` <7c572622c0d8e283fc880fe3f4ffac27@ispras.ru>
@ 2023-05-23 14:10               ` David Hildenbrand
  2023-05-26 19:04                 ` Florent Revest
  0 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2023-05-23 14:10 UTC (permalink / raw)
  To: Alexey Izbyshev
  Cc: Florent Revest, linux-kernel, linux-mm, akpm, catalin.marinas,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

>> Wouldn't that also suffer from the same issue, or how is this
>> different?
>>
> Yes, it is the same issue, so e.g. prctl(PR_SET_DUMPABLE,
> SUID_DUMP_DISABLE ) may wrongly fail with EINVAL on 64-bit targets.
> 
>> Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We
>> need arg2 -> arg5 to be 0. But wouldn't the following also just pass a
>> 0 "int" ?
>>
>> prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0)
>>
> Yes, this is not reliable on 64-bit targets too. The simplest fix is to
> use "0L", as done in MDWE self-tests (but many other tests get this
> wrong).

Oh, it's even worse than I thought, then. :)

Even in our selftest most of
	$ git grep prctl tools/testing/selftests/ | grep "0"

gets it wrong.

> 
> Florent also expressed surprise[1] that we don't see a lot of failures
> due to such issues, and I tried to provide some reasons. To elaborate on

Yes, I'm also surprised!

> the x86-64 thing, for prctl(PR_SET_DUMPABLE, 0) the compiler will likely
> generate "xorl %esi, %esi" to pass zero, but this instruction will also
> clear the upper 32 bits of %rsi, so the problem is masked (and I believe
> CPU vendors are motivated to do such zeroing to reduce false
> dependencies). But this zeroing is not required by the ABI, so in a more
> complex situation junk might get through.

:/

> 
> Real-world examples of very similar breakage in variadic functions
> involving NULL sentinels are mentioned in [2] (the musl bug report is
> [3]). In short, musl defined NULL as plain 0 for C++, so when people do
> e.g. execl("/bin/true", "true", NULL), junk might prevent detection of
> the sentinel in execl() impl. (Though if the sentinel is passed via
> stack because there are a lot of preceding arguments, the breakage
> becomes more apparent because auto-zeroing of registers doesn't come
> into play anymore.)

Yes, I heard about the "fun" with NULL already. Thanks for the musl 
pointer. And thanks for the confirmation/explanation.

> 
>>
>> I'm easily confused by such (va_args) things, so sorry for the dummy
>> questions.
> 
> This stuff *is* confusing, and note that Linux man pages don't even tell
> that prctl() is actually declared as a variadic function (and for
> ptrace() this is mentioned only in the notes, but not in its signature).

Agreed, that's easy to miss (and probably many people missed it).


Anyhow, for this patch as is (although it feels like drops in the ocean 
after our discussion)

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
  2023-05-17 15:03 ` [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long Florent Revest
  2023-05-22  8:55   ` David Hildenbrand
@ 2023-05-23 14:11   ` Catalin Marinas
  1 sibling, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2023-05-23 14:11 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly,
	mhocko, keescook, david, peterx, izbyshev, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Wed, May 17, 2023 at 05:03:19PM +0200, Florent Revest wrote:
> Alexey pointed out that defining a prctl flag as an int is a footgun
> because, under some circumstances, when used as a flag to prctl, it can
> be casted to long with garbage upper bits which would result in
> unexpected behaviors.
> 
> This patch changes the constant to a UL to eliminate these
> possibilities.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>
> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru>

FWIW, I'm fine with this patch and I don't think it introduces an ABI
change.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
  2023-05-23 14:09                 ` Catalin Marinas
@ 2023-05-23 15:01                   ` Szabolcs Nagy
  0 siblings, 0 replies; 22+ messages in thread
From: Szabolcs Nagy @ 2023-05-23 15:01 UTC (permalink / raw)
  To: Catalin Marinas, Alexey Izbyshev
  Cc: David Hildenbrand, Florent Revest, linux-kernel, linux-mm, akpm,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	kpsingh, gthelen, toiwoton

The 05/23/2023 15:09, Catalin Marinas wrote:
> At least for glibc, it seems that there is a conversion to unsigned
> long:
>
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/prctl.c#l28
>
> unsigned long int arg2 = va_arg (arg, unsigned long int);
>
> (does va_arg expand to an actual cast?)

this is not a conversion.

at this point the argument is already corrupt since
an int was passed instead of unsigned long, usually
it's just wrong top 32bit, but in theory arg passing
for int and long could be completely different.

>
> If the libc passes a 32-bit to a kernel ABI that expects 64-bit, I think
> it's a user-space bug and not a kernel ABI issue.

this is point 3 in an earlier mail i wrote about varargs
(we ran into vararg issues during morello porting but it
affects all 64bit targets):

https://lkml.org/lkml/2022/10/31/386
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl
  2023-05-17 15:03 ` [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest
  2023-05-22  9:01   ` David Hildenbrand
@ 2023-05-23 16:36   ` Catalin Marinas
  2023-05-26 19:05     ` Florent Revest
  1 sibling, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2023-05-23 16:36 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly,
	mhocko, keescook, david, peterx, izbyshev, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Wed, May 17, 2023 at 05:03:20PM +0200, Florent Revest wrote:
> diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
> index 0ee96ea7a0e9..11f5e3dacb4e 100644
> --- a/include/linux/sched/coredump.h
> +++ b/include/linux/sched/coredump.h
> @@ -91,4 +91,14 @@ static inline int get_dumpable(struct mm_struct *mm)
>  				 MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK)
>  
>  #define MMF_VM_MERGE_ANY	29
> +#define MMF_HAS_MDWE_NO_INHERIT	30
> +
> +#define MMF_INIT_FLAGS(flags)	({					\
> +	unsigned long new_flags = flags;				\
> +	if (new_flags & (1UL << MMF_HAS_MDWE_NO_INHERIT))		\
> +		new_flags &= ~((1UL << MMF_HAS_MDWE) |			\
> +				(1UL << MMF_HAS_MDWE_NO_INHERIT));	\
> +	new_flags & MMF_INIT_MASK;					\
> +})

A function is better indeed, not sure who came up with this macro idea ;).

> diff --git a/kernel/sys.c b/kernel/sys.c
> index 339fee3eff6a..320eae3b12ab 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2368,9 +2368,25 @@ static inline int prctl_set_mdwe(unsigned long bits, unsigned long arg3,
>  	if (arg3 || arg4 || arg5)
>  		return -EINVAL;
>  
> -	if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN))
> +	if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT))
>  		return -EINVAL;
>  
> +	/* NO_INHERIT only makes sense with REFUSE_EXEC_GAIN */
> +	if (bits & PR_MDWE_NO_INHERIT && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
> +		return -EINVAL;
> +
> +	/* Can't gain NO_INHERIT from !NO_INHERIT */
> +	if (bits & PR_MDWE_NO_INHERIT &&
> +	    test_bit(MMF_HAS_MDWE, &current->mm->flags) &&
> +	    !test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags))
> +		return -EPERM;
> +
> +	if (bits & PR_MDWE_NO_INHERIT)
> +		set_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags);
> +	else if (test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags)
> +		 && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
> +		return -EPERM; /* Cannot unset the flag */

Is this about not unsetting the MMF_HAS_MDWE bit? We already have a
check further down that covers this case.

Related to this, do we want to allow unsetting MMF_HAS_MDWE_NO_INHERIT?
It looks like it can't be unset but no error either. The above check,
IIUC, looks more like ensuring we don't clear MMF_HAS_MDWE.

Maybe we should tighten the logic here a bit and not allow any changes
after the initial flag setting:

current->mm->flags == 0, we allow:
	bits == 0 or
	bits == PR_MDWE_REFUSE_EXEC_GAIN or
	bits == PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT

current->mm->flags != 0 (some bits were set), we only allow the exactly
the same bit combination or -EPERM.

So basically build the flags based on the PR_* input bits and compare
them with current->mm->flags when not 0, return -EPERM if different. I
think this preserves the ABI as we only have a single bit currently and
hopefully makes the logic here easier to parse.

> +
>  	if (bits & PR_MDWE_REFUSE_EXEC_GAIN)
>  		set_bit(MMF_HAS_MDWE, &current->mm->flags);
>  	else if (test_bit(MMF_HAS_MDWE, &current->mm->flags))
> @@ -2385,8 +2401,10 @@ static inline int prctl_get_mdwe(unsigned long arg2, unsigned long arg3,
>  	if (arg2 || arg3 || arg4 || arg5)
>  		return -EINVAL;
>  
> -	return test_bit(MMF_HAS_MDWE, &current->mm->flags) ?
> -		PR_MDWE_REFUSE_EXEC_GAIN : 0;
> +	return (test_bit(MMF_HAS_MDWE, &current->mm->flags) ?
> +		PR_MDWE_REFUSE_EXEC_GAIN : 0) |
> +		(test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags) ?
> +		PR_MDWE_NO_INHERIT : 0);
>  }

Just personal preference, use explicit 'if' blocks and add bits to a
local variable variable than multiple ternary operators.

-- 
Catalin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
       [not found]         ` <3c2e210b75bd56909322e8a3e5086d91@ispras.ru>
  2023-05-23  9:12           ` David Hildenbrand
@ 2023-05-26 19:02           ` Florent Revest
  1 sibling, 0 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-26 19:02 UTC (permalink / raw)
  To: Alexey Izbyshev
  Cc: David Hildenbrand, linux-kernel, linux-mm, akpm, catalin.marinas,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Mon, May 22, 2023 at 8:58 PM Alexey Izbyshev <izbyshev@ispras.ru> wrote:
>
> My preference would be to keep checking the upper 32 bits. Florent, what
> do you think?

I don't mind. I can do this as part of v3. :)

> > Which raises the question if we want to tag this here with a "Fixes"
> > and eventually cc stable (hmm ...)?
>
> Yes, IMO the faster we propagate this change, the better.

Okay, will do


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
  2023-05-23 14:10               ` David Hildenbrand
@ 2023-05-26 19:04                 ` Florent Revest
  0 siblings, 0 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-26 19:04 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Alexey Izbyshev, linux-kernel, linux-mm, akpm, catalin.marinas,
	anshuman.khandual, joey.gouly, mhocko, keescook, peterx, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Tue, May 23, 2023 at 4:10 PM David Hildenbrand <david@redhat.com> wrote:
>
> >> I'm easily confused by such (va_args) things, so sorry for the dummy
> >> questions.
> >
> > This stuff *is* confusing, and note that Linux man pages don't even tell
> > that prctl() is actually declared as a variadic function (and for
> > ptrace() this is mentioned only in the notes, but not in its signature).
>
> Agreed, that's easy to miss (and probably many people missed it).
>
>
> Anyhow, for this patch as is (although it feels like drops in the ocean
> after our discussion)
>
> Reviewed-by: David Hildenbrand <david@redhat.com>

Thanks everyone for the review and the exchange ! :)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl
  2023-05-23 16:36   ` Catalin Marinas
@ 2023-05-26 19:05     ` Florent Revest
  0 siblings, 0 replies; 22+ messages in thread
From: Florent Revest @ 2023-05-26 19:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, linux-mm, akpm, anshuman.khandual, joey.gouly,
	mhocko, keescook, david, peterx, izbyshev, broonie,
	szabolcs.nagy, kpsingh, gthelen, toiwoton

On Tue, May 23, 2023 at 6:36 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Wed, May 17, 2023 at 05:03:20PM +0200, Florent Revest wrote:
> > diff --git a/kernel/sys.c b/kernel/sys.c
> > index 339fee3eff6a..320eae3b12ab 100644
> > --- a/kernel/sys.c
> > +++ b/kernel/sys.c
> > @@ -2368,9 +2368,25 @@ static inline int prctl_set_mdwe(unsigned long bits, unsigned long arg3,
> >       if (arg3 || arg4 || arg5)
> >               return -EINVAL;
> >
> > -     if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN))
> > +     if (bits & ~(PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT))
> >               return -EINVAL;
> >
> > +     /* NO_INHERIT only makes sense with REFUSE_EXEC_GAIN */
> > +     if (bits & PR_MDWE_NO_INHERIT && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
> > +             return -EINVAL;
> > +
> > +     /* Can't gain NO_INHERIT from !NO_INHERIT */
> > +     if (bits & PR_MDWE_NO_INHERIT &&
> > +         test_bit(MMF_HAS_MDWE, &current->mm->flags) &&
> > +         !test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags))
> > +             return -EPERM;
> > +
> > +     if (bits & PR_MDWE_NO_INHERIT)
> > +             set_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags);
> > +     else if (test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags)
> > +              && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
> > +             return -EPERM; /* Cannot unset the flag */

Ugh, I had to proofread this 3 times to figure out what I was trying
to do, not great. :(

> Is this about not unsetting the MMF_HAS_MDWE bit? We already have a
> check further down that covers this case.

Actually, this was about not being able to unset _both_ NO_INHERIT and
HAS_MDWE (this would gain capabilities)... While still being able to
unset NO_INHERIT only (this reduces capabilities)

> Related to this, do we want to allow unsetting MMF_HAS_MDWE_NO_INHERIT?
> It looks like it can't be unset but no error either.

But - sigh - as you point out, that second part wasn't implemented. It
looks like I intended to add an:

else
  clear_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags);

after that block but forgot... The idea was that:
- setting no_inherit is always allowed
- unsetting it is allowed iff we don't also unset REFUSE_EXEC_GAIN

The "consecutive_prctl_flags" selftests in patch 5 were supposed to
make the assumptions here easier to read and verify. This logic was
meant to be covered by  "cant_disable_mdwe_no_inherit" and
"can_lower_privileges" but these tests only check that the "set" prctl
succeeds and not that the end result is the expected one so I missed
this. I'll improve the selftest in the next revision too so we can
catch this more easily next time.

> The above check,
> IIUC, looks more like ensuring we don't clear MMF_HAS_MDWE.

Indeed, without the else, it was useless.

> Maybe we should tighten the logic here a bit and not allow any changes
> after the initial flag setting:
>
> current->mm->flags == 0, we allow:
>         bits == 0 or
>         bits == PR_MDWE_REFUSE_EXEC_GAIN or
>         bits == PR_MDWE_REFUSE_EXEC_GAIN | PR_MDWE_NO_INHERIT
>
> current->mm->flags != 0 (some bits were set), we only allow the exactly
> the same bit combination or -EPERM.
>
> So basically build the flags based on the PR_* input bits and compare
> them with current->mm->flags when not 0, return -EPERM if different. I
> think this preserves the ABI as we only have a single bit currently and
> hopefully makes the logic here easier to parse.

On one hand, I thought it would be nice to have the ability to
transition from NO_INHERIT | REFUSE_EXEC_GAIN to REFUSE_EXEC_GAIN
because, conceptually, it seems to me that we shouldn't prevent a
process from dropping further capabilities.

On the other hand, I don't think I have an actual need for that
transition and I agree that if we don't try to handle it, the code
here would become a lot easier to reason about. I'd certainly sleep
better at night with the smaller likelihood of having screwed
something up... :)

Unless someone feels strongly otherwise, I'll do what you suggest in v3

> > @@ -2385,8 +2401,10 @@ static inline int prctl_get_mdwe(unsigned long arg2, unsigned long arg3,
> >       if (arg2 || arg3 || arg4 || arg5)
> >               return -EINVAL;
> >
> > -     return test_bit(MMF_HAS_MDWE, &current->mm->flags) ?
> > -             PR_MDWE_REFUSE_EXEC_GAIN : 0;
> > +     return (test_bit(MMF_HAS_MDWE, &current->mm->flags) ?
> > +             PR_MDWE_REFUSE_EXEC_GAIN : 0) |
> > +             (test_bit(MMF_HAS_MDWE_NO_INHERIT, &current->mm->flags) ?
> > +             PR_MDWE_NO_INHERIT : 0);
> >  }
>
> Just personal preference, use explicit 'if' blocks and add bits to a
> local variable variable than multiple ternary operators.

Will do!


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-05-26 19:05 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-17 15:03 [PATCH v2 0/5] MDWE without inheritance Florent Revest
2023-05-17 15:03 ` [PATCH v2 1/5] kselftest: vm: Fix tabs/spaces inconsistency in the mdwe test Florent Revest
2023-05-22  8:52   ` David Hildenbrand
2023-05-17 15:03 ` [PATCH v2 2/5] kselftest: vm: Fix mdwe's mmap_FIXED test case Florent Revest
2023-05-22  8:53   ` David Hildenbrand
2023-05-17 15:03 ` [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long Florent Revest
2023-05-22  8:55   ` David Hildenbrand
     [not found]     ` <884d131bbc28ebfa0b729176e6415269@ispras.ru>
2023-05-22 16:22       ` David Hildenbrand
     [not found]         ` <3c2e210b75bd56909322e8a3e5086d91@ispras.ru>
2023-05-23  9:12           ` David Hildenbrand
2023-05-23 13:07             ` Catalin Marinas
     [not found]               ` <f47d587fe5a6285f88191fbb13f367c7@ispras.ru>
2023-05-23 14:09                 ` Catalin Marinas
2023-05-23 15:01                   ` Szabolcs Nagy
     [not found]             ` <7c572622c0d8e283fc880fe3f4ffac27@ispras.ru>
2023-05-23 14:10               ` David Hildenbrand
2023-05-26 19:04                 ` Florent Revest
2023-05-26 19:02           ` Florent Revest
2023-05-23 14:11   ` Catalin Marinas
2023-05-17 15:03 ` [PATCH v2 4/5] mm: Add a NO_INHERIT flag to the PR_SET_MDWE prctl Florent Revest
2023-05-22  9:01   ` David Hildenbrand
2023-05-22 16:11     ` Florent Revest
2023-05-23 16:36   ` Catalin Marinas
2023-05-26 19:05     ` Florent Revest
2023-05-17 15:03 ` [PATCH v2 5/5] kselftest: vm: Add tests for no-inherit memory-deny-write-execute Florent Revest

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox