linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mateusz Guzik <mjguzik@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: torvalds@linux-foundation.org, brauner@kernel.org,
	ebiederm@xmission.com, david@redhat.com,
	akpm@linux-foundation.org, linux-mm@kvack.org, koct9i@gmail.com,
	oleg@redhat.com, dave@stgolabs.net,
	Mateusz Guzik <mjguzik@gmail.com>
Subject: [PATCH v2] kernel/fork: stop playing lockless games for exe_file replacement
Date: Mon, 14 Aug 2023 19:21:40 +0200	[thread overview]
Message-ID: <20230814172140.1777161-1-mjguzik@gmail.com> (raw)

xchg originated in 6e399cd144d8 ("prctl: avoid using mmap_sem for
exe_file serialization"). While the commit message does not explain
*why* the change, I found the original submission [1] which ultimately
claims it cleans things up by removing dependency of exe_file on the
semaphore.

However, fe69d560b5bd ("kernel/fork: always deny write access to current
MM exe_file") added a semaphore up/down cycle to synchronize the state
of exe_file against fork, defeating the point of the original change.

This is on top of semaphore trips already present both in the replacing
function and prctl (the only consumer).

Normally replacing exe_file does not happen for busy processes, thus
write-locking is not an impediment to performance in the intended use
case.  If someone keeps invoking the routine for a busy processes they
are trying to play dirty and that's another reason to avoid any
trickery.

As such I think the atomic here only adds complexity for no benefit.

Just write-lock around the replacement.

I also note that replacement races against the mapping check loop as
nothing synchronizes actual assignment with with said checks but I am
not addressing it in this patch. (Is the loop of any use to begin with?)

V2:
- fix up comments
- tweak commit message

Link: https://lore.kernel.org/linux-mm/1424979417.10344.14.camel@stgolabs.net/ [1]
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
 fs/exec.c     |  4 ++--
 kernel/fork.c | 22 +++++++++-------------
 2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 1a827d55ba94..dc41180d4e70 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1276,8 +1276,8 @@ int begin_new_exec(struct linux_binprm * bprm)
 
 	/*
 	 * Must be called _before_ exec_mmap() as bprm->mm is
-	 * not visible until then. This also enables the update
-	 * to be lockless.
+	 * not visible until then. Doing it here also ensures
+	 * we don't race against replace_mm_exe_file().
 	 */
 	retval = set_mm_exe_file(bprm->mm, bprm->file);
 	if (retval)
diff --git a/kernel/fork.c b/kernel/fork.c
index d2e12b6d2b18..7b8b63fb0438 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1396,8 +1396,8 @@ EXPORT_SYMBOL_GPL(mmput_async);
  * This changes mm's executable file (shown as symlink /proc/[pid]/exe).
  *
  * Main users are mmput() and sys_execve(). Callers prevent concurrent
- * invocations: in mmput() nobody alive left, in execve task is single
- * threaded.
+ * invocations: in mmput() nobody alive left, in execve it happens before
+ * the new mm is made visible to anyone.
  *
  * Can only fail if new_exe_file != NULL.
  */
@@ -1432,9 +1432,7 @@ int set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 /**
  * replace_mm_exe_file - replace a reference to the mm's executable file
  *
- * This changes mm's executable file (shown as symlink /proc/[pid]/exe),
- * dealing with concurrent invocation and without grabbing the mmap lock in
- * write mode.
+ * This changes mm's executable file (shown as symlink /proc/[pid]/exe).
  *
  * Main user is sys_prctl(PR_SET_MM_MAP/EXE_FILE).
  */
@@ -1464,22 +1462,20 @@ int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 			return ret;
 	}
 
-	/* set the new file, lockless */
 	ret = deny_write_access(new_exe_file);
 	if (ret)
 		return -EACCES;
 	get_file(new_exe_file);
 
-	old_exe_file = xchg(&mm->exe_file, new_exe_file);
+	/* set the new file */
+	mmap_write_lock(mm);
+	old_exe_file = rcu_dereference_raw(mm->exe_file);
+	rcu_assign_pointer(mm->exe_file, new_exe_file);
+	mmap_write_unlock(mm);
+
 	if (old_exe_file) {
-		/*
-		 * Don't race with dup_mmap() getting the file and disallowing
-		 * write access while someone might open the file writable.
-		 */
-		mmap_read_lock(mm);
 		allow_write_access(old_exe_file);
 		fput(old_exe_file);
-		mmap_read_unlock(mm);
 	}
 	return 0;
 }
-- 
2.39.2



             reply	other threads:[~2023-08-14 17:21 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14 17:21 Mateusz Guzik [this message]
2023-08-14 18:11 ` Oleg Nesterov
2023-08-15  7:29 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230814172140.1777161-1-mjguzik@gmail.com \
    --to=mjguzik@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=koct9i@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox