linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/9] Improve the copy of task comm
@ 2024-08-12  2:29 Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 1/9] Get rid of __get_task_comm() Yafang Shao
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao

Using {memcpy,strncpy,strcpy,kstrdup} to copy the task comm relies on the
length of task comm. Changes in the task comm could result in a destination
string that is overflow. Therefore, we should explicitly ensure the
destination string is always NUL-terminated, regardless of the task comm.
This approach will facilitate future extensions to the task comm.

As suggested by Linus [0], we can identify all relevant code with the
following git grep command:

  git grep 'memcpy.*->comm\>'
  git grep 'kstrdup.*->comm\>'
  git grep 'strncpy.*->comm\>'
  git grep 'strcpy.*->comm\>'

PATCH #2~#4:   memcpy
PATCH #5~#6:   kstrdup
PATCH #7:      strncpy
PATCH #8~#9:   strcpy

In this series, we have removed __get_task_comm() because the task_lock()
and BUILD_BUG_ON() within it are unnecessary.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-=wjAmmHUg6vho1KjzQi2=psR30+CogFd4aXrThr2gsiS4g@mail.gmail.com/ [0]

Changes:
v5->v6:
- Get rid of __get_task_comm() (Linus)
- Use ARRAY_SIZE() in get_task_comm() (Alejandro)

v4->v5: https://lore.kernel.org/all/20240804075619.20804-1-laoar.shao@gmail.com/
- Drop changes in the mm/kmemleak.c as it was fixed by
  commit 0b84780134fb ("mm/kmemleak: replace strncpy() with strscpy()")
- Drop changes in kernel/tsacct.c as it was fixed by
  commmit 0fe2356434e ("tsacct: replace strncpy() with strscpy()")

v3->v4: https://lore.kernel.org/linux-mm/20240729023719.1933-1-laoar.shao@gmail.com/
- Rename __kstrndup() to __kmemdup_nul() and define it inside mm/util.c
  (Matthew)
- Remove unused local varaible (Simon)

v2->v3: https://lore.kernel.org/all/20240621022959.9124-1-laoar.shao@gmail.com/
- Deduplicate code around kstrdup (Andrew)
- Add commit log for dropping task_lock (Catalin)

v1->v2: https://lore.kernel.org/bpf/20240613023044.45873-1-laoar.shao@gmail.com/
- Add comment for dropping task_lock() in __get_task_comm() (Alexei)
- Drop changes in trace event (Steven)
- Fix comment on task comm (Matus)

Yafang Shao (9):
  Get rid of __get_task_comm()
  auditsc: Replace memcpy() with strscpy()
  security: Replace memcpy() with get_task_comm()
  bpftool: Ensure task comm is always NUL-terminated
  mm/util: Fix possible race condition in kstrdup()
  mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  tracing: Replace strncpy() with strscpy()
  net: Replace strcpy() with strscpy()
  drm: Replace strcpy() with strscpy()

 drivers/gpu/drm/drm_framebuffer.c     |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c |  2 +-
 fs/exec.c                             | 10 -----
 fs/proc/array.c                       |  2 +-
 include/linux/sched.h                 | 31 +++++++++++---
 kernel/auditsc.c                      |  6 +--
 kernel/kthread.c                      |  2 +-
 kernel/trace/trace.c                  |  2 +-
 kernel/trace/trace_events_hist.c      |  2 +-
 mm/util.c                             | 61 ++++++++++++---------------
 net/ipv6/ndisc.c                      |  2 +-
 security/lsm_audit.c                  |  4 +-
 security/selinux/selinuxfs.c          |  2 +-
 tools/bpf/bpftool/pids.c              |  2 +
 14 files changed, 66 insertions(+), 64 deletions(-)

-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 1/9] Get rid of __get_task_comm()
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-12  8:05   ` Alejandro Colomar
  2024-08-12  2:29 ` [PATCH v6 2/9] auditsc: Replace memcpy() with strscpy() Yafang Shao
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, Alejandro Colomar, Alexander Viro,
	Christian Brauner, Jan Kara, Kees Cook, Matus Jokay,
	Serge E. Hallyn

We want to eliminate the use of __get_task_comm() for the following
reasons:

- The task_lock() is unnecessary
  Quoted from Linus [0]:
  : Since user space can randomly change their names anyway, using locking
  : was always wrong for readers (for writers it probably does make sense
  : to have some lock - although practically speaking nobody cares there
  : either, but at least for a writer some kind of race could have
  : long-term mixed results

- The BUILD_BUG_ON() doesn't add any value
  The only requirement is to ensure that the destination buffer is a valid
  array.

- Zeroing is not necessary in current use cases
  To avoid confusion, we should remove it. Moreover, not zeroing could
  potentially make it easier to uncover bugs. If the caller needs a
  zero-padded task name, it should be explicitly handled at the call site.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npJooP8chLPEfaFV0oNYTTBA@mail.gmail.com [0]
Link: https://lore.kernel.org/all/CAHk-=whWtUC-AjmGJveAETKOMeMFSTwKwu99v7+b6AyHMmaDFA@mail.gmail.com/
Suggested-by: Alejandro Colomar <alx@kernel.org>
Link: https://lore.kernel.org/all/2jxak5v6dfxlpbxhpm3ey7oup4g2lnr3ueurfbosf5wdo65dk4@srb3hsk72zwq
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Matus Jokay <matus.jokay@stuba.sk>
Cc: Alejandro Colomar <alx@kernel.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
---
 fs/exec.c             | 10 ----------
 fs/proc/array.c       |  2 +-
 include/linux/sched.h | 31 +++++++++++++++++++++++++------
 kernel/kthread.c      |  2 +-
 4 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index a47d0e4c54f6..2e468ddd203a 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1264,16 +1264,6 @@ static int unshare_sighand(struct task_struct *me)
 	return 0;
 }
 
-char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
-{
-	task_lock(tsk);
-	/* Always NUL terminated and zero-padded */
-	strscpy_pad(buf, tsk->comm, buf_size);
-	task_unlock(tsk);
-	return buf;
-}
-EXPORT_SYMBOL_GPL(__get_task_comm);
-
 /*
  * These functions flushes out all traces of the currently running executable
  * so that a new one can be started
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 34a47fb0c57f..55ed3510d2bb 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -109,7 +109,7 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
 	else if (p->flags & PF_KTHREAD)
 		get_kthread_comm(tcomm, sizeof(tcomm), p);
 	else
-		__get_task_comm(tcomm, sizeof(tcomm), p);
+		get_task_comm(tcomm, p);
 
 	if (escape)
 		seq_escape_str(m, tcomm, ESCAPE_SPACE | ESCAPE_SPECIAL, "\n\\");
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 33dd8d9d2b85..e0e26edbda61 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1096,9 +1096,11 @@ struct task_struct {
 	/*
 	 * executable name, excluding path.
 	 *
-	 * - normally initialized setup_new_exec()
-	 * - access it with [gs]et_task_comm()
-	 * - lock it with task_lock()
+	 * - normally initialized begin_new_exec()
+	 * - set it with set_task_comm()
+	 *   - strscpy_pad() to ensure it is always NUL-terminated
+	 *   - task_lock() to ensure the operation is atomic and the name is
+	 *     fully updated.
 	 */
 	char				comm[TASK_COMM_LEN];
 
@@ -1912,10 +1914,27 @@ static inline void set_task_comm(struct task_struct *tsk, const char *from)
 	__set_task_comm(tsk, from, false);
 }
 
-extern char *__get_task_comm(char *to, size_t len, struct task_struct *tsk);
+/*
+ * - Why not use task_lock()?
+ *   User space can randomly change their names anyway, so locking for readers
+ *   doesn't make sense. For writers, locking is probably necessary, as a race
+ *   condition could lead to long-term mixed results.
+ *   The strscpy_pad() in __set_task_comm() can ensure that the task comm is
+ *   always NUL-terminated. Therefore the race condition between reader and
+ *   writer is not an issue.
+ *
+ * - Why not use strscpy_pad()?
+ *   While strscpy_pad() prevents writing garbage past the NUL terminator, which
+ *   is useful when using the task name as a key in a hash map, most use cases
+ *   don't require this. Zero-padding might confuse users if it’s unnecessary,
+ *   and not zeroing might even make it easier to expose bugs. If you need a
+ *   zero-padded task name, please handle that explicitly at the call site.
+ *
+ * - ARRAY_SIZE() can help ensure that @buf is indeed an array.
+ */
 #define get_task_comm(buf, tsk) ({			\
-	BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN);	\
-	__get_task_comm(buf, sizeof(buf), tsk);		\
+	strscpy(buf, (tsk)->comm, ARRAY_SIZE(buf));	\
+	buf;						\
 })
 
 #ifdef CONFIG_SMP
diff --git a/kernel/kthread.c b/kernel/kthread.c
index f7be976ff88a..7d001d033cf9 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -101,7 +101,7 @@ void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
 	struct kthread *kthread = to_kthread(tsk);
 
 	if (!kthread || !kthread->full_name) {
-		__get_task_comm(buf, buf_size, tsk);
+		strscpy(buf, tsk->comm, buf_size);
 		return;
 	}
 
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 2/9] auditsc: Replace memcpy() with strscpy()
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 1/9] Get rid of __get_task_comm() Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 3/9] security: Replace memcpy() with get_task_comm() Yafang Shao
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, Paul Moore, Eric Paris

Using strscpy() to read the task comm ensures that the name is
always NUL-terminated, regardless of the source string. This approach also
facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: Eric Paris <eparis@redhat.com>
---
 kernel/auditsc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 6f0d6fb6523f..7cbcf3327409 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -2730,7 +2730,7 @@ void __audit_ptrace(struct task_struct *t)
 	context->target_uid = task_uid(t);
 	context->target_sessionid = audit_get_sessionid(t);
 	security_task_getsecid_obj(t, &context->target_sid);
-	memcpy(context->target_comm, t->comm, TASK_COMM_LEN);
+	strscpy(context->target_comm, t->comm, TASK_COMM_LEN);
 }
 
 /**
@@ -2757,7 +2757,7 @@ int audit_signal_info_syscall(struct task_struct *t)
 		ctx->target_uid = t_uid;
 		ctx->target_sessionid = audit_get_sessionid(t);
 		security_task_getsecid_obj(t, &ctx->target_sid);
-		memcpy(ctx->target_comm, t->comm, TASK_COMM_LEN);
+		strscpy(ctx->target_comm, t->comm, TASK_COMM_LEN);
 		return 0;
 	}
 
@@ -2778,7 +2778,7 @@ int audit_signal_info_syscall(struct task_struct *t)
 	axp->target_uid[axp->pid_count] = t_uid;
 	axp->target_sessionid[axp->pid_count] = audit_get_sessionid(t);
 	security_task_getsecid_obj(t, &axp->target_sid[axp->pid_count]);
-	memcpy(axp->target_comm[axp->pid_count], t->comm, TASK_COMM_LEN);
+	strscpy(axp->target_comm[axp->pid_count], t->comm, TASK_COMM_LEN);
 	axp->pid_count++;
 
 	return 0;
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 3/9] security: Replace memcpy() with get_task_comm()
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 1/9] Get rid of __get_task_comm() Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 2/9] auditsc: Replace memcpy() with strscpy() Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 4/9] bpftool: Ensure task comm is always NUL-terminated Yafang Shao
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, Paul Moore, James Morris, Serge E. Hallyn,
	Stephen Smalley, Ondrej Mosnacek

Quoted from Linus [0]:

  selinux never wanted a lock, and never wanted any kind of *consistent*
  result, it just wanted a *stable* result.

Using get_task_comm() to read the task comm ensures that the name is
always NUL-terminated, regardless of the source string. This approach also
facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
LINK: https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npJooP8chLPEfaFV0oNYTTBA@mail.gmail.com/ [0]
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
---
 security/lsm_audit.c         | 4 ++--
 security/selinux/selinuxfs.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/security/lsm_audit.c b/security/lsm_audit.c
index 849e832719e2..9a8352972086 100644
--- a/security/lsm_audit.c
+++ b/security/lsm_audit.c
@@ -207,7 +207,7 @@ static void dump_common_audit_data(struct audit_buffer *ab,
 	BUILD_BUG_ON(sizeof(a->u) > sizeof(void *)*2);
 
 	audit_log_format(ab, " pid=%d comm=", task_tgid_nr(current));
-	audit_log_untrustedstring(ab, memcpy(comm, current->comm, sizeof(comm)));
+	audit_log_untrustedstring(ab, get_task_comm(comm, current));
 
 	switch (a->type) {
 	case LSM_AUDIT_DATA_NONE:
@@ -302,7 +302,7 @@ static void dump_common_audit_data(struct audit_buffer *ab,
 				char comm[sizeof(tsk->comm)];
 				audit_log_format(ab, " opid=%d ocomm=", pid);
 				audit_log_untrustedstring(ab,
-				    memcpy(comm, tsk->comm, sizeof(comm)));
+				    get_task_comm(comm, tsk));
 			}
 		}
 		break;
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index e172f182b65c..57e014ff3076 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -708,7 +708,7 @@ static ssize_t sel_write_checkreqprot(struct file *file, const char __user *buf,
 	if (new_value) {
 		char comm[sizeof(current->comm)];
 
-		memcpy(comm, current->comm, sizeof(comm));
+		strscpy(comm, current->comm, sizeof(comm));
 		pr_err("SELinux: %s (%d) set checkreqprot to 1. This is no longer supported.\n",
 		       comm, current->pid);
 	}
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 4/9] bpftool: Ensure task comm is always NUL-terminated
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
                   ` (2 preceding siblings ...)
  2024-08-12  2:29 ` [PATCH v6 3/9] security: Replace memcpy() with get_task_comm() Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 5/9] mm/util: Fix possible race condition in kstrdup() Yafang Shao
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, Quentin Monnet

Let's explicitly ensure the destination string is NUL-terminated. This way,
it won't be affected by changes to the source string.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
---
 tools/bpf/bpftool/pids.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
index 9b898571b49e..23f488cf1740 100644
--- a/tools/bpf/bpftool/pids.c
+++ b/tools/bpf/bpftool/pids.c
@@ -54,6 +54,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
 		ref = &refs->refs[refs->ref_cnt];
 		ref->pid = e->pid;
 		memcpy(ref->comm, e->comm, sizeof(ref->comm));
+		ref->comm[sizeof(ref->comm) - 1] = '\0';
 		refs->ref_cnt++;
 
 		return;
@@ -77,6 +78,7 @@ static void add_ref(struct hashmap *map, struct pid_iter_entry *e)
 	ref = &refs->refs[0];
 	ref->pid = e->pid;
 	memcpy(ref->comm, e->comm, sizeof(ref->comm));
+	ref->comm[sizeof(ref->comm) - 1] = '\0';
 	refs->ref_cnt = 1;
 	refs->has_bpf_cookie = e->has_bpf_cookie;
 	refs->bpf_cookie = e->bpf_cookie;
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 5/9] mm/util: Fix possible race condition in kstrdup()
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
                   ` (3 preceding siblings ...)
  2024-08-12  2:29 ` [PATCH v6 4/9] bpftool: Ensure task comm is always NUL-terminated Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 6/9] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul} Yafang Shao
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao

In kstrdup(), it is critical to ensure that the dest string is always
NUL-terminated. However, potential race condidtion can occur between a
writer and a reader.

Consider the following scenario involving task->comm:

    reader                    writer

  len = strlen(s) + 1;
                             strlcpy(tsk->comm, buf, sizeof(tsk->comm));
  memcpy(buf, s, len);

In this case, there is a race condition between the reader and the
writer. The reader calculate the length of the string `s` based on the
old value of task->comm. However, during the memcpy(), the string `s`
might be updated by the writer to a new value of task->comm.

If the new task->comm is larger than the old one, the `buf` might not be
NUL-terminated. This can lead to undefined behavior and potential
security vulnerabilities.

Let's fix it by explicitly adding a NUL-terminator.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/util.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/util.c b/mm/util.c
index 983baf2bd675..4542d8a800d9 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp)
 
 	len = strlen(s) + 1;
 	buf = kmalloc_track_caller(len, gfp);
-	if (buf)
+	if (buf) {
 		memcpy(buf, s, len);
+		/* During memcpy(), the string might be updated to a new value,
+		 * which could be longer than the string when strlen() is
+		 * called. Therefore, we need to add a null termimator.
+		 */
+		buf[len - 1] = '\0';
+	}
 	return buf;
 }
 EXPORT_SYMBOL(kstrdup);
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 6/9] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul}
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
                   ` (4 preceding siblings ...)
  2024-08-12  2:29 ` [PATCH v6 5/9] mm/util: Fix possible race condition in kstrdup() Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 7/9] tracing: Replace strncpy() with strscpy() Yafang Shao
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, Simon Horman, Matthew Wilcox

These three functions follow the same pattern. To deduplicate the code,
let's introduce a common helper __kmemdup_nul().

Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
---
 mm/util.c | 67 +++++++++++++++++++++----------------------------------
 1 file changed, 26 insertions(+), 41 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index 4542d8a800d9..310c7735c617 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -45,33 +45,40 @@ void kfree_const(const void *x)
 EXPORT_SYMBOL(kfree_const);
 
 /**
- * kstrdup - allocate space for and copy an existing string
- * @s: the string to duplicate
+ * __kmemdup_nul - Create a NUL-terminated string from @s, which might be unterminated.
+ * @s: The data to copy
+ * @len: The size of the data, including the null terminator
  * @gfp: the GFP mask used in the kmalloc() call when allocating memory
  *
- * Return: newly allocated copy of @s or %NULL in case of error
+ * Return: newly allocated copy of @s with NUL-termination or %NULL in
+ * case of error
  */
-noinline
-char *kstrdup(const char *s, gfp_t gfp)
+static __always_inline char *__kmemdup_nul(const char *s, size_t len, gfp_t gfp)
 {
-	size_t len;
 	char *buf;
 
-	if (!s)
+	buf = kmalloc_track_caller(len, gfp);
+	if (!buf)
 		return NULL;
 
-	len = strlen(s) + 1;
-	buf = kmalloc_track_caller(len, gfp);
-	if (buf) {
-		memcpy(buf, s, len);
-		/* During memcpy(), the string might be updated to a new value,
-		 * which could be longer than the string when strlen() is
-		 * called. Therefore, we need to add a null termimator.
-		 */
-		buf[len - 1] = '\0';
-	}
+	memcpy(buf, s, len);
+	/* Ensure the buf is always NUL-terminated, regardless of @s. */
+	buf[len - 1] = '\0';
 	return buf;
 }
+
+/**
+ * kstrdup - allocate space for and copy an existing string
+ * @s: the string to duplicate
+ * @gfp: the GFP mask used in the kmalloc() call when allocating memory
+ *
+ * Return: newly allocated copy of @s or %NULL in case of error
+ */
+noinline
+char *kstrdup(const char *s, gfp_t gfp)
+{
+	return s ? __kmemdup_nul(s, strlen(s) + 1, gfp) : NULL;
+}
 EXPORT_SYMBOL(kstrdup);
 
 /**
@@ -106,19 +113,7 @@ EXPORT_SYMBOL(kstrdup_const);
  */
 char *kstrndup(const char *s, size_t max, gfp_t gfp)
 {
-	size_t len;
-	char *buf;
-
-	if (!s)
-		return NULL;
-
-	len = strnlen(s, max);
-	buf = kmalloc_track_caller(len+1, gfp);
-	if (buf) {
-		memcpy(buf, s, len);
-		buf[len] = '\0';
-	}
-	return buf;
+	return s ? __kmemdup_nul(s, strnlen(s, max) + 1, gfp) : NULL;
 }
 EXPORT_SYMBOL(kstrndup);
 
@@ -192,17 +187,7 @@ EXPORT_SYMBOL(kvmemdup);
  */
 char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)
 {
-	char *buf;
-
-	if (!s)
-		return NULL;
-
-	buf = kmalloc_track_caller(len + 1, gfp);
-	if (buf) {
-		memcpy(buf, s, len);
-		buf[len] = '\0';
-	}
-	return buf;
+	return s ? __kmemdup_nul(s, len + 1, gfp) : NULL;
 }
 EXPORT_SYMBOL(kmemdup_nul);
 
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 7/9] tracing: Replace strncpy() with strscpy()
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
                   ` (5 preceding siblings ...)
  2024-08-12  2:29 ` [PATCH v6 6/9] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul} Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-13 22:19   ` Justin Stitt
  2024-08-12  2:29 ` [PATCH v6 8/9] net: Replace strcpy() " Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 9/9] drm: " Yafang Shao
  8 siblings, 1 reply; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, Masami Hiramatsu, Mathieu Desnoyers

Using strscpy() to read the task comm ensures that the name is
always NUL-terminated, regardless of the source string. This approach also
facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 kernel/trace/trace.c             | 2 +-
 kernel/trace/trace_events_hist.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 578a49ff5c32..1b2577f9d734 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1907,7 +1907,7 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
 	max_data->critical_start = data->critical_start;
 	max_data->critical_end = data->critical_end;
 
-	strncpy(max_data->comm, tsk->comm, TASK_COMM_LEN);
+	strscpy(max_data->comm, tsk->comm, TASK_COMM_LEN);
 	max_data->pid = tsk->pid;
 	/*
 	 * If tsk == current, then use current_uid(), as that does not use
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 6ece1308d36a..4cd24c25ce05 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1599,7 +1599,7 @@ static inline void save_comm(char *comm, struct task_struct *task)
 		return;
 	}
 
-	strncpy(comm, task->comm, TASK_COMM_LEN);
+	strscpy(comm, task->comm, TASK_COMM_LEN);
 }
 
 static void hist_elt_data_free(struct hist_elt_data *elt_data)
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 8/9] net: Replace strcpy() with strscpy()
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
                   ` (6 preceding siblings ...)
  2024-08-12  2:29 ` [PATCH v6 7/9] tracing: Replace strncpy() with strscpy() Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  2024-08-12  2:29 ` [PATCH v6 9/9] drm: " Yafang Shao
  8 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni

To prevent errors from occurring when the src string is longer than the dst
string in strcpy(), we should use strscpy() instead. This approach
also facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Ahern <dsahern@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv6/ndisc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 254b192c5705..bf969a4773c0 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1942,7 +1942,7 @@ static void ndisc_warn_deprecated_sysctl(const struct ctl_table *ctl,
 	static char warncomm[TASK_COMM_LEN];
 	static int warned;
 	if (strcmp(warncomm, current->comm) && warned < 5) {
-		strcpy(warncomm, current->comm);
+		strscpy(warncomm, current->comm, sizeof(warncomm));
 		pr_warn("process `%s' is using deprecated sysctl (%s) net.ipv6.neigh.%s.%s - use net.ipv6.neigh.%s.%s_ms instead\n",
 			warncomm, func,
 			dev_name, ctl->procname,
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 9/9] drm: Replace strcpy() with strscpy()
  2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
                   ` (7 preceding siblings ...)
  2024-08-12  2:29 ` [PATCH v6 8/9] net: Replace strcpy() " Yafang Shao
@ 2024-08-12  2:29 ` Yafang Shao
  8 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12  2:29 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, ebiederm, alexei.starovoitov, rostedt, catalin.marinas,
	penguin-kernel, linux-mm, linux-fsdevel, linux-trace-kernel,
	audit, linux-security-module, selinux, bpf, netdev, dri-devel,
	Yafang Shao, Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie

To prevent erros from occurring when the src string is longer than the
dst string in strcpy(), we should use strscpy() instead. This
approach also facilitates future extensions to the task comm.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
---
 drivers/gpu/drm/drm_framebuffer.c     | 2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c
index 888aadb6a4ac..71bf8997eddf 100644
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -868,7 +868,7 @@ int drm_framebuffer_init(struct drm_device *dev, struct drm_framebuffer *fb,
 	INIT_LIST_HEAD(&fb->filp_head);
 
 	fb->funcs = funcs;
-	strcpy(fb->comm, current->comm);
+	strscpy(fb->comm, current->comm, sizeof(fb->comm));
 
 	ret = __drm_mode_object_add(dev, &fb->base, DRM_MODE_OBJECT_FB,
 				    false, drm_framebuffer_free);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 625b3c024540..97424a53bf9e 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1411,7 +1411,7 @@ static bool record_context(struct i915_gem_context_coredump *e,
 	rcu_read_lock();
 	task = pid_task(ctx->pid, PIDTYPE_PID);
 	if (task) {
-		strcpy(e->comm, task->comm);
+		strscpy(e->comm, task->comm, sizeof(e->comm));
 		e->pid = task->pid;
 	}
 	rcu_read_unlock();
-- 
2.43.5



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 1/9] Get rid of __get_task_comm()
  2024-08-12  2:29 ` [PATCH v6 1/9] Get rid of __get_task_comm() Yafang Shao
@ 2024-08-12  8:05   ` Alejandro Colomar
  2024-08-12 13:20     ` Yafang Shao
  0 siblings, 1 reply; 15+ messages in thread
From: Alejandro Colomar @ 2024-08-12  8:05 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Alexander Viro, Christian Brauner, Jan Kara,
	Kees Cook, Matus Jokay, Serge E. Hallyn

[-- Attachment #1: Type: text/plain, Size: 6420 bytes --]

Hi Yafang,

On Mon, Aug 12, 2024 at 10:29:25AM GMT, Yafang Shao wrote:
> We want to eliminate the use of __get_task_comm() for the following
> reasons:
> 
> - The task_lock() is unnecessary
>   Quoted from Linus [0]:
>   : Since user space can randomly change their names anyway, using locking
>   : was always wrong for readers (for writers it probably does make sense
>   : to have some lock - although practically speaking nobody cares there
>   : either, but at least for a writer some kind of race could have
>   : long-term mixed results
> 
> - The BUILD_BUG_ON() doesn't add any value
>   The only requirement is to ensure that the destination buffer is a valid
>   array.
> 
> - Zeroing is not necessary in current use cases
>   To avoid confusion, we should remove it. Moreover, not zeroing could
>   potentially make it easier to uncover bugs. If the caller needs a
>   zero-padded task name, it should be explicitly handled at the call site.
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Link: https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npJooP8chLPEfaFV0oNYTTBA@mail.gmail.com [0]
> Link: https://lore.kernel.org/all/CAHk-=whWtUC-AjmGJveAETKOMeMFSTwKwu99v7+b6AyHMmaDFA@mail.gmail.com/
> Suggested-by: Alejandro Colomar <alx@kernel.org>
> Link: https://lore.kernel.org/all/2jxak5v6dfxlpbxhpm3ey7oup4g2lnr3ueurfbosf5wdo65dk4@srb3hsk72zwq
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Cc: Matus Jokay <matus.jokay@stuba.sk>
> Cc: Alejandro Colomar <alx@kernel.org>
> Cc: "Serge E. Hallyn" <serge@hallyn.com>
> ---
>  fs/exec.c             | 10 ----------
>  fs/proc/array.c       |  2 +-
>  include/linux/sched.h | 31 +++++++++++++++++++++++++------
>  kernel/kthread.c      |  2 +-
>  4 files changed, 27 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index a47d0e4c54f6..2e468ddd203a 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1264,16 +1264,6 @@ static int unshare_sighand(struct task_struct *me)
>  	return 0;
>  }
>  
> -char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
> -{
> -	task_lock(tsk);
> -	/* Always NUL terminated and zero-padded */
> -	strscpy_pad(buf, tsk->comm, buf_size);

This comment is correct (see other comments below).

(Except that pedantically, I'd write it as NUL-terminated with a hyphen,
 just like zero-padded.)

> -	task_unlock(tsk);
> -	return buf;
> -}
> -EXPORT_SYMBOL_GPL(__get_task_comm);
> -
>  /*
>   * These functions flushes out all traces of the currently running executable
>   * so that a new one can be started
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 34a47fb0c57f..55ed3510d2bb 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -109,7 +109,7 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
>  	else if (p->flags & PF_KTHREAD)
>  		get_kthread_comm(tcomm, sizeof(tcomm), p);
>  	else
> -		__get_task_comm(tcomm, sizeof(tcomm), p);
> +		get_task_comm(tcomm, p);

LGTM.  (This would have been good even if not removing the helper.)

>  
>  	if (escape)
>  		seq_escape_str(m, tcomm, ESCAPE_SPACE | ESCAPE_SPECIAL, "\n\\");
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 33dd8d9d2b85..e0e26edbda61 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1096,9 +1096,11 @@ struct task_struct {
>  	/*
>  	 * executable name, excluding path.
>  	 *
> -	 * - normally initialized setup_new_exec()
> -	 * - access it with [gs]et_task_comm()
> -	 * - lock it with task_lock()
> +	 * - normally initialized begin_new_exec()
> +	 * - set it with set_task_comm()
> +	 *   - strscpy_pad() to ensure it is always NUL-terminated

The comment above is inmprecise.
It should say either
"strscpy() to ensure it is always NUL-terminated", or
"strscpy_pad() to ensure it is NUL-terminated and zero-padded".

> +	 *   - task_lock() to ensure the operation is atomic and the name is
> +	 *     fully updated.
>  	 */
>  	char				comm[TASK_COMM_LEN];
>  
> @@ -1912,10 +1914,27 @@ static inline void set_task_comm(struct task_struct *tsk, const char *from)
>  	__set_task_comm(tsk, from, false);
>  }
>  
> -extern char *__get_task_comm(char *to, size_t len, struct task_struct *tsk);
> +/*
> + * - Why not use task_lock()?
> + *   User space can randomly change their names anyway, so locking for readers
> + *   doesn't make sense. For writers, locking is probably necessary, as a race
> + *   condition could lead to long-term mixed results.
> + *   The strscpy_pad() in __set_task_comm() can ensure that the task comm is
> + *   always NUL-terminated.

This comment has the same imprecission that I noted above.

> Therefore the race condition between reader and
> + *   writer is not an issue.
> + *
> + * - Why not use strscpy_pad()?
> + *   While strscpy_pad() prevents writing garbage past the NUL terminator, which
> + *   is useful when using the task name as a key in a hash map, most use cases
> + *   don't require this. Zero-padding might confuse users if it’s unnecessary,
> + *   and not zeroing might even make it easier to expose bugs. If you need a
> + *   zero-padded task name, please handle that explicitly at the call site.
> + *
> + * - ARRAY_SIZE() can help ensure that @buf is indeed an array.
> + */
>  #define get_task_comm(buf, tsk) ({			\
> -	BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN);	\
> -	__get_task_comm(buf, sizeof(buf), tsk);		\
> +	strscpy(buf, (tsk)->comm, ARRAY_SIZE(buf));	\
> +	buf;						\
>  })
>  
>  #ifdef CONFIG_SMP
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index f7be976ff88a..7d001d033cf9 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -101,7 +101,7 @@ void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
>  	struct kthread *kthread = to_kthread(tsk);
>  
>  	if (!kthread || !kthread->full_name) {
> -		__get_task_comm(buf, buf_size, tsk);
> +		strscpy(buf, tsk->comm, buf_size);
>  		return;
>  	}

Other than that, LGTM.

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 1/9] Get rid of __get_task_comm()
  2024-08-12  8:05   ` Alejandro Colomar
@ 2024-08-12 13:20     ` Yafang Shao
  0 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-12 13:20 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: akpm, torvalds, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Alexander Viro, Christian Brauner, Jan Kara,
	Kees Cook, Matus Jokay, Serge E. Hallyn

On Mon, Aug 12, 2024 at 4:05 PM Alejandro Colomar <alx@kernel.org> wrote:
>
> Hi Yafang,
>
> On Mon, Aug 12, 2024 at 10:29:25AM GMT, Yafang Shao wrote:
> > We want to eliminate the use of __get_task_comm() for the following
> > reasons:
> >
> > - The task_lock() is unnecessary
> >   Quoted from Linus [0]:
> >   : Since user space can randomly change their names anyway, using locking
> >   : was always wrong for readers (for writers it probably does make sense
> >   : to have some lock - although practically speaking nobody cares there
> >   : either, but at least for a writer some kind of race could have
> >   : long-term mixed results
> >
> > - The BUILD_BUG_ON() doesn't add any value
> >   The only requirement is to ensure that the destination buffer is a valid
> >   array.
> >
> > - Zeroing is not necessary in current use cases
> >   To avoid confusion, we should remove it. Moreover, not zeroing could
> >   potentially make it easier to uncover bugs. If the caller needs a
> >   zero-padded task name, it should be explicitly handled at the call site.
> >
> > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> > Link: https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npJooP8chLPEfaFV0oNYTTBA@mail.gmail.com [0]
> > Link: https://lore.kernel.org/all/CAHk-=whWtUC-AjmGJveAETKOMeMFSTwKwu99v7+b6AyHMmaDFA@mail.gmail.com/
> > Suggested-by: Alejandro Colomar <alx@kernel.org>
> > Link: https://lore.kernel.org/all/2jxak5v6dfxlpbxhpm3ey7oup4g2lnr3ueurfbosf5wdo65dk4@srb3hsk72zwq
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: Christian Brauner <brauner@kernel.org>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Eric Biederman <ebiederm@xmission.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Cc: Matus Jokay <matus.jokay@stuba.sk>
> > Cc: Alejandro Colomar <alx@kernel.org>
> > Cc: "Serge E. Hallyn" <serge@hallyn.com>
> > ---
> >  fs/exec.c             | 10 ----------
> >  fs/proc/array.c       |  2 +-
> >  include/linux/sched.h | 31 +++++++++++++++++++++++++------
> >  kernel/kthread.c      |  2 +-
> >  4 files changed, 27 insertions(+), 18 deletions(-)
> >
> > diff --git a/fs/exec.c b/fs/exec.c
> > index a47d0e4c54f6..2e468ddd203a 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -1264,16 +1264,6 @@ static int unshare_sighand(struct task_struct *me)
> >       return 0;
> >  }
> >
> > -char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
> > -{
> > -     task_lock(tsk);
> > -     /* Always NUL terminated and zero-padded */
> > -     strscpy_pad(buf, tsk->comm, buf_size);
>
> This comment is correct (see other comments below).
>
> (Except that pedantically, I'd write it as NUL-terminated with a hyphen,
>  just like zero-padded.)
>
> > -     task_unlock(tsk);
> > -     return buf;
> > -}
> > -EXPORT_SYMBOL_GPL(__get_task_comm);
> > -
> >  /*
> >   * These functions flushes out all traces of the currently running executable
> >   * so that a new one can be started
> > diff --git a/fs/proc/array.c b/fs/proc/array.c
> > index 34a47fb0c57f..55ed3510d2bb 100644
> > --- a/fs/proc/array.c
> > +++ b/fs/proc/array.c
> > @@ -109,7 +109,7 @@ void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
> >       else if (p->flags & PF_KTHREAD)
> >               get_kthread_comm(tcomm, sizeof(tcomm), p);
> >       else
> > -             __get_task_comm(tcomm, sizeof(tcomm), p);
> > +             get_task_comm(tcomm, p);
>
> LGTM.  (This would have been good even if not removing the helper.)
>
> >
> >       if (escape)
> >               seq_escape_str(m, tcomm, ESCAPE_SPACE | ESCAPE_SPECIAL, "\n\\");
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 33dd8d9d2b85..e0e26edbda61 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1096,9 +1096,11 @@ struct task_struct {
> >       /*
> >        * executable name, excluding path.
> >        *
> > -      * - normally initialized setup_new_exec()
> > -      * - access it with [gs]et_task_comm()
> > -      * - lock it with task_lock()
> > +      * - normally initialized begin_new_exec()
> > +      * - set it with set_task_comm()
> > +      *   - strscpy_pad() to ensure it is always NUL-terminated
>
> The comment above is inmprecise.
> It should say either
> "strscpy() to ensure it is always NUL-terminated", or
> "strscpy_pad() to ensure it is NUL-terminated and zero-padded".

will change it.

>
> > +      *   - task_lock() to ensure the operation is atomic and the name is
> > +      *     fully updated.
> >        */
> >       char                            comm[TASK_COMM_LEN];
> >
> > @@ -1912,10 +1914,27 @@ static inline void set_task_comm(struct task_struct *tsk, const char *from)
> >       __set_task_comm(tsk, from, false);
> >  }
> >
> > -extern char *__get_task_comm(char *to, size_t len, struct task_struct *tsk);
> > +/*
> > + * - Why not use task_lock()?
> > + *   User space can randomly change their names anyway, so locking for readers
> > + *   doesn't make sense. For writers, locking is probably necessary, as a race
> > + *   condition could lead to long-term mixed results.
> > + *   The strscpy_pad() in __set_task_comm() can ensure that the task comm is
> > + *   always NUL-terminated.
>
> This comment has the same imprecission that I noted above.

will change it.

>
> > Therefore the race condition between reader and
> > + *   writer is not an issue.
> > + *
> > + * - Why not use strscpy_pad()?
> > + *   While strscpy_pad() prevents writing garbage past the NUL terminator, which
> > + *   is useful when using the task name as a key in a hash map, most use cases
> > + *   don't require this. Zero-padding might confuse users if it’s unnecessary,
> > + *   and not zeroing might even make it easier to expose bugs. If you need a
> > + *   zero-padded task name, please handle that explicitly at the call site.
> > + *
> > + * - ARRAY_SIZE() can help ensure that @buf is indeed an array.
> > + */
> >  #define get_task_comm(buf, tsk) ({                   \
> > -     BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN);     \
> > -     __get_task_comm(buf, sizeof(buf), tsk);         \
> > +     strscpy(buf, (tsk)->comm, ARRAY_SIZE(buf));     \
> > +     buf;                                            \
> >  })
> >
> >  #ifdef CONFIG_SMP
> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index f7be976ff88a..7d001d033cf9 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -101,7 +101,7 @@ void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
> >       struct kthread *kthread = to_kthread(tsk);
> >
> >       if (!kthread || !kthread->full_name) {
> > -             __get_task_comm(buf, buf_size, tsk);
> > +             strscpy(buf, tsk->comm, buf_size);
> >               return;
> >       }
>
> Other than that, LGTM.

Thanks for your review.

-- 
Regards
Yafang


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 7/9] tracing: Replace strncpy() with strscpy()
  2024-08-12  2:29 ` [PATCH v6 7/9] tracing: Replace strncpy() with strscpy() Yafang Shao
@ 2024-08-13 22:19   ` Justin Stitt
  2024-08-13 22:31     ` Justin Stitt
  0 siblings, 1 reply; 15+ messages in thread
From: Justin Stitt @ 2024-08-13 22:19 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Masami Hiramatsu, Mathieu Desnoyers

Hi,

On Mon, Aug 12, 2024 at 10:29:31AM GMT, Yafang Shao wrote:
> Using strscpy() to read the task comm ensures that the name is
> always NUL-terminated, regardless of the source string. This approach also
> facilitates future extensions to the task comm.

Thanks for sending patches replacing str{n}cpy's!

I believe there's at least two more instances of strncpy in trace.c as
well as in trace_events_hist.c (for a grand total of 6 instances in the
files you've touched in this specific patch).

It'd be great if you could replace those instances in this patch as well :>)

This would help greatly with [1].

> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
>  kernel/trace/trace.c             | 2 +-
>  kernel/trace/trace_events_hist.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 578a49ff5c32..1b2577f9d734 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -1907,7 +1907,7 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
>  	max_data->critical_start = data->critical_start;
>  	max_data->critical_end = data->critical_end;
>  
> -	strncpy(max_data->comm, tsk->comm, TASK_COMM_LEN);
> +	strscpy(max_data->comm, tsk->comm, TASK_COMM_LEN);

If max_data->comm wants to be NUL-terminated then this is the right
replacement. Without knowing how the trace stack works at all, it's hard
for me to tell if that is the case.

There's a length-supplied format specifier for which this comm field is
used with; Either this is just another safeguard against spilling over
the buffer or this field really doesn't care about NUL-termination.
| seq_printf(m, "#    | task: %.16s-%d "
|       "(uid:%d nice:%ld policy:%ld rt_prio:%ld)\n",
|       data->comm, data->pid,

In the event this field doesn't need to be NUL-terminated then we are
introducing an off-by-one error where we are copying one less useful
byte with strscpy -- Linus pointed out earlier [2] that these things all
just want to be c-strings so this is probably the right change :>)

>  	max_data->pid = tsk->pid;
>  	/*
>  	 * If tsk == current, then use current_uid(), as that does not use
> diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> index 6ece1308d36a..4cd24c25ce05 100644
> --- a/kernel/trace/trace_events_hist.c
> +++ b/kernel/trace/trace_events_hist.c
> @@ -1599,7 +1599,7 @@ static inline void save_comm(char *comm, struct task_struct *task)
>  		return;
>  	}
>  
> -	strncpy(comm, task->comm, TASK_COMM_LEN);
> +	strscpy(comm, task->comm, TASK_COMM_LEN);
>  }
>  
>  static void hist_elt_data_free(struct hist_elt_data *elt_data)
> -- 
> 2.43.5
> 

Link: https://github.com/KSPP/linux/issues/90 [1]
Link: https://lore.kernel.org/all/CAHk-=whWtUC-AjmGJveAETKOMeMFSTwKwu99v7+b6AyHMmaDFA@mail.gmail.com/ [2]

Thanks
Justin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 7/9] tracing: Replace strncpy() with strscpy()
  2024-08-13 22:19   ` Justin Stitt
@ 2024-08-13 22:31     ` Justin Stitt
  2024-08-14  2:34       ` Yafang Shao
  0 siblings, 1 reply; 15+ messages in thread
From: Justin Stitt @ 2024-08-13 22:31 UTC (permalink / raw)
  To: Yafang Shao
  Cc: akpm, torvalds, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Masami Hiramatsu, Mathieu Desnoyers

On Tue, Aug 13, 2024 at 3:19 PM Justin Stitt <justinstitt@google.com> wrote:
>
> Hi,
>
> On Mon, Aug 12, 2024 at 10:29:31AM GMT, Yafang Shao wrote:
> > Using strscpy() to read the task comm ensures that the name is
> > always NUL-terminated, regardless of the source string. This approach also
> > facilitates future extensions to the task comm.
>
> Thanks for sending patches replacing str{n}cpy's!
>
> I believe there's at least two more instances of strncpy in trace.c as
> well as in trace_events_hist.c (for a grand total of 6 instances in the
> files you've touched in this specific patch).
>
> It'd be great if you could replace those instances in this patch as well :>)
>
> This would help greatly with [1].
>

I just saw that Jinjie Ruan sent replacements for these strncpy's too
and tracked down and replaced an instance of strscpy() that was
present in trace.c but was moved to trace_sched_switch.c during a
refactor.

They even used the new 2-argument strscpy which is pretty neat.

See their patch here:
https://lore.kernel.org/all/20240731075058.617588-1-ruanjinjie@huawei.com/

> Link: https://github.com/KSPP/linux/issues/90 [1]
> Link: https://lore.kernel.org/all/CAHk-=whWtUC-AjmGJveAETKOMeMFSTwKwu99v7+b6AyHMmaDFA@mail.gmail.com/ [2]
>
> Thanks
> Justin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 7/9] tracing: Replace strncpy() with strscpy()
  2024-08-13 22:31     ` Justin Stitt
@ 2024-08-14  2:34       ` Yafang Shao
  0 siblings, 0 replies; 15+ messages in thread
From: Yafang Shao @ 2024-08-14  2:34 UTC (permalink / raw)
  To: Justin Stitt, ruanjinjie
  Cc: akpm, torvalds, ebiederm, alexei.starovoitov, rostedt,
	catalin.marinas, penguin-kernel, linux-mm, linux-fsdevel,
	linux-trace-kernel, audit, linux-security-module, selinux, bpf,
	netdev, dri-devel, Masami Hiramatsu, Mathieu Desnoyers

On Wed, Aug 14, 2024 at 6:31 AM Justin Stitt <justinstitt@google.com> wrote:
>
> On Tue, Aug 13, 2024 at 3:19 PM Justin Stitt <justinstitt@google.com> wrote:
> >
> > Hi,
> >
> > On Mon, Aug 12, 2024 at 10:29:31AM GMT, Yafang Shao wrote:
> > > Using strscpy() to read the task comm ensures that the name is
> > > always NUL-terminated, regardless of the source string. This approach also
> > > facilitates future extensions to the task comm.
> >
> > Thanks for sending patches replacing str{n}cpy's!
> >
> > I believe there's at least two more instances of strncpy in trace.c as
> > well as in trace_events_hist.c (for a grand total of 6 instances in the
> > files you've touched in this specific patch).
> >
> > It'd be great if you could replace those instances in this patch as well :>)
> >
> > This would help greatly with [1].
> >
>
> I just saw that Jinjie Ruan sent replacements for these strncpy's too
> and tracked down and replaced an instance of strscpy() that was
> present in trace.c but was moved to trace_sched_switch.c during a
> refactor.
>
> They even used the new 2-argument strscpy which is pretty neat.
>
> See their patch here:
> https://lore.kernel.org/all/20240731075058.617588-1-ruanjinjie@huawei.com/

+ Jinjie

That sounds good. Since this change can be handled as a separate
patch, I will drop it from the next version and leave it to Jinjie.
Please note that Steven might have a better solution for handling
task->comm in trace events, so it’s probably best to leave any changes
related to trace events to him [0].

[0] https://lore.kernel.org/all/20240603184016.3374559f@gandalf.local.home/#t

--
Regards
Yafang


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-08-14  2:35 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-12  2:29 [PATCH v6 0/9] Improve the copy of task comm Yafang Shao
2024-08-12  2:29 ` [PATCH v6 1/9] Get rid of __get_task_comm() Yafang Shao
2024-08-12  8:05   ` Alejandro Colomar
2024-08-12 13:20     ` Yafang Shao
2024-08-12  2:29 ` [PATCH v6 2/9] auditsc: Replace memcpy() with strscpy() Yafang Shao
2024-08-12  2:29 ` [PATCH v6 3/9] security: Replace memcpy() with get_task_comm() Yafang Shao
2024-08-12  2:29 ` [PATCH v6 4/9] bpftool: Ensure task comm is always NUL-terminated Yafang Shao
2024-08-12  2:29 ` [PATCH v6 5/9] mm/util: Fix possible race condition in kstrdup() Yafang Shao
2024-08-12  2:29 ` [PATCH v6 6/9] mm/util: Deduplicate code in {kstrdup,kstrndup,kmemdup_nul} Yafang Shao
2024-08-12  2:29 ` [PATCH v6 7/9] tracing: Replace strncpy() with strscpy() Yafang Shao
2024-08-13 22:19   ` Justin Stitt
2024-08-13 22:31     ` Justin Stitt
2024-08-14  2:34       ` Yafang Shao
2024-08-12  2:29 ` [PATCH v6 8/9] net: Replace strcpy() " Yafang Shao
2024-08-12  2:29 ` [PATCH v6 9/9] drm: " Yafang Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox