* [PATCH 0/4 v4] exec: inherit HWCAPs from the parent process
@ 2026-02-17 18:01 Andrei Vagin
2026-02-17 18:01 ` [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4 Andrei Vagin
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Andrei Vagin @ 2026-02-17 18:01 UTC (permalink / raw)
To: Kees Cook, Andrew Morton
Cc: Cyrill Gorcunov, Mike Rapoport, Alexander Mikhalitsyn,
linux-kernel, linux-fsdevel, linux-mm, criu, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny
This patch series introduces a mechanism to inherit hardware capabilities
(AT_HWCAP, AT_HWCAP2, etc.) from a parent process when they have been
modified via prctl.
To support C/R operations (snapshots, live migration) in heterogeneous
clusters, we must ensure that processes utilize CPU features available
on all potential target nodes. To solve this, we need to advertise a
common feature set across the cluster.
Initially, a cgroup-based approach was considered, but it was decided
that inheriting HWCAPs from a parent process that has set its own
auxiliary vector via prctl is a simpler and more flexible solution.
This implementation adds a new mm flag MMF_USER_HWCAP, which is set when the
auxiliary vector is modified via prctl(PR_SET_MM_AUXV). When execve() is
called, if the current process has MMF_USER_HWCAP set, the HWCAP values are
extracted from the current auxiliary vector and inherited by the new process.
The first patch fixes AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4
in binfmt_elf_fdpic and updates AT_VECTOR_SIZE_BASE.
The second patch implements the core inheritance logic in execve().
The third patch adds a selftest to verify that HWCAPs are correctly
inherited across execve().
v4: minor fixes based on feedback from the previous version.
v3: synchronize saved_auxv access with arg_lock
v1: https://lkml.org/lkml/2025/12/5/65
v2: https://lkml.org/lkml/2026/1/8/219
v3: https://lkml.org/lkml/2026/2/9/1233
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chen Ridong <chenridong@huawei.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Andrei Vagin (3):
binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4
exec: inherit HWCAPs from the parent process
mm: synchronize saved_auxv access with arg_lock
selftests/exec: add test for HWCAP inheritance
fs/binfmt_elf.c | 8 +-
fs/binfmt_elf_fdpic.c | 14 ++-
fs/exec.c | 64 ++++++++++++
fs/proc/base.c | 12 ++-
include/linux/auxvec.h | 2 +-
include/linux/binfmts.h | 11 ++
include/linux/mm_types.h | 2 +
kernel/fork.c | 8 ++
kernel/sys.c | 30 +++---
tools/testing/selftests/exec/.gitignore | 1 +
tools/testing/selftests/exec/Makefile | 1 +
tools/testing/selftests/exec/hwcap_inherit.c | 104 +++++++++++++++++++
12 files changed, 231 insertions(+), 26 deletions(-)
create mode 100644 tools/testing/selftests/exec/hwcap_inherit.c
--
2.52.0.351.gbe84eed79e-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4
2026-02-17 18:01 [PATCH 0/4 v4] exec: inherit HWCAPs from the parent process Andrei Vagin
@ 2026-02-17 18:01 ` Andrei Vagin
2026-02-17 18:01 ` [PATCH 2/4] exec: inherit HWCAPs from the parent process Andrei Vagin
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Andrei Vagin @ 2026-02-17 18:01 UTC (permalink / raw)
To: Kees Cook, Andrew Morton
Cc: Cyrill Gorcunov, Mike Rapoport, Alexander Mikhalitsyn,
linux-kernel, linux-fsdevel, linux-mm, criu, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny, Andrei Vagin, Mark Brown,
Max Filippov, Alexander Mikhalitsyn
Commit 4e6e8c2b757f ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4") added
support for AT_HWCAP3 and AT_HWCAP4, but it missed updating the AUX
vector size calculation in create_elf_fdpic_tables() and
AT_VECTOR_SIZE_BASE in include/linux/auxvec.h.
Similar to the fix for AT_HWCAP2 in commit c6a09e342f8e ("binfmt_elf_fdpic:
fix AUXV size calculation when ELF_HWCAP2 is defined"), this omission
leads to a mismatch between the reserved space and the actual number of
AUX entries, eventually triggering a kernel BUG_ON(csp != sp).
Fix this by incrementing nitems when ELF_HWCAP3 or ELF_HWCAP4 are
defined and updating AT_VECTOR_SIZE_BASE.
Cc: Mark Brown <broonie@kernel.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@futurfusion.io>
Fixes: 4e6e8c2b757f ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4")
Signed-off-by: Andrei Vagin <avagin@google.com>
---
fs/binfmt_elf_fdpic.c | 6 ++++++
include/linux/auxvec.h | 2 +-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 48fd2de3bca0..a3d4e6973b29 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -595,6 +595,12 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
#ifdef ELF_HWCAP2
nitems++;
#endif
+#ifdef ELF_HWCAP3
+ nitems++;
+#endif
+#ifdef ELF_HWCAP4
+ nitems++;
+#endif
csp = sp;
sp -= nitems * 2 * sizeof(unsigned long);
diff --git a/include/linux/auxvec.h b/include/linux/auxvec.h
index 407f7005e6d6..8bcb9b726262 100644
--- a/include/linux/auxvec.h
+++ b/include/linux/auxvec.h
@@ -4,6 +4,6 @@
#include <uapi/linux/auxvec.h>
-#define AT_VECTOR_SIZE_BASE 22 /* NEW_AUX_ENT entries in auxiliary table */
+#define AT_VECTOR_SIZE_BASE 24 /* NEW_AUX_ENT entries in auxiliary table */
/* number of "#define AT_.*" above, minus {AT_NULL, AT_IGNORE, AT_NOTELF} */
#endif /* _LINUX_AUXVEC_H */
--
2.53.0.310.g728cabbaf7-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/4] exec: inherit HWCAPs from the parent process
2026-02-17 18:01 [PATCH 0/4 v4] exec: inherit HWCAPs from the parent process Andrei Vagin
2026-02-17 18:01 ` [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4 Andrei Vagin
@ 2026-02-17 18:01 ` Andrei Vagin
2026-02-17 18:01 ` [PATCH 3/4] mm: synchronize saved_auxv access with arg_lock Andrei Vagin
2026-02-17 18:01 ` [PATCH 4/4] selftests/exec: add test for HWCAP inheritance Andrei Vagin
3 siblings, 0 replies; 7+ messages in thread
From: Andrei Vagin @ 2026-02-17 18:01 UTC (permalink / raw)
To: Kees Cook, Andrew Morton
Cc: Cyrill Gorcunov, Mike Rapoport, Alexander Mikhalitsyn,
linux-kernel, linux-fsdevel, linux-mm, criu, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny, Andrei Vagin,
Alexander Mikhalitsyn
Introduces a mechanism to inherit hardware capabilities (AT_HWCAP,
AT_HWCAP2, etc.) from a parent process when they have been modified via
prctl.
To support C/R operations (snapshots, live migration) in heterogeneous
clusters, we must ensure that processes utilize CPU features available
on all potential target nodes. To solve this, we need to advertise a
common feature set across the cluster.
This patch adds a new mm flag MMF_USER_HWCAP, which is set when the
auxiliary vector is modified via prctl(PR_SET_MM, PR_SET_MM_AUXV). When
execve() is called, if the current process has MMF_USER_HWCAP set, the
HWCAP values are extracted from the current auxiliary vector and stored
in the linux_binprm structure. These values are then used to populate
the auxiliary vector of the new process, effectively inheriting the
hardware capabilities.
The inherited HWCAPs are masked with the hardware capabilities supported
by the current kernel to ensure that we don't report more features than
actually supported. This is important to avoid unexpected behavior,
especially for processes with additional privileges.
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@futurfusion.io>
Signed-off-by: Andrei Vagin <avagin@google.com>
---
fs/binfmt_elf.c | 8 ++---
fs/binfmt_elf_fdpic.c | 8 ++---
fs/exec.c | 63 ++++++++++++++++++++++++++++++++++++++++
include/linux/binfmts.h | 11 +++++++
include/linux/mm_types.h | 2 ++
kernel/fork.c | 3 ++
kernel/sys.c | 5 +++-
7 files changed, 91 insertions(+), 9 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 3eb734c192e9..aec129e33f0b 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -246,7 +246,7 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
*/
ARCH_DLINFO;
#endif
- NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
+ NEW_AUX_ENT(AT_HWCAP, bprm->hwcap);
NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE);
NEW_AUX_ENT(AT_CLKTCK, CLOCKS_PER_SEC);
NEW_AUX_ENT(AT_PHDR, phdr_addr);
@@ -264,13 +264,13 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
NEW_AUX_ENT(AT_SECURE, bprm->secureexec);
NEW_AUX_ENT(AT_RANDOM, (elf_addr_t)(unsigned long)u_rand_bytes);
#ifdef ELF_HWCAP2
- NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2);
+ NEW_AUX_ENT(AT_HWCAP2, bprm->hwcap2);
#endif
#ifdef ELF_HWCAP3
- NEW_AUX_ENT(AT_HWCAP3, ELF_HWCAP3);
+ NEW_AUX_ENT(AT_HWCAP3, bprm->hwcap3);
#endif
#ifdef ELF_HWCAP4
- NEW_AUX_ENT(AT_HWCAP4, ELF_HWCAP4);
+ NEW_AUX_ENT(AT_HWCAP4, bprm->hwcap4);
#endif
NEW_AUX_ENT(AT_EXECFN, bprm->exec);
if (k_platform) {
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index a3d4e6973b29..55b482f03c82 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -629,15 +629,15 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
*/
ARCH_DLINFO;
#endif
- NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
+ NEW_AUX_ENT(AT_HWCAP, bprm->hwcap);
#ifdef ELF_HWCAP2
- NEW_AUX_ENT(AT_HWCAP2, ELF_HWCAP2);
+ NEW_AUX_ENT(AT_HWCAP2, bprm->hwcap2);
#endif
#ifdef ELF_HWCAP3
- NEW_AUX_ENT(AT_HWCAP3, ELF_HWCAP3);
+ NEW_AUX_ENT(AT_HWCAP3, bprm->hwcap3);
#endif
#ifdef ELF_HWCAP4
- NEW_AUX_ENT(AT_HWCAP4, ELF_HWCAP4);
+ NEW_AUX_ENT(AT_HWCAP4, bprm->hwcap4);
#endif
NEW_AUX_ENT(AT_PAGESZ, PAGE_SIZE);
NEW_AUX_ENT(AT_CLKTCK, CLOCKS_PER_SEC);
diff --git a/fs/exec.c b/fs/exec.c
index 2e3a6593c6fd..9c70776fca9e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1454,6 +1454,17 @@ static struct linux_binprm *alloc_bprm(int fd, struct filename *filename, int fl
*/
bprm->is_check = !!(flags & AT_EXECVE_CHECK);
+ bprm->hwcap = ELF_HWCAP;
+#ifdef ELF_HWCAP2
+ bprm->hwcap2 = ELF_HWCAP2;
+#endif
+#ifdef ELF_HWCAP3
+ bprm->hwcap3 = ELF_HWCAP3;
+#endif
+#ifdef ELF_HWCAP4
+ bprm->hwcap4 = ELF_HWCAP4;
+#endif
+
retval = bprm_mm_init(bprm);
if (!retval)
return bprm;
@@ -1775,6 +1786,55 @@ static int bprm_execve(struct linux_binprm *bprm)
return retval;
}
+static void inherit_hwcap(struct linux_binprm *bprm)
+{
+ struct mm_struct *mm = current->mm;
+ int i, n;
+
+#ifdef ELF_HWCAP4
+ n = 4;
+#elif defined(ELF_HWCAP3)
+ n = 3;
+#elif defined(ELF_HWCAP2)
+ n = 2;
+#else
+ n = 1;
+#endif
+
+ for (i = 0; n && i < AT_VECTOR_SIZE; i += 2) {
+ unsigned long type = mm->saved_auxv[i];
+ unsigned long val = mm->saved_auxv[i + 1];
+
+ switch (type) {
+ case AT_NULL:
+ goto done;
+ case AT_HWCAP:
+ bprm->hwcap = val & ELF_HWCAP;
+ break;
+#ifdef ELF_HWCAP2
+ case AT_HWCAP2:
+ bprm->hwcap2 = val & ELF_HWCAP2;
+ break;
+#endif
+#ifdef ELF_HWCAP3
+ case AT_HWCAP3:
+ bprm->hwcap3 = val & ELF_HWCAP3;
+ break;
+#endif
+#ifdef ELF_HWCAP4
+ case AT_HWCAP4:
+ bprm->hwcap4 = val & ELF_HWCAP4;
+ break;
+#endif
+ default:
+ continue;
+ }
+ n--;
+ }
+done:
+ mm_flags_set(MMF_USER_HWCAP, bprm->mm);
+}
+
static int do_execveat_common(int fd, struct filename *filename,
struct user_arg_ptr argv,
struct user_arg_ptr envp,
@@ -1843,6 +1903,9 @@ static int do_execveat_common(int fd, struct filename *filename,
current->comm, bprm->filename);
}
+ if (mm_flags_test(MMF_USER_HWCAP, current->mm))
+ inherit_hwcap(bprm);
+
return bprm_execve(bprm);
}
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index 65abd5ab8836..94a3dcf9b1d2 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -2,6 +2,7 @@
#ifndef _LINUX_BINFMTS_H
#define _LINUX_BINFMTS_H
+#include <linux/elf.h>
#include <linux/sched.h>
#include <linux/unistd.h>
#include <asm/exec.h>
@@ -67,6 +68,16 @@ struct linux_binprm {
unsigned long exec;
struct rlimit rlim_stack; /* Saved RLIMIT_STACK used during exec. */
+ unsigned long hwcap;
+#ifdef ELF_HWCAP2
+ unsigned long hwcap2;
+#endif
+#ifdef ELF_HWCAP3
+ unsigned long hwcap3;
+#endif
+#ifdef ELF_HWCAP4
+ unsigned long hwcap4;
+#endif
char buf[BINPRM_BUF_SIZE];
} __randomize_layout;
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8731606d8d36..2f3c6ad48c0a 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1918,6 +1918,8 @@ enum {
#define MMF_TOPDOWN 31 /* mm searches top down by default */
#define MMF_TOPDOWN_MASK BIT(MMF_TOPDOWN)
+#define MMF_USER_HWCAP 32 /* user-defined HWCAPs */
+
#define MMF_INIT_LEGACY_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\
MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK)
diff --git a/kernel/fork.c b/kernel/fork.c
index e832da9d15a4..4c92a2bc3cbb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1104,6 +1104,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
__mm_flags_overwrite_word(mm, mmf_init_legacy_flags(flags));
mm->def_flags = current->mm->def_flags & VM_INIT_DEF_MASK;
+
+ if (mm_flags_test(MMF_USER_HWCAP, current->mm))
+ mm_flags_set(MMF_USER_HWCAP, mm);
} else {
__mm_flags_overwrite_word(mm, default_dump_filter);
mm->def_flags = 0;
diff --git a/kernel/sys.c b/kernel/sys.c
index cdbf8513caf6..e4b0fa2f6845 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2157,8 +2157,10 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
* not introduce additional locks here making the kernel
* more complex.
*/
- if (prctl_map.auxv_size)
+ if (prctl_map.auxv_size) {
memcpy(mm->saved_auxv, user_auxv, sizeof(user_auxv));
+ mm_flags_set(MMF_USER_HWCAP, mm);
+ }
mmap_read_unlock(mm);
return 0;
@@ -2190,6 +2192,7 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr,
task_lock(current);
memcpy(mm->saved_auxv, user_auxv, len);
+ mm_flags_set(MMF_USER_HWCAP, mm);
task_unlock(current);
return 0;
--
2.53.0.310.g728cabbaf7-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/4] mm: synchronize saved_auxv access with arg_lock
2026-02-17 18:01 [PATCH 0/4 v4] exec: inherit HWCAPs from the parent process Andrei Vagin
2026-02-17 18:01 ` [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4 Andrei Vagin
2026-02-17 18:01 ` [PATCH 2/4] exec: inherit HWCAPs from the parent process Andrei Vagin
@ 2026-02-17 18:01 ` Andrei Vagin
2026-02-17 18:01 ` [PATCH 4/4] selftests/exec: add test for HWCAP inheritance Andrei Vagin
3 siblings, 0 replies; 7+ messages in thread
From: Andrei Vagin @ 2026-02-17 18:01 UTC (permalink / raw)
To: Kees Cook, Andrew Morton
Cc: Cyrill Gorcunov, Mike Rapoport, Alexander Mikhalitsyn,
linux-kernel, linux-fsdevel, linux-mm, criu, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny, Andrei Vagin,
Alexander Mikhalitsyn
The mm->saved_auxv array stores the auxiliary vector, which can be
modified via prctl(PR_SET_MM_AUXV) or prctl(PR_SET_MM_MAP). Previously,
accesses to saved_auxv were not synchronized. This was a intentional
trade-off, as the vector was only used to provide information to
userspace via /proc/PID/auxv or prctl(PR_GET_AUXV), and consistency
between the auxv values left to userspace.
With the introduction of hardware capability (HWCAP) inheritance during
execve, the kernel now relies on the contents of saved_auxv to configure
the execution environment of new processes. An unsynchronized read
during execve could result in a new process inheriting an inconsistent
set of capabilities if the parent process updates its auxiliary vector
concurrently.
While it is still not strictly required to guarantee the consistency of
auxv values on the kernel side, doing so is relatively straightforward.
This change implements synchronization using arg_lock.
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@futurfusion.io>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Andrei Vagin <avagin@google.com>
---
fs/exec.c | 2 ++
fs/proc/base.c | 12 +++++++++---
include/linux/mm_types.h | 1 -
kernel/fork.c | 7 ++++++-
kernel/sys.c | 29 ++++++++++++++---------------
5 files changed, 31 insertions(+), 20 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 9c70776fca9e..8f5fba06aff8 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1801,6 +1801,7 @@ static void inherit_hwcap(struct linux_binprm *bprm)
n = 1;
#endif
+ spin_lock(&mm->arg_lock);
for (i = 0; n && i < AT_VECTOR_SIZE; i += 2) {
unsigned long type = mm->saved_auxv[i];
unsigned long val = mm->saved_auxv[i + 1];
@@ -1832,6 +1833,7 @@ static void inherit_hwcap(struct linux_binprm *bprm)
n--;
}
done:
+ spin_unlock(&mm->arg_lock);
mm_flags_set(MMF_USER_HWCAP, bprm->mm);
}
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4eec684baca9..09d887741268 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1083,14 +1083,20 @@ static ssize_t auxv_read(struct file *file, char __user *buf,
{
struct mm_struct *mm = file->private_data;
unsigned int nwords = 0;
+ unsigned long saved_auxv[AT_VECTOR_SIZE];
if (!mm)
return 0;
+
+ spin_lock(&mm->arg_lock);
+ memcpy(saved_auxv, mm->saved_auxv, sizeof(saved_auxv));
+ spin_unlock(&mm->arg_lock);
+
do {
nwords += 2;
- } while (mm->saved_auxv[nwords - 2] != 0); /* AT_NULL */
- return simple_read_from_buffer(buf, count, ppos, mm->saved_auxv,
- nwords * sizeof(mm->saved_auxv[0]));
+ } while (saved_auxv[nwords - 2] != 0); /* AT_NULL */
+ return simple_read_from_buffer(buf, count, ppos, saved_auxv,
+ nwords * sizeof(saved_auxv[0]));
}
static const struct file_operations proc_auxv_operations = {
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 2f3c6ad48c0a..d1a95b90e448 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1254,7 +1254,6 @@ struct mm_struct {
unsigned long start_code, end_code, start_data, end_data;
unsigned long start_brk, brk, start_stack;
unsigned long arg_start, arg_end, env_start, env_end;
-
unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */
#ifdef CONFIG_ARCH_HAS_ELF_CORE_EFLAGS
diff --git a/kernel/fork.c b/kernel/fork.c
index 4c92a2bc3cbb..e17e57e29b6a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1105,8 +1105,13 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
__mm_flags_overwrite_word(mm, mmf_init_legacy_flags(flags));
mm->def_flags = current->mm->def_flags & VM_INIT_DEF_MASK;
- if (mm_flags_test(MMF_USER_HWCAP, current->mm))
+ if (mm_flags_test(MMF_USER_HWCAP, current->mm)) {
+ spin_lock(¤t->mm->arg_lock);
mm_flags_set(MMF_USER_HWCAP, mm);
+ memcpy(mm->saved_auxv, current->mm->saved_auxv,
+ sizeof(mm->saved_auxv));
+ spin_unlock(¤t->mm->arg_lock);
+ }
} else {
__mm_flags_overwrite_word(mm, default_dump_filter);
mm->def_flags = 0;
diff --git a/kernel/sys.c b/kernel/sys.c
index e4b0fa2f6845..c679b5797e73 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2147,20 +2147,11 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
mm->arg_end = prctl_map.arg_end;
mm->env_start = prctl_map.env_start;
mm->env_end = prctl_map.env_end;
- spin_unlock(&mm->arg_lock);
-
- /*
- * Note this update of @saved_auxv is lockless thus
- * if someone reads this member in procfs while we're
- * updating -- it may get partly updated results. It's
- * known and acceptable trade off: we leave it as is to
- * not introduce additional locks here making the kernel
- * more complex.
- */
if (prctl_map.auxv_size) {
- memcpy(mm->saved_auxv, user_auxv, sizeof(user_auxv));
mm_flags_set(MMF_USER_HWCAP, mm);
+ memcpy(mm->saved_auxv, user_auxv, sizeof(user_auxv));
}
+ spin_unlock(&mm->arg_lock);
mmap_read_unlock(mm);
return 0;
@@ -2190,10 +2181,10 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr,
BUILD_BUG_ON(sizeof(user_auxv) != sizeof(mm->saved_auxv));
- task_lock(current);
- memcpy(mm->saved_auxv, user_auxv, len);
+ spin_lock(&mm->arg_lock);
mm_flags_set(MMF_USER_HWCAP, mm);
- task_unlock(current);
+ memcpy(mm->saved_auxv, user_auxv, len);
+ spin_unlock(&mm->arg_lock);
return 0;
}
@@ -2481,9 +2472,17 @@ static inline int prctl_get_mdwe(unsigned long arg2, unsigned long arg3,
static int prctl_get_auxv(void __user *addr, unsigned long len)
{
struct mm_struct *mm = current->mm;
+ unsigned long auxv[AT_VECTOR_SIZE];
unsigned long size = min_t(unsigned long, sizeof(mm->saved_auxv), len);
- if (size && copy_to_user(addr, mm->saved_auxv, size))
+ if (!size)
+ return sizeof(mm->saved_auxv);
+
+ spin_lock(&mm->arg_lock);
+ memcpy(auxv, mm->saved_auxv, size);
+ spin_unlock(&mm->arg_lock);
+
+ if (copy_to_user(addr, auxv, size))
return -EFAULT;
return sizeof(mm->saved_auxv);
}
--
2.53.0.310.g728cabbaf7-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 4/4] selftests/exec: add test for HWCAP inheritance
2026-02-17 18:01 [PATCH 0/4 v4] exec: inherit HWCAPs from the parent process Andrei Vagin
` (2 preceding siblings ...)
2026-02-17 18:01 ` [PATCH 3/4] mm: synchronize saved_auxv access with arg_lock Andrei Vagin
@ 2026-02-17 18:01 ` Andrei Vagin
3 siblings, 0 replies; 7+ messages in thread
From: Andrei Vagin @ 2026-02-17 18:01 UTC (permalink / raw)
To: Kees Cook, Andrew Morton
Cc: Cyrill Gorcunov, Mike Rapoport, Alexander Mikhalitsyn,
linux-kernel, linux-fsdevel, linux-mm, criu, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny, Andrei Vagin,
Alexander Mikhalitsyn
Verify that HWCAPs are correctly inherited/preserved across execve() when
modified via prctl(PR_SET_MM_AUXV).
The test performs the following steps:
* reads the current AUXV using prctl(PR_GET_AUXV);
* finds an HWCAP entry and toggles its most significant bit;
* replaces the AUXV of the current process with the modified one using
prctl(PR_SET_MM, PR_SET_MM_AUXV);
* executes itself to verify that the new program sees the modified HWCAP
value.
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@futurfusion.io>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Andrei Vagin <avagin@google.com>
---
tools/testing/selftests/exec/.gitignore | 1 +
tools/testing/selftests/exec/Makefile | 1 +
tools/testing/selftests/exec/hwcap_inherit.c | 105 +++++++++++++++++++
3 files changed, 107 insertions(+)
create mode 100644 tools/testing/selftests/exec/hwcap_inherit.c
diff --git a/tools/testing/selftests/exec/.gitignore b/tools/testing/selftests/exec/.gitignore
index 7f3d1ae762ec..2ff245fd0ba6 100644
--- a/tools/testing/selftests/exec/.gitignore
+++ b/tools/testing/selftests/exec/.gitignore
@@ -19,3 +19,4 @@ null-argv
xxxxxxxx*
pipe
S_I*.test
+hwcap_inherit
\ No newline at end of file
diff --git a/tools/testing/selftests/exec/Makefile b/tools/testing/selftests/exec/Makefile
index 45a3cfc435cf..e73005965e05 100644
--- a/tools/testing/selftests/exec/Makefile
+++ b/tools/testing/selftests/exec/Makefile
@@ -20,6 +20,7 @@ TEST_FILES := Makefile
TEST_GEN_PROGS += recursion-depth
TEST_GEN_PROGS += null-argv
TEST_GEN_PROGS += check-exec
+TEST_GEN_PROGS += hwcap_inherit
EXTRA_CLEAN := $(OUTPUT)/subdir.moved $(OUTPUT)/execveat.moved $(OUTPUT)/xxxxx* \
$(OUTPUT)/S_I*.test
diff --git a/tools/testing/selftests/exec/hwcap_inherit.c b/tools/testing/selftests/exec/hwcap_inherit.c
new file mode 100644
index 000000000000..1b43b2dbb1d0
--- /dev/null
+++ b/tools/testing/selftests/exec/hwcap_inherit.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include <sys/auxv.h>
+#include <sys/prctl.h>
+#include <sys/wait.h>
+#include <linux/prctl.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <errno.h>
+#include <elf.h>
+#include <linux/auxvec.h>
+
+#include "../kselftest.h"
+
+static int find_msb(unsigned long v)
+{
+ return sizeof(v)*8 - __builtin_clzl(v) - 1;
+}
+
+int main(int argc, char *argv[])
+{
+ unsigned long auxv[1024], hwcap, new_hwcap, hwcap_idx;
+ int size, hwcap_type = 0, hwcap_feature, count, status;
+ char hwcap_str[32], hwcap_type_str[32];
+ pid_t pid;
+
+ if (argc > 1 && strcmp(argv[1], "verify") == 0) {
+ unsigned long type = strtoul(argv[2], NULL, 16);
+ unsigned long expected = strtoul(argv[3], NULL, 16);
+ unsigned long hwcap = getauxval(type);
+
+ if (hwcap != expected) {
+ ksft_print_msg("HWCAP mismatch: type %lx, expected %lx, got %lx\n",
+ type, expected, hwcap);
+ return 1;
+ }
+ ksft_print_msg("HWCAP matched: %lx\n", hwcap);
+ return 0;
+ }
+
+ ksft_print_header();
+ ksft_set_plan(1);
+
+ size = prctl(PR_GET_AUXV, auxv, sizeof(auxv), 0, 0);
+ if (size == -1)
+ ksft_exit_fail_perror("prctl(PR_GET_AUXV)");
+
+ count = size / sizeof(unsigned long);
+
+ /* Find the "latest" feature and try to mask it out. */
+ for (int i = 0; i < count - 1; i += 2) {
+ hwcap = auxv[i + 1];
+ if (hwcap == 0)
+ continue;
+ switch (auxv[i]) {
+ case AT_HWCAP4:
+ case AT_HWCAP3:
+ case AT_HWCAP2:
+ case AT_HWCAP:
+ hwcap_type = auxv[i];
+ hwcap_feature = find_msb(hwcap);
+ hwcap_idx = i + 1;
+ break;
+ default:
+ continue;
+ }
+ }
+ if (hwcap_type == 0)
+ ksft_exit_skip("No features found, skipping test\n");
+ hwcap = auxv[hwcap_idx];
+ new_hwcap = hwcap ^ (1UL << hwcap_feature);
+ auxv[hwcap_idx] = new_hwcap;
+
+ if (prctl(PR_SET_MM, PR_SET_MM_AUXV, auxv, size, 0) < 0) {
+ if (errno == EPERM)
+ ksft_exit_skip("prctl(PR_SET_MM_AUXV) requires CAP_SYS_RESOURCE\n");
+ ksft_exit_fail_perror("prctl(PR_SET_MM_AUXV)");
+ }
+
+ pid = fork();
+ if (pid < 0)
+ ksft_exit_fail_perror("fork");
+ if (pid == 0) {
+ char *new_argv[] = { argv[0], "verify", hwcap_type_str, hwcap_str, NULL };
+
+ snprintf(hwcap_str, sizeof(hwcap_str), "%lx", new_hwcap);
+ snprintf(hwcap_type_str, sizeof(hwcap_type_str), "%x", hwcap_type);
+
+ execv(argv[0], new_argv);
+ perror("execv");
+ exit(1);
+ }
+
+ if (waitpid(pid, &status, 0) == -1)
+ ksft_exit_fail_perror("waitpid");
+ if (status != 0)
+ ksft_exit_fail_msg("HWCAP inheritance failed (status %d)\n", status);
+
+ ksft_test_result_pass("HWCAP inheritance succeeded\n");
+ ksft_exit_pass();
+ return 0;
+}
--
2.53.0.310.g728cabbaf7-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4
2026-02-09 19:06 ` [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4 Andrei Vagin
@ 2026-02-10 19:59 ` Alexander Mikhalitsyn
0 siblings, 0 replies; 7+ messages in thread
From: Alexander Mikhalitsyn @ 2026-02-10 19:59 UTC (permalink / raw)
To: Andrei Vagin
Cc: Kees Cook, Andrew Morton, Cyrill Gorcunov, Mike Rapoport,
linux-kernel, linux-fsdevel, linux-mm, criu, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny, Mark Brown, Max Filippov
Am Mo., 9. Feb. 2026 um 20:06 Uhr schrieb Andrei Vagin <avagin@google.com>:
>
> Commit 4e6e8c2b757f ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4") added
> support for AT_HWCAP3 and AT_HWCAP4, but it missed updating the AUX
> vector size calculation in create_elf_fdpic_tables() and
> AT_VECTOR_SIZE_BASE in include/linux/auxvec.h.
>
> Similar to the fix for AT_HWCAP2 in commit c6a09e342f8e ("binfmt_elf_fdpic:
> fix AUXV size calculation when ELF_HWCAP2 is defined"), this omission
> leads to a mismatch between the reserved space and the actual number of
> AUX entries, eventually triggering a kernel BUG_ON(csp != sp).
>
> Fix this by incrementing nitems when ELF_HWCAP3 or ELF_HWCAP4 are
> defined and updating AT_VECTOR_SIZE_BASE.
>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Max Filippov <jcmvbkbc@gmail.com>
> Reviewed-by: Michal Koutný <mkoutny@suse.com>
> Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@futurfusion.io>
> Fixes: 4e6e8c2b757f ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4")
> Signed-off-by: Andrei Vagin <avagin@google.com>
> ---
> fs/binfmt_elf_fdpic.c | 6 ++++++
> include/linux/auxvec.h | 2 +-
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
> index 48fd2de3bca0..a3d4e6973b29 100644
> --- a/fs/binfmt_elf_fdpic.c
> +++ b/fs/binfmt_elf_fdpic.c
> @@ -595,6 +595,12 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
> #ifdef ELF_HWCAP2
> nitems++;
> #endif
> +#ifdef ELF_HWCAP3
> + nitems++;
> +#endif
> +#ifdef ELF_HWCAP4
> + nitems++;
> +#endif
>
> csp = sp;
> sp -= nitems * 2 * sizeof(unsigned long);
> diff --git a/include/linux/auxvec.h b/include/linux/auxvec.h
> index 407f7005e6d6..8bcb9b726262 100644
> --- a/include/linux/auxvec.h
> +++ b/include/linux/auxvec.h
> @@ -4,6 +4,6 @@
>
> #include <uapi/linux/auxvec.h>
>
> -#define AT_VECTOR_SIZE_BASE 22 /* NEW_AUX_ENT entries in auxiliary table */
> +#define AT_VECTOR_SIZE_BASE 24 /* NEW_AUX_ENT entries in auxiliary table */
> /* number of "#define AT_.*" above, minus {AT_NULL, AT_IGNORE, AT_NOTELF} */
> #endif /* _LINUX_AUXVEC_H */
> --
> 2.53.0.239.g8d8fc8a987-goog
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4
2026-02-09 19:06 [PATCH 0/4 v3] exec: inherit HWCAPs from the parent process Andrei Vagin
@ 2026-02-09 19:06 ` Andrei Vagin
2026-02-10 19:59 ` Alexander Mikhalitsyn
0 siblings, 1 reply; 7+ messages in thread
From: Andrei Vagin @ 2026-02-09 19:06 UTC (permalink / raw)
To: Kees Cook, Andrew Morton
Cc: Cyrill Gorcunov, Mike Rapoport, Alexander Mikhalitsyn,
linux-kernel, linux-fsdevel, linux-mm, criu, Chen Ridong,
Christian Brauner, David Hildenbrand, Eric Biederman,
Lorenzo Stoakes, Michal Koutny, Andrei Vagin, Mark Brown,
Max Filippov
Commit 4e6e8c2b757f ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4") added
support for AT_HWCAP3 and AT_HWCAP4, but it missed updating the AUX
vector size calculation in create_elf_fdpic_tables() and
AT_VECTOR_SIZE_BASE in include/linux/auxvec.h.
Similar to the fix for AT_HWCAP2 in commit c6a09e342f8e ("binfmt_elf_fdpic:
fix AUXV size calculation when ELF_HWCAP2 is defined"), this omission
leads to a mismatch between the reserved space and the actual number of
AUX entries, eventually triggering a kernel BUG_ON(csp != sp).
Fix this by incrementing nitems when ELF_HWCAP3 or ELF_HWCAP4 are
defined and updating AT_VECTOR_SIZE_BASE.
Cc: Mark Brown <broonie@kernel.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Fixes: 4e6e8c2b757f ("binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4")
Signed-off-by: Andrei Vagin <avagin@google.com>
---
fs/binfmt_elf_fdpic.c | 6 ++++++
include/linux/auxvec.h | 2 +-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 48fd2de3bca0..a3d4e6973b29 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -595,6 +595,12 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
#ifdef ELF_HWCAP2
nitems++;
#endif
+#ifdef ELF_HWCAP3
+ nitems++;
+#endif
+#ifdef ELF_HWCAP4
+ nitems++;
+#endif
csp = sp;
sp -= nitems * 2 * sizeof(unsigned long);
diff --git a/include/linux/auxvec.h b/include/linux/auxvec.h
index 407f7005e6d6..8bcb9b726262 100644
--- a/include/linux/auxvec.h
+++ b/include/linux/auxvec.h
@@ -4,6 +4,6 @@
#include <uapi/linux/auxvec.h>
-#define AT_VECTOR_SIZE_BASE 22 /* NEW_AUX_ENT entries in auxiliary table */
+#define AT_VECTOR_SIZE_BASE 24 /* NEW_AUX_ENT entries in auxiliary table */
/* number of "#define AT_.*" above, minus {AT_NULL, AT_IGNORE, AT_NOTELF} */
#endif /* _LINUX_AUXVEC_H */
--
2.53.0.239.g8d8fc8a987-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-02-17 18:01 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-17 18:01 [PATCH 0/4 v4] exec: inherit HWCAPs from the parent process Andrei Vagin
2026-02-17 18:01 ` [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4 Andrei Vagin
2026-02-17 18:01 ` [PATCH 2/4] exec: inherit HWCAPs from the parent process Andrei Vagin
2026-02-17 18:01 ` [PATCH 3/4] mm: synchronize saved_auxv access with arg_lock Andrei Vagin
2026-02-17 18:01 ` [PATCH 4/4] selftests/exec: add test for HWCAP inheritance Andrei Vagin
-- strict thread matches above, loose matches on Subject: below --
2026-02-09 19:06 [PATCH 0/4 v3] exec: inherit HWCAPs from the parent process Andrei Vagin
2026-02-09 19:06 ` [PATCH 1/4] binfmt_elf_fdpic: fix AUXV size calculation for ELF_HWCAP3 and ELF_HWCAP4 Andrei Vagin
2026-02-10 19:59 ` Alexander Mikhalitsyn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox