From: Fangrui Song <maskray@sourceware.org>
To: Jens Remus <jremus@linux.ibm.com>
Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
bpf@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org,
Steven Rostedt <rostedt@kernel.org>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andrii Nakryiko <andrii@kernel.org>,
Indu Bhagat <indu.bhagat@oracle.com>,
"Jose E. Marchesi" <jemarch@gnu.org>,
Beau Belgrave <beaub@linux.microsoft.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Florian Weimer <fweimer@redhat.com>, Kees Cook <kees@kernel.org>,
Carlos O'Donell <codonell@redhat.com>,
Sam James <sam@gentoo.org>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Hildenbrand <david@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>
Subject: Re: [PATCH v11 00/15] unwind_deferred: Implement sframe handling
Date: Thu, 23 Oct 2025 01:09:02 -0700 [thread overview]
Message-ID: <mzqhpduikzpeczmmxh5uyfzjpvdeae3ityqyp2rno4myujzb6y@ey346eksvoyf> (raw)
In-Reply-To: <20251022144326.4082059-1-jremus@linux.ibm.com>
On 2025-10-22, Jens Remus wrote:
>This is the implementation of parsing the SFrame section in an ELF file.
>It's a continuation of Josh's and Steve's last work that can be found
>here:
>
> https://lore.kernel.org/all/cover.1737511963.git.jpoimboe@kernel.org/
> https://lore.kernel.org/all/20250827201548.448472904@kernel.org/
>
>Currently the only way to get a user space stack trace from a stack
>walk (and not just copying large amount of user stack into the kernel
>ring buffer) is to use frame pointers. This has a few issues. The biggest
>one is that compiling frame pointers into every application and library
>has been shown to cause performance overhead.
>
>Another issue is that the format of the frames may not always be consistent
>between different compilers and some architectures (s390) has no defined
>format to do a reliable stack walk. The only way to perform user space
>profiling on these architectures is to copy the user stack into the kernel
>buffer.
>
>SFrames[1] is now supported in gcc binutils and soon will also be supported
>by LLVM.
Please consider dropping the statement, "soon will also be supported by LLVM."
Speaking as LLVM's MC, lld/ELF, and binary utilities maintainer, I have significant concerns about the v2 format, specifically its apparent disregard for standard ELF and linker conventions
(https://maskray.me/blog/2025-09-28-remarks-on-sframe#linking-and-execution-views)
To arm64 maintainers, it is critical time to revisit a unwind
information format, as I have outlined in my blog post:
A sorted address table like .eh_frame_hdr might still be needed, but the
design could be very different for arm64.
I am curious whether anyone has thought about a library that parses .eh_frame and generates SFrame.
If objtool integrates this library, it can generate SFrame for vmlinux and modules without relying on assembler/linker.
Linker and assembler requires a level of stability that is currently concerning on the toolchain side.
(https://sourceware.org/pipermail/binutils/2025-October/144974.html
"This "linker will DTRT" assertion glosses over significant
implementation complexity. Each version needs not just a reader but
version-specific *merging* logic in every linker—fundamentally different
from simply reading a format.")
>SFrames acts more like ORC, and lives in the ELF executable
>file as its own section. Like ORC it has two tables where the first table
>is sorted by instruction pointers (IP) and using the current IP and finding
>it's entry in the first table, it will take you to the second table which
>will tell you where the return address of the current function is located
>and then you can use that address to look it up in the first table to find
>the return address of that function, and so on. This performs a user
>space stack walk.
>
>Now because the SFrame section lives in the ELF file it needs to be faulted
>into memory when it is used. This means that walking the user space stack
>requires being in a faultable context. As profilers like perf request a stack
>trace in interrupt or NMI context, it cannot do the walking when it is
>requested. Instead it must be deferred until it is safe to fault in user
>space. One place this is known to be safe is when the task is about to return
>back to user space.
>
>This series makes the deferred unwind code implement SFrames.
>
>[1] https://sourceware.org/binutils/wiki/sframe
>
>Changes since v10:
>- Rebase on v6.17-rc1 with Peter's unwind user fixes and x86 support
> series [2] and Steve's support for the deferred unwinding infrastructure
> series in perf [3] and perf tool [4] on top.
>- Support for SFrame V2 PC-relative FDE function start address. (Jens)
>- Support for SFrame V2 representing RA undefined as indication for
> outermost frames. (Jens)
>
>[2]: [PATCH 00/12] Various fixes and x86 support,
> https://lore.kernel.org/all/20250924075948.579302904@infradead.org/
>[3]: [PATCH v16 0/4] perf: Support the deferred unwinding infrastructure,
> https://lore.kernel.org/all/20251007214008.080852573@kernel.org/
>[4]: [PATCH v16 0/4] perf tool: Support the deferred unwinding infrastructure,
> https://lore.kernel.org/all/20250908175319.841517121@kernel.org/
>
>Patches 1 and 2 are suggested fixups to patches from Peter's unwind user
>fixes and x86 support series. They keep the factoring out of the word
>size from the frame's CFA, FP, and RA offsets local to unwind user fp, as
>unwind user sframe does use absolute offsets.
>
>Patches 3, 6, and 14 have been updated to exclusively support the recent
>PC-relative SFrame FDE function start address encoding. With Binutils 2.45
>the SFrame V2 FDE function start address field value is an offset from the
>field (i.e. PC-relative) instead of from the .sframe section start. This
>is indicated by the new SFrame header flag SFRAME_F_FDE_FUNC_START_PCREL.
>Old SFrame V2 sections get rejected with dynamic debug message
>"bad/unsupported sframe header".
>
>Patches 9 and 10 add support to unwind user and unwind user sframe for
>a recent change of the SFrame V2 format to represent an undefined
>return address as an SFrame FRE without any offsets, which is used as
>indication for outermost frames. Note that currently only a development
>build of Binutils mainline generates SFrame information including this
>new indication for outermost frames. SFrame information without the new
>indication is still supported. Without these patches unwind user sframe
>would identify such new SFrame FREs without any offsets as corrupted and
>remove the .sframe section, causing any any further stack tracing using
>sframe to fail.
>
>Regards,
>Jens
>
>
>Jens Remus (4):
> fixup! unwind: Implement compat fp unwind
> fixup! unwind_user/x86: Enable frame pointer unwinding on x86
> unwind_user: Stop when reaching an outermost frame
> unwind_user/sframe: Add support for outermost frame indication
>
>Josh Poimboeuf (11):
> unwind_user/sframe: Add support for reading .sframe headers
> unwind_user/sframe: Store sframe section data in per-mm maple tree
> x86/uaccess: Add unsafe_copy_from_user() implementation
> unwind_user/sframe: Add support for reading .sframe contents
> unwind_user/sframe: Detect .sframe sections in executables
> unwind_user/sframe: Wire up unwind_user to sframe
> unwind_user/sframe/x86: Enable sframe unwinding on x86
> unwind_user/sframe: Remove .sframe section on detected corruption
> unwind_user/sframe: Show file name in debug output
> unwind_user/sframe: Add .sframe validation option
> unwind_user/sframe: Add prctl() interface for registering .sframe
> sections
>
> MAINTAINERS | 1 +
> arch/Kconfig | 23 ++
> arch/x86/Kconfig | 1 +
> arch/x86/include/asm/mmu.h | 2 +-
> arch/x86/include/asm/uaccess.h | 39 +-
> arch/x86/include/asm/unwind_user.h | 11 +-
> fs/binfmt_elf.c | 49 ++-
> include/linux/mm_types.h | 3 +
> include/linux/sframe.h | 60 +++
> include/linux/unwind_user_types.h | 5 +-
> include/uapi/linux/elf.h | 1 +
> include/uapi/linux/prctl.h | 6 +-
> kernel/fork.c | 10 +
> kernel/sys.c | 9 +
> kernel/unwind/Makefile | 3 +-
> kernel/unwind/sframe.c | 615 +++++++++++++++++++++++++++++
> kernel/unwind/sframe.h | 72 ++++
> kernel/unwind/sframe_debug.h | 68 ++++
> kernel/unwind/user.c | 56 ++-
> mm/init-mm.c | 2 +
> 20 files changed, 1004 insertions(+), 32 deletions(-)
> create mode 100644 include/linux/sframe.h
> create mode 100644 kernel/unwind/sframe.c
> create mode 100644 kernel/unwind/sframe.h
> create mode 100644 kernel/unwind/sframe_debug.h
>
>--
>2.48.1
>
next prev parent reply other threads:[~2025-10-23 8:08 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-22 14:43 Jens Remus
2025-10-22 14:43 ` [PATCH v11 01/15] fixup! unwind: Implement compat fp unwind Jens Remus
2025-10-22 14:43 ` [PATCH v11 02/15] fixup! unwind_user/x86: Enable frame pointer unwinding on x86 Jens Remus
2025-10-22 14:43 ` [PATCH v11 03/15] unwind_user/sframe: Add support for reading .sframe headers Jens Remus
2025-11-18 17:04 ` Jens Remus
2025-11-18 19:26 ` Steven Rostedt
2025-10-22 14:43 ` [PATCH v11 04/15] unwind_user/sframe: Store sframe section data in per-mm maple tree Jens Remus
2025-10-22 14:43 ` [PATCH v11 05/15] x86/uaccess: Add unsafe_copy_from_user() implementation Jens Remus
2025-10-22 14:43 ` [PATCH v11 06/15] unwind_user/sframe: Add support for reading .sframe contents Jens Remus
2025-10-23 16:04 ` Jens Remus
2025-10-22 14:43 ` [PATCH v11 07/15] unwind_user/sframe: Detect .sframe sections in executables Jens Remus
2025-10-22 14:43 ` [PATCH v11 08/15] unwind_user/sframe: Wire up unwind_user to sframe Jens Remus
2025-10-24 13:44 ` Peter Zijlstra
2025-10-24 14:29 ` Jens Remus
2025-10-24 19:00 ` Steven Rostedt
2025-10-22 14:43 ` [PATCH v11 09/15] unwind_user: Stop when reaching an outermost frame Jens Remus
2025-10-22 14:43 ` [PATCH v11 10/15] unwind_user/sframe: Add support for outermost frame indication Jens Remus
2025-10-22 14:43 ` [PATCH v11 11/15] unwind_user/sframe/x86: Enable sframe unwinding on x86 Jens Remus
2025-10-22 14:43 ` [PATCH v11 12/15] unwind_user/sframe: Remove .sframe section on detected corruption Jens Remus
2025-10-22 14:43 ` [PATCH v11 13/15] unwind_user/sframe: Show file name in debug output Jens Remus
2025-10-22 14:43 ` [PATCH v11 14/15] unwind_user/sframe: Add .sframe validation option Jens Remus
2025-10-22 14:43 ` [PATCH v11 15/15] unwind_user/sframe: Add prctl() interface for registering .sframe sections Jens Remus
2025-10-22 20:39 ` [PATCH v11 00/15] unwind_deferred: Implement sframe handling Andrew Morton
2025-10-22 21:58 ` Steven Rostedt
2025-10-23 8:09 ` Fangrui Song [this message]
2025-10-23 14:23 ` Steven Rostedt
2025-10-23 16:05 ` [RFC PATCH 1/2] fixup! unwind_user/sframe: Add support for reading .sframe contents Jens Remus
2025-10-23 16:05 ` [RFC PATCH 2/2] fixup! unwind_user/sframe: Add .sframe validation option Jens Remus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mzqhpduikzpeczmmxh5uyfzjpvdeae3ityqyp2rno4myujzb6y@ey346eksvoyf \
--to=maskray@sourceware.org \
--cc=Liam.Howlett@oracle.com \
--cc=acme@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=beaub@linux.microsoft.com \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=codonell@redhat.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=fweimer@redhat.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=indu.bhagat@oracle.com \
--cc=jemarch@gnu.org \
--cc=jolsa@kernel.org \
--cc=jpoimboe@kernel.org \
--cc=jremus@linux.ibm.com \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@kernel.org \
--cc=rppt@kernel.org \
--cc=sam@gentoo.org \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox