On 03.03.26 13:58, James Bottomley wrote:
> On Tue, 2026-03-03 at 07:47 -0500, Sasha Levin wrote:
>> On Tue, Mar 03, 2026 at 10:31:46AM +0100, Jiri Slaby wrote:
>>> On 03. 03. 26, 9:11, Geert Uytterhoeven wrote:
>>>> On Tue, 3 Mar 2026 at 07:26, Richard Weinberger <richard@nod.at>
>>>> wrote:
>>>>>> Von: "Sasha Levin" <sashal@kernel.org>
>>>>>> Add CONFIG_KALLSYMS_LINEINFO, which embeds a compact address-
>>>>>> to-line
>>>>>> lookup table in the kernel image so stack traces directly
>>>>>> print source
>>>>>> file and line number information:
>>>>
>>>>>> Memory footprint measured with a simple KVM guest x86_64
>>>>>> config:
>>>>>>
>>>>>>   Table: 4,597,583 entries from 4,841 source files
>>>>>>     lineinfo_addrs[]     4,597,583 x u32  = 17.5 MiB
>>>>>>     lineinfo_file_ids[]  4,597,583 x u16  =  8.8 MiB
>>>>>>     lineinfo_lines[]     4,597,583 x u32  = 17.5 MiB
>>>>>>     file_offsets + filenames              ~  0.1 MiB
>>>>>>     Total .rodata increase:              ~ 44.0 MiB
>>>>>>
>>>>>>   vmlinux (stripped):  529 MiB -> 573 MiB  (+44 MiB / +8.3%)
>>>>>
>>>>> Hm, that's a significant increase.
>>>>
>>>> Other random idea: this data is only needed in case of a crash.
>>>> Perhaps it can be stored compressed, and only be decompressed
>>>> when needed, or even during look-up?
>>>
>>> But obviously not when dumping OOM stack traces :P.
>>
>> Right - I really wanted to avoid memory allocations or disk I/O here.
>>
>> I'm sure we can come up with more efficient ways to store this
>> information - I wanted to keep the initial version simple and easy
>> for review.
> 
> When the system is crashing, efficiency (at least as long as the user
> doesn't notice) isn't typically required, so if you did a linear search
> instead of a binary one you could use compressed data that's amenable
> to decompression using a stream algorithm (i.e. only requires a fixed
> length buffer, not decompression of the entire thing), then you stream
> through the compressed data a chunk at a time looking for the match.

And for this data the compression algorithm could be quite simple:

Build chunks of e.g. 1000 entries, allowing to do a quick search for
finding the correct chunk, then scan through the chunk to find the
entry.

Put the start address at the beginning of each chunk, then use a lsb128
coded offset for each entry (offset always relative to last entry, so
most entries would need only 1 byte additional address information).

I guess file ids are fine as u16.

Line numbers could be lsb128 encoded, too, limiting most entries to
2 bytes of additional information.

This simple scheme would already save roughly 50% of the needed space.


Juergen