* KASAN vs realloc
@ 2026-01-06 12:42 Maciej Żenczykowski
2026-01-07 20:28 ` Kees Cook
0 siblings, 1 reply; 6+ messages in thread
From: Maciej Żenczykowski @ 2026-01-06 12:42 UTC (permalink / raw)
To: Maciej Wieczor-Retman
Cc: joonki.min, Kees Cook, Andrew Morton, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton, Uladzislau Rezki,
Danilo Krummrich, Kees Cook, jiayuan.chen,
syzbot+997752115a851cb0cf36, Maciej Wieczor-Retman, kasan-dev,
Kernel hackers, linux-mm
We've got internal reports (b/467571011 - from CC'ed Samsung
developer) that kasan realloc is broken for sizes that are not a
multiple of the granule. This appears to be triggered during Android
bootup by some ebpf program loading operations (a struct is 88 bytes
in size, which is a multiple of 8, but not 16, which is the granule
size).
(this is on 6.18 with
https://lore.kernel.org/all/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me/
already included)
joonki.min@samsung-slsi.corp-partner.google.com summarized it as
"When newly requested size is not bigger than allocated size and old
size was not 16 byte aligned, it failed to unpoison extended area."
and *very* rough comment:
Right. "size - old_size" is not guaranteed 16-byte alignment in this case.
I think we may unpoison 16-byte alignment size, but it allowed more
than requested :(
I'm not sure that's right approach.
if (size <= alloced_size) {
- kasan_unpoison_vmalloc(p + old_size, size - old_size,
+ kasan_unpoison_vmalloc(p + old_size, round_up(size -
old_size, KASAN_GRANULE_SIZE),
KASAN_VMALLOC_PROT_NORMAL |
KASAN_VMALLOC_VM_ALLOC |
KASAN_VMALLOC_KEEP_TAG);
/*
* No need to zero memory here, as unused memory will have
* already been zeroed at initial allocation time or during
* realloc shrink time.
*/
- vm->requested_size = size;
+ vm->requested_size = round_up(size, KASAN_GRANULE_SIZE);
my personal guess is that
But just above the code you quoted in mm/vmalloc.c I see:
if (size <= old_size) {
...
kasan_poison_vmalloc(p + size, old_size - size);
is also likely wrong?? Considering:
mm/kasan/shadow.c
void __kasan_poison_vmalloc(const void *start, unsigned long size)
{
if (!is_vmalloc_or_module_addr(start))
return;
size = round_up(size, KASAN_GRANULE_SIZE);
kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
}
This doesn't look right - if start isn't a multiple of the granule.
--
Maciej Żenczykowski, Kernel Networking Developer @ Google
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KASAN vs realloc
2026-01-06 12:42 KASAN vs realloc Maciej Żenczykowski
@ 2026-01-07 20:28 ` Kees Cook
2026-01-07 20:47 ` Maciej Wieczor-Retman
0 siblings, 1 reply; 6+ messages in thread
From: Kees Cook @ 2026-01-07 20:28 UTC (permalink / raw)
To: Maciej Żenczykowski
Cc: Maciej Wieczor-Retman, joonki.min, Andrew Morton,
Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Marco Elver, Andrew Morton,
Uladzislau Rezki, Danilo Krummrich, jiayuan.chen,
syzbot+997752115a851cb0cf36, Maciej Wieczor-Retman, kasan-dev,
Kernel hackers, linux-mm
On Tue, Jan 06, 2026 at 01:42:45PM +0100, Maciej Żenczykowski wrote:
> We've got internal reports (b/467571011 - from CC'ed Samsung
> developer) that kasan realloc is broken for sizes that are not a
> multiple of the granule. This appears to be triggered during Android
> bootup by some ebpf program loading operations (a struct is 88 bytes
> in size, which is a multiple of 8, but not 16, which is the granule
> size).
>
> (this is on 6.18 with
> https://lore.kernel.org/all/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me/
> already included)
>
> joonki.min@samsung-slsi.corp-partner.google.com summarized it as
> "When newly requested size is not bigger than allocated size and old
> size was not 16 byte aligned, it failed to unpoison extended area."
>
> and *very* rough comment:
>
> Right. "size - old_size" is not guaranteed 16-byte alignment in this case.
>
> I think we may unpoison 16-byte alignment size, but it allowed more
> than requested :(
>
> I'm not sure that's right approach.
>
> if (size <= alloced_size) {
> - kasan_unpoison_vmalloc(p + old_size, size - old_size,
> + kasan_unpoison_vmalloc(p + old_size, round_up(size -
> old_size, KASAN_GRANULE_SIZE),
> KASAN_VMALLOC_PROT_NORMAL |
> KASAN_VMALLOC_VM_ALLOC |
> KASAN_VMALLOC_KEEP_TAG);
> /*
> * No need to zero memory here, as unused memory will have
> * already been zeroed at initial allocation time or during
> * realloc shrink time.
> */
> - vm->requested_size = size;
> + vm->requested_size = round_up(size, KASAN_GRANULE_SIZE);
>
> my personal guess is that
>
> But just above the code you quoted in mm/vmalloc.c I see:
> if (size <= old_size) {
> ...
> kasan_poison_vmalloc(p + size, old_size - size);
>
> is also likely wrong?? Considering:
>
> mm/kasan/shadow.c
>
> void __kasan_poison_vmalloc(const void *start, unsigned long size)
> {
> if (!is_vmalloc_or_module_addr(start))
> return;
>
> size = round_up(size, KASAN_GRANULE_SIZE);
> kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
> }
>
> This doesn't look right - if start isn't a multiple of the granule.
I don't think we can ever have the start not be a granule multiple, can
we?
I'm not sure how any of this is supposed to be handled by KASAN, though.
It does seem like a round_up() is missing, though?
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KASAN vs realloc
2026-01-07 20:28 ` Kees Cook
@ 2026-01-07 20:47 ` Maciej Wieczor-Retman
2026-01-07 21:47 ` Maciej Żenczykowski
0 siblings, 1 reply; 6+ messages in thread
From: Maciej Wieczor-Retman @ 2026-01-07 20:47 UTC (permalink / raw)
To: Kees Cook
Cc: Maciej Żenczykowski, joonki.min, Andrew Morton,
Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Marco Elver, Andrew Morton,
Uladzislau Rezki, Danilo Krummrich, jiayuan.chen,
syzbot+997752115a851cb0cf36, Maciej Wieczor-Retman, kasan-dev,
Kernel hackers, linux-mm
On 2026-01-07 at 12:28:27 -0800, Kees Cook wrote:
>On Tue, Jan 06, 2026 at 01:42:45PM +0100, Maciej Żenczykowski wrote:
>> We've got internal reports (b/467571011 - from CC'ed Samsung
>> developer) that kasan realloc is broken for sizes that are not a
>> multiple of the granule. This appears to be triggered during Android
>> bootup by some ebpf program loading operations (a struct is 88 bytes
>> in size, which is a multiple of 8, but not 16, which is the granule
>> size).
>>
>> (this is on 6.18 with
>> https://lore.kernel.org/all/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me/
>> already included)
>>
>> joonki.min@samsung-slsi.corp-partner.google.com summarized it as
>> "When newly requested size is not bigger than allocated size and old
>> size was not 16 byte aligned, it failed to unpoison extended area."
>>
>> and *very* rough comment:
>>
>> Right. "size - old_size" is not guaranteed 16-byte alignment in this case.
>>
>> I think we may unpoison 16-byte alignment size, but it allowed more
>> than requested :(
>>
>> I'm not sure that's right approach.
>>
>> if (size <= alloced_size) {
>> - kasan_unpoison_vmalloc(p + old_size, size - old_size,
>> + kasan_unpoison_vmalloc(p + old_size, round_up(size -
>> old_size, KASAN_GRANULE_SIZE),
>> KASAN_VMALLOC_PROT_NORMAL |
>> KASAN_VMALLOC_VM_ALLOC |
>> KASAN_VMALLOC_KEEP_TAG);
>> /*
>> * No need to zero memory here, as unused memory will have
>> * already been zeroed at initial allocation time or during
>> * realloc shrink time.
>> */
>> - vm->requested_size = size;
>> + vm->requested_size = round_up(size, KASAN_GRANULE_SIZE);
>>
>> my personal guess is that
>>
>> But just above the code you quoted in mm/vmalloc.c I see:
>> if (size <= old_size) {
>> ...
>> kasan_poison_vmalloc(p + size, old_size - size);
>>
>> is also likely wrong?? Considering:
>>
>> mm/kasan/shadow.c
>>
>> void __kasan_poison_vmalloc(const void *start, unsigned long size)
>> {
>> if (!is_vmalloc_or_module_addr(start))
>> return;
>>
>> size = round_up(size, KASAN_GRANULE_SIZE);
>> kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
>> }
>>
>> This doesn't look right - if start isn't a multiple of the granule.
>
>I don't think we can ever have the start not be a granule multiple, can
>we?
>
>I'm not sure how any of this is supposed to be handled by KASAN, though.
>It does seem like a round_up() is missing, though?
>
>-Kees
>
>--
>Kees Cook
I assume the error happens in hw-tags mode? And this used to work because
KASAN_VMALLOC_VM_ALLOC was missing and kasan_unpoison_vmalloc() used to do an
early return, while now it's actually doing the unpoisoning here?
If that's the case, I agree, the round up seems to be missing; I can add it and
send a patch later.
--
Kind regards
Maciej Wieczór-Retman
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KASAN vs realloc
2026-01-07 20:47 ` Maciej Wieczor-Retman
@ 2026-01-07 21:47 ` Maciej Żenczykowski
2026-01-07 21:50 ` Maciej Żenczykowski
0 siblings, 1 reply; 6+ messages in thread
From: Maciej Żenczykowski @ 2026-01-07 21:47 UTC (permalink / raw)
To: Maciej Wieczor-Retman
Cc: Kees Cook, joonki.min, Andrew Morton, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Marco Elver, Andrew Morton, Uladzislau Rezki,
Danilo Krummrich, jiayuan.chen, syzbot+997752115a851cb0cf36,
Maciej Wieczor-Retman, kasan-dev, Kernel hackers, linux-mm
On Wed, Jan 7, 2026 at 9:47 PM Maciej Wieczor-Retman
<m.wieczorretman@pm.me> wrote:
>
> On 2026-01-07 at 12:28:27 -0800, Kees Cook wrote:
> >On Tue, Jan 06, 2026 at 01:42:45PM +0100, Maciej Żenczykowski wrote:
> >> We've got internal reports (b/467571011 - from CC'ed Samsung
> >> developer) that kasan realloc is broken for sizes that are not a
> >> multiple of the granule. This appears to be triggered during Android
> >> bootup by some ebpf program loading operations (a struct is 88 bytes
> >> in size, which is a multiple of 8, but not 16, which is the granule
> >> size).
> >>
> >> (this is on 6.18 with
> >> https://lore.kernel.org/all/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me/
> >> already included)
> >>
> >> joonki.min@samsung-slsi.corp-partner.google.com summarized it as
> >> "When newly requested size is not bigger than allocated size and old
> >> size was not 16 byte aligned, it failed to unpoison extended area."
> >>
> >> and *very* rough comment:
> >>
> >> Right. "size - old_size" is not guaranteed 16-byte alignment in this case.
> >>
> >> I think we may unpoison 16-byte alignment size, but it allowed more
> >> than requested :(
> >>
> >> I'm not sure that's right approach.
> >>
> >> if (size <= alloced_size) {
> >> - kasan_unpoison_vmalloc(p + old_size, size - old_size,
> >> + kasan_unpoison_vmalloc(p + old_size, round_up(size -
> >> old_size, KASAN_GRANULE_SIZE),
> >> KASAN_VMALLOC_PROT_NORMAL |
> >> KASAN_VMALLOC_VM_ALLOC |
> >> KASAN_VMALLOC_KEEP_TAG);
> >> /*
> >> * No need to zero memory here, as unused memory will have
> >> * already been zeroed at initial allocation time or during
> >> * realloc shrink time.
> >> */
> >> - vm->requested_size = size;
> >> + vm->requested_size = round_up(size, KASAN_GRANULE_SIZE);
> >>
> >> my personal guess is that
> >>
> >> But just above the code you quoted in mm/vmalloc.c I see:
> >> if (size <= old_size) {
> >> ...
> >> kasan_poison_vmalloc(p + size, old_size - size);
I assume p is presumably 16-byte aligned, but size (ie. new size) /
old_size can presumably be odd.
This means the first argument passed to kasan_poison_vmalloc() is
potentially utterly unaligned.
> >> is also likely wrong?? Considering:
> >>
> >> mm/kasan/shadow.c
> >>
> >> void __kasan_poison_vmalloc(const void *start, unsigned long size)
> >> {
> >> if (!is_vmalloc_or_module_addr(start))
> >> return;
> >>
> >> size = round_up(size, KASAN_GRANULE_SIZE);
> >> kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
> >> }
> >>
> >> This doesn't look right - if start isn't a multiple of the granule.
> >
> >I don't think we can ever have the start not be a granule multiple, can
> >we?
See above for why I think we can...
I fully admit though I have no idea how this works, KASAN is not
something I really work with.
> >I'm not sure how any of this is supposed to be handled by KASAN, though.
> >It does seem like a round_up() is missing, though?
perhaps add a:
BUG_ON(start & 15)
BUG_ON(start & (GRANULE_SIZE-1))
if you think it shouldn't trigger?
and/or comments/documentation about the expected alignment of the
pointers and sizes if it cannot be arbitrary?
> I assume the error happens in hw-tags mode? And this used to work because
> KASAN_VMALLOC_VM_ALLOC was missing and kasan_unpoison_vmalloc() used to do an
> early return, while now it's actually doing the unpoisoning here?
I was under the impression this was triggering with software tags.
However, reproduction on a pixel 6 done by another Google engineer did
indeed fail.
It is failing on some Samsung device, but not sure what that is using...
Maybe a Pixel 8+ would use MTE???
So perhaps it is only hw tags??? Sorry, no idea.
I'm not sure, this is way way lower than I've wandered in the past
years, lately I mostly write userspace & ebpf code...
Would a stack trace help?
[ 22.280856][ T762]
==================================================================
[ 22.280866][ T762] BUG: KASAN: invalid-access in
bpf_patch_insn_data+0x25c/0x378
[ 22.280880][ T762] Write of size 27896 at addr 43ffffc08baf14d0 by
task netbpfload/762
[ 22.280888][ T762] Pointer tag: [43], memory tag: [54]
[ 22.280893][ T762]
[ 22.280900][ T762] CPU: 9 UID: 0 PID: 762 Comm: netbpfload
Tainted: G OE 6.18.0-android17-0-gef2f661f7812-4k #1
PREEMPT 5f8baed9473d1315a96dec60171cddf4b0b35487
[ 22.280907][ T762] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 22.280909][ T762] Hardware name: Samsung xxxxxxxxx
[ 22.280912][ T762] Call trace:
[ 22.280914][ T762] show_stack+0x18/0x28 (C)
[ 22.280922][ T762] __dump_stack+0x28/0x3c
[ 22.280930][ T762] dump_stack_lvl+0x7c/0xa8
[ 22.280934][ T762] print_address_description+0x7c/0x20c
[ 22.280941][ T762] print_report+0x70/0x8c
[ 22.280945][ T762] kasan_report+0xb4/0x114
[ 22.280952][ T762] kasan_check_range+0x94/0xa0
[ 22.280956][ T762] __asan_memmove+0x54/0x88
[ 22.280960][ T762] bpf_patch_insn_data+0x25c/0x378
[ 22.280965][ T762] bpf_check+0x25a4/0x8ef0
[ 22.280971][ T762] bpf_prog_load+0x8dc/0x990
[ 22.280976][ T762] __sys_bpf+0x340/0x524
[ 22.280980][ T762] __arm64_sys_bpf+0x48/0x64
[ 22.280984][ T762] invoke_syscall+0x6c/0x13c
[ 22.280990][ T762] el0_svc_common+0xf8/0x138
[ 22.280994][ T762] do_el0_svc+0x30/0x40
[ 22.280999][ T762] el0_svc+0x38/0x90
[ 22.281007][ T762] el0t_64_sync_handler+0x68/0xdc
[ 22.281012][ T762] el0t_64_sync+0x1b8/0x1bc
[ 22.281015][ T762]
[ 22.281063][ T762] The buggy address belongs to a 8-page vmalloc
region starting at 0x43ffffc08baf1000 allocated at
bpf_patch_insn_data+0xb0/0x378
[ 22.281088][ T762] The buggy address belongs to the physical page:
[ 22.281093][ T762] page: refcount:1 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x8ce792
[ 22.281099][ T762] memcg:f0ffff88354e7e42
[ 22.281104][ T762] flags: 0x4300000000000000(zone=1|kasantag=0xc)
[ 22.281113][ T762] raw: 4300000000000000 0000000000000000
dead000000000122 0000000000000000
[ 22.281119][ T762] raw: 0000000000000000 0000000000000000
00000001ffffffff f0ffff88354e7e42
[ 22.281125][ T762] page dumped because: kasan: bad access detected
[ 22.281129][ T762]
[ 22.281134][ T762] Memory state around the buggy address:
[ 22.281139][ T762] ffffffc08baf7f00: 43 43 43 43 43 43 43 43 43
43 43 43 43 43 43 43
[ 22.281144][ T762] ffffffc08baf8000: 43 43 43 43 43 43 43 43 43
43 43 43 43 43 43 43
[ 22.281150][ T762] >ffffffc08baf8100: 43 43 43 43 43 43 43 54 54
54 54 54 54 fe fe fe
[ 22.281155][ T762] ^
[ 22.281160][ T762] ffffffc08baf8200: fe fe fe fe fe fe fe fe fe
fe fe fe fe fe fe fe
[ 22.281165][ T762] ffffffc08baf8300: fe fe fe fe fe fe fe fe fe
fe fe fe fe fe fe fe
[ 22.281170][ T762]
==================================================================
[ 22.281199][ T762] Kernel panic - not syncing: KASAN: panic_on_warn set ...
> If that's the case, I agree, the round up seems to be missing; I can add it and
> send a patch later.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KASAN vs realloc
2026-01-07 21:47 ` Maciej Żenczykowski
@ 2026-01-07 21:50 ` Maciej Żenczykowski
2026-01-07 21:55 ` Maciej Żenczykowski
0 siblings, 1 reply; 6+ messages in thread
From: Maciej Żenczykowski @ 2026-01-07 21:50 UTC (permalink / raw)
To: Maciej Wieczor-Retman
Cc: Kees Cook, joonki.min, Andrew Morton, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Marco Elver, Andrew Morton, Uladzislau Rezki,
Danilo Krummrich, jiayuan.chen, syzbot+997752115a851cb0cf36,
Maciej Wieczor-Retman, kasan-dev, Kernel hackers, linux-mm
On Wed, Jan 7, 2026 at 10:47 PM Maciej Żenczykowski <maze@google.com> wrote:
>
> On Wed, Jan 7, 2026 at 9:47 PM Maciej Wieczor-Retman
> <m.wieczorretman@pm.me> wrote:
> >
> > On 2026-01-07 at 12:28:27 -0800, Kees Cook wrote:
> > >On Tue, Jan 06, 2026 at 01:42:45PM +0100, Maciej Żenczykowski wrote:
> > >> We've got internal reports (b/467571011 - from CC'ed Samsung
> > >> developer) that kasan realloc is broken for sizes that are not a
> > >> multiple of the granule. This appears to be triggered during Android
> > >> bootup by some ebpf program loading operations (a struct is 88 bytes
> > >> in size, which is a multiple of 8, but not 16, which is the granule
> > >> size).
> > >>
> > >> (this is on 6.18 with
> > >> https://lore.kernel.org/all/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me/
> > >> already included)
> > >>
> > >> joonki.min@samsung-slsi.corp-partner.google.com summarized it as
> > >> "When newly requested size is not bigger than allocated size and old
> > >> size was not 16 byte aligned, it failed to unpoison extended area."
> > >>
> > >> and *very* rough comment:
> > >>
> > >> Right. "size - old_size" is not guaranteed 16-byte alignment in this case.
> > >>
> > >> I think we may unpoison 16-byte alignment size, but it allowed more
> > >> than requested :(
> > >>
> > >> I'm not sure that's right approach.
> > >>
> > >> if (size <= alloced_size) {
> > >> - kasan_unpoison_vmalloc(p + old_size, size - old_size,
> > >> + kasan_unpoison_vmalloc(p + old_size, round_up(size -
> > >> old_size, KASAN_GRANULE_SIZE),
> > >> KASAN_VMALLOC_PROT_NORMAL |
> > >> KASAN_VMALLOC_VM_ALLOC |
> > >> KASAN_VMALLOC_KEEP_TAG);
> > >> /*
> > >> * No need to zero memory here, as unused memory will have
> > >> * already been zeroed at initial allocation time or during
> > >> * realloc shrink time.
> > >> */
> > >> - vm->requested_size = size;
> > >> + vm->requested_size = round_up(size, KASAN_GRANULE_SIZE);
> > >>
> > >> my personal guess is that
> > >>
> > >> But just above the code you quoted in mm/vmalloc.c I see:
> > >> if (size <= old_size) {
> > >> ...
> > >> kasan_poison_vmalloc(p + size, old_size - size);
>
> I assume p is presumably 16-byte aligned, but size (ie. new size) /
> old_size can presumably be odd.
>
> This means the first argument passed to kasan_poison_vmalloc() is
> potentially utterly unaligned.
>
> > >> is also likely wrong?? Considering:
> > >>
> > >> mm/kasan/shadow.c
> > >>
> > >> void __kasan_poison_vmalloc(const void *start, unsigned long size)
> > >> {
> > >> if (!is_vmalloc_or_module_addr(start))
> > >> return;
> > >>
> > >> size = round_up(size, KASAN_GRANULE_SIZE);
> > >> kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
> > >> }
> > >>
> > >> This doesn't look right - if start isn't a multiple of the granule.
> > >
> > >I don't think we can ever have the start not be a granule multiple, can
> > >we?
>
> See above for why I think we can...
> I fully admit though I have no idea how this works, KASAN is not
> something I really work with.
>
> > >I'm not sure how any of this is supposed to be handled by KASAN, though.
> > >It does seem like a round_up() is missing, though?
>
> perhaps add a:
> BUG_ON(start & 15)
> BUG_ON(start & (GRANULE_SIZE-1))
>
> if you think it shouldn't trigger?
>
> and/or comments/documentation about the expected alignment of the
> pointers and sizes if it cannot be arbitrary?
>
> > I assume the error happens in hw-tags mode? And this used to work because
> > KASAN_VMALLOC_VM_ALLOC was missing and kasan_unpoison_vmalloc() used to do an
> > early return, while now it's actually doing the unpoisoning here?
>
> I was under the impression this was triggering with software tags.
> However, reproduction on a pixel 6 done by another Google engineer did
> indeed fail.
> It is failing on some Samsung device, but not sure what that is using...
> Maybe a Pixel 8+ would use MTE???
> So perhaps it is only hw tags??? Sorry, no idea.
> I'm not sure, this is way way lower than I've wandered in the past
> years, lately I mostly write userspace & ebpf code...
>
> Would a stack trace help?
>
> [ 22.280856][ T762]
> ==================================================================
> [ 22.280866][ T762] BUG: KASAN: invalid-access in
> bpf_patch_insn_data+0x25c/0x378
> [ 22.280880][ T762] Write of size 27896 at addr 43ffffc08baf14d0 by
> task netbpfload/762
> [ 22.280888][ T762] Pointer tag: [43], memory tag: [54]
> [ 22.280893][ T762]
> [ 22.280900][ T762] CPU: 9 UID: 0 PID: 762 Comm: netbpfload
> Tainted: G OE 6.18.0-android17-0-gef2f661f7812-4k #1
> PREEMPT 5f8baed9473d1315a96dec60171cddf4b0b35487
> [ 22.280907][ T762] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> [ 22.280909][ T762] Hardware name: Samsung xxxxxxxxx
> [ 22.280912][ T762] Call trace:
> [ 22.280914][ T762] show_stack+0x18/0x28 (C)
> [ 22.280922][ T762] __dump_stack+0x28/0x3c
> [ 22.280930][ T762] dump_stack_lvl+0x7c/0xa8
> [ 22.280934][ T762] print_address_description+0x7c/0x20c
> [ 22.280941][ T762] print_report+0x70/0x8c
> [ 22.280945][ T762] kasan_report+0xb4/0x114
> [ 22.280952][ T762] kasan_check_range+0x94/0xa0
> [ 22.280956][ T762] __asan_memmove+0x54/0x88
> [ 22.280960][ T762] bpf_patch_insn_data+0x25c/0x378
> [ 22.280965][ T762] bpf_check+0x25a4/0x8ef0
> [ 22.280971][ T762] bpf_prog_load+0x8dc/0x990
> [ 22.280976][ T762] __sys_bpf+0x340/0x524
> [ 22.280980][ T762] __arm64_sys_bpf+0x48/0x64
> [ 22.280984][ T762] invoke_syscall+0x6c/0x13c
> [ 22.280990][ T762] el0_svc_common+0xf8/0x138
> [ 22.280994][ T762] do_el0_svc+0x30/0x40
> [ 22.280999][ T762] el0_svc+0x38/0x90
> [ 22.281007][ T762] el0t_64_sync_handler+0x68/0xdc
> [ 22.281012][ T762] el0t_64_sync+0x1b8/0x1bc
> [ 22.281015][ T762]
> [ 22.281063][ T762] The buggy address belongs to a 8-page vmalloc
> region starting at 0x43ffffc08baf1000 allocated at
> bpf_patch_insn_data+0xb0/0x378
> [ 22.281088][ T762] The buggy address belongs to the physical page:
> [ 22.281093][ T762] page: refcount:1 mapcount:0
> mapping:0000000000000000 index:0x0 pfn:0x8ce792
> [ 22.281099][ T762] memcg:f0ffff88354e7e42
> [ 22.281104][ T762] flags: 0x4300000000000000(zone=1|kasantag=0xc)
> [ 22.281113][ T762] raw: 4300000000000000 0000000000000000
> dead000000000122 0000000000000000
> [ 22.281119][ T762] raw: 0000000000000000 0000000000000000
> 00000001ffffffff f0ffff88354e7e42
> [ 22.281125][ T762] page dumped because: kasan: bad access detected
> [ 22.281129][ T762]
> [ 22.281134][ T762] Memory state around the buggy address:
> [ 22.281139][ T762] ffffffc08baf7f00: 43 43 43 43 43 43 43 43 43
> 43 43 43 43 43 43 43
> [ 22.281144][ T762] ffffffc08baf8000: 43 43 43 43 43 43 43 43 43
> 43 43 43 43 43 43 43
> [ 22.281150][ T762] >ffffffc08baf8100: 43 43 43 43 43 43 43 54 54
> 54 54 54 54 fe fe fe
> [ 22.281155][ T762] ^
> [ 22.281160][ T762] ffffffc08baf8200: fe fe fe fe fe fe fe fe fe
> fe fe fe fe fe fe fe
> [ 22.281165][ T762] ffffffc08baf8300: fe fe fe fe fe fe fe fe fe
> fe fe fe fe fe fe fe
> [ 22.281170][ T762]
> ==================================================================
> [ 22.281199][ T762] Kernel panic - not syncing: KASAN: panic_on_warn set ...
>
> > If that's the case, I agree, the round up seems to be missing; I can add it and
> > send a patch later.
WARNING: Actually I'm not sure if this is the *right* stack trace.
This might be on a bare 6.18 without the latest extra 4 patches.
I'm not finding a more recent stack trace.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KASAN vs realloc
2026-01-07 21:50 ` Maciej Żenczykowski
@ 2026-01-07 21:55 ` Maciej Żenczykowski
0 siblings, 0 replies; 6+ messages in thread
From: Maciej Żenczykowski @ 2026-01-07 21:55 UTC (permalink / raw)
To: Maciej Wieczor-Retman
Cc: Kees Cook, joonki.min, Andrew Morton, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Marco Elver, Andrew Morton, Uladzislau Rezki,
Danilo Krummrich, jiayuan.chen, syzbot+997752115a851cb0cf36,
Maciej Wieczor-Retman, kasan-dev, Kernel hackers, linux-mm
> WARNING: Actually I'm not sure if this is the *right* stack trace.
> This might be on a bare 6.18 without the latest extra 4 patches.
> I'm not finding a more recent stack trace.
Found comments from Samsung dev:
But another panic came after those fixes [ie. 4 patches] applied.
struct bpf_insn_aux_data is 88byte, so panic on warn set when old_size
ends with 0x8.
It seems like vrealloc cannot handle that case.
84.536021] [4: netbpfload: 771] ------------[ cut here ]------------
[ 84.536196] [4: netbpfload: 771] WARNING: CPU: 4 PID: 771 at
mm/kasan/shadow.c:174 __kasan_unpoison_vmalloc+0x94/0xa0
....
[ 84.773445] [4: netbpfload: 771] CPU: 4 UID: 0 PID: 771 Comm:
netbpfload Tainted: G OE
6.18.1-android17-0-g41be44edb8d5-4k #1 PREEMPT
70442b615e7d1d560808f482eb5d71810120225e
[ 84.789323] [4: netbpfload: 771] Tainted: [O]=OOT_MODULE,
[E]=UNSIGNED_MODULE
[ 84.795311] [4: netbpfload: 771] Hardware name: Samsung xxxx
[ 84.802519] [4: netbpfload: 771] pstate: 03402005 (nzcv daif
+PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 84.810152] [4: netbpfload: 771] pc : __kasan_unpoison_vmalloc+0x94/0xa0
[ 84.815708] [4: netbpfload: 771] lr : __kasan_unpoison_vmalloc+0x24/0xa0
[ 84.821264] [4: netbpfload: 771] sp : ffffffc0a97e77a0
[ 84.825256] [4: netbpfload: 771] x29: ffffffc0a97e77a0 x28:
3bffff8837198670 x27: 0000000000008000
[ 84.833069] [4: netbpfload: 771] x26: 41ffff8837ef8e00 x25:
ffffffffffffffa8 x24: 00000000000071c8
[ 84.840880] [4: netbpfload: 771] x23: 0000000000000001 x22:
00000000ffffffff x21: 000000000000000e
[ 84.848694] [4: netbpfload: 771] x20: 0000000000000058 x19:
c3ffffc0a8f271c8 x18: ffffffc082f1c100
[ 84.856504] [4: netbpfload: 771] x17: 000000003688d116 x16:
000000003688d116 x15: ffffff8837efff80
[ 84.864317] [4: netbpfload: 771] x14: 0000000000000180 x13:
0000000000000000 x12: e6ffff8837eff700
[ 84.872129] [4: netbpfload: 771] x11: 0000000000000041 x10:
0000000000000000 x9 : fffffffebf800000
[ 84.879941] [4: netbpfload: 771] x8 : ffffffc0a8f271c8 x7 :
0000000000000000 x6 : ffffffc0805bef3c
[ 84.887754] [4: netbpfload: 771] x5 : 0000000000000000 x4 :
0000000000000000 x3 : ffffffc080234b6c
[ 84.895566] [4: netbpfload: 771] x2 : 000000000000000e x1 :
0000000000000058 x0 : 0000000000000001
[ 84.903377] [4: netbpfload: 771] Call trace:
[ 84.906502] [4: netbpfload: 771] __kasan_unpoison_vmalloc+0x94/0xa0 (P)
[ 84.912058] [4: netbpfload: 771] vrealloc_node_align_noprof+0xdc/0x2e4
[ 84.917525] [4: netbpfload: 771] bpf_patch_insn_data+0xb0/0x378
[ 84.922384] [4: netbpfload: 771] bpf_check+0x25a4/0x8ef0
[ 84.926638] [4: netbpfload: 771] bpf_prog_load+0x8dc/0x990
[ 84.931065] [4: netbpfload: 771] __sys_bpf+0x340/0x524
[ 79.334574][ T827] bpf_patch_insn_data: insn_aux_data size realloc
at abffffc08ef41000 to 330
[ 79.334919][ T827] bpf_patch_insn_data: insn_aux_data at 55ffffc0a9c00000
[ 79.335151][ T827] bpf_patch_insn_data: insn_aux_data size realloc
at 55ffffc0a9c00000 to 331
[ 79.336331][ T827] vrealloc_node_align_noprof: p=55ffffc0a9c00000
old_size=7170
[ 79.343898][ T827] vrealloc_node_align_noprof: size=71c8 alloced_size=8000
[ 79.350782][ T827] bpf_patch_insn_data: insn_aux_data at 55ffffc0a9c00000
[ 79.357591][ T827] bpf_patch_insn_data: insn_aux_data size realloc
at 55ffffc0a9c00000 to 332
[ 79.366174][ T827] vrealloc_node_align_noprof: p=55ffffc0a9c00000
old_size=71c8
[ 79.373588][ T827] vrealloc_node_align_noprof: size=7220 alloced_size=8000
[ 79.380485][ T827] kasan_unpoison: after kasan_reset_tag
addr=ffffffc0a9c071c8(granule mask=f)
I added 8 bytes dummy data to avoid "p + old_size" was not ended with
8, it booted well.
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 4c497e839526..f9d3448321e8 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -581,6 +581,7 @@ struct bpf_insn_aux_data {
u32 scc;
/* registers alive before this instruction. */
u16 live_regs_before;
+ u16 buf[4]; // TEST
};
maze: Likely if 8 bytes worked then 'u8 buf[7]' would too?
it will be 88bytes + 7 bytes = 95 bytes(=0x5f) which is in the range
of granule mask(=0xf)
I don't think it works, but it works.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-01-07 21:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-06 12:42 KASAN vs realloc Maciej Żenczykowski
2026-01-07 20:28 ` Kees Cook
2026-01-07 20:47 ` Maciej Wieczor-Retman
2026-01-07 21:47 ` Maciej Żenczykowski
2026-01-07 21:50 ` Maciej Żenczykowski
2026-01-07 21:55 ` Maciej Żenczykowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox