linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Maciej Żenczykowski" <maze@google.com>
To: Maciej Wieczor-Retman <m.wieczorretman@pm.me>
Cc: Kees Cook <kees@kernel.org>,
	joonki.min@samsung-slsi.corp-partner.google.com,
	 Andrew Morton <akpm@google.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	 Alexander Potapenko <glider@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	 Dmitry Vyukov <dvyukov@google.com>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	 Marco Elver <elver@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Uladzislau Rezki <urezki@gmail.com>,
	Danilo Krummrich <dakr@kernel.org>,
	jiayuan.chen@linux.dev,
	 syzbot+997752115a851cb0cf36@syzkaller.appspotmail.com,
	 Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>,
	kasan-dev@googlegroups.com,
	 Kernel hackers <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: KASAN vs realloc
Date: Wed, 7 Jan 2026 22:50:16 +0100	[thread overview]
Message-ID: <CANP3RGeaEQipgRvk2FedpN54Rrq=fKdLs3PN4_+DThpeqQmTXA@mail.gmail.com> (raw)
In-Reply-To: <CANP3RGfLXptZp6widUEyvVzicAB=dwcSx3k7MLtQozhO0NuxZw@mail.gmail.com>

On Wed, Jan 7, 2026 at 10:47 PM Maciej Żenczykowski <maze@google.com> wrote:
>
> On Wed, Jan 7, 2026 at 9:47 PM Maciej Wieczor-Retman
> <m.wieczorretman@pm.me> wrote:
> >
> > On 2026-01-07 at 12:28:27 -0800, Kees Cook wrote:
> > >On Tue, Jan 06, 2026 at 01:42:45PM +0100, Maciej Żenczykowski wrote:
> > >> We've got internal reports (b/467571011 - from CC'ed Samsung
> > >> developer) that kasan realloc is broken for sizes that are not a
> > >> multiple of the granule.  This appears to be triggered during Android
> > >> bootup by some ebpf program loading operations (a struct is 88 bytes
> > >> in size, which is a multiple of 8, but not 16, which is the granule
> > >> size).
> > >>
> > >> (this is on 6.18 with
> > >> https://lore.kernel.org/all/38dece0a4074c43e48150d1e242f8242c73bf1a5.1764874575.git.m.wieczorretman@pm.me/
> > >> already included)
> > >>
> > >> joonki.min@samsung-slsi.corp-partner.google.com summarized it as
> > >> "When newly requested size is not bigger than allocated size and old
> > >> size was not 16 byte aligned, it failed to unpoison extended area."
> > >>
> > >> and *very* rough comment:
> > >>
> > >> Right. "size - old_size" is not guaranteed 16-byte alignment in this case.
> > >>
> > >> I think we may unpoison 16-byte alignment size, but it allowed more
> > >> than requested :(
> > >>
> > >> I'm not sure that's right approach.
> > >>
> > >> if (size <= alloced_size) {
> > >> - kasan_unpoison_vmalloc(p + old_size, size - old_size,
> > >> +               kasan_unpoison_vmalloc(p + old_size, round_up(size -
> > >> old_size, KASAN_GRANULE_SIZE),
> > >>       KASAN_VMALLOC_PROT_NORMAL |
> > >>       KASAN_VMALLOC_VM_ALLOC |
> > >>       KASAN_VMALLOC_KEEP_TAG);
> > >> /*
> > >> * No need to zero memory here, as unused memory will have
> > >> * already been zeroed at initial allocation time or during
> > >> * realloc shrink time.
> > >> */
> > >> - vm->requested_size = size;
> > >> +               vm->requested_size = round_up(size, KASAN_GRANULE_SIZE);
> > >>
> > >> my personal guess is that
> > >>
> > >> But just above the code you quoted in mm/vmalloc.c I see:
> > >>         if (size <= old_size) {
> > >> ...
> > >>                 kasan_poison_vmalloc(p + size, old_size - size);
>
> I assume p is presumably 16-byte aligned, but size (ie. new size) /
> old_size can presumably be odd.
>
> This means the first argument passed to kasan_poison_vmalloc() is
> potentially utterly unaligned.
>
> > >> is also likely wrong?? Considering:
> > >>
> > >> mm/kasan/shadow.c
> > >>
> > >> void __kasan_poison_vmalloc(const void *start, unsigned long size)
> > >> {
> > >>         if (!is_vmalloc_or_module_addr(start))
> > >>                 return;
> > >>
> > >>         size = round_up(size, KASAN_GRANULE_SIZE);
> > >>         kasan_poison(start, size, KASAN_VMALLOC_INVALID, false);
> > >> }
> > >>
> > >> This doesn't look right - if start isn't a multiple of the granule.
> > >
> > >I don't think we can ever have the start not be a granule multiple, can
> > >we?
>
> See above for why I think we can...
> I fully admit though I have no idea how this works, KASAN is not
> something I really work with.
>
> > >I'm not sure how any of this is supposed to be handled by KASAN, though.
> > >It does seem like a round_up() is missing, though?
>
> perhaps add a:
>  BUG_ON(start & 15)
>  BUG_ON(start & (GRANULE_SIZE-1))
>
> if you think it shouldn't trigger?
>
> and/or comments/documentation about the expected alignment of the
> pointers and sizes if it cannot be arbitrary?
>
> > I assume the error happens in hw-tags mode? And this used to work because
> > KASAN_VMALLOC_VM_ALLOC was missing and kasan_unpoison_vmalloc() used to do an
> > early return, while now it's actually doing the unpoisoning here?
>
> I was under the impression this was triggering with software tags.
> However, reproduction on a pixel 6 done by another Google engineer did
> indeed fail.
> It is failing on some Samsung device, but not sure what that is using...
> Maybe a Pixel 8+ would use MTE???
> So perhaps it is only hw tags???  Sorry, no idea.
> I'm not sure, this is way way lower than I've wandered in the past
> years, lately I mostly write userspace & ebpf code...
>
> Would a stack trace help?
>
> [   22.280856][  T762]
> ==================================================================
> [   22.280866][  T762] BUG: KASAN: invalid-access in
> bpf_patch_insn_data+0x25c/0x378
> [   22.280880][  T762] Write of size 27896 at addr 43ffffc08baf14d0 by
> task netbpfload/762
> [   22.280888][  T762] Pointer tag: [43], memory tag: [54]
> [   22.280893][  T762]
> [   22.280900][  T762] CPU: 9 UID: 0 PID: 762 Comm: netbpfload
> Tainted: G           OE       6.18.0-android17-0-gef2f661f7812-4k #1
> PREEMPT  5f8baed9473d1315a96dec60171cddf4b0b35487
> [   22.280907][  T762] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> [   22.280909][  T762] Hardware name: Samsung xxxxxxxxx
> [   22.280912][  T762] Call trace:
> [   22.280914][  T762]  show_stack+0x18/0x28 (C)
> [   22.280922][  T762]  __dump_stack+0x28/0x3c
> [   22.280930][  T762]  dump_stack_lvl+0x7c/0xa8
> [   22.280934][  T762]  print_address_description+0x7c/0x20c
> [   22.280941][  T762]  print_report+0x70/0x8c
> [   22.280945][  T762]  kasan_report+0xb4/0x114
> [   22.280952][  T762]  kasan_check_range+0x94/0xa0
> [   22.280956][  T762]  __asan_memmove+0x54/0x88
> [   22.280960][  T762]  bpf_patch_insn_data+0x25c/0x378
> [   22.280965][  T762]  bpf_check+0x25a4/0x8ef0
> [   22.280971][  T762]  bpf_prog_load+0x8dc/0x990
> [   22.280976][  T762]  __sys_bpf+0x340/0x524
> [   22.280980][  T762]  __arm64_sys_bpf+0x48/0x64
> [   22.280984][  T762]  invoke_syscall+0x6c/0x13c
> [   22.280990][  T762]  el0_svc_common+0xf8/0x138
> [   22.280994][  T762]  do_el0_svc+0x30/0x40
> [   22.280999][  T762]  el0_svc+0x38/0x90
> [   22.281007][  T762]  el0t_64_sync_handler+0x68/0xdc
> [   22.281012][  T762]  el0t_64_sync+0x1b8/0x1bc
> [   22.281015][  T762]
> [   22.281063][  T762] The buggy address belongs to a 8-page vmalloc
> region starting at 0x43ffffc08baf1000 allocated at
> bpf_patch_insn_data+0xb0/0x378
> [   22.281088][  T762] The buggy address belongs to the physical page:
> [   22.281093][  T762] page: refcount:1 mapcount:0
> mapping:0000000000000000 index:0x0 pfn:0x8ce792
> [   22.281099][  T762] memcg:f0ffff88354e7e42
> [   22.281104][  T762] flags: 0x4300000000000000(zone=1|kasantag=0xc)
> [   22.281113][  T762] raw: 4300000000000000 0000000000000000
> dead000000000122 0000000000000000
> [   22.281119][  T762] raw: 0000000000000000 0000000000000000
> 00000001ffffffff f0ffff88354e7e42
> [   22.281125][  T762] page dumped because: kasan: bad access detected
> [   22.281129][  T762]
> [   22.281134][  T762] Memory state around the buggy address:
> [   22.281139][  T762]  ffffffc08baf7f00: 43 43 43 43 43 43 43 43 43
> 43 43 43 43 43 43 43
> [   22.281144][  T762]  ffffffc08baf8000: 43 43 43 43 43 43 43 43 43
> 43 43 43 43 43 43 43
> [   22.281150][  T762] >ffffffc08baf8100: 43 43 43 43 43 43 43 54 54
> 54 54 54 54 fe fe fe
> [   22.281155][  T762]                                         ^
> [   22.281160][  T762]  ffffffc08baf8200: fe fe fe fe fe fe fe fe fe
> fe fe fe fe fe fe fe
> [   22.281165][  T762]  ffffffc08baf8300: fe fe fe fe fe fe fe fe fe
> fe fe fe fe fe fe fe
> [   22.281170][  T762]
> ==================================================================
> [   22.281199][  T762] Kernel panic - not syncing: KASAN: panic_on_warn set ...
>
> > If that's the case, I agree, the round up seems to be missing; I can add it and
> > send a patch later.

WARNING: Actually I'm not sure if this is the *right* stack trace.
This might be on a bare 6.18 without the latest extra 4 patches.
I'm not finding a more recent stack trace.


  reply	other threads:[~2026-01-07 21:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-06 12:42 Maciej Żenczykowski
2026-01-07 20:28 ` Kees Cook
2026-01-07 20:47   ` Maciej Wieczor-Retman
2026-01-07 21:47     ` Maciej Żenczykowski
2026-01-07 21:50       ` Maciej Żenczykowski [this message]
2026-01-07 21:55         ` Maciej Żenczykowski
2026-01-09 18:55           ` Maciej Wieczor-Retman
2026-01-09 20:05             ` Maciej Żenczykowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANP3RGeaEQipgRvk2FedpN54Rrq=fKdLs3PN4_+DThpeqQmTXA@mail.gmail.com' \
    --to=maze@google.com \
    --cc=akpm@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=dakr@kernel.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=jiayuan.chen@linux.dev \
    --cc=joonki.min@samsung-slsi.corp-partner.google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kees@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=m.wieczorretman@pm.me \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=ryabinin.a.a@gmail.com \
    --cc=syzbot+997752115a851cb0cf36@syzkaller.appspotmail.com \
    --cc=urezki@gmail.com \
    --cc=vincenzo.frascino@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox