linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>,
	Alexander Potapenko <glider@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	Feng Tang <feng.79.tang@gmail.com>,
	Christoph Lameter <cl@gentwo.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kasan-dev@googlegroups.com, stable@vger.kernel.org
Subject: Re: [PATCH] mm/slab: ensure all metadata in slab object are word-aligned
Date: Fri, 24 Oct 2025 09:40:55 +0900	[thread overview]
Message-ID: <aPrLF0OUK651M4dk@hyeyoo> (raw)
In-Reply-To: <20251023131600.1103431-1-harry.yoo@oracle.com>

On Thu, Oct 23, 2025 at 10:16:00PM +0900, Harry Yoo wrote:
> When the SLAB_STORE_USER debug flag is used, any metadata placed after
> the original kmalloc request size (orig_size) is not properly aligned
> on 64-bit architectures because its type is unsigned int. When both KASAN
> and SLAB_STORE_USER are enabled, kasan_alloc_meta is misaligned.
> 
> Because not all architectures support unaligned memory accesses,
> ensure that all metadata (track, orig_size, kasan_{alloc,free}_meta)
> in a slab object are word-aligned. struct track, kasan_{alloc,free}_meta
> are aligned by adding __aligned(sizeof(unsigned long)).
> 
> For orig_size, use ALIGN(sizeof(unsigned int), sizeof(unsigned long)) to
> make clear that its size remains unsigned int but it must be aligned to
> a word boundary. On 64-bit architectures, this reserves 8 bytes for
> orig_size, which is acceptable since kmalloc's original request size
> tracking is intended for debugging rather than production use.
> 
> Cc: <stable@vger.kernel.org>
> Fixes: 6edf2576a6cc ("mm/slub: enable debugging memory wasting of kmalloc")
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---

Adding more details on how I discovered this and why I care:

I was developing a feature that uses unused bytes in s->size as the
slabobj_ext metadata. Unlike other metadata where slab disables KASAN
when accessing it, this should be unpoisoned to avoid adding complexity
and overhead when accessing it.

So I wrote a code snippet to unpoison the metdata on slab allocation,
and then encountered a warning from KASAN:

[    4.951021] WARNING: CPU: 0 PID: 1 at mm/kasan/shadow.c:174 kasan_unpoison+0x6d/0x80
[    4.952021] Modules linked in:
[    4.953021] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G        W           6.17.0-rc3-slab-memopt-debug+ #22 PREEMPT(voluntary)
[    4.954021] Tainted: [W]=WARN
[    4.954495] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[    4.955021] RIP: 0010:kasan_unpoison+0x6d/0x80
[    4.955724] Code: 84 02 4c 89 e0 83 e0 07 74 0b 4c 01 e3 48 c1 eb 03 42 88 04 2b 5b 41 5c 41 5d 5d 31 c0 31 d2 31 c9 31 f6 31 ff c3 cc cc cc cc <0f> 0b 31 c0 31 d20
[    4.956021] RSP: 0018:ffff888007c57a90 EFLAGS: 00010202
[    4.957021] RAX: 0000000000000000 RBX: 0000000000000074 RCX: 0000000000000080
[    4.958021] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff888007d7ae74
[    4.959021] RBP: ffff888007c57a98 R08: 0000000000000000 R09: 0000000000000000
[    4.960021] R10: ffffed1000faf400 R11: 0000000000000000 R12: ffffea00001f5e80
[    4.961021] R13: ffff888007466dc0 R14: ffff888007d7ae00 R15: ffff888007d7ae74
[    4.962023] FS:  0000000000000000(0000) GS:ffff8880e1a15000(0000) knlGS:0000000000000000
[    4.963021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.964021] CR2: ffff888007201000 CR3: 00000000056a0000 CR4: 00000000000006f0
[    4.965023] Call Trace:
[    4.965413]  <TASK>
[    4.966021]  ? __kasan_unpoison_range+0x26/0x50
[    4.966759]  alloc_slab_obj_exts_early.constprop.0+0x136/0x240
[    4.967021]  allocate_slab+0x107/0x4b0
[    4.968021]  ___slab_alloc+0x8f6/0xec0
[    4.969021]  ? kstrdup_const+0x2c/0x40
[    4.969615]  ? __xa_alloc+0x227/0x320
[    4.970021]  __slab_alloc.isra.0+0x35/0x90
[    4.970663]  __kmalloc_node_track_caller_noprof+0x4e2/0x7a0
[    4.971021]  ? kstrdup_const+0x2c/0x40
[    4.972021]  kstrdup+0x48/0xf0
[    4.972505]  ? kstrdup+0x48/0xf0
[    4.973021]  kstrdup_const+0x2c/0x40
[    4.973589]  alloc_vfsmnt+0xd5/0x680
[    4.974021]  vfs_create_mount.part.0+0x42/0x3e0
[    4.975021]  vfs_kern_mount.part.0+0x10c/0x150
[    4.975722]  vfs_kern_mount+0x13/0x40
[    4.976021]  devtmpfs_init+0xa8/0x430
[    4.977021]  ? __percpu_counter_init_many+0x199/0x360
[    4.977812]  ? __pfx_devtmpfs_init+0x10/0x10
[    4.978021]  ? page_offline_thaw+0x5/0x20
[    4.979021]  ? __kasan_check_write+0x14/0x30
[    4.979694]  driver_init+0x1a/0x60
[    4.980021]  kernel_init_freeable+0x7de/0xeb0
[    4.981021]  ? __pfx_kernel_init+0x10/0x10
[    4.981667]  kernel_init+0x1f/0x220
[    4.982021]  ? __pfx_kernel_init+0x10/0x10
[    4.983021]  ret_from_fork+0x2b8/0x3b0
[    4.983618]  ? __pfx_kernel_init+0x10/0x10
[    4.984021]  ret_from_fork_asm+0x1a/0x30
[    4.984639] RIP: 2e66:0x0
[    4.985021] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[    4.986021] RSP: 0084:0000000000000000 EFLAGS: 841f0f2e660000 ORIG_RAX: 2e66000000000084
[    4.987021] RAX: 0000000000000000 RBX: 2e66000000000084 RCX: 0000000000841f0f
[    4.988021] RDX: 000000841f0f2e66 RSI: 00841f0f2e660000 RDI: 1f0f2e6600000000
[    4.989021] RBP: 1f0f2e6600000000 R08: 1f0f2e6600000000 R09: 00841f0f2e660000
[    4.990021] R10: 000000841f0f2e66 R11: 0000000000841f0f R12: 00841f0f2e660000
[    4.991021] R13: 000000841f0f2e66 R14: 0000000000841f0f R15: 2e66000000000084
[    4.992022]  </TASK>
[    4.992372] ---[ end trace 0000000000000000 ]---

This warning is from kasan_unpoison():
	if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
                return;

on x86_64, the address passed to kasan_{poison,unpoison}() should be at
least aligned with 8 bytes.

After manual investigation it turns out when the SLAB_STORE_USER flag is
specified, any metadata after the original kmalloc request size is
misaligned.

Questions:
- Could it cause any issues other than the one described above?
- Does KASAN even support architectures that have issues with unaligned
  accesses?
- How come we haven't seen any issues regarding this so far? :/

>  mm/kasan/kasan.h |  4 ++--
>  mm/slub.c        | 16 +++++++++++-----
>  2 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index 129178be5e64..d4ea7ecc20c3 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -265,7 +265,7 @@ struct kasan_alloc_meta {
>  	struct kasan_track alloc_track;
>  	/* Free track is stored in kasan_free_meta. */
>  	depot_stack_handle_t aux_stack[2];
> -};
> +} __aligned(sizeof(unsigned long));
>  
>  struct qlist_node {
>  	struct qlist_node *next;
> @@ -289,7 +289,7 @@ struct qlist_node {
>  struct kasan_free_meta {
>  	struct qlist_node quarantine_link;
>  	struct kasan_track free_track;
> -};
> +} __aligned(sizeof(unsigned long));
>  
>  #endif /* CONFIG_KASAN_GENERIC */
>  
> diff --git a/mm/slub.c b/mm/slub.c
> index a585d0ac45d4..b921f91723c2 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -344,7 +344,7 @@ struct track {
>  	int cpu;		/* Was running on cpu */
>  	int pid;		/* Pid context */
>  	unsigned long when;	/* When did the operation occur */
> -};
> +} __aligned(sizeof(unsigned long));
>  
>  enum track_item { TRACK_ALLOC, TRACK_FREE };
>  
> @@ -1196,7 +1196,7 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
>  		off += 2 * sizeof(struct track);
>  
>  	if (slub_debug_orig_size(s))
> -		off += sizeof(unsigned int);
> +		off += ALIGN(sizeof(unsigned int), sizeof(unsigned long));
>  
>  	off += kasan_metadata_size(s, false);
>  
> @@ -1392,7 +1392,8 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
>  		off += 2 * sizeof(struct track);
>  
>  		if (s->flags & SLAB_KMALLOC)
> -			off += sizeof(unsigned int);
> +			off += ALIGN(sizeof(unsigned int),
> +				     sizeof(unsigned long));
>  	}
>  
>  	off += kasan_metadata_size(s, false);
> @@ -7820,9 +7821,14 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
>  		 */
>  		size += 2 * sizeof(struct track);
>  
> -		/* Save the original kmalloc request size */
> +		/*
> +		 * Save the original kmalloc request size.
> +		 * Although the request size is an unsigned int,
> +		 * make sure that is aligned to word boundary.
> +		 */
>  		if (flags & SLAB_KMALLOC)
> -			size += sizeof(unsigned int);
> +			size += ALIGN(sizeof(unsigned int),
> +				      sizeof(unsigned long));
>  	}
>  #endif
>  
> -- 
> 2.43.0
> 

-- 
Cheers,
Harry / Hyeonggon


  reply	other threads:[~2025-10-24  0:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-23 13:16 Harry Yoo
2025-10-24  0:40 ` Harry Yoo [this message]
2025-10-24  1:19   ` Andrey Konovalov
2025-10-24  1:35     ` Andrey Konovalov
2025-10-24  1:56     ` Andrey Konovalov
2025-10-24  7:55       ` Harry Yoo
2025-10-24  8:35     ` Harry Yoo
2025-10-24 14:17       ` Andrey Konovalov
2025-10-24  1:19 ` Andrey Konovalov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPrLF0OUK651M4dk@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=cl@gentwo.org \
    --cc=dvyukov@google.com \
    --cc=feng.79.tang@gmail.com \
    --cc=glider@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ryabinin.a.a@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    --cc=vincenzo.frascino@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox