linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [BUG] maple_tree: maple_node slab object corruption via out-of-bounds write during VMA operations
@ 2026-02-07  4:53 psg
  2026-02-07 12:35 ` Liam R. Howlett
  0 siblings, 1 reply; 2+ messages in thread
From: psg @ 2026-02-07  4:53 UTC (permalink / raw)
  To: linux-mm, linux-kernel; +Cc: Liam.Howlett, lorenzo.stoakes, vbabka, akpm, willy



Hi,


We are hitting a reproducible maple_node slab corruption on 6.18.0-rc6
(ARM64) during early boot. The corruption manifests as
a left redzone overwrite detected by slub_debug, followed by a kernel
panic due to panic_on_taint. We have captured two independent crash dumps
showing the exact same corruption pattern.


We have verified that the CVE-2024-50200
fix (commit bea07fd63192 "maple_tree: correct tree corruption on spanning
store") IS present in our kernel -- the r_mas.max > r_mas.last comparison
uses 64-bit registers as expected. We have also ruled out CVE-2025-38364
(MA_STATE_PREALLOC flag), which causes NULL pointer dereference rather
than an out-of-bounds write.


Environment
-----------
  Kernel: 6.18.0-rc6 (mainline, ARM64)
  Config: SMP PREEMPT, slub_debug=FZP enabled for maple_node
  Cmdline includes: panic_on_taint=0x20 slub_debug=FZP,zs_handle,...


Crash #1 (mmap path)
---------------------
  BUG maple_node (Tainted: G        W  OE): Object corrupt
  [Left Redzone overwritten] 0xffffff8828948300-0xffffff88289483ff
    @offset=768. First byte 0x1 instead of 0xcc


  Allocated in kmem_cache_prefill_sheaf+0x308/0x33c age=291 cpu=2 pid=87171
    kmem_cache_prefill_sheaf+0x308/0x33c
    mas_alloc_nodes+0x98/0xf0
    mas_preallocate+0x234/0x33c
    __split_vma+0x11c/0x364
    vms_gather_munmap_vmas+0x118/0x310
    mmap_region+0x2a8/0xae4
    do_mmap+0x470/0x578
    vm_mmap_pgoff+0x1e8/0x264
    ksys_mmap_pgoff+0xa4/0xf0
    __arm64_sys_mmap+0x34/0x44


  Freed in mt_destroy_walk+0x16c/0x344 age=391 cpu=6 pid=83838
    kmem_cache_free_bulk+0x3c4/0x9f8
    mt_destroy_walk+0x16c/0x344
    __mt_destroy+0x40/0x80
    exit_mmap+0x2ac/0x4b0
    __mmput+0x38/0x16c
    mmput+0x44/0x7c
    exec_mmap+0x208/0x2ac
    begin_new_exec+0x188/0x46c
    load_elf_binary+0x434/0xc68


  Slab 0xfffffffee0a25200 objects=21 used=16
    fp=0xffffff882894a200 flags=0x4000000000000240(workingset|head|zone=1)
  Object 0xffffff8828948400 @offset=1024 fp=0xffffff8828949c00


  Panic call trace (detected during RCU free):
    check_bytes_and_report+0x104/0x31c
    check_object+0x98/0x3c8
    free_to_partial_list+0x174/0x638
    __slab_free+0x204/0x248
    kmem_cache_free_bulk+0x3c4/0x9f8
    kvfree_rcu_bulk+0x17c/0x320
    kfree_rcu_work+0xb8/0x144


Crash #2 (mprotect path)
-------------------------
  BUG maple_node (Tainted: G        W  OE): Object corrupt
  [Left Redzone overwritten] 0xffffff88184b8300-0xffffff88184b83ff
    @offset=768. First byte 0x1 instead of 0xbb


  Allocated in mas_alloc_nodes+0xcc/0xf0 age=343 cpu=3 pid=89696
    kmem_cache_alloc_noprof+0x3fc/0x55c
    mas_alloc_nodes+0xcc/0xf0
    mas_preallocate+0x234/0x33c
    __split_vma+0x11c/0x364
    vma_modify+0x424/0x4dc
    vma_modify_flags+0x74/0xa0
    mprotect_fixup+0x154/0x28c
    do_mprotect_pkey+0x410/0x5b0
    __arm64_sys_mprotect+0x20/0x34


  Freed in kvfree_rcu_bulk+0x17c/0x320 age=335 cpu=7 pid=9090
    kmem_cache_free_bulk+0x3c4/0x9f8
    kvfree_rcu_bulk+0x17c/0x320
    kfree_rcu_work+0xb8/0x144


  Slab 0xfffffffee0612e00 objects=21 used=8
    fp=0xffffff88184b8100 flags=0x4000000000000240(workingset|head|zone=1)
  Object 0xffffff88184b8400 @offset=1024 fp=0xffffff88184b8100


  Panic call trace (detected during sheaf prefill alloc):
    check_bytes_and_report+0x104/0x31c
    check_object+0x98/0x3c8
    alloc_debug_processing+0x104/0x1b8
    ___slab_alloc+0xb10/0x1314
    __kmem_cache_alloc_bulk+0x1d0/0x460
    kmem_cache_prefill_sheaf+0x308/0x33c
    mas_alloc_nodes+0x98/0xf0
    mas_preallocate+0x234/0x33c
    mmap_region+0x548/0xae4
    do_mmap+0x470/0x578


Redzone corruption pattern analysis
------------------------------------
Both crashes show IDENTICAL structured data in the left redzone of the
object at slot 1 (offset 1024). The redzone occupies bytes 768-1023
(256 bytes). The corruption originates from the PREVIOUS maple_node
(slot 0, offset 0-255) writing past its 256-byte boundary.


Corrupted left redzone dump (crash #1, 0xcc = SLUB_RED_ACTIVE):


  Redzone  ffffff8828948300: 01 00 00 00 cc cc cc cc cc cc cc cc 78 59 ef ff
  Redzone  ffffff8828948310: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
  Redzone  ffffff8828948320: cc cc cc cc cc cc cc cc 80 59 ef ff 04 00 00 00
  Redzone  ffffff8828948330: 00 00 00 00 00 00 00 00 01 00 00 00 cc cc cc cc
  Redzone  ffffff8828948340: cc cc cc cc 88 59 ef ff 08 00 00 00 00 00 00 00
  Redzone  ffffff8828948350: 00 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc
  Redzone  ffffff8828948360: 90 59 ef ff 04 00 00 00 00 00 00 00 00 00 00 00
  Redzone  ffffff8828948370: 01 00 00 00 cc cc cc cc cc cc cc cc 98 59 ef ff
  Redzone  ffffff8828948380: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
  Redzone  ffffff8828948390: cc cc cc cc cc cc cc cc a0 59 ef ff 18 00 00 00
  Redzone  ffffff88289483a0: 00 00 00 00 00 00 00 00 01 00 00 00 cc cc cc cc
  Redzone  ffffff88289483b0: cc cc cc cc b8 59 ef ff 18 00 00 00 00 00 00 00
  Redzone  ffffff88289483c0: 00 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc
  Redzone  ffffff88289483d0: d0 59 ef ff 18 00 00 00 00 00 00 00 00 00 00 00
  Redzone  ffffff88289483e0: 01 00 00 00 cc cc cc cc cc cc cc cc e8 59 ef ff
  Redzone  ffffff88289483f0: 18 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00


Corrupted left redzone dump (crash #2, 0xbb = SLUB_RED_INACTIVE):


  Redzone  ffffff88184b8300: 01 00 00 00 bb bb bb bb bb bb bb bb 78 59 ef ff
  Redzone  ffffff88184b8310: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
  Redzone  ffffff88184b8320: bb bb bb bb bb bb bb bb 80 59 ef ff 04 00 00 00
  Redzone  ffffff88184b8330: 00 00 00 00 00 00 00 00 01 00 00 00 bb bb bb bb
  Redzone  ffffff88184b8340: bb bb bb bb 88 59 ef ff 08 00 00 00 00 00 00 00
  Redzone  ffffff88184b8350: 00 00 00 00 01 00 00 00 bb bb bb bb bb bb bb bb
  Redzone  ffffff88184b8360: 90 59 ef ff 04 00 00 00 00 00 00 00 00 00 00 00
  Redzone  ffffff88184b8370: 01 00 00 00 bb bb bb bb bb bb bb bb 98 59 ef ff
  Redzone  ffffff88184b8380: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
  Redzone  ffffff88184b8390: bb bb bb bb bb bb bb bb a0 59 ef ff 18 00 00 00
  Redzone  ffffff88184b83a0: 00 00 00 00 00 00 00 00 01 00 00 00 bb bb bb bb
  Redzone  ffffff88184b83b0: bb bb bb bb b8 59 ef ff 18 00 00 00 00 00 00 00
  Redzone  ffffff88184b83c0: 00 00 00 00 01 00 00 00 bb bb bb bb bb bb bb bb
  Redzone  ffffff88184b83d0: d0 59 ef ff 18 00 00 00 00 00 00 00 00 00 00 00
  Redzone  ffffff88184b83e0: 01 00 00 00 bb bb bb bb bb bb bb bb e8 59 ef ff
  Redzone  ffffff88184b83f0: 18 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00


The corruption data is interleaved with original redzone poison bytes
(0xcc or 0xbb), which are preserved at 8-byte intervals. The corrupted
bytes form a repeating 28-byte structure that resembles maple_range_64
pivot entries containing VMA page-boundary addresses:


  Bytes 0-3:   flags/refcount (0x00000001)
  Bytes 4-11:  [original redzone poison - NOT overwritten]
  Bytes 12-15: VMA address fragment (e.g., 0xffef5978, incrementing)
  Bytes 16-19: size/length field (0x04, 0x08, or 0x18 pages)
  Bytes 20-27: zero padding


The VMA addresses form a sequential series:
  0x????ffef5978, 0x????ffef5980, 0x????ffef5988, 0x????ffef5990,
  0x????ffef5998, 0x????ffef59a0, 0x????ffef59b8, 0x????ffef59d0,
  0x????ffef59e8


This pattern is consistent with a maple_range_64 node's pivot/slot data
being written beyond the 256-byte maple_node allocation boundary,
overflowing into the right redzone of slot 0 and the left redzone of
slot 1.


Our analysis of slot 0 data from crash #1 (via physical memory
reconstruction from the DDR dump) revealed DUPLICATE PIVOT entries in
the previous maple_node -- a pattern reminiscent of CVE-2024-50200, but
occurring despite the fix being present. This suggests there may be
another code path in the maple tree that can produce similar spanning
store corruption.






Thanks


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] maple_tree: maple_node slab object corruption via out-of-bounds write during VMA operations
  2026-02-07  4:53 [BUG] maple_tree: maple_node slab object corruption via out-of-bounds write during VMA operations psg
@ 2026-02-07 12:35 ` Liam R. Howlett
  0 siblings, 0 replies; 2+ messages in thread
From: Liam R. Howlett @ 2026-02-07 12:35 UTC (permalink / raw)
  To: psg; +Cc: linux-mm, linux-kernel, lorenzo.stoakes, vbabka, akpm, willy

* psg <ab9517532006@126.com> [260207 04:54]:
> 
> 
> Hi,
> 
> 
> We are hitting a reproducible maple_node slab corruption on 6.18.0-rc6
> (ARM64) during early boot. The corruption manifests as
> a left redzone overwrite detected by slub_debug, followed by a kernel
> panic due to panic_on_taint. We have captured two independent crash dumps
> showing the exact same corruption pattern.
> 
> 
> We have verified that the CVE-2024-50200
> fix (commit bea07fd63192 "maple_tree: correct tree corruption on spanning
> store") IS present in our kernel -- the r_mas.max > r_mas.last comparison
> uses 64-bit registers as expected. We have also ruled out CVE-2025-38364
> (MA_STATE_PREALLOC flag), which causes NULL pointer dereference rather
> than an out-of-bounds write.
> 
> 
> Environment
> -----------
>   Kernel: 6.18.0-rc6 (mainline, ARM64)

Did you mean 6.19?

>   Config: SMP PREEMPT, slub_debug=FZP enabled for maple_node
>   Cmdline includes: panic_on_taint=0x20 slub_debug=FZP,zs_handle,...

Can we see the full command line and config?

Have you tried 6.18 or any other released kernel?

Did you try enabling CONFIG_DEBUG_VM_MAPLE_TREE ?  Please set
no_hash_pointers on the command line.  This way we can see the full tree
dump and where it is happening.

> 
> 
> Crash #1 (mmap path)
> ---------------------
>   BUG maple_node (Tainted: G        W  OE): Object corrupt
>   [Left Redzone overwritten] 0xffffff8828948300-0xffffff88289483ff
>     @offset=768. First byte 0x1 instead of 0xcc
> 
> 
>   Allocated in kmem_cache_prefill_sheaf+0x308/0x33c age=291 cpu=2 pid=87171
>     kmem_cache_prefill_sheaf+0x308/0x33c
>     mas_alloc_nodes+0x98/0xf0
>     mas_preallocate+0x234/0x33c
>     __split_vma+0x11c/0x364
>     vms_gather_munmap_vmas+0x118/0x310
>     mmap_region+0x2a8/0xae4
>     do_mmap+0x470/0x578
>     vm_mmap_pgoff+0x1e8/0x264
>     ksys_mmap_pgoff+0xa4/0xf0
>     __arm64_sys_mmap+0x34/0x44
> 
> 
>   Freed in mt_destroy_walk+0x16c/0x344 age=391 cpu=6 pid=83838
>     kmem_cache_free_bulk+0x3c4/0x9f8
>     mt_destroy_walk+0x16c/0x344
>     __mt_destroy+0x40/0x80
>     exit_mmap+0x2ac/0x4b0
>     __mmput+0x38/0x16c
>     mmput+0x44/0x7c
>     exec_mmap+0x208/0x2ac
>     begin_new_exec+0x188/0x46c
>     load_elf_binary+0x434/0xc68

This is on exit of the process, so it's not very early in the boot
process.  Hopefully you can reproduce it with the debug flag without
waiting too long.

> 
> 
>   Slab 0xfffffffee0a25200 objects=21 used=16
>     fp=0xffffff882894a200 flags=0x4000000000000240(workingset|head|zone=1)
>   Object 0xffffff8828948400 @offset=1024 fp=0xffffff8828949c00
> 
> 
>   Panic call trace (detected during RCU free):
>     check_bytes_and_report+0x104/0x31c
>     check_object+0x98/0x3c8
>     free_to_partial_list+0x174/0x638
>     __slab_free+0x204/0x248
>     kmem_cache_free_bulk+0x3c4/0x9f8
>     kvfree_rcu_bulk+0x17c/0x320
>     kfree_rcu_work+0xb8/0x144
> 
> 
> Crash #2 (mprotect path)
> -------------------------
>   BUG maple_node (Tainted: G        W  OE): Object corrupt
>   [Left Redzone overwritten] 0xffffff88184b8300-0xffffff88184b83ff
>     @offset=768. First byte 0x1 instead of 0xbb
> 
> 
>   Allocated in mas_alloc_nodes+0xcc/0xf0 age=343 cpu=3 pid=89696
>     kmem_cache_alloc_noprof+0x3fc/0x55c
>     mas_alloc_nodes+0xcc/0xf0
>     mas_preallocate+0x234/0x33c
>     __split_vma+0x11c/0x364
>     vma_modify+0x424/0x4dc
>     vma_modify_flags+0x74/0xa0
>     mprotect_fixup+0x154/0x28c
>     do_mprotect_pkey+0x410/0x5b0
>     __arm64_sys_mprotect+0x20/0x34
> 
> 
>   Freed in kvfree_rcu_bulk+0x17c/0x320 age=335 cpu=7 pid=9090
>     kmem_cache_free_bulk+0x3c4/0x9f8
>     kvfree_rcu_bulk+0x17c/0x320
>     kfree_rcu_work+0xb8/0x144
> 
> 
>   Slab 0xfffffffee0612e00 objects=21 used=8
>     fp=0xffffff88184b8100 flags=0x4000000000000240(workingset|head|zone=1)
>   Object 0xffffff88184b8400 @offset=1024 fp=0xffffff88184b8100
> 
> 
>   Panic call trace (detected during sheaf prefill alloc):
>     check_bytes_and_report+0x104/0x31c
>     check_object+0x98/0x3c8
>     alloc_debug_processing+0x104/0x1b8
>     ___slab_alloc+0xb10/0x1314
>     __kmem_cache_alloc_bulk+0x1d0/0x460
>     kmem_cache_prefill_sheaf+0x308/0x33c
>     mas_alloc_nodes+0x98/0xf0
>     mas_preallocate+0x234/0x33c
>     mmap_region+0x548/0xae4
>     do_mmap+0x470/0x578
> 
> 
> Redzone corruption pattern analysis
> ------------------------------------
> Both crashes show IDENTICAL structured data in the left redzone of the
> object at slot 1 (offset 1024). The redzone occupies bytes 768-1023
> (256 bytes). The corruption originates from the PREVIOUS maple_node
> (slot 0, offset 0-255) writing past its 256-byte boundary.
> 
> 
> Corrupted left redzone dump (crash #1, 0xcc = SLUB_RED_ACTIVE):
> 
> 
>   Redzone  ffffff8828948300: 01 00 00 00 cc cc cc cc cc cc cc cc 78 59 ef ff
>   Redzone  ffffff8828948310: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
>   Redzone  ffffff8828948320: cc cc cc cc cc cc cc cc 80 59 ef ff 04 00 00 00
>   Redzone  ffffff8828948330: 00 00 00 00 00 00 00 00 01 00 00 00 cc cc cc cc
>   Redzone  ffffff8828948340: cc cc cc cc 88 59 ef ff 08 00 00 00 00 00 00 00
>   Redzone  ffffff8828948350: 00 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc
>   Redzone  ffffff8828948360: 90 59 ef ff 04 00 00 00 00 00 00 00 00 00 00 00
>   Redzone  ffffff8828948370: 01 00 00 00 cc cc cc cc cc cc cc cc 98 59 ef ff
>   Redzone  ffffff8828948380: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
>   Redzone  ffffff8828948390: cc cc cc cc cc cc cc cc a0 59 ef ff 18 00 00 00
>   Redzone  ffffff88289483a0: 00 00 00 00 00 00 00 00 01 00 00 00 cc cc cc cc
>   Redzone  ffffff88289483b0: cc cc cc cc b8 59 ef ff 18 00 00 00 00 00 00 00
>   Redzone  ffffff88289483c0: 00 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc
>   Redzone  ffffff88289483d0: d0 59 ef ff 18 00 00 00 00 00 00 00 00 00 00 00
>   Redzone  ffffff88289483e0: 01 00 00 00 cc cc cc cc cc cc cc cc e8 59 ef ff
>   Redzone  ffffff88289483f0: 18 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
> 
> 
> Corrupted left redzone dump (crash #2, 0xbb = SLUB_RED_INACTIVE):
> 
> 
>   Redzone  ffffff88184b8300: 01 00 00 00 bb bb bb bb bb bb bb bb 78 59 ef ff
>   Redzone  ffffff88184b8310: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
>   Redzone  ffffff88184b8320: bb bb bb bb bb bb bb bb 80 59 ef ff 04 00 00 00
>   Redzone  ffffff88184b8330: 00 00 00 00 00 00 00 00 01 00 00 00 bb bb bb bb
>   Redzone  ffffff88184b8340: bb bb bb bb 88 59 ef ff 08 00 00 00 00 00 00 00
>   Redzone  ffffff88184b8350: 00 00 00 00 01 00 00 00 bb bb bb bb bb bb bb bb
>   Redzone  ffffff88184b8360: 90 59 ef ff 04 00 00 00 00 00 00 00 00 00 00 00
>   Redzone  ffffff88184b8370: 01 00 00 00 bb bb bb bb bb bb bb bb 98 59 ef ff
>   Redzone  ffffff88184b8380: 08 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
>   Redzone  ffffff88184b8390: bb bb bb bb bb bb bb bb a0 59 ef ff 18 00 00 00
>   Redzone  ffffff88184b83a0: 00 00 00 00 00 00 00 00 01 00 00 00 bb bb bb bb
>   Redzone  ffffff88184b83b0: bb bb bb bb b8 59 ef ff 18 00 00 00 00 00 00 00
>   Redzone  ffffff88184b83c0: 00 00 00 00 01 00 00 00 bb bb bb bb bb bb bb bb
>   Redzone  ffffff88184b83d0: d0 59 ef ff 18 00 00 00 00 00 00 00 00 00 00 00
>   Redzone  ffffff88184b83e0: 01 00 00 00 bb bb bb bb bb bb bb bb e8 59 ef ff
>   Redzone  ffffff88184b83f0: 18 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
> 
> 
> The corruption data is interleaved with original redzone poison bytes
> (0xcc or 0xbb), which are preserved at 8-byte intervals. The corrupted
> bytes form a repeating 28-byte structure that resembles maple_range_64
> pivot entries containing VMA page-boundary addresses:
> 
> 
>   Bytes 0-3:   flags/refcount (0x00000001)
>   Bytes 4-11:  [original redzone poison - NOT overwritten]
>   Bytes 12-15: VMA address fragment (e.g., 0xffef5978, incrementing)
>   Bytes 16-19: size/length field (0x04, 0x08, or 0x18 pages)
>   Bytes 20-27: zero padding
> 
> 
> The VMA addresses form a sequential series:
>   0x????ffef5978, 0x????ffef5980, 0x????ffef5988, 0x????ffef5990,
>   0x????ffef5998, 0x????ffef59a0, 0x????ffef59b8, 0x????ffef59d0,
>   0x????ffef59e8
> 
> 
> This pattern is consistent with a maple_range_64 node's pivot/slot data
> being written beyond the 256-byte maple_node allocation boundary,
> overflowing into the right redzone of slot 0 and the left redzone of
> slot 1.
> 
> 
> Our analysis of slot 0 data from crash #1 (via physical memory
> reconstruction from the DDR dump) revealed DUPLICATE PIVOT entries in
> the previous maple_node -- a pattern reminiscent of CVE-2024-50200, but
> occurring despite the fix being present. This suggests there may be
> another code path in the maple tree that can produce similar spanning
> store corruption.

What are the pivots?

At rcu free time, the data in the nodes may not be reliable so it would
be good to try and use the debug validation code in the conf option
mentioned above.

Thanks,
Liam



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-02-07 12:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-07  4:53 [BUG] maple_tree: maple_node slab object corruption via out-of-bounds write during VMA operations psg
2026-02-07 12:35 ` Liam R. Howlett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox