linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
@ 2024-02-23  2:10 Tetsuo Handa
  2024-02-23  2:27 ` Yosry Ahmed
  2024-02-23  4:43 ` Sergey Senozhatsky
  0 siblings, 2 replies; 17+ messages in thread
From: Tetsuo Handa @ 2024-02-23  2:10 UTC (permalink / raw)
  To: Johannes Weiner, Yosry Ahmed, Nhat Pham, Minchan Kim, Sergey Senozhatsky
  Cc: linux-mm

I can observe this bug during evict_folios() from 6.7.0 to 6.8.0-rc5-00163-gffd2cb6b718e.
Since I haven't observed with 6.6.0, this bug might be introduced in 6.7 cycle.

----------------------------------------
[    0.000000][    T0] Linux version 6.8.0-rc5-00163-gffd2cb6b718e (root@ubuntu) (Ubuntu clang version 14.0.0-1ubuntu1.1, Ubuntu LLD 14.0.0) #1094 SMP PREEMPT_DYNAMIC Fri Feb 23 01:45:21 UTC 2024
[   50.026544][ T2974] =====================================================
[   50.030627][ T2974] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0
[   50.034611][ T2974]  obj_malloc+0x6cc/0x7b0
                                                           obj_malloc at mm/zsmalloc.c:0
[   50.037250][ T2974]  zs_malloc+0xdbd/0x1400
                                                           zs_malloc at mm/zsmalloc.c:0
[   50.039852][ T2974]  zs_zpool_malloc+0xa5/0x1b0
                                                           zs_zpool_malloc at mm/zsmalloc.c:372
[   50.044707][ T2974]  zpool_malloc+0x110/0x150
                                                           zpool_malloc at mm/zpool.c:258
[   50.049607][ T2974]  zswap_store+0x2bbb/0x3d30
                                                           zswap_store at mm/zswap.c:1637
[   50.054463][ T2974]  swap_writepage+0x15b/0x4f0
                                                           swap_writepage at mm/page_io.c:198
[   50.059392][ T2974]  pageout+0x41d/0xef0
                                                           pageout at mm/vmscan.c:654
[   50.064057][ T2974]  shrink_folio_list+0x4d7a/0x7480
                                                           shrink_folio_list at mm/vmscan.c:1316
[   50.069176][ T2974]  evict_folios+0x30f1/0x5170
                                                           evict_folios at mm/vmscan.c:4521
[   50.074082][ T2974]  try_to_shrink_lruvec+0x983/0xd20
[   50.079352][ T2974]  shrink_one+0x72d/0xeb0
[   50.084061][ T2974]  shrink_many+0x70d/0x10b0
[   50.088859][ T2974]  lru_gen_shrink_node+0x577/0x850
[   50.094192][ T2974]  shrink_node+0x13d/0x1de0
[   50.099028][ T2974]  shrink_zones+0x878/0x14a0
[   50.103958][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
[   50.109138][ T2974]  try_to_free_pages+0xd9e/0x1910
[   50.114190][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
[   50.119555][ T2974]  __alloc_pages+0xb8c/0x1050
[   50.124472][ T2974]  alloc_pages_mpol+0x8e0/0xc80
[   50.129367][ T2974]  alloc_pages+0x224/0x240
[   50.134022][ T2974]  pipe_write+0xabe/0x2ba0
[   50.138632][ T2974]  vfs_write+0xfb0/0x1b80
[   50.143171][ T2974]  ksys_write+0x275/0x500
[   50.147723][ T2974]  __x64_sys_write+0xdf/0x120
[   50.152431][ T2974]  do_syscall_64+0xd1/0x1b0
[   50.157106][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
[   50.162382][ T2974] 
[   50.165956][ T2974] Uninit was stored to memory at:
[   50.170819][ T2974]  obj_malloc+0x70a/0x7b0
                                                           set_freeobj at mm/zsmalloc.c:476
                                                           (inlined by) obj_malloc at mm/zsmalloc.c:1333
[   50.175341][ T2974]  zs_malloc+0xdbd/0x1400
                                                           zs_malloc at mm/zsmalloc.c:0
[   50.179923][ T2974]  zs_zpool_malloc+0xa5/0x1b0
                                                           zs_zpool_malloc at mm/zsmalloc.c:372
[   50.184636][ T2974]  zpool_malloc+0x110/0x150
                                                           zpool_malloc at mm/zpool.c:258
[   50.189257][ T2974]  zswap_store+0x2bbb/0x3d30
                                                           zswap_store at mm/zswap.c:1637
[   50.193918][ T2974]  swap_writepage+0x15b/0x4f0
                                                           swap_writepage at mm/page_io.c:198
[   50.198615][ T2974]  pageout+0x41d/0xef0
                                                           pageout at mm/vmscan.c:654
[   50.203012][ T2974]  shrink_folio_list+0x4d7a/0x7480
                                                           shrink_folio_list at mm/vmscan.c:1316
[   50.207772][ T2974]  evict_folios+0x30f1/0x5170
                                                           evict_folios at mm/vmscan.c:4521
[   50.212321][ T2974]  try_to_shrink_lruvec+0x983/0xd20
[   50.217092][ T2974]  shrink_one+0x72d/0xeb0
[   50.221441][ T2974]  shrink_many+0x70d/0x10b0
[   50.225891][ T2974]  lru_gen_shrink_node+0x577/0x850
[   50.230614][ T2974]  shrink_node+0x13d/0x1de0
[   50.235128][ T2974]  shrink_zones+0x878/0x14a0
[   50.239646][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
[   50.244461][ T2974]  try_to_free_pages+0xd9e/0x1910
[   50.249151][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
[   50.254148][ T2974]  __alloc_pages+0xb8c/0x1050
[   50.258679][ T2974]  alloc_pages_mpol+0x8e0/0xc80
[   50.263289][ T2974]  alloc_pages+0x224/0x240
[   50.267767][ T2974]  pipe_write+0xabe/0x2ba0
[   50.272190][ T2974]  vfs_write+0xfb0/0x1b80
[   50.276543][ T2974]  ksys_write+0x275/0x500
[   50.280931][ T2974]  __x64_sys_write+0xdf/0x120
[   50.289451][ T2974]  do_syscall_64+0xd1/0x1b0
[   50.303402][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
[   50.318721][ T2974] 
[   50.328931][ T2974] Uninit was created at:
[   50.341845][ T2974]  free_unref_page_prepare+0x130/0xfc0
                                                           arch_static_branch_jump at arch/x86/include/asm/jump_label.h:55
                                                           (inlined by) memcg_kmem_online at include/linux/memcontrol.h:1840
                                                           (inlined by) free_pages_prepare at mm/page_alloc.c:1096
                                                           (inlined by) free_unref_page_prepare at mm/page_alloc.c:2346
[   50.356492][ T2974]  free_unref_page_list+0x139/0x1050
                                                           free_unref_page_list at mm/page_alloc.c:2532
[   50.370898][ T2974]  shrink_folio_list+0x7139/0x7480
                                                           list_empty at include/linux/list.h:373
                                                           (inlined by) list_splice at include/linux/list.h:545
                                                           (inlined by) shrink_folio_list at mm/vmscan.c:1490
[   50.385025][ T2974]  evict_folios+0x30f1/0x5170
                                                           evict_folios at mm/vmscan.c:4521
[   50.398448][ T2974]  try_to_shrink_lruvec+0x983/0xd20
[   50.412660][ T2974]  shrink_one+0x72d/0xeb0
[   50.425591][ T2974]  shrink_many+0x70d/0x10b0
[   50.438827][ T2974]  lru_gen_shrink_node+0x577/0x850
[   50.454390][ T2974]  shrink_node+0x13d/0x1de0
[   50.479401][ T2974]  shrink_zones+0x878/0x14a0
[   50.529610][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
[   50.544397][ T2974]  try_to_free_pages+0xd9e/0x1910
[   50.559556][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
[   50.574932][ T2974]  __alloc_pages+0xb8c/0x1050
[   50.589024][ T2974]  alloc_pages_mpol+0x8e0/0xc80
[   50.603421][ T2974]  alloc_pages+0x224/0x240
[   50.616483][ T2974]  pipe_write+0xabe/0x2ba0
[   50.629601][ T2974]  vfs_write+0xfb0/0x1b80
[   50.643009][ T2974]  ksys_write+0x275/0x500
[   50.656157][ T2974]  __x64_sys_write+0xdf/0x120
[   50.670080][ T2974]  do_syscall_64+0xd1/0x1b0
[   50.683405][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
[   50.698626][ T2974] 
----------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  2:10 [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc() Tetsuo Handa
@ 2024-02-23  2:27 ` Yosry Ahmed
  2024-02-23  4:48   ` Sergey Senozhatsky
  2024-02-23  4:43 ` Sergey Senozhatsky
  1 sibling, 1 reply; 17+ messages in thread
From: Yosry Ahmed @ 2024-02-23  2:27 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Johannes Weiner, Nhat Pham, Minchan Kim, Sergey Senozhatsky, linux-mm

On Thu, Feb 22, 2024 at 6:10 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> I can observe this bug during evict_folios() from 6.7.0 to 6.8.0-rc5-00163-gffd2cb6b718e.
> Since I haven't observed with 6.6.0, this bug might be introduced in 6.7 cycle.

I am not familiar with KMSAN bug reports, but it seems like it's
reporting a user-after-free for zspage->freeobj. The report says it
was created in free_unref_page_prepare() during lruvec reclaim, and I
am not sure how that's possible given that zspage is allocated from
the slab allocator. Perhaps I am mis-interpreting the report.

I also don't see any recent changes in mm/zsmalloc.c that modify this
code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
Sergey, I don't think zswap is an active actor in this bug report.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  2:10 [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc() Tetsuo Handa
  2024-02-23  2:27 ` Yosry Ahmed
@ 2024-02-23  4:43 ` Sergey Senozhatsky
  2024-02-23 15:22   ` Tetsuo Handa
  1 sibling, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2024-02-23  4:43 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Minchan Kim,
	Sergey Senozhatsky, linux-mm

On (24/02/23 11:10), Tetsuo Handa wrote:
> 
> I can observe this bug during evict_folios() from 6.7.0 to 6.8.0-rc5-00163-gffd2cb6b718e.
> Since I haven't observed with 6.6.0, this bug might be introduced in 6.7 cycle.

Can we please run a bisect?

There are some zsmalloc patches for 6.8 (mm-unstable), I don't recall
anything in 6.7.

> ----------------------------------------
> [    0.000000][    T0] Linux version 6.8.0-rc5-00163-gffd2cb6b718e (root@ubuntu) (Ubuntu clang version 14.0.0-1ubuntu1.1, Ubuntu LLD 14.0.0) #1094 SMP PREEMPT_DYNAMIC Fri Feb 23 01:45:21 UTC 2024
> [   50.026544][ T2974] =====================================================
> [   50.030627][ T2974] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0
> [   50.034611][ T2974]  obj_malloc+0x6cc/0x7b0
>                                                            obj_malloc at mm/zsmalloc.c:0
> [   50.037250][ T2974]  zs_malloc+0xdbd/0x1400
>                                                            zs_malloc at mm/zsmalloc.c:0
> [   50.039852][ T2974]  zs_zpool_malloc+0xa5/0x1b0
>                                                            zs_zpool_malloc at mm/zsmalloc.c:372
> [   50.044707][ T2974]  zpool_malloc+0x110/0x150
>                                                            zpool_malloc at mm/zpool.c:258
> [   50.049607][ T2974]  zswap_store+0x2bbb/0x3d30
>                                                            zswap_store at mm/zswap.c:1637
> [   50.054463][ T2974]  swap_writepage+0x15b/0x4f0
>                                                            swap_writepage at mm/page_io.c:198
> [   50.059392][ T2974]  pageout+0x41d/0xef0
>                                                            pageout at mm/vmscan.c:654
> [   50.064057][ T2974]  shrink_folio_list+0x4d7a/0x7480
>                                                            shrink_folio_list at mm/vmscan.c:1316
> [   50.069176][ T2974]  evict_folios+0x30f1/0x5170
>                                                            evict_folios at mm/vmscan.c:4521
> [   50.074082][ T2974]  try_to_shrink_lruvec+0x983/0xd20
> [   50.079352][ T2974]  shrink_one+0x72d/0xeb0
> [   50.084061][ T2974]  shrink_many+0x70d/0x10b0
> [   50.088859][ T2974]  lru_gen_shrink_node+0x577/0x850
> [   50.094192][ T2974]  shrink_node+0x13d/0x1de0
> [   50.099028][ T2974]  shrink_zones+0x878/0x14a0
> [   50.103958][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
> [   50.109138][ T2974]  try_to_free_pages+0xd9e/0x1910
> [   50.114190][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
> [   50.119555][ T2974]  __alloc_pages+0xb8c/0x1050
> [   50.124472][ T2974]  alloc_pages_mpol+0x8e0/0xc80
> [   50.129367][ T2974]  alloc_pages+0x224/0x240
> [   50.134022][ T2974]  pipe_write+0xabe/0x2ba0
> [   50.138632][ T2974]  vfs_write+0xfb0/0x1b80
> [   50.143171][ T2974]  ksys_write+0x275/0x500
> [   50.147723][ T2974]  __x64_sys_write+0xdf/0x120
> [   50.152431][ T2974]  do_syscall_64+0xd1/0x1b0
> [   50.157106][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
> [   50.162382][ T2974] 
> [   50.165956][ T2974] Uninit was stored to memory at:
> [   50.170819][ T2974]  obj_malloc+0x70a/0x7b0
>                                                            set_freeobj at mm/zsmalloc.c:476
>                                                            (inlined by) obj_malloc at mm/zsmalloc.c:1333
> [   50.175341][ T2974]  zs_malloc+0xdbd/0x1400
>                                                            zs_malloc at mm/zsmalloc.c:0
> [   50.179923][ T2974]  zs_zpool_malloc+0xa5/0x1b0
>                                                            zs_zpool_malloc at mm/zsmalloc.c:372
> [   50.184636][ T2974]  zpool_malloc+0x110/0x150
>                                                            zpool_malloc at mm/zpool.c:258
> [   50.189257][ T2974]  zswap_store+0x2bbb/0x3d30
>                                                            zswap_store at mm/zswap.c:1637
> [   50.193918][ T2974]  swap_writepage+0x15b/0x4f0
>                                                            swap_writepage at mm/page_io.c:198
> [   50.198615][ T2974]  pageout+0x41d/0xef0
>                                                            pageout at mm/vmscan.c:654
> [   50.203012][ T2974]  shrink_folio_list+0x4d7a/0x7480
>                                                            shrink_folio_list at mm/vmscan.c:1316
> [   50.207772][ T2974]  evict_folios+0x30f1/0x5170
>                                                            evict_folios at mm/vmscan.c:4521
> [   50.212321][ T2974]  try_to_shrink_lruvec+0x983/0xd20
> [   50.217092][ T2974]  shrink_one+0x72d/0xeb0
> [   50.221441][ T2974]  shrink_many+0x70d/0x10b0
> [   50.225891][ T2974]  lru_gen_shrink_node+0x577/0x850
> [   50.230614][ T2974]  shrink_node+0x13d/0x1de0
> [   50.235128][ T2974]  shrink_zones+0x878/0x14a0
> [   50.239646][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
> [   50.244461][ T2974]  try_to_free_pages+0xd9e/0x1910
> [   50.249151][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
> [   50.254148][ T2974]  __alloc_pages+0xb8c/0x1050
> [   50.258679][ T2974]  alloc_pages_mpol+0x8e0/0xc80
> [   50.263289][ T2974]  alloc_pages+0x224/0x240
> [   50.267767][ T2974]  pipe_write+0xabe/0x2ba0
> [   50.272190][ T2974]  vfs_write+0xfb0/0x1b80
> [   50.276543][ T2974]  ksys_write+0x275/0x500
> [   50.280931][ T2974]  __x64_sys_write+0xdf/0x120
> [   50.289451][ T2974]  do_syscall_64+0xd1/0x1b0
> [   50.303402][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
> [   50.318721][ T2974] 
> [   50.328931][ T2974] Uninit was created at:
> [   50.341845][ T2974]  free_unref_page_prepare+0x130/0xfc0
>                                                            arch_static_branch_jump at arch/x86/include/asm/jump_label.h:55
>                                                            (inlined by) memcg_kmem_online at include/linux/memcontrol.h:1840
>                                                            (inlined by) free_pages_prepare at mm/page_alloc.c:1096
>                                                            (inlined by) free_unref_page_prepare at mm/page_alloc.c:2346
> [   50.356492][ T2974]  free_unref_page_list+0x139/0x1050
>                                                            free_unref_page_list at mm/page_alloc.c:2532
> [   50.370898][ T2974]  shrink_folio_list+0x7139/0x7480
>                                                            list_empty at include/linux/list.h:373
>                                                            (inlined by) list_splice at include/linux/list.h:545
>                                                            (inlined by) shrink_folio_list at mm/vmscan.c:1490
> [   50.385025][ T2974]  evict_folios+0x30f1/0x5170
>                                                            evict_folios at mm/vmscan.c:4521
> [   50.398448][ T2974]  try_to_shrink_lruvec+0x983/0xd20
> [   50.412660][ T2974]  shrink_one+0x72d/0xeb0
> [   50.425591][ T2974]  shrink_many+0x70d/0x10b0
> [   50.438827][ T2974]  lru_gen_shrink_node+0x577/0x850
> [   50.454390][ T2974]  shrink_node+0x13d/0x1de0
> [   50.479401][ T2974]  shrink_zones+0x878/0x14a0
> [   50.529610][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
> [   50.544397][ T2974]  try_to_free_pages+0xd9e/0x1910
> [   50.559556][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
> [   50.574932][ T2974]  __alloc_pages+0xb8c/0x1050
> [   50.589024][ T2974]  alloc_pages_mpol+0x8e0/0xc80
> [   50.603421][ T2974]  alloc_pages+0x224/0x240
> [   50.616483][ T2974]  pipe_write+0xabe/0x2ba0
> [   50.629601][ T2974]  vfs_write+0xfb0/0x1b80
> [   50.643009][ T2974]  ksys_write+0x275/0x500
> [   50.656157][ T2974]  __x64_sys_write+0xdf/0x120
> [   50.670080][ T2974]  do_syscall_64+0xd1/0x1b0
> [   50.683405][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
> [   50.698626][ T2974] 
> ----------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  2:27 ` Yosry Ahmed
@ 2024-02-23  4:48   ` Sergey Senozhatsky
  2024-02-23  4:50     ` Yosry Ahmed
  2024-02-23  5:23     ` Chengming Zhou
  0 siblings, 2 replies; 17+ messages in thread
From: Sergey Senozhatsky @ 2024-02-23  4:48 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Tetsuo Handa, Johannes Weiner, Nhat Pham, Minchan Kim,
	Sergey Senozhatsky, linux-mm

On (24/02/22 18:27), Yosry Ahmed wrote:
> I also don't see any recent changes in mm/zsmalloc.c that modify this
> code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
> Sergey, I don't think zswap is an active actor in this bug report.

Yeah. [1] are the only recent zsmalloc patches I can recall, and those
patches touch zsmalloc locking (zspages migration/compaction).

https://lore.kernel.org/lkml/20240219-b4-szmalloc-migrate-v1-0-34cd49c6545b@bytedance.com/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  4:48   ` Sergey Senozhatsky
@ 2024-02-23  4:50     ` Yosry Ahmed
  2024-02-23  4:56       ` Sergey Senozhatsky
  2024-02-23  5:23     ` Chengming Zhou
  1 sibling, 1 reply; 17+ messages in thread
From: Yosry Ahmed @ 2024-02-23  4:50 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Johannes Weiner, Nhat Pham, Minchan Kim, linux-mm

On Thu, Feb 22, 2024 at 8:48 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (24/02/22 18:27), Yosry Ahmed wrote:
> > I also don't see any recent changes in mm/zsmalloc.c that modify this
> > code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
> > Sergey, I don't think zswap is an active actor in this bug report.
>
> Yeah. [1] are the only recent zsmalloc patches I can recall, and those
> patches touch zsmalloc locking (zspages migration/compaction).
>
> https://lore.kernel.org/lkml/20240219-b4-szmalloc-migrate-v1-0-34cd49c6545b@bytedance.com/

These are not in 6.8.0-rc5 anyway, right?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  4:50     ` Yosry Ahmed
@ 2024-02-23  4:56       ` Sergey Senozhatsky
  2024-02-23  4:58         ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2024-02-23  4:56 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Sergey Senozhatsky, Tetsuo Handa, Johannes Weiner, Nhat Pham,
	Minchan Kim, linux-mm

On (24/02/22 20:50), Yosry Ahmed wrote:
> On Thu, Feb 22, 2024 at 8:48 PM Sergey Senozhatsky
> <senozhatsky@chromium.org> wrote:
> >
> > On (24/02/22 18:27), Yosry Ahmed wrote:
> > > I also don't see any recent changes in mm/zsmalloc.c that modify this
> > > code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
> > > Sergey, I don't think zswap is an active actor in this bug report.
> >
> > Yeah. [1] are the only recent zsmalloc patches I can recall, and those
> > patches touch zsmalloc locking (zspages migration/compaction).
> >
> > https://lore.kernel.org/lkml/20240219-b4-szmalloc-migrate-v1-0-34cd49c6545b@bytedance.com/
> 
> These are not in 6.8.0-rc5 anyway, right?

I see them in next-20240223, which seems to be 6.8-rc6 (according to
Makefile)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  4:56       ` Sergey Senozhatsky
@ 2024-02-23  4:58         ` Sergey Senozhatsky
  2024-02-23  5:05           ` Yosry Ahmed
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2024-02-23  4:58 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Yosry Ahmed, Tetsuo Handa, Johannes Weiner, Nhat Pham,
	Minchan Kim, linux-mm

On (24/02/23 13:56), Sergey Senozhatsky wrote:
> On (24/02/22 20:50), Yosry Ahmed wrote:
> > On Thu, Feb 22, 2024 at 8:48 PM Sergey Senozhatsky
> > <senozhatsky@chromium.org> wrote:
> > >
> > > On (24/02/22 18:27), Yosry Ahmed wrote:
> > > > I also don't see any recent changes in mm/zsmalloc.c that modify this
> > > > code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
> > > > Sergey, I don't think zswap is an active actor in this bug report.
> > >
> > > Yeah. [1] are the only recent zsmalloc patches I can recall, and those
> > > patches touch zsmalloc locking (zspages migration/compaction).
> > >
> > > https://lore.kernel.org/lkml/20240219-b4-szmalloc-migrate-v1-0-34cd49c6545b@bytedance.com/
> > 
> > These are not in 6.8.0-rc5 anyway, right?
> 
> I see them in next-20240223, which seems to be 6.8-rc6 (according to
                                                   ^ -rc5

But they look more or less correct to me, so I'm not blaming those
patches.  We should be protected by pool->look.  Bisection would help
us a lot, I think.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  4:58         ` Sergey Senozhatsky
@ 2024-02-23  5:05           ` Yosry Ahmed
  2024-02-23  5:19             ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Yosry Ahmed @ 2024-02-23  5:05 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Johannes Weiner, Nhat Pham, Minchan Kim, linux-mm

On Thu, Feb 22, 2024 at 8:58 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (24/02/23 13:56), Sergey Senozhatsky wrote:
> > On (24/02/22 20:50), Yosry Ahmed wrote:
> > > On Thu, Feb 22, 2024 at 8:48 PM Sergey Senozhatsky
> > > <senozhatsky@chromium.org> wrote:
> > > >
> > > > On (24/02/22 18:27), Yosry Ahmed wrote:
> > > > > I also don't see any recent changes in mm/zsmalloc.c that modify this
> > > > > code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
> > > > > Sergey, I don't think zswap is an active actor in this bug report.
> > > >
> > > > Yeah. [1] are the only recent zsmalloc patches I can recall, and those
> > > > patches touch zsmalloc locking (zspages migration/compaction).
> > > >
> > > > https://lore.kernel.org/lkml/20240219-b4-szmalloc-migrate-v1-0-34cd49c6545b@bytedance.com/
> > >
> > > These are not in 6.8.0-rc5 anyway, right?
> >
> > I see them in next-20240223, which seems to be 6.8-rc6 (according to
>                                                    ^ -rc5
>
> But they look more or less correct to me, so I'm not blaming those
> patches.  We should be protected by pool->look.  Bisection would help
> us a lot, I think.

Andrew picked up those patches in mm-unstable, which is included in
linux-next at some point IIUC, but the patches there don't all end up
in the next rc unless I am misunderstanding something here. These
patches should be headed to v6.9 AFAICT.

Actually, if I am not mistaken the patches were sent *after* v.6.8-rc5
was out, and it's not common for non-fixes to make it into rc releases
anyway, right?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  5:05           ` Yosry Ahmed
@ 2024-02-23  5:19             ` Sergey Senozhatsky
  0 siblings, 0 replies; 17+ messages in thread
From: Sergey Senozhatsky @ 2024-02-23  5:19 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Sergey Senozhatsky, Tetsuo Handa, Johannes Weiner, Nhat Pham,
	Minchan Kim, linux-mm

On (24/02/22 21:05), Yosry Ahmed wrote:
> > > > These are not in 6.8.0-rc5 anyway, right?
> > >
> > > I see them in next-20240223, which seems to be 6.8-rc6 (according to
> >                                                    ^ -rc5
> >
> > But they look more or less correct to me, so I'm not blaming those
> > patches.  We should be protected by pool->look.  Bisection would help
> > us a lot, I think.
> 
> Andrew picked up those patches in mm-unstable, which is included in
> linux-next at some point IIUC, but the patches there don't all end up
> in the next rc unless I am misunderstanding something here. These
> patches should be headed to v6.9 AFAICT.
> 
> Actually, if I am not mistaken the patches were sent *after* v.6.8-rc5
> was out, and it's not common for non-fixes to make it into rc releases
> anyway, right?

Oh, sorry, I realized that we talked about different 6.8-rc5.  I talked
about linux-next, not Linus's tree.  You are absolutely right, those patches
are not in Linus's 6.8-rc5 and are headed to 6.9, yes.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  4:48   ` Sergey Senozhatsky
  2024-02-23  4:50     ` Yosry Ahmed
@ 2024-02-23  5:23     ` Chengming Zhou
  2024-02-23  5:29       ` Sergey Senozhatsky
  2024-02-23  9:26       ` Tetsuo Handa
  1 sibling, 2 replies; 17+ messages in thread
From: Chengming Zhou @ 2024-02-23  5:23 UTC (permalink / raw)
  To: Tetsuo Handa, Sergey Senozhatsky, Yosry Ahmed
  Cc: Johannes Weiner, Nhat Pham, Minchan Kim, linux-mm

On 2024/2/23 12:48, Sergey Senozhatsky wrote:
> On (24/02/22 18:27), Yosry Ahmed wrote:
>> I also don't see any recent changes in mm/zsmalloc.c that modify this
>> code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
>> Sergey, I don't think zswap is an active actor in this bug report.
> 
> Yeah. [1] are the only recent zsmalloc patches I can recall, and those
> patches touch zsmalloc locking (zspages migration/compaction).
> 
> https://lore.kernel.org/lkml/20240219-b4-szmalloc-migrate-v1-0-34cd49c6545b@bytedance.com/
> 

I think these patches can't go into 6.8.0-rc5, right? So it maybe a bug
with the current code of zsmalloc (maybe zswap? I don't know).

Tetsuo, could you please check if the config has CONFIG_COMPACTION enabled?

Since the first patch of that series did fix a locking bug of migration:
(mm/zsmalloc: fix migrate_write_lock() when !CONFIG_COMPACTION)

Thanks.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  5:23     ` Chengming Zhou
@ 2024-02-23  5:29       ` Sergey Senozhatsky
  2024-02-23  9:26       ` Tetsuo Handa
  1 sibling, 0 replies; 17+ messages in thread
From: Sergey Senozhatsky @ 2024-02-23  5:29 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: Tetsuo Handa, Sergey Senozhatsky, Yosry Ahmed, Johannes Weiner,
	Nhat Pham, Minchan Kim, linux-mm

On (24/02/23 13:23), Chengming Zhou wrote:
> On 2024/2/23 12:48, Sergey Senozhatsky wrote:
> > On (24/02/22 18:27), Yosry Ahmed wrote:
> >> I also don't see any recent changes in mm/zsmalloc.c that modify this
> >> code, so maybe it wasn't introduce in 6.7. I will defer to Minchan and
> >> Sergey, I don't think zswap is an active actor in this bug report.
> > 
> > Yeah. [1] are the only recent zsmalloc patches I can recall, and those
> > patches touch zsmalloc locking (zspages migration/compaction).
> > 
> > https://lore.kernel.org/lkml/20240219-b4-szmalloc-migrate-v1-0-34cd49c6545b@bytedance.com/
> > 
> 
> I think these patches can't go into 6.8.0-rc5, right?

Only if 6.8-rc5 is linux-next.  But the report is (that was not
immediately apparent to me, somehow) for Linus's tree, so those
zsmalloc patches are out of any suspicions.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  5:23     ` Chengming Zhou
  2024-02-23  5:29       ` Sergey Senozhatsky
@ 2024-02-23  9:26       ` Tetsuo Handa
  2024-02-23 10:10         ` Chengming Zhou
  1 sibling, 1 reply; 17+ messages in thread
From: Tetsuo Handa @ 2024-02-23  9:26 UTC (permalink / raw)
  To: Chengming Zhou, Sergey Senozhatsky, Yosry Ahmed
  Cc: Johannes Weiner, Nhat Pham, Minchan Kim, linux-mm

On 2024/02/23 14:23, Chengming Zhou wrote:
> Tetsuo, could you please check if the config has CONFIG_COMPACTION enabled?

Yes, CONFIG_COMPACTION is enabled.

Also, I can observe this problem with 6.8.0-rc5-next-20240223.

----------------------------------------
[   54.589642][  T157] =====================================================
[   54.603721][  T157] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0
[   54.608092][  T157]  obj_malloc+0x6cc/0x7b0
[   54.610904][  T157]  zs_malloc+0xda2/0x12d0
[   54.613688][  T157]  zs_zpool_malloc+0xa5/0x1b0
[   54.619163][  T157]  zpool_malloc+0x113/0x150
[   54.624449][  T157]  zswap_compress+0x69b/0xbd0
[   54.629904][  T157]  zswap_store+0x1f24/0x2d00
[   54.635026][  T157]  swap_writepage+0x15b/0x4f0
[   54.640023][  T157]  pageout+0x3d4/0xeb0
[   54.644699][  T157]  shrink_folio_list+0x4d7f/0x7480
[   54.649867][  T157]  evict_folios+0x2160/0x52c0
[   54.654872][  T157]  try_to_shrink_lruvec+0x1cb/0x460
[   54.660074][  T157]  shrink_one+0x72d/0xeb0
[   54.664922][  T157]  shrink_many+0x70d/0x10c0
[   54.669849][  T157]  lru_gen_shrink_node+0x832/0xd10
[   54.675110][  T157]  shrink_node+0x13a/0x1dd0
[   54.680026][  T157]  balance_pgdat+0x1556/0x2740
[   54.685032][  T157]  kswapd+0x50d/0x870
[   54.689643][  T157]  kthread+0x485/0x600
[   54.694432][  T157]  ret_from_fork+0xfa/0x140
[   54.699305][  T157]  ret_from_fork_asm+0x11/0x20
[   54.704295][  T157] 
[   54.707905][  T157] Uninit was stored to memory at:
[   54.712837][  T157]  obj_malloc+0x70a/0x7b0
[   54.717434][  T157]  zs_malloc+0xda2/0x12d0
[   54.722009][  T157]  zs_zpool_malloc+0xa5/0x1b0
[   54.726806][  T157]  zpool_malloc+0x113/0x150
[   54.731507][  T157]  zswap_compress+0x69b/0xbd0
[   54.736299][  T157]  zswap_store+0x1f24/0x2d00
[   54.741081][  T157]  swap_writepage+0x15b/0x4f0
[   54.745880][  T157]  pageout+0x3d4/0xeb0
[   54.750386][  T157]  shrink_folio_list+0x4d7f/0x7480
[   54.755378][  T157]  evict_folios+0x2160/0x52c0
[   54.760153][  T157]  try_to_shrink_lruvec+0x1cb/0x460
[   54.765223][  T157]  shrink_one+0x72d/0xeb0
[   54.769870][  T157]  shrink_many+0x70d/0x10c0
[   54.774445][  T157]  lru_gen_shrink_node+0x832/0xd10
[   54.779221][  T157]  shrink_node+0x13a/0x1dd0
[   54.783965][  T157]  balance_pgdat+0x1556/0x2740
[   54.788702][  T157]  kswapd+0x50d/0x870
[   54.793073][  T157]  kthread+0x485/0x600
[   54.798253][  T157]  ret_from_fork+0xfa/0x140
[   54.804206][  T157]  ret_from_fork_asm+0x11/0x20
[   54.809016][  T157] 
[   54.812652][  T157] Uninit was created at:
[   54.817314][  T157]  free_unref_page_prepare+0x130/0xfc0
[   54.822499][  T157]  free_unref_page_list+0x13f/0x1130
[   54.828207][  T157]  shrink_folio_list+0x713e/0x7480
[   54.834143][  T157]  evict_folios+0x2160/0x52c0
[   54.839358][  T157]  try_to_shrink_lruvec+0x1cb/0x460
[   54.844628][  T157]  shrink_one+0x72d/0xeb0
[   54.849436][  T157]  shrink_many+0x70d/0x10c0
[   54.854310][  T157]  lru_gen_shrink_node+0x832/0xd10
[   54.859337][  T157]  shrink_node+0x13a/0x1dd0
[   54.864076][  T157]  shrink_zones+0x787/0x1530
[   54.868808][  T157]  do_try_to_free_pages+0x2ac/0x16a0
[   54.873865][  T157]  try_to_free_pages+0xddb/0x19b0
[   54.878795][  T157]  __alloc_pages_slowpath+0x1a05/0x2d00
[   54.883978][  T157]  __alloc_pages+0xc6c/0x1040
[   54.888802][  T157]  alloc_pages_mpol+0x477/0xc40
[   54.893629][  T157]  alloc_pages+0x224/0x240
[   54.898092][  T157]  pipe_write+0xae5/0x2bd0
[   54.902702][  T157]  vfs_write+0xfb9/0x1b90
[   54.907117][  T157]  ksys_write+0x275/0x500
[   54.911612][  T157]  __x64_sys_write+0xdf/0x120
[   54.916287][  T157]  do_syscall_64+0xd5/0x1c0
[   54.920782][  T157]  entry_SYSCALL_64_after_hwframe+0x62/0x6a
[   54.925972][  T157] 
[   54.929436][  T157] CPU: 4 PID: 157 Comm: kswapd1 Not tainted 6.8.0-rc5-next-20240223 #1
[   54.937592][  T157] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   54.946147][  T157] =====================================================
[   54.951772][  T157] Disabling lock debugging due to kernel taint
[   54.957040][  T157] Kernel panic - not syncing: kmsan.panic set ...
[   54.962443][  T157] CPU: 4 PID: 157 Comm: kswapd1 Tainted: G    B              6.8.0-rc5-next-20240223 #1
[   54.971295][  T157] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   54.979856][  T157] Call Trace:
[   54.983760][  T157]  <TASK>
[   54.987503][  T157]  dump_stack_lvl+0x24b/0x300
[   54.992068][  T157]  dump_stack+0x29/0x30
[   54.996373][  T157]  panic+0x4ed/0xca0
[   55.000656][  T157]  kmsan_report+0x2d1/0x2e0
[   55.005155][  T157]  ? kmem_cache_alloc+0x707/0xf50
[   55.009909][  T157]  ? kmsan_internal_poison_memory+0x7d/0x90
[   55.015056][  T157]  ? kmsan_internal_poison_memory+0x49/0x90
[   55.020253][  T157]  ? kmsan_slab_alloc+0xdf/0x160
[   55.024995][  T157]  ? __msan_warning+0x91/0x120
[   55.029604][  T157]  ? obj_malloc+0x6cc/0x7b0
[   55.034166][  T157]  ? zs_malloc+0xda2/0x12d0
[   55.038692][  T157]  ? zs_zpool_malloc+0xa5/0x1b0
[   55.043342][  T157]  ? zpool_malloc+0x113/0x150
[   55.047909][  T157]  ? zswap_compress+0x69b/0xbd0
[   55.052576][  T157]  ? zswap_store+0x1f24/0x2d00
[   55.057213][  T157]  ? swap_writepage+0x15b/0x4f0
[   55.061836][  T157]  ? pageout+0x3d4/0xeb0
[   55.066216][  T157]  ? shrink_folio_list+0x4d7f/0x7480
[   55.071083][  T157]  ? evict_folios+0x2160/0x52c0
[   55.075734][  T157]  ? try_to_shrink_lruvec+0x1cb/0x460
[   55.080625][  T157]  ? shrink_one+0x72d/0xeb0
[   55.085139][  T157]  ? shrink_many+0x70d/0x10c0
[   55.089752][  T157]  ? lru_gen_shrink_node+0x832/0xd10
[   55.094614][  T157]  ? shrink_node+0x13a/0x1dd0
[   55.099188][  T157]  ? balance_pgdat+0x1556/0x2740
[   55.103891][  T157]  ? kswapd+0x50d/0x870
[   55.108212][  T157]  ? kthread+0x485/0x600
[   55.112459][  T157]  ? ret_from_fork+0xfa/0x140
[   55.116849][  T157]  ? ret_from_fork_asm+0x11/0x20
[   55.121438][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   55.126388][  T157]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   55.131446][  T157]  ? should_fail_ex+0x91/0xa20
[   55.136530][  T157]  ? kmsan_get_metadata+0x146/0x1c0
[   55.141199][  T157]  ? kmsan_get_metadata+0x146/0x1c0
[   55.145956][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   55.150928][  T157]  ? __should_failslab+0x24f/0x2e0
[   55.155750][  T157]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   55.161123][  T157]  ? __should_failslab+0x24f/0x2e0
[   55.165918][  T157]  ? kmsan_get_metadata+0x146/0x1c0
[   55.170723][  T157]  ? kmsan_get_metadata+0x146/0x1c0
[   55.175568][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   55.180684][  T157]  __msan_warning+0x91/0x120
[   55.185066][  T157]  obj_malloc+0x6cc/0x7b0
[   55.189320][  T157]  ? kmsan_get_metadata+0x146/0x1c0
[   55.194051][  T157]  zs_malloc+0xda2/0x12d0
[   55.198333][  T157]  zs_zpool_malloc+0xa5/0x1b0
[   55.202886][  T157]  ? zs_zpool_destroy+0x50/0x50
[   55.207378][  T157]  zpool_malloc+0x113/0x150
[   55.211829][  T157]  zswap_compress+0x69b/0xbd0
[   55.216298][  T157]  zswap_store+0x1f24/0x2d00
[   55.220727][  T157]  swap_writepage+0x15b/0x4f0
[   55.225186][  T157]  ? generic_swapfile_activate+0xed0/0xed0
[   55.230120][  T157]  pageout+0x3d4/0xeb0
[   55.234272][  T157]  shrink_folio_list+0x4d7f/0x7480
[   55.239002][  T157]  evict_folios+0x2160/0x52c0
[   55.243455][  T157]  try_to_shrink_lruvec+0x1cb/0x460
[   55.248119][  T157]  shrink_one+0x72d/0xeb0
[   55.252389][  T157]  shrink_many+0x70d/0x10c0
[   55.257702][  T157]  lru_gen_shrink_node+0x832/0xd10
[   55.262478][  T157]  shrink_node+0x13a/0x1dd0
[   55.266848][  T157]  ? mem_cgroup_soft_limit_reclaim+0x34/0x17b0
[   55.271983][  T157]  ? filter_irq_stacks+0xb9/0x230
[   55.276677][  T157]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   55.281724][  T157]  ? kswapd_age_node+0x63/0xb00
[   55.286322][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   55.291458][  T157]  balance_pgdat+0x1556/0x2740
[   55.295936][  T157]  ? finish_wait+0x2f1/0x4a0
[   55.300332][  T157]  kswapd+0x50d/0x870
[   55.304457][  T157]  kthread+0x485/0x600
[   55.308674][  T157]  ? shrink_all_memory+0x3a0/0x3a0
[   55.313311][  T157]  ? kthread_blkcg+0x120/0x120
[   55.317805][  T157]  ret_from_fork+0xfa/0x140
[   55.322138][  T157]  ? kthread_blkcg+0x120/0x120
[   55.326615][  T157]  ? kthread_blkcg+0x120/0x120
[   55.331198][  T157]  ret_from_fork_asm+0x11/0x20
[   55.335679][  T157]  </TASK>
[   56.470556][  T157] Shutting down cpus with NMI
[   56.474684][  T157] Kernel Offset: disabled
[   56.478285][  T157] Rebooting in 10 seconds..
----------------------------------------

----------------------------------------
ubuntu login: [   42.392666][  T155] =====================================================
[   42.398208][  T155] BUG: KMSAN: use-after-free in lzo1x_decompress_safe+0x433/0x3930
[   42.408589][  T155]  lzo1x_decompress_safe+0x433/0x3930
[   42.416017][  T155]  lzo_sdecompress+0x119/0x220
[   42.427324][  T155]  scomp_acomp_comp_decomp+0x65b/0xa10
[   42.439258][  T155]  scomp_acomp_decompress+0x4e/0x60
[   42.449860][  T155]  zswap_decompress+0x618/0xa50
[   42.459372][  T155]  zswap_writeback_entry+0x6c0/0xaa0
[   42.468643][  T155]  shrink_memcg_cb+0x3e8/0x870
[   42.474589][  T155]  __list_lru_walk_one+0x4ee/0xf00
[   42.477891][  T155]  list_lru_walk_one+0x1f6/0x250
[   42.481171][  T155]  zswap_shrinker_scan+0x46b/0x760
[   42.484544][  T155]  do_shrink_slab+0x958/0x1750
[   42.487742][  T155]  shrink_slab_memcg+0x6ae/0xea0
[   42.491686][  T155]  shrink_slab+0x119/0x7c0
[   42.496077][  T155]  shrink_one+0x835/0xeb0
[   42.500477][  T155]  shrink_many+0x70d/0x10c0
[   42.504933][  T155]  lru_gen_shrink_node+0x832/0xd10
[   42.508651][  T155]  shrink_node+0x13a/0x1dd0
[   42.512056][  T155]  balance_pgdat+0x1556/0x2740
[   42.515294][  T155]  kswapd+0x50d/0x870
[   42.518245][  T155]  kthread+0x485/0x600
[   42.521178][  T155]  ret_from_fork+0xfa/0x140
[   42.524242][  T155]  ret_from_fork_asm+0x11/0x20
[   42.527444][  T155] 
[   42.529916][  T155] Uninit was stored to memory at:
[   42.533147][  T155]  scatterwalk_map_and_copy+0x8b5/0xb50
[   42.536505][  T155]  scomp_acomp_comp_decomp+0x45c/0xa10
[   42.539860][  T155]  scomp_acomp_decompress+0x4e/0x60
[   42.543099][  T155]  zswap_decompress+0x618/0xa50
[   42.546244][  T155]  zswap_writeback_entry+0x6c0/0xaa0
[   42.549525][  T155]  shrink_memcg_cb+0x3e8/0x870
[   42.552652][  T155]  __list_lru_walk_one+0x4ee/0xf00
[   42.555890][  T155]  list_lru_walk_one+0x1f6/0x250
[   42.567920][  T155]  zswap_shrinker_scan+0x46b/0x760
[   42.578533][  T155]  do_shrink_slab+0x958/0x1750
[   42.587474][  T155]  shrink_slab_memcg+0x6ae/0xea0
[   42.591733][  T155]  shrink_slab+0x119/0x7c0
[   42.595698][  T155]  shrink_one+0x835/0xeb0
[   42.599604][  T155]  shrink_many+0x70d/0x10c0
[   42.603671][  T155]  lru_gen_shrink_node+0x832/0xd10
[   42.608028][  T155]  shrink_node+0x13a/0x1dd0
[   42.612164][  T155]  balance_pgdat+0x1556/0x2740
[   42.616458][  T155]  kswapd+0x50d/0x870
[   42.620420][  T155]  kthread+0x485/0x600
[   42.624380][  T155]  ret_from_fork+0xfa/0x140
[   42.628490][  T155]  ret_from_fork_asm+0x11/0x20
[   42.632693][  T155] 
[   42.635854][  T155] Uninit was stored to memory at:
[   42.640219][  T155]  zswap_decompress+0x299/0xa50
[   42.644446][  T155]  zswap_writeback_entry+0x6c0/0xaa0
[   42.648922][  T155]  shrink_memcg_cb+0x3e8/0x870
[   42.653115][  T155]  __list_lru_walk_one+0x4ee/0xf00
[   42.657464][  T155]  list_lru_walk_one+0x1f6/0x250
[   42.661710][  T155]  zswap_shrinker_scan+0x46b/0x760
[   42.666078][  T155]  do_shrink_slab+0x958/0x1750
[   42.670389][  T155]  shrink_slab_memcg+0x6ae/0xea0
[   42.679819][  T155]  shrink_slab+0x119/0x7c0
[   42.688501][  T155]  shrink_one+0x835/0xeb0
[   42.697021][  T155]  shrink_many+0x70d/0x10c0
[   42.705719][  T155]  lru_gen_shrink_node+0x832/0xd10
[   42.715345][  T155]  shrink_node+0x13a/0x1dd0
[   42.724486][  T155]  balance_pgdat+0x1556/0x2740
[   42.733580][  T155]  kswapd+0x50d/0x870
[   42.742222][  T155]  kthread+0x485/0x600
[   42.751032][  T155]  ret_from_fork+0xfa/0x140
[   42.760274][  T155]  ret_from_fork_asm+0x11/0x20
[   42.769826][  T155] 
[   42.776910][  T155] Uninit was created at:
[   42.785419][  T155]  free_unref_page_prepare+0x130/0xfc0
[   42.795890][  T155]  free_unref_page_list+0x13f/0x1130
[   42.806224][  T155]  shrink_folio_list+0x713e/0x7480
[   42.815480][  T155]  evict_folios+0x2160/0x52c0
[   42.819727][  T155]  try_to_shrink_lruvec+0x1cb/0x460
[   42.824366][  T155]  shrink_one+0x72d/0xeb0
[   42.834077][  T155]  shrink_many+0x70d/0x10c0
[   42.844079][  T155]  lru_gen_shrink_node+0x832/0xd10
[   42.854215][  T155]  shrink_node+0x13a/0x1dd0
[   42.863639][  T155]  shrink_zones+0x787/0x1530
[   42.873152][  T155]  do_try_to_free_pages+0x2ac/0x16a0
[   42.877447][  T155]  try_to_free_pages+0xddb/0x19b0
[   42.880694][  T155]  __alloc_pages_slowpath+0x1a05/0x2d00
[   42.884052][  T155]  __alloc_pages+0xc6c/0x1040
[   42.887180][  T155]  alloc_pages_mpol+0x477/0xc40
[   42.890362][  T155]  alloc_pages+0x224/0x240
[   42.893479][  T155]  pipe_write+0xae5/0x2bd0
[   42.896519][  T155]  vfs_write+0xfb9/0x1b90
[   42.899562][  T155]  ksys_write+0x275/0x500
[   42.902616][  T155]  __x64_sys_write+0xdf/0x120
[   42.906154][  T155]  do_syscall_64+0xd5/0x1c0
[   42.909855][  T155]  entry_SYSCALL_64_after_hwframe+0x62/0x6a
[   42.913670][  T155] 
[   42.916155][  T155] CPU: 5 PID: 155 Comm: kswapd1 Not tainted 6.8.0-rc5-next-20240223 #1
[   42.921857][  T155] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   42.927627][  T155] =====================================================
[   42.931409][  T155] Disabling lock debugging due to kernel taint
[   42.934961][  T155] Kernel panic - not syncing: kmsan.panic set ...
[   42.938569][  T155] CPU: 5 PID: 155 Comm: kswapd1 Tainted: G    B              6.8.0-rc5-next-20240223 #1
[   42.944533][  T155] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   42.950295][  T155] Call Trace:
[   42.952978][  T155]  <TASK>
[   42.955503][  T155]  dump_stack_lvl+0x24b/0x300
[   42.960955][  T155]  dump_stack+0x29/0x30
[   42.969770][  T155]  panic+0x4ed/0xca0
[   42.978544][  T155]  kmsan_report+0x2d1/0x2e0
[   42.987874][  T155]  ? __msan_warning+0x91/0x120
[   42.997602][  T155]  ? lzo1x_decompress_safe+0x433/0x3930
[   43.004774][  T155]  ? lzo_sdecompress+0x119/0x220
[   43.008069][  T155]  ? scomp_acomp_comp_decomp+0x65b/0xa10
[   43.011551][  T155]  ? scomp_acomp_decompress+0x4e/0x60
[   43.014956][  T155]  ? zswap_decompress+0x618/0xa50
[   43.018249][  T155]  ? zswap_writeback_entry+0x6c0/0xaa0
[   43.021719][  T155]  ? shrink_memcg_cb+0x3e8/0x870
[   43.025111][  T155]  ? __list_lru_walk_one+0x4ee/0xf00
[   43.028534][  T155]  ? list_lru_walk_one+0x1f6/0x250
[   43.031877][  T155]  ? zswap_shrinker_scan+0x46b/0x760
[   43.035452][  T155]  ? do_shrink_slab+0x958/0x1750
[   43.038777][  T155]  ? shrink_slab_memcg+0x6ae/0xea0
[   43.042092][  T155]  ? shrink_slab+0x119/0x7c0
[   43.045255][  T155]  ? shrink_one+0x835/0xeb0
[   43.048375][  T155]  ? shrink_many+0x70d/0x10c0
[   43.051557][  T155]  ? lru_gen_shrink_node+0x832/0xd10
[   43.054932][  T155]  ? shrink_node+0x13a/0x1dd0
[   43.058106][  T155]  ? balance_pgdat+0x1556/0x2740
[   43.061387][  T155]  ? kswapd+0x50d/0x870
[   43.064386][  T155]  ? kthread+0x485/0x600
[   43.067433][  T155]  ? ret_from_fork+0xfa/0x140
[   43.075042][  T155]  ? ret_from_fork_asm+0x11/0x20
[   43.084182][  T155]  ? shrink_one+0x835/0xeb0
[   43.093150][  T155]  ? shrink_many+0x70d/0x10c0
[   43.102332][  T155]  ? lru_gen_shrink_node+0x832/0xd10
[   43.112061][  T155]  ? shrink_node+0x13a/0x1dd0
[   43.121464][  T155]  ? balance_pgdat+0x1556/0x2740
[   43.131127][  T155]  ? kswapd+0x50d/0x870
[   43.140045][  T155]  ? kthread+0x485/0x600
[   43.148993][  T155]  ? ret_from_fork+0xfa/0x140
[   43.158046][  T155]  ? ret_from_fork_asm+0x11/0x20
[   43.166985][  T155]  ? kmsan_internal_set_shadow_origin+0x66/0xe0
[   43.177320][  T155]  ? kmsan_get_metadata+0x146/0x1c0
[   43.187138][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   43.198259][  T155]  ? scatterwalk_map_and_copy+0xaa/0xb50
[   43.209263][  T155]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   43.220485][  T155]  ? filter_irq_stacks+0xb9/0x230
[   43.230534][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   43.240946][  T155]  __msan_warning+0x91/0x120
[   43.249737][  T155]  lzo1x_decompress_safe+0x433/0x3930
[   43.259433][  T155]  ? filter_irq_stacks+0xb9/0x230
[   43.268747][  T155]  ? kmsan_internal_set_shadow_origin+0x66/0xe0
[   43.278890][  T155]  ? kmsan_get_metadata+0x146/0x1c0
[   43.288092][  T155]  lzo_sdecompress+0x119/0x220
[   43.296809][  T155]  ? lzo_scompress+0x250/0x250
[   43.305573][  T155]  scomp_acomp_comp_decomp+0x65b/0xa10
[   43.315139][  T155]  scomp_acomp_decompress+0x4e/0x60
[   43.324453][  T155]  ? scomp_acomp_compress+0x60/0x60
[   43.334172][  T155]  zswap_decompress+0x618/0xa50
[   43.343444][  T155]  zswap_writeback_entry+0x6c0/0xaa0
[   43.353130][  T155]  shrink_memcg_cb+0x3e8/0x870
[   43.362321][  T155]  __list_lru_walk_one+0x4ee/0xf00
[   43.371873][  T155]  ? zswap_shrinker_count+0x670/0x670
[   43.381677][  T155]  ? __msan_metadata_ptr_for_load_1+0x24/0x40
[   43.392255][  T155]  list_lru_walk_one+0x1f6/0x250
[   43.401742][  T155]  ? zswap_shrinker_count+0x670/0x670
[   43.411756][  T155]  zswap_shrinker_scan+0x46b/0x760
[   43.421682][  T155]  ? zswap_debugfs_init+0x420/0x420
[   43.432130][  T155]  do_shrink_slab+0x958/0x1750
[   43.436685][  T155]  shrink_slab_memcg+0x6ae/0xea0
[   43.441009][  T155]  shrink_slab+0x119/0x7c0
[   43.446049][  T155]  ? try_to_shrink_lruvec+0x42c/0x460
[   43.451031][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   43.456008][  T155]  shrink_one+0x835/0xeb0
[   43.460383][  T155]  shrink_many+0x70d/0x10c0
[   43.464760][  T155]  lru_gen_shrink_node+0x832/0xd10
[   43.469331][  T155]  shrink_node+0x13a/0x1dd0
[   43.473657][  T155]  ? mem_cgroup_soft_limit_reclaim+0x34/0x17b0
[   43.478672][  T155]  ? filter_irq_stacks+0xb9/0x230
[   43.485985][  T155]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   43.497004][  T155]  ? kswapd_age_node+0x63/0xb00
[   43.506282][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   43.516209][  T155]  balance_pgdat+0x1556/0x2740
[   43.570854][  T155]  ? finish_wait+0x2f1/0x4a0
[   43.580532][  T155]  kswapd+0x50d/0x870
[   43.589396][  T155]  kthread+0x485/0x600
[   43.598259][  T155]  ? shrink_all_memory+0x3a0/0x3a0
[   43.608549][  T155]  ? kthread_blkcg+0x120/0x120
[   43.618041][  T155]  ret_from_fork+0xfa/0x140
[   43.627225][  T155]  ? kthread_blkcg+0x120/0x120
[   43.636721][  T155]  ? kthread_blkcg+0x120/0x120
[   43.646141][  T155]  ret_from_fork_asm+0x11/0x20
[   43.655612][  T155]  </TASK>
[   44.788328][  T155] Shutting down cpus with NMI
[   44.792527][  T155] Kernel Offset: disabled
[   44.795640][  T155] Rebooting in 10 seconds..
----------------------------------------



Maybe a different cause, but I feel that frequency of hitting "corrupted stack end detected
inside scheduler" problem has increased in linux-next.git compared to linux.git .
Too much stack usage?

----------------------------------------
ubuntu login: [   53.757790][  T194] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[   53.784397][  T194] CPU: 3 PID: 194 Comm: kworker/u39:3 Not tainted 6.8.0-rc5-next-20240223 #1
[   53.810595][  T194] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   53.829184][  T194] Workqueue: writeback wb_workfn (flush-8:0)
[   53.835445][  T194] Call Trace:
[   53.839997][  T194]  <TASK>
[   53.844176][  T194]  dump_stack_lvl+0x24b/0x300
[   53.849323][  T194]  dump_stack+0x29/0x30
[   53.854261][  T194]  panic+0x4ed/0xca0
[   53.858938][  T194]  ? kmsan_get_metadata+0x50/0x1c0
[   53.864326][  T194]  __schedule+0x9e4/0x2770
[   53.883521][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   53.910719][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   53.936505][  T194]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   53.964199][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   53.989716][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.015124][  T194]  __cond_resched+0x50/0xc0
[   54.038931][  T194]  rmap_walk_file+0x382/0x8d0
[   54.066110][  T194]  folio_mkclean+0x34d/0x530
[   54.089049][  T194]  ? folio_mkclean+0x530/0x530
[   54.117183][  T194]  ? page_mkclean_one+0x3f0/0x3f0
[   54.135476][  T194]  folio_clear_dirty_for_io+0x22a/0xae0
[   54.144905][  T194]  ? filemap_get_folios_tag+0x64a/0x6c0
[   54.155053][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.165436][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   54.175572][  T194]  mpage_submit_folio+0x12a/0x5d0
[   54.186797][  T194]  ext4_do_writepages+0x3401/0x63d0
[   54.193608][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.206517][  T194]  ext4_writepages+0x338/0x870
[   54.234367][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.243997][  T194]  ? ext4_read_folio+0x440/0x440
[   54.271561][  T194]  do_writepages+0x5e5/0x15c0
[   54.287149][  T194]  ? wake_up_bit+0x9c/0x490
[   54.297127][  T194]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   54.318153][  T194]  ? filter_irq_stacks+0xb9/0x230
[   54.326784][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.343015][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   54.349741][  T194]  __writeback_single_inode+0x170/0x1090
[   54.356296][  T194]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
[   54.364305][  T194]  writeback_sb_inodes+0xd74/0x1e20
[   54.371317][  T194]  ? kmsan_internal_set_shadow_origin+0x66/0xe0
[   54.378719][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.385263][  T194]  __writeback_inodes_wb+0x1d6/0x510
[   54.391720][  T194]  wb_writeback+0x63e/0xff0
[   54.399899][  T194]  ? stack_depot_save_flags+0x2c/0x6f0
[   54.408778][  T194]  ? kmsan_internal_set_shadow_origin+0x60/0xe0
[   54.439971][  T194]  wb_do_writeback+0x120b/0x1510
[   54.467029][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.494644][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.512469][  T194]  wb_workfn+0x190/0x850
[   54.537678][  T194]  ? kmsan_get_metadata+0x146/0x1c0
[   54.565645][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
[   54.595256][  T194]  ? inode_wait_for_writeback+0x320/0x320
[   54.609201][  T194]  process_one_work+0xa0c/0x1c60
[   54.614993][  T194]  worker_thread+0x11f2/0x1ba0
[   54.620515][  T194]  kthread+0x485/0x600
[   54.625631][  T194]  ? pr_cont_work+0xee0/0xee0
[   54.630919][  T194]  ? kthread_blkcg+0x120/0x120
[   54.636291][  T194]  ret_from_fork+0xfa/0x140
[   54.641771][  T194]  ? kthread_blkcg+0x120/0x120
[   54.647279][  T194]  ? kthread_blkcg+0x120/0x120
[   54.652705][  T194]  ret_from_fork_asm+0x11/0x20
[   54.658127][  T194]  </TASK>
[   54.683905][  T194] Kernel Offset: disabled
[   54.688874][  T194] Rebooting in 10 seconds..
----------------------------------------



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  9:26       ` Tetsuo Handa
@ 2024-02-23 10:10         ` Chengming Zhou
  0 siblings, 0 replies; 17+ messages in thread
From: Chengming Zhou @ 2024-02-23 10:10 UTC (permalink / raw)
  To: Tetsuo Handa, Sergey Senozhatsky, Yosry Ahmed
  Cc: Johannes Weiner, Nhat Pham, Minchan Kim, linux-mm

On 2024/2/23 17:26, Tetsuo Handa wrote:
> On 2024/02/23 14:23, Chengming Zhou wrote:
>> Tetsuo, could you please check if the config has CONFIG_COMPACTION enabled?
> 
> Yes, CONFIG_COMPACTION is enabled.
> 
> Also, I can observe this problem with 6.8.0-rc5-next-20240223.

Ok, from the report it seems UAF of the zspage? which is allocated from slab.
I have no idea of the reason.

Maybe it's better to run a bisect, as suggested by Sergey.

Thanks.

> 
> ----------------------------------------
> [   54.589642][  T157] =====================================================
> [   54.603721][  T157] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0
> [   54.608092][  T157]  obj_malloc+0x6cc/0x7b0
> [   54.610904][  T157]  zs_malloc+0xda2/0x12d0
> [   54.613688][  T157]  zs_zpool_malloc+0xa5/0x1b0
> [   54.619163][  T157]  zpool_malloc+0x113/0x150
> [   54.624449][  T157]  zswap_compress+0x69b/0xbd0
> [   54.629904][  T157]  zswap_store+0x1f24/0x2d00
> [   54.635026][  T157]  swap_writepage+0x15b/0x4f0
> [   54.640023][  T157]  pageout+0x3d4/0xeb0
> [   54.644699][  T157]  shrink_folio_list+0x4d7f/0x7480
> [   54.649867][  T157]  evict_folios+0x2160/0x52c0
> [   54.654872][  T157]  try_to_shrink_lruvec+0x1cb/0x460
> [   54.660074][  T157]  shrink_one+0x72d/0xeb0
> [   54.664922][  T157]  shrink_many+0x70d/0x10c0
> [   54.669849][  T157]  lru_gen_shrink_node+0x832/0xd10
> [   54.675110][  T157]  shrink_node+0x13a/0x1dd0
> [   54.680026][  T157]  balance_pgdat+0x1556/0x2740
> [   54.685032][  T157]  kswapd+0x50d/0x870
> [   54.689643][  T157]  kthread+0x485/0x600
> [   54.694432][  T157]  ret_from_fork+0xfa/0x140
> [   54.699305][  T157]  ret_from_fork_asm+0x11/0x20
> [   54.704295][  T157] 
> [   54.707905][  T157] Uninit was stored to memory at:
> [   54.712837][  T157]  obj_malloc+0x70a/0x7b0
> [   54.717434][  T157]  zs_malloc+0xda2/0x12d0
> [   54.722009][  T157]  zs_zpool_malloc+0xa5/0x1b0
> [   54.726806][  T157]  zpool_malloc+0x113/0x150
> [   54.731507][  T157]  zswap_compress+0x69b/0xbd0
> [   54.736299][  T157]  zswap_store+0x1f24/0x2d00
> [   54.741081][  T157]  swap_writepage+0x15b/0x4f0
> [   54.745880][  T157]  pageout+0x3d4/0xeb0
> [   54.750386][  T157]  shrink_folio_list+0x4d7f/0x7480
> [   54.755378][  T157]  evict_folios+0x2160/0x52c0
> [   54.760153][  T157]  try_to_shrink_lruvec+0x1cb/0x460
> [   54.765223][  T157]  shrink_one+0x72d/0xeb0
> [   54.769870][  T157]  shrink_many+0x70d/0x10c0
> [   54.774445][  T157]  lru_gen_shrink_node+0x832/0xd10
> [   54.779221][  T157]  shrink_node+0x13a/0x1dd0
> [   54.783965][  T157]  balance_pgdat+0x1556/0x2740
> [   54.788702][  T157]  kswapd+0x50d/0x870
> [   54.793073][  T157]  kthread+0x485/0x600
> [   54.798253][  T157]  ret_from_fork+0xfa/0x140
> [   54.804206][  T157]  ret_from_fork_asm+0x11/0x20
> [   54.809016][  T157] 
> [   54.812652][  T157] Uninit was created at:
> [   54.817314][  T157]  free_unref_page_prepare+0x130/0xfc0
> [   54.822499][  T157]  free_unref_page_list+0x13f/0x1130
> [   54.828207][  T157]  shrink_folio_list+0x713e/0x7480
> [   54.834143][  T157]  evict_folios+0x2160/0x52c0
> [   54.839358][  T157]  try_to_shrink_lruvec+0x1cb/0x460
> [   54.844628][  T157]  shrink_one+0x72d/0xeb0
> [   54.849436][  T157]  shrink_many+0x70d/0x10c0
> [   54.854310][  T157]  lru_gen_shrink_node+0x832/0xd10
> [   54.859337][  T157]  shrink_node+0x13a/0x1dd0
> [   54.864076][  T157]  shrink_zones+0x787/0x1530
> [   54.868808][  T157]  do_try_to_free_pages+0x2ac/0x16a0
> [   54.873865][  T157]  try_to_free_pages+0xddb/0x19b0
> [   54.878795][  T157]  __alloc_pages_slowpath+0x1a05/0x2d00
> [   54.883978][  T157]  __alloc_pages+0xc6c/0x1040
> [   54.888802][  T157]  alloc_pages_mpol+0x477/0xc40
> [   54.893629][  T157]  alloc_pages+0x224/0x240
> [   54.898092][  T157]  pipe_write+0xae5/0x2bd0
> [   54.902702][  T157]  vfs_write+0xfb9/0x1b90
> [   54.907117][  T157]  ksys_write+0x275/0x500
> [   54.911612][  T157]  __x64_sys_write+0xdf/0x120
> [   54.916287][  T157]  do_syscall_64+0xd5/0x1c0
> [   54.920782][  T157]  entry_SYSCALL_64_after_hwframe+0x62/0x6a
> [   54.925972][  T157] 
> [   54.929436][  T157] CPU: 4 PID: 157 Comm: kswapd1 Not tainted 6.8.0-rc5-next-20240223 #1
> [   54.937592][  T157] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [   54.946147][  T157] =====================================================
> [   54.951772][  T157] Disabling lock debugging due to kernel taint
> [   54.957040][  T157] Kernel panic - not syncing: kmsan.panic set ...
> [   54.962443][  T157] CPU: 4 PID: 157 Comm: kswapd1 Tainted: G    B              6.8.0-rc5-next-20240223 #1
> [   54.971295][  T157] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [   54.979856][  T157] Call Trace:
> [   54.983760][  T157]  <TASK>
> [   54.987503][  T157]  dump_stack_lvl+0x24b/0x300
> [   54.992068][  T157]  dump_stack+0x29/0x30
> [   54.996373][  T157]  panic+0x4ed/0xca0
> [   55.000656][  T157]  kmsan_report+0x2d1/0x2e0
> [   55.005155][  T157]  ? kmem_cache_alloc+0x707/0xf50
> [   55.009909][  T157]  ? kmsan_internal_poison_memory+0x7d/0x90
> [   55.015056][  T157]  ? kmsan_internal_poison_memory+0x49/0x90
> [   55.020253][  T157]  ? kmsan_slab_alloc+0xdf/0x160
> [   55.024995][  T157]  ? __msan_warning+0x91/0x120
> [   55.029604][  T157]  ? obj_malloc+0x6cc/0x7b0
> [   55.034166][  T157]  ? zs_malloc+0xda2/0x12d0
> [   55.038692][  T157]  ? zs_zpool_malloc+0xa5/0x1b0
> [   55.043342][  T157]  ? zpool_malloc+0x113/0x150
> [   55.047909][  T157]  ? zswap_compress+0x69b/0xbd0
> [   55.052576][  T157]  ? zswap_store+0x1f24/0x2d00
> [   55.057213][  T157]  ? swap_writepage+0x15b/0x4f0
> [   55.061836][  T157]  ? pageout+0x3d4/0xeb0
> [   55.066216][  T157]  ? shrink_folio_list+0x4d7f/0x7480
> [   55.071083][  T157]  ? evict_folios+0x2160/0x52c0
> [   55.075734][  T157]  ? try_to_shrink_lruvec+0x1cb/0x460
> [   55.080625][  T157]  ? shrink_one+0x72d/0xeb0
> [   55.085139][  T157]  ? shrink_many+0x70d/0x10c0
> [   55.089752][  T157]  ? lru_gen_shrink_node+0x832/0xd10
> [   55.094614][  T157]  ? shrink_node+0x13a/0x1dd0
> [   55.099188][  T157]  ? balance_pgdat+0x1556/0x2740
> [   55.103891][  T157]  ? kswapd+0x50d/0x870
> [   55.108212][  T157]  ? kthread+0x485/0x600
> [   55.112459][  T157]  ? ret_from_fork+0xfa/0x140
> [   55.116849][  T157]  ? ret_from_fork_asm+0x11/0x20
> [   55.121438][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   55.126388][  T157]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   55.131446][  T157]  ? should_fail_ex+0x91/0xa20
> [   55.136530][  T157]  ? kmsan_get_metadata+0x146/0x1c0
> [   55.141199][  T157]  ? kmsan_get_metadata+0x146/0x1c0
> [   55.145956][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   55.150928][  T157]  ? __should_failslab+0x24f/0x2e0
> [   55.155750][  T157]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   55.161123][  T157]  ? __should_failslab+0x24f/0x2e0
> [   55.165918][  T157]  ? kmsan_get_metadata+0x146/0x1c0
> [   55.170723][  T157]  ? kmsan_get_metadata+0x146/0x1c0
> [   55.175568][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   55.180684][  T157]  __msan_warning+0x91/0x120
> [   55.185066][  T157]  obj_malloc+0x6cc/0x7b0
> [   55.189320][  T157]  ? kmsan_get_metadata+0x146/0x1c0
> [   55.194051][  T157]  zs_malloc+0xda2/0x12d0
> [   55.198333][  T157]  zs_zpool_malloc+0xa5/0x1b0
> [   55.202886][  T157]  ? zs_zpool_destroy+0x50/0x50
> [   55.207378][  T157]  zpool_malloc+0x113/0x150
> [   55.211829][  T157]  zswap_compress+0x69b/0xbd0
> [   55.216298][  T157]  zswap_store+0x1f24/0x2d00
> [   55.220727][  T157]  swap_writepage+0x15b/0x4f0
> [   55.225186][  T157]  ? generic_swapfile_activate+0xed0/0xed0
> [   55.230120][  T157]  pageout+0x3d4/0xeb0
> [   55.234272][  T157]  shrink_folio_list+0x4d7f/0x7480
> [   55.239002][  T157]  evict_folios+0x2160/0x52c0
> [   55.243455][  T157]  try_to_shrink_lruvec+0x1cb/0x460
> [   55.248119][  T157]  shrink_one+0x72d/0xeb0
> [   55.252389][  T157]  shrink_many+0x70d/0x10c0
> [   55.257702][  T157]  lru_gen_shrink_node+0x832/0xd10
> [   55.262478][  T157]  shrink_node+0x13a/0x1dd0
> [   55.266848][  T157]  ? mem_cgroup_soft_limit_reclaim+0x34/0x17b0
> [   55.271983][  T157]  ? filter_irq_stacks+0xb9/0x230
> [   55.276677][  T157]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   55.281724][  T157]  ? kswapd_age_node+0x63/0xb00
> [   55.286322][  T157]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   55.291458][  T157]  balance_pgdat+0x1556/0x2740
> [   55.295936][  T157]  ? finish_wait+0x2f1/0x4a0
> [   55.300332][  T157]  kswapd+0x50d/0x870
> [   55.304457][  T157]  kthread+0x485/0x600
> [   55.308674][  T157]  ? shrink_all_memory+0x3a0/0x3a0
> [   55.313311][  T157]  ? kthread_blkcg+0x120/0x120
> [   55.317805][  T157]  ret_from_fork+0xfa/0x140
> [   55.322138][  T157]  ? kthread_blkcg+0x120/0x120
> [   55.326615][  T157]  ? kthread_blkcg+0x120/0x120
> [   55.331198][  T157]  ret_from_fork_asm+0x11/0x20
> [   55.335679][  T157]  </TASK>
> [   56.470556][  T157] Shutting down cpus with NMI
> [   56.474684][  T157] Kernel Offset: disabled
> [   56.478285][  T157] Rebooting in 10 seconds..
> ----------------------------------------
> 
> ----------------------------------------
> ubuntu login: [   42.392666][  T155] =====================================================
> [   42.398208][  T155] BUG: KMSAN: use-after-free in lzo1x_decompress_safe+0x433/0x3930
> [   42.408589][  T155]  lzo1x_decompress_safe+0x433/0x3930
> [   42.416017][  T155]  lzo_sdecompress+0x119/0x220
> [   42.427324][  T155]  scomp_acomp_comp_decomp+0x65b/0xa10
> [   42.439258][  T155]  scomp_acomp_decompress+0x4e/0x60
> [   42.449860][  T155]  zswap_decompress+0x618/0xa50
> [   42.459372][  T155]  zswap_writeback_entry+0x6c0/0xaa0
> [   42.468643][  T155]  shrink_memcg_cb+0x3e8/0x870
> [   42.474589][  T155]  __list_lru_walk_one+0x4ee/0xf00
> [   42.477891][  T155]  list_lru_walk_one+0x1f6/0x250
> [   42.481171][  T155]  zswap_shrinker_scan+0x46b/0x760
> [   42.484544][  T155]  do_shrink_slab+0x958/0x1750
> [   42.487742][  T155]  shrink_slab_memcg+0x6ae/0xea0
> [   42.491686][  T155]  shrink_slab+0x119/0x7c0
> [   42.496077][  T155]  shrink_one+0x835/0xeb0
> [   42.500477][  T155]  shrink_many+0x70d/0x10c0
> [   42.504933][  T155]  lru_gen_shrink_node+0x832/0xd10
> [   42.508651][  T155]  shrink_node+0x13a/0x1dd0
> [   42.512056][  T155]  balance_pgdat+0x1556/0x2740
> [   42.515294][  T155]  kswapd+0x50d/0x870
> [   42.518245][  T155]  kthread+0x485/0x600
> [   42.521178][  T155]  ret_from_fork+0xfa/0x140
> [   42.524242][  T155]  ret_from_fork_asm+0x11/0x20
> [   42.527444][  T155] 
> [   42.529916][  T155] Uninit was stored to memory at:
> [   42.533147][  T155]  scatterwalk_map_and_copy+0x8b5/0xb50
> [   42.536505][  T155]  scomp_acomp_comp_decomp+0x45c/0xa10
> [   42.539860][  T155]  scomp_acomp_decompress+0x4e/0x60
> [   42.543099][  T155]  zswap_decompress+0x618/0xa50
> [   42.546244][  T155]  zswap_writeback_entry+0x6c0/0xaa0
> [   42.549525][  T155]  shrink_memcg_cb+0x3e8/0x870
> [   42.552652][  T155]  __list_lru_walk_one+0x4ee/0xf00
> [   42.555890][  T155]  list_lru_walk_one+0x1f6/0x250
> [   42.567920][  T155]  zswap_shrinker_scan+0x46b/0x760
> [   42.578533][  T155]  do_shrink_slab+0x958/0x1750
> [   42.587474][  T155]  shrink_slab_memcg+0x6ae/0xea0
> [   42.591733][  T155]  shrink_slab+0x119/0x7c0
> [   42.595698][  T155]  shrink_one+0x835/0xeb0
> [   42.599604][  T155]  shrink_many+0x70d/0x10c0
> [   42.603671][  T155]  lru_gen_shrink_node+0x832/0xd10
> [   42.608028][  T155]  shrink_node+0x13a/0x1dd0
> [   42.612164][  T155]  balance_pgdat+0x1556/0x2740
> [   42.616458][  T155]  kswapd+0x50d/0x870
> [   42.620420][  T155]  kthread+0x485/0x600
> [   42.624380][  T155]  ret_from_fork+0xfa/0x140
> [   42.628490][  T155]  ret_from_fork_asm+0x11/0x20
> [   42.632693][  T155] 
> [   42.635854][  T155] Uninit was stored to memory at:
> [   42.640219][  T155]  zswap_decompress+0x299/0xa50
> [   42.644446][  T155]  zswap_writeback_entry+0x6c0/0xaa0
> [   42.648922][  T155]  shrink_memcg_cb+0x3e8/0x870
> [   42.653115][  T155]  __list_lru_walk_one+0x4ee/0xf00
> [   42.657464][  T155]  list_lru_walk_one+0x1f6/0x250
> [   42.661710][  T155]  zswap_shrinker_scan+0x46b/0x760
> [   42.666078][  T155]  do_shrink_slab+0x958/0x1750
> [   42.670389][  T155]  shrink_slab_memcg+0x6ae/0xea0
> [   42.679819][  T155]  shrink_slab+0x119/0x7c0
> [   42.688501][  T155]  shrink_one+0x835/0xeb0
> [   42.697021][  T155]  shrink_many+0x70d/0x10c0
> [   42.705719][  T155]  lru_gen_shrink_node+0x832/0xd10
> [   42.715345][  T155]  shrink_node+0x13a/0x1dd0
> [   42.724486][  T155]  balance_pgdat+0x1556/0x2740
> [   42.733580][  T155]  kswapd+0x50d/0x870
> [   42.742222][  T155]  kthread+0x485/0x600
> [   42.751032][  T155]  ret_from_fork+0xfa/0x140
> [   42.760274][  T155]  ret_from_fork_asm+0x11/0x20
> [   42.769826][  T155] 
> [   42.776910][  T155] Uninit was created at:
> [   42.785419][  T155]  free_unref_page_prepare+0x130/0xfc0
> [   42.795890][  T155]  free_unref_page_list+0x13f/0x1130
> [   42.806224][  T155]  shrink_folio_list+0x713e/0x7480
> [   42.815480][  T155]  evict_folios+0x2160/0x52c0
> [   42.819727][  T155]  try_to_shrink_lruvec+0x1cb/0x460
> [   42.824366][  T155]  shrink_one+0x72d/0xeb0
> [   42.834077][  T155]  shrink_many+0x70d/0x10c0
> [   42.844079][  T155]  lru_gen_shrink_node+0x832/0xd10
> [   42.854215][  T155]  shrink_node+0x13a/0x1dd0
> [   42.863639][  T155]  shrink_zones+0x787/0x1530
> [   42.873152][  T155]  do_try_to_free_pages+0x2ac/0x16a0
> [   42.877447][  T155]  try_to_free_pages+0xddb/0x19b0
> [   42.880694][  T155]  __alloc_pages_slowpath+0x1a05/0x2d00
> [   42.884052][  T155]  __alloc_pages+0xc6c/0x1040
> [   42.887180][  T155]  alloc_pages_mpol+0x477/0xc40
> [   42.890362][  T155]  alloc_pages+0x224/0x240
> [   42.893479][  T155]  pipe_write+0xae5/0x2bd0
> [   42.896519][  T155]  vfs_write+0xfb9/0x1b90
> [   42.899562][  T155]  ksys_write+0x275/0x500
> [   42.902616][  T155]  __x64_sys_write+0xdf/0x120
> [   42.906154][  T155]  do_syscall_64+0xd5/0x1c0
> [   42.909855][  T155]  entry_SYSCALL_64_after_hwframe+0x62/0x6a
> [   42.913670][  T155] 
> [   42.916155][  T155] CPU: 5 PID: 155 Comm: kswapd1 Not tainted 6.8.0-rc5-next-20240223 #1
> [   42.921857][  T155] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [   42.927627][  T155] =====================================================
> [   42.931409][  T155] Disabling lock debugging due to kernel taint
> [   42.934961][  T155] Kernel panic - not syncing: kmsan.panic set ...
> [   42.938569][  T155] CPU: 5 PID: 155 Comm: kswapd1 Tainted: G    B              6.8.0-rc5-next-20240223 #1
> [   42.944533][  T155] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [   42.950295][  T155] Call Trace:
> [   42.952978][  T155]  <TASK>
> [   42.955503][  T155]  dump_stack_lvl+0x24b/0x300
> [   42.960955][  T155]  dump_stack+0x29/0x30
> [   42.969770][  T155]  panic+0x4ed/0xca0
> [   42.978544][  T155]  kmsan_report+0x2d1/0x2e0
> [   42.987874][  T155]  ? __msan_warning+0x91/0x120
> [   42.997602][  T155]  ? lzo1x_decompress_safe+0x433/0x3930
> [   43.004774][  T155]  ? lzo_sdecompress+0x119/0x220
> [   43.008069][  T155]  ? scomp_acomp_comp_decomp+0x65b/0xa10
> [   43.011551][  T155]  ? scomp_acomp_decompress+0x4e/0x60
> [   43.014956][  T155]  ? zswap_decompress+0x618/0xa50
> [   43.018249][  T155]  ? zswap_writeback_entry+0x6c0/0xaa0
> [   43.021719][  T155]  ? shrink_memcg_cb+0x3e8/0x870
> [   43.025111][  T155]  ? __list_lru_walk_one+0x4ee/0xf00
> [   43.028534][  T155]  ? list_lru_walk_one+0x1f6/0x250
> [   43.031877][  T155]  ? zswap_shrinker_scan+0x46b/0x760
> [   43.035452][  T155]  ? do_shrink_slab+0x958/0x1750
> [   43.038777][  T155]  ? shrink_slab_memcg+0x6ae/0xea0
> [   43.042092][  T155]  ? shrink_slab+0x119/0x7c0
> [   43.045255][  T155]  ? shrink_one+0x835/0xeb0
> [   43.048375][  T155]  ? shrink_many+0x70d/0x10c0
> [   43.051557][  T155]  ? lru_gen_shrink_node+0x832/0xd10
> [   43.054932][  T155]  ? shrink_node+0x13a/0x1dd0
> [   43.058106][  T155]  ? balance_pgdat+0x1556/0x2740
> [   43.061387][  T155]  ? kswapd+0x50d/0x870
> [   43.064386][  T155]  ? kthread+0x485/0x600
> [   43.067433][  T155]  ? ret_from_fork+0xfa/0x140
> [   43.075042][  T155]  ? ret_from_fork_asm+0x11/0x20
> [   43.084182][  T155]  ? shrink_one+0x835/0xeb0
> [   43.093150][  T155]  ? shrink_many+0x70d/0x10c0
> [   43.102332][  T155]  ? lru_gen_shrink_node+0x832/0xd10
> [   43.112061][  T155]  ? shrink_node+0x13a/0x1dd0
> [   43.121464][  T155]  ? balance_pgdat+0x1556/0x2740
> [   43.131127][  T155]  ? kswapd+0x50d/0x870
> [   43.140045][  T155]  ? kthread+0x485/0x600
> [   43.148993][  T155]  ? ret_from_fork+0xfa/0x140
> [   43.158046][  T155]  ? ret_from_fork_asm+0x11/0x20
> [   43.166985][  T155]  ? kmsan_internal_set_shadow_origin+0x66/0xe0
> [   43.177320][  T155]  ? kmsan_get_metadata+0x146/0x1c0
> [   43.187138][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   43.198259][  T155]  ? scatterwalk_map_and_copy+0xaa/0xb50
> [   43.209263][  T155]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   43.220485][  T155]  ? filter_irq_stacks+0xb9/0x230
> [   43.230534][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   43.240946][  T155]  __msan_warning+0x91/0x120
> [   43.249737][  T155]  lzo1x_decompress_safe+0x433/0x3930
> [   43.259433][  T155]  ? filter_irq_stacks+0xb9/0x230
> [   43.268747][  T155]  ? kmsan_internal_set_shadow_origin+0x66/0xe0
> [   43.278890][  T155]  ? kmsan_get_metadata+0x146/0x1c0
> [   43.288092][  T155]  lzo_sdecompress+0x119/0x220
> [   43.296809][  T155]  ? lzo_scompress+0x250/0x250
> [   43.305573][  T155]  scomp_acomp_comp_decomp+0x65b/0xa10
> [   43.315139][  T155]  scomp_acomp_decompress+0x4e/0x60
> [   43.324453][  T155]  ? scomp_acomp_compress+0x60/0x60
> [   43.334172][  T155]  zswap_decompress+0x618/0xa50
> [   43.343444][  T155]  zswap_writeback_entry+0x6c0/0xaa0
> [   43.353130][  T155]  shrink_memcg_cb+0x3e8/0x870
> [   43.362321][  T155]  __list_lru_walk_one+0x4ee/0xf00
> [   43.371873][  T155]  ? zswap_shrinker_count+0x670/0x670
> [   43.381677][  T155]  ? __msan_metadata_ptr_for_load_1+0x24/0x40
> [   43.392255][  T155]  list_lru_walk_one+0x1f6/0x250
> [   43.401742][  T155]  ? zswap_shrinker_count+0x670/0x670
> [   43.411756][  T155]  zswap_shrinker_scan+0x46b/0x760
> [   43.421682][  T155]  ? zswap_debugfs_init+0x420/0x420
> [   43.432130][  T155]  do_shrink_slab+0x958/0x1750
> [   43.436685][  T155]  shrink_slab_memcg+0x6ae/0xea0
> [   43.441009][  T155]  shrink_slab+0x119/0x7c0
> [   43.446049][  T155]  ? try_to_shrink_lruvec+0x42c/0x460
> [   43.451031][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   43.456008][  T155]  shrink_one+0x835/0xeb0
> [   43.460383][  T155]  shrink_many+0x70d/0x10c0
> [   43.464760][  T155]  lru_gen_shrink_node+0x832/0xd10
> [   43.469331][  T155]  shrink_node+0x13a/0x1dd0
> [   43.473657][  T155]  ? mem_cgroup_soft_limit_reclaim+0x34/0x17b0
> [   43.478672][  T155]  ? filter_irq_stacks+0xb9/0x230
> [   43.485985][  T155]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   43.497004][  T155]  ? kswapd_age_node+0x63/0xb00
> [   43.506282][  T155]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   43.516209][  T155]  balance_pgdat+0x1556/0x2740
> [   43.570854][  T155]  ? finish_wait+0x2f1/0x4a0
> [   43.580532][  T155]  kswapd+0x50d/0x870
> [   43.589396][  T155]  kthread+0x485/0x600
> [   43.598259][  T155]  ? shrink_all_memory+0x3a0/0x3a0
> [   43.608549][  T155]  ? kthread_blkcg+0x120/0x120
> [   43.618041][  T155]  ret_from_fork+0xfa/0x140
> [   43.627225][  T155]  ? kthread_blkcg+0x120/0x120
> [   43.636721][  T155]  ? kthread_blkcg+0x120/0x120
> [   43.646141][  T155]  ret_from_fork_asm+0x11/0x20
> [   43.655612][  T155]  </TASK>
> [   44.788328][  T155] Shutting down cpus with NMI
> [   44.792527][  T155] Kernel Offset: disabled
> [   44.795640][  T155] Rebooting in 10 seconds..
> ----------------------------------------
> 
> 
> 
> Maybe a different cause, but I feel that frequency of hitting "corrupted stack end detected
> inside scheduler" problem has increased in linux-next.git compared to linux.git .
> Too much stack usage?
> 
> ----------------------------------------
> ubuntu login: [   53.757790][  T194] Kernel panic - not syncing: corrupted stack end detected inside scheduler
> [   53.784397][  T194] CPU: 3 PID: 194 Comm: kworker/u39:3 Not tainted 6.8.0-rc5-next-20240223 #1
> [   53.810595][  T194] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [   53.829184][  T194] Workqueue: writeback wb_workfn (flush-8:0)
> [   53.835445][  T194] Call Trace:
> [   53.839997][  T194]  <TASK>
> [   53.844176][  T194]  dump_stack_lvl+0x24b/0x300
> [   53.849323][  T194]  dump_stack+0x29/0x30
> [   53.854261][  T194]  panic+0x4ed/0xca0
> [   53.858938][  T194]  ? kmsan_get_metadata+0x50/0x1c0
> [   53.864326][  T194]  __schedule+0x9e4/0x2770
> [   53.883521][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   53.910719][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   53.936505][  T194]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   53.964199][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   53.989716][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.015124][  T194]  __cond_resched+0x50/0xc0
> [   54.038931][  T194]  rmap_walk_file+0x382/0x8d0
> [   54.066110][  T194]  folio_mkclean+0x34d/0x530
> [   54.089049][  T194]  ? folio_mkclean+0x530/0x530
> [   54.117183][  T194]  ? page_mkclean_one+0x3f0/0x3f0
> [   54.135476][  T194]  folio_clear_dirty_for_io+0x22a/0xae0
> [   54.144905][  T194]  ? filemap_get_folios_tag+0x64a/0x6c0
> [   54.155053][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.165436][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   54.175572][  T194]  mpage_submit_folio+0x12a/0x5d0
> [   54.186797][  T194]  ext4_do_writepages+0x3401/0x63d0
> [   54.193608][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.206517][  T194]  ext4_writepages+0x338/0x870
> [   54.234367][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.243997][  T194]  ? ext4_read_folio+0x440/0x440
> [   54.271561][  T194]  do_writepages+0x5e5/0x15c0
> [   54.287149][  T194]  ? wake_up_bit+0x9c/0x490
> [   54.297127][  T194]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   54.318153][  T194]  ? filter_irq_stacks+0xb9/0x230
> [   54.326784][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.343015][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   54.349741][  T194]  __writeback_single_inode+0x170/0x1090
> [   54.356296][  T194]  ? __msan_metadata_ptr_for_load_8+0x24/0x40
> [   54.364305][  T194]  writeback_sb_inodes+0xd74/0x1e20
> [   54.371317][  T194]  ? kmsan_internal_set_shadow_origin+0x66/0xe0
> [   54.378719][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.385263][  T194]  __writeback_inodes_wb+0x1d6/0x510
> [   54.391720][  T194]  wb_writeback+0x63e/0xff0
> [   54.399899][  T194]  ? stack_depot_save_flags+0x2c/0x6f0
> [   54.408778][  T194]  ? kmsan_internal_set_shadow_origin+0x60/0xe0
> [   54.439971][  T194]  wb_do_writeback+0x120b/0x1510
> [   54.467029][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.494644][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.512469][  T194]  wb_workfn+0x190/0x850
> [   54.537678][  T194]  ? kmsan_get_metadata+0x146/0x1c0
> [   54.565645][  T194]  ? kmsan_get_shadow_origin_ptr+0x4d/0xb0
> [   54.595256][  T194]  ? inode_wait_for_writeback+0x320/0x320
> [   54.609201][  T194]  process_one_work+0xa0c/0x1c60
> [   54.614993][  T194]  worker_thread+0x11f2/0x1ba0
> [   54.620515][  T194]  kthread+0x485/0x600
> [   54.625631][  T194]  ? pr_cont_work+0xee0/0xee0
> [   54.630919][  T194]  ? kthread_blkcg+0x120/0x120
> [   54.636291][  T194]  ret_from_fork+0xfa/0x140
> [   54.641771][  T194]  ? kthread_blkcg+0x120/0x120
> [   54.647279][  T194]  ? kthread_blkcg+0x120/0x120
> [   54.652705][  T194]  ret_from_fork_asm+0x11/0x20
> [   54.658127][  T194]  </TASK>
> [   54.683905][  T194] Kernel Offset: disabled
> [   54.688874][  T194] Rebooting in 10 seconds..
> ----------------------------------------
> 
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23  4:43 ` Sergey Senozhatsky
@ 2024-02-23 15:22   ` Tetsuo Handa
  2024-02-23 23:54     ` [PATCH] x86: disable non-instrumented version of copy_page when KMSAN is enabled Tetsuo Handa
  2024-02-24 14:23     ` [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc() Sergey Senozhatsky
  0 siblings, 2 replies; 17+ messages in thread
From: Tetsuo Handa @ 2024-02-23 15:22 UTC (permalink / raw)
  To: Sergey Senozhatsky, Alexander Potapenko
  Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Minchan Kim, linux-mm,
	kasan-dev, Mark-PK Tsai

On 2024/02/23 13:43, Sergey Senozhatsky wrote:
> On (24/02/23 11:10), Tetsuo Handa wrote:
>>
>> I can observe this bug during evict_folios() from 6.7.0 to 6.8.0-rc5-00163-gffd2cb6b718e.
>> Since I haven't observed with 6.6.0, this bug might be introduced in 6.7 cycle.
> 
> Can we please run a bisect?

Bisection pointed at commit afb2d666d025 ("zsmalloc: use copy_page for full page copy"),
for copy_page() is implemented as non-instrumented code where KMSAN cannot handle.
On x86_64, copy_page() is defined at arch/x86/lib/copy_page_64.S as below.

----------------------------------------
/*
 * Some CPUs run faster using the string copy instructions (sane microcode).
 * It is also a lot simpler. Use this when possible. But, don't use streaming
 * copy unless the CPU indicates X86_FEATURE_REP_GOOD. Could vary the
 * prefetch distance based on SMP/UP.
 */
        ALIGN
SYM_FUNC_START(copy_page)
        ALTERNATIVE "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD
        movl    $4096/8, %ecx
        rep     movsq
        RET
SYM_FUNC_END(copy_page)
EXPORT_SYMBOL(copy_page)
----------------------------------------

To fix this problem, we need to implement copy_page() etc. in a way
KMSAN can handle.

Question to KASAN people:
Is it possible to add annotation for KMSAN into assembly code?
Do we need to disable assembly version and force use of C version
when KMSAN is enabled?

> 
> There are some zsmalloc patches for 6.8 (mm-unstable), I don't recall
> anything in 6.7.
> 
>> ----------------------------------------
>> [    0.000000][    T0] Linux version 6.8.0-rc5-00163-gffd2cb6b718e (root@ubuntu) (Ubuntu clang version 14.0.0-1ubuntu1.1, Ubuntu LLD 14.0.0) #1094 SMP PREEMPT_DYNAMIC Fri Feb 23 01:45:21 UTC 2024
>> [   50.026544][ T2974] =====================================================
>> [   50.030627][ T2974] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0
>> [   50.034611][ T2974]  obj_malloc+0x6cc/0x7b0
>>                                                            obj_malloc at mm/zsmalloc.c:0
>> [   50.037250][ T2974]  zs_malloc+0xdbd/0x1400
>>                                                            zs_malloc at mm/zsmalloc.c:0
>> [   50.039852][ T2974]  zs_zpool_malloc+0xa5/0x1b0
>>                                                            zs_zpool_malloc at mm/zsmalloc.c:372
>> [   50.044707][ T2974]  zpool_malloc+0x110/0x150
>>                                                            zpool_malloc at mm/zpool.c:258
>> [   50.049607][ T2974]  zswap_store+0x2bbb/0x3d30
>>                                                            zswap_store at mm/zswap.c:1637
>> [   50.054463][ T2974]  swap_writepage+0x15b/0x4f0
>>                                                            swap_writepage at mm/page_io.c:198
>> [   50.059392][ T2974]  pageout+0x41d/0xef0
>>                                                            pageout at mm/vmscan.c:654
>> [   50.064057][ T2974]  shrink_folio_list+0x4d7a/0x7480
>>                                                            shrink_folio_list at mm/vmscan.c:1316
>> [   50.069176][ T2974]  evict_folios+0x30f1/0x5170
>>                                                            evict_folios at mm/vmscan.c:4521
>> [   50.074082][ T2974]  try_to_shrink_lruvec+0x983/0xd20
>> [   50.079352][ T2974]  shrink_one+0x72d/0xeb0
>> [   50.084061][ T2974]  shrink_many+0x70d/0x10b0
>> [   50.088859][ T2974]  lru_gen_shrink_node+0x577/0x850
>> [   50.094192][ T2974]  shrink_node+0x13d/0x1de0
>> [   50.099028][ T2974]  shrink_zones+0x878/0x14a0
>> [   50.103958][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
>> [   50.109138][ T2974]  try_to_free_pages+0xd9e/0x1910
>> [   50.114190][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
>> [   50.119555][ T2974]  __alloc_pages+0xb8c/0x1050
>> [   50.124472][ T2974]  alloc_pages_mpol+0x8e0/0xc80
>> [   50.129367][ T2974]  alloc_pages+0x224/0x240
>> [   50.134022][ T2974]  pipe_write+0xabe/0x2ba0
>> [   50.138632][ T2974]  vfs_write+0xfb0/0x1b80
>> [   50.143171][ T2974]  ksys_write+0x275/0x500
>> [   50.147723][ T2974]  __x64_sys_write+0xdf/0x120
>> [   50.152431][ T2974]  do_syscall_64+0xd1/0x1b0
>> [   50.157106][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
>> [   50.162382][ T2974] 
>> [   50.165956][ T2974] Uninit was stored to memory at:
>> [   50.170819][ T2974]  obj_malloc+0x70a/0x7b0
>>                                                            set_freeobj at mm/zsmalloc.c:476
>>                                                            (inlined by) obj_malloc at mm/zsmalloc.c:1333
>> [   50.175341][ T2974]  zs_malloc+0xdbd/0x1400
>>                                                            zs_malloc at mm/zsmalloc.c:0
>> [   50.179923][ T2974]  zs_zpool_malloc+0xa5/0x1b0
>>                                                            zs_zpool_malloc at mm/zsmalloc.c:372
>> [   50.184636][ T2974]  zpool_malloc+0x110/0x150
>>                                                            zpool_malloc at mm/zpool.c:258
>> [   50.189257][ T2974]  zswap_store+0x2bbb/0x3d30
>>                                                            zswap_store at mm/zswap.c:1637
>> [   50.193918][ T2974]  swap_writepage+0x15b/0x4f0
>>                                                            swap_writepage at mm/page_io.c:198
>> [   50.198615][ T2974]  pageout+0x41d/0xef0
>>                                                            pageout at mm/vmscan.c:654
>> [   50.203012][ T2974]  shrink_folio_list+0x4d7a/0x7480
>>                                                            shrink_folio_list at mm/vmscan.c:1316
>> [   50.207772][ T2974]  evict_folios+0x30f1/0x5170
>>                                                            evict_folios at mm/vmscan.c:4521
>> [   50.212321][ T2974]  try_to_shrink_lruvec+0x983/0xd20
>> [   50.217092][ T2974]  shrink_one+0x72d/0xeb0
>> [   50.221441][ T2974]  shrink_many+0x70d/0x10b0
>> [   50.225891][ T2974]  lru_gen_shrink_node+0x577/0x850
>> [   50.230614][ T2974]  shrink_node+0x13d/0x1de0
>> [   50.235128][ T2974]  shrink_zones+0x878/0x14a0
>> [   50.239646][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
>> [   50.244461][ T2974]  try_to_free_pages+0xd9e/0x1910
>> [   50.249151][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
>> [   50.254148][ T2974]  __alloc_pages+0xb8c/0x1050
>> [   50.258679][ T2974]  alloc_pages_mpol+0x8e0/0xc80
>> [   50.263289][ T2974]  alloc_pages+0x224/0x240
>> [   50.267767][ T2974]  pipe_write+0xabe/0x2ba0
>> [   50.272190][ T2974]  vfs_write+0xfb0/0x1b80
>> [   50.276543][ T2974]  ksys_write+0x275/0x500
>> [   50.280931][ T2974]  __x64_sys_write+0xdf/0x120
>> [   50.289451][ T2974]  do_syscall_64+0xd1/0x1b0
>> [   50.303402][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
>> [   50.318721][ T2974] 
>> [   50.328931][ T2974] Uninit was created at:
>> [   50.341845][ T2974]  free_unref_page_prepare+0x130/0xfc0
>>                                                            arch_static_branch_jump at arch/x86/include/asm/jump_label.h:55
>>                                                            (inlined by) memcg_kmem_online at include/linux/memcontrol.h:1840
>>                                                            (inlined by) free_pages_prepare at mm/page_alloc.c:1096
>>                                                            (inlined by) free_unref_page_prepare at mm/page_alloc.c:2346
>> [   50.356492][ T2974]  free_unref_page_list+0x139/0x1050
>>                                                            free_unref_page_list at mm/page_alloc.c:2532
>> [   50.370898][ T2974]  shrink_folio_list+0x7139/0x7480
>>                                                            list_empty at include/linux/list.h:373
>>                                                            (inlined by) list_splice at include/linux/list.h:545
>>                                                            (inlined by) shrink_folio_list at mm/vmscan.c:1490
>> [   50.385025][ T2974]  evict_folios+0x30f1/0x5170
>>                                                            evict_folios at mm/vmscan.c:4521
>> [   50.398448][ T2974]  try_to_shrink_lruvec+0x983/0xd20
>> [   50.412660][ T2974]  shrink_one+0x72d/0xeb0
>> [   50.425591][ T2974]  shrink_many+0x70d/0x10b0
>> [   50.438827][ T2974]  lru_gen_shrink_node+0x577/0x850
>> [   50.454390][ T2974]  shrink_node+0x13d/0x1de0
>> [   50.479401][ T2974]  shrink_zones+0x878/0x14a0
>> [   50.529610][ T2974]  do_try_to_free_pages+0x2ac/0x16a0
>> [   50.544397][ T2974]  try_to_free_pages+0xd9e/0x1910
>> [   50.559556][ T2974]  __alloc_pages_slowpath+0x147a/0x2bd0
>> [   50.574932][ T2974]  __alloc_pages+0xb8c/0x1050
>> [   50.589024][ T2974]  alloc_pages_mpol+0x8e0/0xc80
>> [   50.603421][ T2974]  alloc_pages+0x224/0x240
>> [   50.616483][ T2974]  pipe_write+0xabe/0x2ba0
>> [   50.629601][ T2974]  vfs_write+0xfb0/0x1b80
>> [   50.643009][ T2974]  ksys_write+0x275/0x500
>> [   50.656157][ T2974]  __x64_sys_write+0xdf/0x120
>> [   50.670080][ T2974]  do_syscall_64+0xd1/0x1b0
>> [   50.683405][ T2974]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
>> [   50.698626][ T2974] 
>> ----------------------------------------



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] x86: disable non-instrumented version of copy_page when KMSAN is enabled
  2024-02-23 15:22   ` Tetsuo Handa
@ 2024-02-23 23:54     ` Tetsuo Handa
  2024-02-24  6:27       ` [PATCH v2] " Tetsuo Handa
  2024-02-24 14:23     ` [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc() Sergey Senozhatsky
  1 sibling, 1 reply; 17+ messages in thread
From: Tetsuo Handa @ 2024-02-23 23:54 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	the arch/x86 maintainers, H. Peter Anvin
  Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Minchan Kim, linux-mm,
	kasan-dev, Mark-PK Tsai, Sergey Senozhatsky, Alexander Potapenko

I found that commit afb2d666d025 ("zsmalloc: use copy_page for full page
copy") caused KMSAN warning. We need to fallback to instrumented version
when KMSAN is enabled.

  [   50.030627][ T2974] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0

  [   50.165956][ T2974] Uninit was stored to memory at:
  [   50.170819][ T2974]  obj_malloc+0x70a/0x7b0

  [   50.328931][ T2974] Uninit was created at:
  [   50.341845][ T2974]  free_unref_page_prepare+0x130/0xfc0

Since the destination page likely already holds previously written value
(i.e. KMSAN considers that the page was already initialized), whether to
globally enforce an instrumented version when KMSAN is enabled might be
questionable.

But since finding why KMSAN considers that value is not initialized is
difficult (developers tend to choose optimized version without knowing
KMSAN), let's choose human-friendly version. That is, since
arch/x86/include/asm/page_32.h implements copy_page() using memcpy(), let
arch/x86/include/asm/page_64.h implement copy_page() using memcpy() when
KMSAN is enabled.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 arch/x86/include/asm/page_64.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index cc6b8e087192..f13bba3a9dab 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -58,7 +58,16 @@ static inline void clear_page(void *page)
 			   : "cc", "memory", "rax", "rcx");
 }
 
+#ifdef CONFIG_KMSAN
+/* Use of non-instrumented assembly version confuses KMSAN. */
+void *memcpy(void *to, const void *from, __kernel_size_t len);
+static inline void copy_page(void *to, void *from)
+{
+	memcpy(to, from, PAGE_SIZE);
+}
+#else
 void copy_page(void *to, void *from);
+#endif
 
 #ifdef CONFIG_X86_5LEVEL
 /*
-- 
2.34.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] x86: disable non-instrumented version of copy_page when KMSAN is enabled
  2024-02-23 23:54     ` [PATCH] x86: disable non-instrumented version of copy_page when KMSAN is enabled Tetsuo Handa
@ 2024-02-24  6:27       ` Tetsuo Handa
  0 siblings, 0 replies; 17+ messages in thread
From: Tetsuo Handa @ 2024-02-24  6:27 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	the arch/x86 maintainers, H. Peter Anvin
  Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Minchan Kim, linux-mm,
	kasan-dev, Mark-PK Tsai, Sergey Senozhatsky, Alexander Potapenko

I found that commit afb2d666d025 ("zsmalloc: use copy_page for full page
copy") caused a false-positive KMSAN warning.

  [   50.030627][ T2974] BUG: KMSAN: use-after-free in obj_malloc+0x6cc/0x7b0

  [   50.165956][ T2974] Uninit was stored to memory at:
  [   50.170819][ T2974]  obj_malloc+0x70a/0x7b0

  [   50.328931][ T2974] Uninit was created at:
  [   50.341845][ T2974]  free_unref_page_prepare+0x130/0xfc0

We need to use instrumented version when KMSAN is enabled.
Let arch/x86/include/asm/page_64.h implement copy_page() using memcpy()
like arch/x86/include/asm/page_32.h does.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 arch/x86/include/asm/page_64.h | 9 +++++++++
 1 file changed, 9 insertions(+)

Changes in v2:

  Update explanation, for I misinterpreted source/destination direction.

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index cc6b8e087192..f13bba3a9dab 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -58,7 +58,16 @@ static inline void clear_page(void *page)
 			   : "cc", "memory", "rax", "rcx");
 }
 
+#ifdef CONFIG_KMSAN
+/* Use of non-instrumented assembly version confuses KMSAN. */
+void *memcpy(void *to, const void *from, __kernel_size_t len);
+static inline void copy_page(void *to, void *from)
+{
+	memcpy(to, from, PAGE_SIZE);
+}
+#else
 void copy_page(void *to, void *from);
+#endif
 
 #ifdef CONFIG_X86_5LEVEL
 /*
-- 
2.34.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc()
  2024-02-23 15:22   ` Tetsuo Handa
  2024-02-23 23:54     ` [PATCH] x86: disable non-instrumented version of copy_page when KMSAN is enabled Tetsuo Handa
@ 2024-02-24 14:23     ` Sergey Senozhatsky
  1 sibling, 0 replies; 17+ messages in thread
From: Sergey Senozhatsky @ 2024-02-24 14:23 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sergey Senozhatsky, Alexander Potapenko, Johannes Weiner,
	Yosry Ahmed, Nhat Pham, Minchan Kim, linux-mm, kasan-dev,
	Mark-PK Tsai

On (24/02/24 00:22), Tetsuo Handa wrote:
> On 2024/02/23 13:43, Sergey Senozhatsky wrote:
> > On (24/02/23 11:10), Tetsuo Handa wrote:
> >>
> >> I can observe this bug during evict_folios() from 6.7.0 to 6.8.0-rc5-00163-gffd2cb6b718e.
> >> Since I haven't observed with 6.6.0, this bug might be introduced in 6.7 cycle.
> > 
> > Can we please run a bisect?
> 
> Bisection pointed at commit afb2d666d025 ("zsmalloc: use copy_page for full page copy"),
> for copy_page() is implemented as non-instrumented code where KMSAN cannot handle.
> On x86_64, copy_page() is defined at arch/x86/lib/copy_page_64.S as below.

Thank you so much.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-02-24 14:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-23  2:10 [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc() Tetsuo Handa
2024-02-23  2:27 ` Yosry Ahmed
2024-02-23  4:48   ` Sergey Senozhatsky
2024-02-23  4:50     ` Yosry Ahmed
2024-02-23  4:56       ` Sergey Senozhatsky
2024-02-23  4:58         ` Sergey Senozhatsky
2024-02-23  5:05           ` Yosry Ahmed
2024-02-23  5:19             ` Sergey Senozhatsky
2024-02-23  5:23     ` Chengming Zhou
2024-02-23  5:29       ` Sergey Senozhatsky
2024-02-23  9:26       ` Tetsuo Handa
2024-02-23 10:10         ` Chengming Zhou
2024-02-23  4:43 ` Sergey Senozhatsky
2024-02-23 15:22   ` Tetsuo Handa
2024-02-23 23:54     ` [PATCH] x86: disable non-instrumented version of copy_page when KMSAN is enabled Tetsuo Handa
2024-02-24  6:27       ` [PATCH v2] " Tetsuo Handa
2024-02-24 14:23     ` [mm/page_alloc or mm/vmscan or mm/zswap] use-after-free in obj_malloc() Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox