linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Sasha Levin <sashal@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	"Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	RCU <rcu@vger.kernel.org>
Subject: Re: [GIT PULL] slab updates for 6.14
Date: Mon, 20 Jan 2025 21:35:38 +0100	[thread overview]
Message-ID: <0e2d61be-39a4-4837-81ac-bd8b2fb7a37f@suse.cz> (raw)
In-Reply-To: <Z46N0lAqUwQ3Z5S6@lappy>

On 1/20/25 18:54, Sasha Levin wrote:
> On Fri, Jan 17, 2025 at 03:13:18PM +0100, Vlastimil Babka wrote:
>>Hi Linus,
>>
>>please pull the latest slab updates from:
>>
>>  git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git tags/slab-for-6.14
> 
> Hi Vlastimil,

Hi,

> I've ended up pulling quite a few of the 6.14 PRs into linus-next, and
> LKFT started hitting the following issue:
> 
> <1>[  526.258666] Unable to handle kernel paging request at virtual address 00000007f5b55088
> <1>[  526.260217] Mem abort info:
> <1>[  526.260902]   ESR = 0x0000000096000005
> <1>[  526.261422]   EC = 0x25: DABT (current EL), IL = 32 bits
> <1>[  526.262197]   SET = 0, FnV = 0
> <1>[  526.262684]   EA = 0, S1PTW = 0
> <1>[  526.263370]   FSC = 0x05: level 1 translation fault
> <1>[  526.267546] Data abort info:
> <1>[  526.268047]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> <1>[  526.268688]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> <1>[  526.269601]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> <1>[  526.270143] user pgtable: 64k pages, 52-bit VAs, pgdp=0000000103f42000
> <1>[  526.279321] [00000007f5b55088] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
> <0>[  526.284271] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> <4>[  526.285819] Modules linked in: tun sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables
> <4>[  526.288412] CPU: 0 UID: 0 PID: 5334 Comm: read_all Not tainted 6.13.0 #1
> <4>[  526.290169] Hardware name: linux,dummy-virt (DT)
> <4>[  526.291025] pstate: a3402009 (NzCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> <4>[  526.291607] pc : kfree+0x60/0x350
> <4>[  526.292404] lr : show_slab_objects+0x31c/0x438
> <4>[  526.292796] sp : ffff8000882efb40
> <4>[  526.293761] x29: ffff8000882efb50 x28: 0000000000000000 x27: ffffa0d6d542a8f0
> <4>[  526.295127] x26: fff00000c0000b40 x25: ffffa0d6d5379000 x24: 0000000000000001
> <4>[  526.296745] x23: ffffa0d6d5379d40 x22: ffffa0d6d4aba898 x21: 6cefa0d6d2dab76c
> <4>[  526.297465] x20: ffffa0d6d542a8f0 x19: 00000007f5b55080 x18: ffffffffffffffff
> <4>[  526.299121] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000882ef9e0
> <4>[  526.300133] x14: fff00000c0540000 x13: fff00000c0530000 x12: 0000000000000000
> <4>[  526.301642] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa0d6d2dab76c
> <4>[  526.302368] x8 : fff00000ff060000 x7 : ffff8000882efba0 x6 : ffff8000882efba0
> <4>[  526.303575] x5 : 00000000ffffffd8 x4 : fff00000ff060608 x3 : 0000000000000000
> <4>[  526.304506] x2 : 0000000000000000 x1 : fff00000ff060000 x0 : fffffc1fc0000000
> <4>[  526.305547] Call trace:
> <4>[  526.306310]  kfree+0x60/0x350 (P)
> <4>[  526.307446]  show_slab_objects+0x31c/0x438

This is typically a corrupted slab freelist due to double free or
use-after-free, here it indirectly hits the slab's code that handles sysfs
cache stats reporting (probably of some other cache that's fine), because
that function does a kmalloc/kfree and happens to use the corrupted slab.

> <4>[  526.307948]  total_objects_show+0x1c/0x30
> <4>[  526.308514]  slab_attr_show+0x28/0x48
> <4>[  526.308812]  sysfs_kf_seq_show+0x9c/0x148
> <4>[  526.309901]  kernfs_seq_show+0x34/0x48
> <4>[  526.310922]  seq_read_iter+0xe4/0x460
> <4>[  526.311704]  kernfs_fop_read_iter+0x148/0x1c0
> <4>[  526.312903]  vfs_read+0x280/0x330
> <4>[  526.314276]  ksys_read+0x78/0x118
> <4>[  526.316078]  __arm64_sys_read+0x24/0x38
> <4>[  526.316651]  invoke_syscall.constprop.0+0x58/0xf8
> <4>[  526.317315]  do_el0_svc+0x48/0xd8
> <4>[  526.317811]  el0_svc+0x40/0x160
> <4>[  526.319521]  el0t_64_sync_handler+0x10c/0x138
> <4>[  526.320220]  el0t_64_sync+0x198/0x1a0
> <0>[  526.321602] Code: b26287e0 d350fe73 f2df83e0 8b131813 (f9400660)
> <4>[  526.322715] ---[ end trace 0000000000000000 ]---
> <4>[  536.656232] ------------[ cut here ]------------
> <4>[  536.656871] Trying to vfree() bad address (00000000a5fbfd52)
> <4>[  536.658605] WARNING: CPU: 1 PID: 31 at mm/vmalloc.c:3231 remove_vm_area+0x68/0x90

Perhaps this implicates some vmalloc changes or it's also a victim of
somebody else causing a corruption. It does use kmalloc/kfree too so could
be even due to corruption of the same slab as above.

I think the slab PR itself has nothing that could affect this it's mostly
just a code move from RCU to SLAB.

Bisecting might indeed work best, or you could try KASAN or at least booting
with slub_debug to catch whoever is misbehaving.

> <4>[  536.660181] Modules linked in: tun sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables
> <4>[  536.662159] CPU: 1 UID: 0 PID: 31 Comm: kworker/1:1 Tainted: G      D            6.13.0 #1
> <4>[  536.663261] Tainted: [D]=DIE
> <4>[  536.664160] Hardware name: linux,dummy-virt (DT)
> <4>[  536.665493] Workqueue: events delayed_vfree_work
> <4>[  536.666020] pstate: 62402009 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> <4>[  536.667232] pc : remove_vm_area+0x68/0x90
> <4>[  536.667917] lr : remove_vm_area+0x68/0x90
> <4>[  536.668448] sp : ffff80008092fc90
> <4>[  536.668759] x29: ffff80008092fc90 x28: 0000000000000000 x27: 0000000000000000
> <4>[  536.670302] x26: 0000000000000000 x25: 0000000000000000 x24: fff00000c02f0205
> <4>[  536.671148] x23: fff00000fdaf3180 x22: 0000000000000000 x21: 0000000000000000
> <4>[  536.672118] x20: ffffa0d6d542a8f0 x19: ffffa0d6d542a8f0 x18: 0000000000000006
> <4>[  536.673098] x17: fff05f2a288b0000 x16: ffff800080020000 x15: ffff80008092f6c0
> <4>[  536.673718] x14: ffff80010092f87a x13: ffff80008092f882 x12: 0000000000000000
> <4>[  536.674718] x11: fffffffffffe0000 x10: ffffa0d6d53f82b0 x9 : ffffa0d6d2b4c7c4
> <4>[  536.675772] x8 : 00000000ffffefff x7 : ffffa0d6d53f82b0 x6 : 80000000fffff000
> <4>[  536.676801] x5 : 0000000000000181 x4 : 0000000000000000 x3 : 0000000000000000
> <4>[  536.677609] x2 : 0000000000000000 x1 : 0000000000000000 x0 : fff00000c0842600
> <4>[  536.679075] Call trace:
> <4>[  536.679415]  remove_vm_area+0x68/0x90 (P)
> <4>[  536.679795]  vfree+0x44/0x338
> <4>[  536.680241]  kvfree+0x2c/0x60
> <4>[  536.681397]  vfree+0x134/0x338
> <4>[  536.681989]  delayed_vfree_work+0x44/0x60
> <4>[  536.682344]  process_one_work+0x158/0x3c0
> <4>[  536.683428]  worker_thread+0x2d8/0x3e8
> <4>[  536.684162]  kthread+0x120/0x208
> <4>[  536.684778]  ret_from_fork+0x10/0x20
> <4>[  536.685509] ---[ end trace 0000000000000000 ]---
> 
> I'm working on bisecting, but sending this mail out in hopes that we can
> figure it out from the logs. The full logs are at: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-1168-g45696205640c/testrun/26824158/suite/log-parser-test/test/bug-bug-bad-rss-counter-state-mmeadba-typemm_anonpages-val/log
> 



  reply	other threads:[~2025-01-20 20:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-17 14:13 Vlastimil Babka
2025-01-20 17:54 ` Sasha Levin
2025-01-20 20:35   ` Vlastimil Babka [this message]
2025-01-21 23:53   ` Christoph Lameter (Ampere)
2025-01-22  0:24     ` Sasha Levin
2025-01-21 23:04 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e2d61be-39a4-4837-81ac-bd8b2fb7a37f@suse.cz \
    --to=vbabka@suse.cz \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rcu@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=sashal@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox