From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2F4CC021A9 for ; Mon, 17 Feb 2025 17:12:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4016528007A; Mon, 17 Feb 2025 12:12:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B2CE280078; Mon, 17 Feb 2025 12:12:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 252D528007A; Mon, 17 Feb 2025 12:12:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 07518280078 for ; Mon, 17 Feb 2025 12:12:28 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id AA3131A063D for ; Mon, 17 Feb 2025 17:12:27 +0000 (UTC) X-FDA: 83130080334.11.431746A Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com [209.85.208.181]) by imf22.hostedemail.com (Postfix) with ESMTP id BAF62C000A for ; Mon, 17 Feb 2025 17:12:25 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FSnM9rU2; spf=pass (imf22.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.181 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739812345; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jbNIFLkQGKx8gMtazppTvQVDaVLm+keOHDU+VX7GYjs=; b=AT/UZRxOpodpAP6KNTJjtrMZ/mDJBQmFVospWkoF4D6H0bbi+rZzhy41NCx8sOji5j930m 3CuxUOkUPftkfMttM0p5fXACrXBeF2sIdGIjf4GDPDLW9OhzTzo6eqDxu0GdY56Beixpzy rMW0j4J9vQpd573QqdJrgP14B1Yc/JQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739812345; a=rsa-sha256; cv=none; b=JXHgB1Zyd9nCl70/1i+L3RlqvcSpjd4bXhN8TSCuNm8GO5NZqZr4yy9W0X/fxk1Ke30lPp TIN6l6Hv2PPUImWGznKRwIGEJN+/eVqOoMRuFwYA6FXCBGCHXxo8jkF9zx6z+Z/94QG6UJ K6ceiRtORm/ViIxixzlKJ5kPigkHOhM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FSnM9rU2; spf=pass (imf22.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.181 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lj1-f181.google.com with SMTP id 38308e7fff4ca-30797730cbdso47691011fa.3 for ; Mon, 17 Feb 2025 09:12:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739812344; x=1740417144; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jbNIFLkQGKx8gMtazppTvQVDaVLm+keOHDU+VX7GYjs=; b=FSnM9rU2BDq9dsNCGtc/XqLwkA+6wBg3aY6bCKt6sG5QN1idlg6Bz6F+d0dZP0SQ9A pCYzHKbN392L9pFb3hF4dvxZH7wGA9FV7q0lUCftjefWQnv6BuCmbdNQJX2rHLHZhKvo iEclFcMyeZpDwNjRwjbJ00uzzROfhH0ANGqySOq2kAbinkNpws4os4QFzi+byMJgPANg cTj60/lqdvW2umMiR8SOpBXsUI3KWKEqtXxEaW3MNcQUsrn5v2DeP/+f5QTsHc3rj5VK 4Xqhd0ZuxYeYErqUryDHL1KG0TzhMqrY5e20fty5nFVrxcrte2F75KOj6+5ZHVzkTi5Z aztA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739812344; x=1740417144; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jbNIFLkQGKx8gMtazppTvQVDaVLm+keOHDU+VX7GYjs=; b=kcpoc2gwxukFYNjbYo0D9n/CWSfZfMG0UzxxwGOjgQBSfbZvB8lrpA9BRZC7YeNG9g 3pV5i3MxX708r4XjUoGWjKjd/gxcb57u2vWYloEICrISwSreIr3vbVMUd1mWE545S9oR npIqISXZUbXHn+UYflYAYsJNGashDxHJVLW3vPN8MfBGoO3Xg1/o3H31XI2J33CHYWLZ mvNIHFREpZOtN7IKMz0I1208P67EH1EykYlzkIZausVuWo+Rd05IkZTNEF7M4qNRKRBP PhL1/x6msNStxzd+TOoIrQWd6kU+4kUzvMTK0oOuzu1UPQIpuHnxEzACmyW+yk3ojpua PTIw== X-Forwarded-Encrypted: i=1; AJvYcCVoFaclH2i0vVaH1XjHK36XT90P/LtLAfBKTBuFbMy2SmcPNJL0fU7m77Hncrm0DP3PFWFy4aVDgw==@kvack.org X-Gm-Message-State: AOJu0YymPB9K/eWZxv3UfABTMiSmYYBV1r0HEZ1oT+GhfbnxQSlUsDTH n+W57ua/exqBhyM0sUdC0BUAavzX91nReZoIau6aZiP/46OdSN9hq0xAv9y2i4tx5rS4nJwyZNw Y8ZcMdV3YPYU+L2DXl/7dSrCOwhI= X-Gm-Gg: ASbGncv3pRjq5hSd48Ns7AmxxttU67+mIkt4rGBVDn7R1nBwijqlQQwLWAlo/8VnUeM f30MD/e+J6wdYJSNu2GkGx60jaRFNJSpkXZlHKqw67hwaF3/cozWcJmZrzrOK6L0Nva5OOr/K X-Google-Smtp-Source: AGHT+IHjy4r+IzmE9EhuRmSMzfEdSQ8U3VGlqITFFstaR8PRZDd3RbzoZ6F7x5rkaVJbCaO78HffFEzUZo9jbKIUiWg= X-Received: by 2002:a2e:9f54:0:b0:308:f860:7c1 with SMTP id 38308e7fff4ca-30927b1b22amr29120861fa.30.1739812342599; Mon, 17 Feb 2025 09:12:22 -0800 (PST) MIME-Version: 1.0 References: <675d01e9.050a0220.37aaf.00be.GAE@google.com> <67af8747.050a0220.21dd3.004c.GAE@google.com> <20250214152358.7ba29d10229e2155c0899774@linux-foundation.org> In-Reply-To: From: Kairui Song Date: Tue, 18 Feb 2025 01:12:04 +0800 X-Gm-Features: AWEUYZlr5OT-vFG04O05Y9RY-JrYRxb2uhKVomssed7apCq-exWeAz-LivZSVcg Message-ID: Subject: Re: [syzbot] [mm?] [bcachefs?] WARNING in lock_list_lru_of_memcg To: Andrew Morton , kent.overstreet@linux.dev Cc: syzbot , chengming.zhou@linux.dev, hannes@cmpxchg.org, linux-bcachefs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, muchun.song@linux.dev, roman.gushchin@linux.dev, sashal@kernel.org, shakeel.butt@linux.dev, syzkaller-bugs@googlegroups.com, willy@infradead.org, yuzhao@google.com, zhengqi.arch@bytedance.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: BAF62C000A X-Rspamd-Server: rspam07 X-Stat-Signature: i48aukas3cpady7juptr4aqn89dcwiuw X-HE-Tag: 1739812345-843331 X-HE-Meta: U2FsdGVkX19MjvEBwP9Do0oE91pGOo7oz+lSD37A0YzMUwuzA2ndk1t6ecMWllRHDTXMXytOJv7ddjl6omOy8RD4NQKzJNNkUVOfom/Q4CCt8rujic4TrsmK+21bZurLaUOv850K40+E22g582ZaAHQNEZmZa3bjO90B0E+Xge3X6uDVHSp+bIKr0nE0Lq0npANLT7rQyNlfG52x83Izn3MdcBJ1Te3xDvNj1bZrmQztJkFDdSQ40qe60bCl04WYekKSboknCaBi23bmKKhfIJSBGFB0XuV4Oovoz6Y6aivCQtYBsmwhcINsXftQBUB4oyg9olLp9rr/iNbLMkPyTBIlr8JJk11PiQ9rvWYA+giLHrK4KUQD+i1jG38Jena85Ywlcl7jOufo+Y/unBZ6ET0563smZQu1i4izY7KCDXhYtqG2mxRd5lDbeL/HbLGJ2Ooqm44TTl63/xToYWE+aUosA5T8JvqCbPpFRt861gO+z5N2uTkwV2QRquvxXSvZ3WJIIAhGHSY7lQ2zU4Sw9mjpe0mgUudIhXf03Hdzfec7iPatoGcCEDSDKrm1oSeQSFEmkcIZGvmU93QbKy9sN1P/shVu77w+5U2wsmE8ChaKO4fmzxyQZlDK7DC3sBmCYI8tTHBzc7+O4c6P/sOuvKZpKjgBhgUnaFrBYJtHvueJSI+F0EKvBn2y9rkDK1YMyl9JFY2/48608/Jynx4hiuwgdTFuObZnFpn2cYi+zfOLhy/dnk5usFtXKwrHN+sZ8Ak00+PQ6lReiy+i6oXQLewE+3rPDhshdv/9Z1DfTjCxUF5XD1RGoH1iPxEsAiK2wcbUF8YjdZdp/NV+UrbKcuzBWcJAoCpKguEVjAXV80n6maPc+BokXVHqmtMfiv8zgGn3klcSd3hoJ7RGJoXLuDE0GSgwaZThgeP+XuDd4tCvYamdLgD05bVa+d3ATvKg+jsM8XGYLyX7ka31+H2 VkfyjxAU tTHJrmV+ktCc3npOFCoeBU71yv39SnQ7ZF+Oszpz85EAORN02ngmmQIG5lntOKbcvPKgQzn3mDCfZ0YO3c2rmt0Yol0mF3Ofgu9pJSEcNc88krlwQGBO9nt2zmHs1P/RVI0GZ+FobEsfPggVyqOYrsexRvTTkJiE/zHWithucOgV1lUBshLjJtepxUMvvIiHYUssx3zdpxAzwLfS054HCG0838yIqFF50+Qoi+ACuNZVSzK/X4Y9XCwUl7vLpEZrPn3pRxFzy/DAf84wzHeVDAEml3Uq00Ezl604ITZqQBKyyL72BwylZZ7R4XlJz9r1zs36e6iXfmci9k4i9qWzMcyZjCAXfqB1kEDe3o32dZGVqqvNrfS5+Yn49lHJfY+3Lf/zudkAwhzaYJxBZ7hllNwIo5MYtu/bLM+EJKAU/AnwV6kzGs7mI7VSqR3wwEtgqND6HbO6DcBlX9YAmRDu4RT4m+TVMqxG7k568x0XKrh6QCVkYCeOGmoPw0w5ILOLgXWsEKa0fy09uMXc2/M2hW5Oq78A6q2jlbHYD+Gd/+F84JwWTTikZY9R8IWtts0zjl2uF7JDoBt+K1/6JkeqTRpOLpGfD8z6Bq1zFFvcS6JbIMJ4lRiF/nLoDspIC/eKFa0twxge8etyGBLmUDkKFsOye4mhwJFCZeil2zj6SM/3o/bjmLAeBRjakEcVuG6l3ZbjdxQrwbrj10IOMdVHfj/Ozv4Gg8j2uAYSS4gF/3UQHaKb2YWBfr6d2iKR+a3O++7Xd1BlUJ2nWcvnOjrkyNPTtG4GUSWlCFq+V+wnugEI9HATT071RZWsdEAJyVaMbhs+IboH1i8qdjndqJTxgdO5M3xXYirSk8paUWJXC/uVpk9xvJ2ZW03xhmw0vTbaMbpW0JAXDwp5OQ9v/GwzOq9nIX3YSNIsL6d2aNtjWsI4OL9ZPM5u+536DyW9Q+alMpvIW6M9NRWKAI1YwpJDRbBnj0tk1 mO38hen3 eHigg/6eNfg4UQsAMgI2jqZQ73aJWi3VSR9QMTipp3LxTXQ+aN2TGiu5qEWcreAG7uRFGib97Zdm6+IU/sT4VvVvSdozyrzYvjJGZvvVF+I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 17, 2025 at 12:13=E2=80=AFAM Kairui Song wro= te: > > On Sat, Feb 15, 2025 at 7:24=E2=80=AFAM Andrew Morton wrote: > > > > On Fri, 14 Feb 2025 10:11:19 -0800 syzbot wrote: > > > > > syzbot has found a reproducer for the following issue on: > > > > Thanks. I doubt if bcachefs is implicated in this? > > > > > HEAD commit: 128c8f96eb86 Merge tag 'drm-fixes-2025-02-14' of http= s://g.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=3D148019a45= 80000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=3Dc776e555c= fbdb82d > > > dashboard link: https://syzkaller.appspot.com/bug?extid=3D38a0cbd267e= ff2d286ff > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for= Debian) 2.40 > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=3D12328bf= 8580000 > > > > > > Downloadable assets: > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-asse= ts/7feb34a89c2a/non_bootable_disk-128c8f96.raw.xz > > > vmlinux: https://storage.googleapis.com/syzbot-assets/a97f78ac821e/vm= linux-128c8f96.xz > > > kernel image: https://storage.googleapis.com/syzbot-assets/f451cf16fc= 9f/bzImage-128c8f96.xz > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/a7da78= 3f97cf/mount_3.gz > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the = commit: > > > Reported-by: syzbot+38a0cbd267eff2d286ff@syzkaller.appspotmail.com > > > > > > ------------[ cut here ]------------ > > > WARNING: CPU: 0 PID: 5459 at mm/list_lru.c:96 lock_list_lru_of_memcg+= 0x39e/0x4d0 mm/list_lru.c:96 > > > > VM_WARN_ON(!css_is_dying(&memcg->css)); > > I'm checking this, when last time this was triggered, it was caused by > a list_lru user did not initialize the memcg list_lru properly before > list_lru reclaim started, and fixed by: > https://lore.kernel.org/all/20241222122936.67501-1-ryncsn@gmail.com/T/ > > This shouldn't be a big issue, maybe there are leaks that will be > fixed upon reparenting, and this new added sanity check might be too > lenient, I'm not 100% sure though. > > Unfortunately I couldn't reproduce the issue locally with the > reproducer yet. will keep the test running and see if it can hit this > WARN_ON. So far I am still unable to trigger this VM_WARN_ON using the reproducer, and I'm seeing many other random crashes. But after I changed the .config a bit adding more debug configs (SLAB_FREELIST_HARDENED, DEBUG_PAGEALLOC), following crash showed up and will be triggered immediately after I start the test: [ T1242] BUG: unable to handle page fault for address: ffff888054c60000 [ T1242] #PF: supervisor read access in kernel mode [ T1242] #PF: error_code(0x0000) - not-present page [ T1242] PGD 19e01067 P4D 19e01067 PUD 19e04067 PMD 7fc5c067 PTE 800fffffab39f060 [ T1242] Oops: Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI [ T1242] CPU: 1 UID: 0 PID: 1242 Comm: kworker/1:1H Not tainted 6.14.0-rc2-00185-g128c8f96eb86 #2 [ T1242] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-4.module+el8.8.0+664+0a3d6c83 04/01/2014 [ T1242] Workqueue: bcachefs_btree_read_complete btree_node_read_work [ T1242] RIP: 0010:validate_bset_keys+0xae3/0x14f0 [ T6058] bcachefs (loop2): empty btree root xattrs [ T1242] Code: 49 39 df 0f 87 fc 09 00 00 e8 79 54 a8 fd 41 0f b7 c6 48 8b 4c 24 68 48 8d 04 c1 4c 29 f8 48 c1 e8 03 89 c1 48 89 de 4c 89 ff 48 a5 48 8b bc 24 c8 00 00 08 [ T1242] RSP: 0018:ffffc900070a72c0 EFLAGS: 00010206 [ T1242] RAX: 000000000000ec0f RBX: ffff888054c20110 RCX: 0000000000006c31 [ T1242] RDX: 0000000000000000 RSI: ffff888054c60000 RDI: ffff888054c5ff90 [ T1242] RBP: ffffc900070a7570 R08: ffff888065e001af R09: 1ffff1100cbc0035 [ T1242] R10: dffffc0000000000 R11: ffffed100cbc0036 R12: ffff888054c2009e [ T1242] R13: dffffc0000000000 R14: 000000000000ec0f R15: ffff888054c200a0 [ T1242] FS: 0000000000000000(0000) GS:ffff88807ea00000(0000) knlGS:0000000000000000 [ T1242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ T1242] CR2: ffff888054c60000 CR3: 000000006cea6000 CR4: 00000000000006f0 [ T1242] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ T1242] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ T1242] Call Trace: [ T1242] [ T1242] bch2_btree_node_read_done+0x1d20/0x53a0 [ T1242] btree_node_read_work+0x54d/0xdc0 [ T1242] process_scheduled_works+0xaf8/0x17f0 [ T1242] worker_thread+0x89d/0xd60 [ T1242] kthread+0x722/0x890 [ T1242] ret_from_fork+0x4e/0x80 [ T1242] ret_from_fork_asm+0x1a/0x30 [ T1242] [ T1242] Modules linked in: [ T1242] ---[ end trace 0000000000000000 ]--- [ T1242] RIP: 0010:validate_bset_keys+0xae3/0x14f0 [ T1242] Code: 49 39 df 0f 87 fc 09 00 00 e8 79 54 a8 fd 41 0f b7 c6 48 8b 4c 24 68 48 8d 04 c1 4c 29 f8 48 c1 e8 03 89 c1 48 89 de 4c 89 ff 48 a5 48 8b bc 24 c8 00 00 08 [ T1242] RSP: 0018:ffffc900070a72c0 EFLAGS: 00010206 [ T1242] RAX: 000000000000ec0f RBX: ffff888054c20110 RCX: 0000000000006c31 [ T1242] RDX: 0000000000000000 RSI: ffff888054c60000 RDI: ffff888054c5ff90 [ T1242] RBP: ffffc900070a7570 R08: ffff888065e001af R09: 1ffff1100cbc0035 [ T1242] R10: dffffc0000000000 R11: ffffed100cbc0036 R12: ffff888054c2009e [ T1242] R13: dffffc0000000000 R14: 000000000000ec0f R15: ffff888054c200a0 [ T1242] FS: 0000000000000000(0000) GS:ffff88807ea00000(0000) knlGS:0000000000000000 [ T1242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ T1242] CR2: ffff888054c60000 CR3: 000000006cea6000 CR4: 00000000000006f0 [ T1242] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ T1242] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ T1242] Kernel panic - not syncing: Fatal exception [ T1242] Kernel Offset: disabled [ T1242] Rebooting in 86400 seconds.. It's caused by the memmove_u64s_down in validate_bset_keys of fs/bcachefs/btree_io.c: -> memmove_u64s_down(k, bkey_p_next(k), (u64 *) vstruct_end(i) - (u64 *) k)= ; The bkey_p_next(k) is RSI: ffff888054c60000 and it's causing an out of border access. (u64 *) vstruct_end(i) - (u64 *) k is RCX: 0000000000006c31, if added to RDI this should cause an out of border write as well. This seems to indicate there is an out of border memory modification? And maybe it corrupted other subsystems? The slight change to .config changed the layout so it's causing a fault, maybe previously this just went on silently. I don't know much about bcachefs, will be grateful if bcachefs people could help have a look.