From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4708EC021B3 for ; Fri, 21 Feb 2025 19:44:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AAB8D28000A; Fri, 21 Feb 2025 14:44:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A34D6280004; Fri, 21 Feb 2025 14:44:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AE0628000A; Fri, 21 Feb 2025 14:44:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6A18D280004 for ; Fri, 21 Feb 2025 14:44:53 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 945D8160296 for ; Fri, 21 Feb 2025 19:44:52 +0000 (UTC) X-FDA: 83144979624.19.3F668DC Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) by imf28.hostedemail.com (Postfix) with ESMTP id AB3BDC0005 for ; Fri, 21 Feb 2025 19:44:50 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=IAXcXPLT; spf=pass (imf28.hostedemail.com: domain of mmpgouride@gmail.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=mmpgouride@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740167090; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RZjnVAd0Kk3hsZLXO7/QTBdvrnQz6hHcWXWZZevM+TY=; b=fEvp6LfT1XQgsPzkmwv+0Mcr3gabhcrgN4eqjztloRo0S5xgf7JDSReaZg/NbJU7FFXPjD LJAc1ZjYiuyXjHE/K+VmxEZM0QptocP398rwX8SA1/39sh0J/R02A4P50PmOoPpoieqh4I Nb4GMLrQldyUeqDzHZGTuLfDqHBL+Vg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740167090; a=rsa-sha256; cv=none; b=rLueyyyp6G7J3QMOhwKR0XwshtKdVvPHIVvCPjxScw0cm0H9ZrYVOL5ZSosqEO7oLUP0Z0 /V8Eu5GvRaHhKkJtdCmwvZKnKvawZSBJJ4PxXK+Z37ooNeivpjwLCtrwKxDX7Lu/f5F4HT JVejs4vloOucXoMZr9z8JZoLM/WjzPU= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=IAXcXPLT; spf=pass (imf28.hostedemail.com: domain of mmpgouride@gmail.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=mmpgouride@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-6f7031ea11cso21741017b3.2 for ; Fri, 21 Feb 2025 11:44:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1740167090; x=1740771890; darn=kvack.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RZjnVAd0Kk3hsZLXO7/QTBdvrnQz6hHcWXWZZevM+TY=; b=IAXcXPLTyzzuNZV/+tSaLY8UjMLciGo82VhImWGqo7oGwH2yl6wpHFgydTqVhHPoYL H2CDLqbUE6FlxImTD+pI6nNWOKHZlXi2aNJk5ElnorJtMQMfDZHeGtwF/PJlM8ZyVzOr uXdd2GG1N6f8icWmKD8L8921SWoUeO+aeakTOs5m+aGFtTfML1h8bG4pLp9/eKx3TcVq G86AxXfy90sQRpE85SR0U65YODBuxbsMDG6vqm1ySCl9+1X/M+c40KlG3WyLDrMqIzpn mkLTJwx0y9k7YUPxaWfRWFKHjd24LLRhSUiRcwY0cSPDl6VSf3gvf4mQ2NccittbNc1F wIHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740167090; x=1740771890; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RZjnVAd0Kk3hsZLXO7/QTBdvrnQz6hHcWXWZZevM+TY=; b=KPt9VBhYjb7vwyqz87KXF/IxvRqPkRVQ4vdk+EXTXCt4fSagIpEQGCEK8TI+LBCBKI ngCgF8xKJRRYMYJdA9zU2Qt+nafEty2BliaPS9Aphv0pbLJ2A8Eyk5Abl5nzC6vFiQTe N8k8czZO1WJ7uugFq8ZJyPE59Q4X2fniqrkZy8QWeUtCWBfRb67m6YoDxIZJo6eMtpBy mJitflmyvxhpPyzUDBN298PldVZbXoAmAVYVxRk6hXpKp+jo0uuYtO1kScuoGslVkwaX rKg8d5GByL54LSyqg/BD1MljIEnpbj8FEK7yPeng/y45s6SdWzgBpAk9vzUN1TAS9id2 77hw== X-Forwarded-Encrypted: i=1; AJvYcCWzvnBrLJtjHoDQgrpl8fFZKxajgKL/8cmRZjgEhlRAmezaj5s1oeYotCwcgHNdggi80epcQUXTbg==@kvack.org X-Gm-Message-State: AOJu0YyR2rx6K9BdAcgERYsGDYEvt50OEePFb1YH9J3D/XAcRTNn7s5q CBJ/XOWua5TUOijgSAHN8Rto+kvfwnhXJVDCJHkHQXMBeeYAXlG0 X-Gm-Gg: ASbGnctZHDvA1tZ80Lw8joNY4QH+uSCXKMmI6u60+OQUFBLntTJmYQAtuEHmSQoAIkn CrtXHR7cI/DYl/ZFA1WO/MLjMArJuoO3jICIULL+Hz4X76MXmihF/1QKPt72bZtQSZjICxG7qoO RIUH5guOPeQH3gvUjPL1lMqA7fusaeZOhqtcNMJrXJYCfWI+9ZWxNK9SAwBU8djykAFDWS/oloU +8xrtckuw2ORxe6PQATGX9qTdjwSbdqGQM9nRzD9CaDMTIi0PXS7gLyREf7onIo1jwcvIVnCsBQ yIsZYShL7A== X-Google-Smtp-Source: AGHT+IHpltLlzQ5mPajgCil6CZ5jnzSzFVk0xzORVfZFBTS6ypJav0O4fpnuxPf6St2J7iSoRLwJmg== X-Received: by 2002:a05:690c:c18:b0:6fb:a467:bff4 with SMTP id 00721157ae682-6fbcc365f4amr42366297b3.24.1740167089631; Fri, 21 Feb 2025 11:44:49 -0800 (PST) Received: from smtpclient.apple ([2402:d0c0:11:86::1]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6fb6f89cd50sm29235837b3.69.2025.02.21.11.44.46 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Feb 2025 11:44:49 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.400.131.1.6\)) Subject: Re: [PATCH] bcachefs: Use alloc_percpu_gfp to avoid deadlock From: Alan Huang In-Reply-To: Date: Sat, 22 Feb 2025 03:44:33 +0800 Cc: Kent Overstreet , Vlastimil Babka , linux-bcachefs@vger.kernel.org, syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com, linux-mm@kvack.org, Tejun Heo , Christoph Lameter , Michal Hocko Content-Transfer-Encoding: quoted-printable Message-Id: <92074FA7-F37E-49E7-810B-55ACD187EC0F@gmail.com> References: <20250212100625.55860-1-mmpgouride@gmail.com> <25FBAAE5-8BC6-41F3-9A6D-65911BA5A5D7@gmail.com> <78d954b5-e33f-4bbc-855b-e91e96278bef@suse.cz> To: Dennis Zhou X-Mailer: Apple Mail (2.3826.400.131.1.6) X-Rspam-User: X-Rspamd-Queue-Id: AB3BDC0005 X-Rspamd-Server: rspam07 X-Stat-Signature: eom6q7dryo9576zr5mt6313wga4g49b1 X-HE-Tag: 1740167090-279767 X-HE-Meta: U2FsdGVkX19XfYGHxVfdQYKJ90IXLVmiYcymyy5k9jfKejeSuerNuhf2rCmVwM864kX/wEZotx0VwrIClqaNsPkRylOWj0u92gg/782fqmmB3ATvqVChF0dF5PMXaac2yFQMcjGaTtjOZ8P4nzre3JUaGu0cdjrzzWGZXsbqgdEr2i7dkO6vo6lv/EdET7U8c+KSOfPvtWs1k6MUYSg/2UCZdC5RwxcE5LSa7SksB4CAX3/lJIvtshvLpq0pkXX/I3/qUfuZdWZb8qBvGEDewXdNX2MNb5EiGkqwhPdqlVRxBHLJvh6420iBj0iSe5kDn5jayh0Lf84JC0dtceLL7kJPbBtXau95YXzFj14YB4Tcf+4sa2ExiYZGARQfvnd72tblyoT37NFFKk6N2g1QupiJ02ea8X9N5HQU88Lm5ouuVaknlCg/q1e1IhCxApYWvryU/O0JmwZz+H/1ZKfmPisnsVEdjqjra0wOtpbOLUTjj43tEXoz8dl3wvHXpfrBnFrkWWPwOaw7ElJSJ2nGMrvrt890qyMuHz+WmQd6lPmFLRaqcs5qeXOog6kQOUhNGV8gvZhiRzDxJxddaTg+MC7RHAA/veQ5qxS/29/gjD8I7tUeMayx811rEl2BgyTMKP5QFK+WIMi1G3ndg21q7wts32SzbnbxWDHWgp2Sb/eh4S+ZFR7+wr9dfG3BDvkSkE/ayp2eAzKjIL4836UqiUIcP7o7zBwA+YZFeFi9mhgNqJS6naZp1zzLF5nkqtJB+tVR5YF+7M9KxM7vbI7syxnlDUq2n4vibJtOs2KH750fcodhHR4as+yMxsM7Cr2brs+H8hhmLbwHy8YFzTOdGHuJT0HXHKZs0Ps3F5H226//BXHMxsYXb1ancX9CNj3HrqKtNF0Hx+qV21JQt++6XAE3eCNurPm47CKh02XplHWwFfK+VM7uBusTn9wTurAZDInCGXzkzglpaiaCOEp NTHRhWea ZtbEeiYfTr7uBVGJyvhYelZoxIrN5OoL20Ma4qixnxtTLSGOT3l1Qhqdcp/jgGib15R92NpjViJHErWXqHdQLo8Sh3qd2mCNMSxd/LNoroXw2+KvRJjS+4g3o4qquif5nDB8C0bhCs2q/OkCZoFU7f749lLJOXSRDjHZgMv1M0HMvVeZk9Ppfa/darUpelLPJIMUChjJ2e7Ks6H39+QpRe/qKSRvFn0HGoNMCzzsqNbfmSoOqyLNAjqLu5B6HTjTS3cpML6pUNb+mIN2EntRvHjyEPCdOC52hy2RLwAYv1yqUirk4/YVfvBTHwIDbqgqPT4YYg2zbN7RYj3TdsIpmG46W6QXcO3w6yWJz238oRzsCdslN/FbXAZ+CA3zVcgYqW1eDcndWR9F1DIMbXcrG8B/GqjFVcQNoNlErVELDEcesMjc3UPRslxcabqHYJRmgDGa/pAkIcPRimLXqLINvk73jr0CWA/9y79AGui3n1Yvu4luwS+rXK4zlMUyF3oEwV1jX1wn3u80qiEqCaDbUxJjyw5KwkUoh0foNdF929sKQ7T0Ff59D/iwxrYO3wI6c5+jeY/2spSAkQ405XaxIwzau3zC1gR8tzCku0njgX8/YTSGgkfRDwXppRwbi6Lbo7LkFA0DcVN4I/NE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.003364, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Feb 21, 2025, at 10:46, Dennis Zhou wrote: >=20 > Hello, >=20 > On Thu, Feb 20, 2025 at 03:37:26PM -0500, Kent Overstreet wrote: >> On Thu, Feb 20, 2025 at 06:16:43PM +0100, Vlastimil Babka wrote: >>> On 2/20/25 11:57, Alan Huang wrote: >>>> Ping >>>>=20 >>>>> On Feb 12, 2025, at 22:27, Kent Overstreet = wrote: >>>>>=20 >>>>> Adding pcpu people to the CC >>>>>=20 >>>>> On Wed, Feb 12, 2025 at 06:06:25PM +0800, Alan Huang wrote: >>>>>> The cycle: >>>>>>=20 >>>>>> CPU0: CPU1: >>>>>> bc->lock pcpu_alloc_mutex >>>>>> pcpu_alloc_mutex bc->lock >>>>>>=20 >>>>>> Reported-by: = syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com >>>>>> Tested-by: syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com >>>>>> Signed-off-by: Alan Huang >>>>>=20 >>>>> So pcpu_alloc_mutex -> fs_reclaim? >>>>>=20 >>>>> That's really awkward; seems like something that might invite more >>>>> issues. We can apply your fix if we need to, but I want to hear = with the >>>>> percpu people have to say first. >>>>>=20 >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D >>>>> WARNING: possible circular locking dependency detected >>>>> 6.14.0-rc2-syzkaller-00039-g09fbf3d50205 #0 Not tainted >>>>> ------------------------------------------------------ >>>>> syz.0.21/5625 is trying to acquire lock: >>>>> ffffffff8ea19608 (pcpu_alloc_mutex){+.+.}-{4:4}, at: = pcpu_alloc_noprof+0x293/0x1760 mm/percpu.c:1782 >>>>>=20 >>>>> but task is already holding lock: >>>>> ffff888051401c68 (&bc->lock){+.+.}-{4:4}, at: = bch2_btree_node_mem_alloc+0x559/0x16f0 fs/bcachefs/btree_cache.c:804 >>>>>=20 >>>>> which lock already depends on the new lock. >>>>>=20 >>>>>=20 >>>>> the existing dependency chain (in reverse order) is: >>>>>=20 >>>>> -> #2 (&bc->lock){+.+.}-{4:4}: >>>>> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851 >>>>> __mutex_lock_common kernel/locking/mutex.c:585 [inline] >>>>> __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730 >>>>> bch2_btree_cache_scan+0x184/0xec0 = fs/bcachefs/btree_cache.c:482 >>>>> do_shrink_slab+0x72d/0x1160 mm/shrinker.c:437 >>>>> shrink_slab+0x1093/0x14d0 mm/shrinker.c:664 >>>>> shrink_one+0x43b/0x850 mm/vmscan.c:4868 >>>>> shrink_many mm/vmscan.c:4929 [inline] >>>>> lru_gen_shrink_node mm/vmscan.c:5007 [inline] >>>>> shrink_node+0x37c5/0x3e50 mm/vmscan.c:5978 >>>>> kswapd_shrink_node mm/vmscan.c:6807 [inline] >>>>> balance_pgdat mm/vmscan.c:6999 [inline] >>>>> kswapd+0x20f3/0x3b10 mm/vmscan.c:7264 >>>>> kthread+0x7a9/0x920 kernel/kthread.c:464 >>>>> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148 >>>>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 >>>>>=20 >>>>> -> #1 (fs_reclaim){+.+.}-{0:0}: >>>>> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851 >>>>> __fs_reclaim_acquire mm/page_alloc.c:3853 [inline] >>>>> fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3867 >>>>> might_alloc include/linux/sched/mm.h:318 [inline] >>>>> slab_pre_alloc_hook mm/slub.c:4066 [inline] >>>>> slab_alloc_node mm/slub.c:4144 [inline] >>>>> __do_kmalloc_node mm/slub.c:4293 [inline] >>>>> __kmalloc_noprof+0xae/0x4c0 mm/slub.c:4306 >>>>> kmalloc_noprof include/linux/slab.h:905 [inline] >>>>> kzalloc_noprof include/linux/slab.h:1037 [inline] >>>>> pcpu_mem_zalloc mm/percpu.c:510 [inline] >>>>> pcpu_alloc_chunk mm/percpu.c:1430 [inline] >>>>> pcpu_create_chunk+0x57/0xbc0 mm/percpu-vm.c:338 >>>>> pcpu_balance_populated mm/percpu.c:2063 [inline] >>>>> pcpu_balance_workfn+0xc4d/0xd40 mm/percpu.c:2200 >>>>> process_one_work kernel/workqueue.c:3236 [inline] >>>>> process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317 >>>>> worker_thread+0x870/0xd30 kernel/workqueue.c:3398 >>>>> kthread+0x7a9/0x920 kernel/kthread.c:464 >>>>> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148 >>>>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 >>>=20 >>> Seeing this as part of the chain (fs reclaim from a worker doing >>> pcpu_balance_workfn) makes me think Michal's patch could be a fix to = this: >>>=20 >>> = https://lore.kernel.org/all/20250206122633.167896-1-mhocko@kernel.org/ >>=20 >> Thanks for the link - that does look like just the thing. >=20 > Sorry I missed the first email asking to weigh in. >=20 > Michal's problem is a little bit different than what's happening here. > He's having an issue where a alloc_percpu_gfp(NOFS/NOIO) is considered > atomic and failing during probing. This is because we don't have = enough > percpu memory backed to fulfill the "atomic" requests. >=20 > Historically we've considered any allocation that's not GFP_KERNEL to = be > atomic. Here it seems like the alloc_percpu() behind the bc->lock() > should have been an "atomic" allocation to prevent the lock cycle? I think so, if I understand it correctly, NOFS/NOIO could invoke the = shrinker, so we=20 can lock bc->lock again. And I think we should not rely on the = implementation of=20 alloc_percpu_gfp, but the GFP flags instead. Correct me if I'm wrong. > Thanks, > Dennis