Re: [PATCH] bcachefs: Use alloc_percpu_gfp to avoid deadlock

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Alan Huang <mmpgouride@gmail.com>
To: Dennis Zhou <dennis@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-bcachefs@vger.kernel.org,
	syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com,
	linux-mm@kvack.org, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH] bcachefs: Use alloc_percpu_gfp to avoid deadlock
Date: Sat, 22 Feb 2025 03:44:33 +0800	[thread overview]
Message-ID: <92074FA7-F37E-49E7-810B-55ACD187EC0F@gmail.com> (raw)
In-Reply-To: <Z7fpEy1fdkEtvIw2@snowbird>

On Feb 21, 2025, at 10:46, Dennis Zhou <dennis@kernel.org> wrote:
> 
> Hello,
> 
> On Thu, Feb 20, 2025 at 03:37:26PM -0500, Kent Overstreet wrote:
>> On Thu, Feb 20, 2025 at 06:16:43PM +0100, Vlastimil Babka wrote:
>>> On 2/20/25 11:57, Alan Huang wrote:
>>>> Ping
>>>> 
>>>>> On Feb 12, 2025, at 22:27, Kent Overstreet <kent.overstreet@linux.dev> wrote:
>>>>> 
>>>>> Adding pcpu people to the CC
>>>>> 
>>>>> On Wed, Feb 12, 2025 at 06:06:25PM +0800, Alan Huang wrote:
>>>>>> The cycle:
>>>>>> 
>>>>>> CPU0: CPU1:
>>>>>> bc->lock pcpu_alloc_mutex
>>>>>> pcpu_alloc_mutex bc->lock
>>>>>> 
>>>>>> Reported-by: syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com
>>>>>> Tested-by: syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com
>>>>>> Signed-off-by: Alan Huang <mmpgouride@gmail.com>
>>>>> 
>>>>> So pcpu_alloc_mutex -> fs_reclaim?
>>>>> 
>>>>> That's really awkward; seems like something that might invite more
>>>>> issues. We can apply your fix if we need to, but I want to hear with the
>>>>> percpu people have to say first.
>>>>> 
>>>>> ======================================================
>>>>> WARNING: possible circular locking dependency detected
>>>>> 6.14.0-rc2-syzkaller-00039-g09fbf3d50205 #0 Not tainted
>>>>> ------------------------------------------------------
>>>>> syz.0.21/5625 is trying to acquire lock:
>>>>> ffffffff8ea19608 (pcpu_alloc_mutex){+.+.}-{4:4}, at: pcpu_alloc_noprof+0x293/0x1760 mm/percpu.c:1782
>>>>> 
>>>>> but task is already holding lock:
>>>>> ffff888051401c68 (&bc->lock){+.+.}-{4:4}, at: bch2_btree_node_mem_alloc+0x559/0x16f0 fs/bcachefs/btree_cache.c:804
>>>>> 
>>>>> which lock already depends on the new lock.
>>>>> 
>>>>> 
>>>>> the existing dependency chain (in reverse order) is:
>>>>> 
>>>>> -> #2 (&bc->lock){+.+.}-{4:4}:
>>>>>      lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
>>>>>      __mutex_lock_common kernel/locking/mutex.c:585 [inline]
>>>>>      __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
>>>>>      bch2_btree_cache_scan+0x184/0xec0 fs/bcachefs/btree_cache.c:482
>>>>>      do_shrink_slab+0x72d/0x1160 mm/shrinker.c:437
>>>>>      shrink_slab+0x1093/0x14d0 mm/shrinker.c:664
>>>>>      shrink_one+0x43b/0x850 mm/vmscan.c:4868
>>>>>      shrink_many mm/vmscan.c:4929 [inline]
>>>>>      lru_gen_shrink_node mm/vmscan.c:5007 [inline]
>>>>>      shrink_node+0x37c5/0x3e50 mm/vmscan.c:5978
>>>>>      kswapd_shrink_node mm/vmscan.c:6807 [inline]
>>>>>      balance_pgdat mm/vmscan.c:6999 [inline]
>>>>>      kswapd+0x20f3/0x3b10 mm/vmscan.c:7264
>>>>>      kthread+0x7a9/0x920 kernel/kthread.c:464
>>>>>      ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
>>>>>      ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>>>>> 
>>>>> -> #1 (fs_reclaim){+.+.}-{0:0}:
>>>>>      lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
>>>>>      __fs_reclaim_acquire mm/page_alloc.c:3853 [inline]
>>>>>      fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3867
>>>>>      might_alloc include/linux/sched/mm.h:318 [inline]
>>>>>      slab_pre_alloc_hook mm/slub.c:4066 [inline]
>>>>>      slab_alloc_node mm/slub.c:4144 [inline]
>>>>>      __do_kmalloc_node mm/slub.c:4293 [inline]
>>>>>      __kmalloc_noprof+0xae/0x4c0 mm/slub.c:4306
>>>>>      kmalloc_noprof include/linux/slab.h:905 [inline]
>>>>>      kzalloc_noprof include/linux/slab.h:1037 [inline]
>>>>>      pcpu_mem_zalloc mm/percpu.c:510 [inline]
>>>>>      pcpu_alloc_chunk mm/percpu.c:1430 [inline]
>>>>>      pcpu_create_chunk+0x57/0xbc0 mm/percpu-vm.c:338
>>>>>      pcpu_balance_populated mm/percpu.c:2063 [inline]
>>>>>      pcpu_balance_workfn+0xc4d/0xd40 mm/percpu.c:2200
>>>>>      process_one_work kernel/workqueue.c:3236 [inline]
>>>>>      process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317
>>>>>      worker_thread+0x870/0xd30 kernel/workqueue.c:3398
>>>>>      kthread+0x7a9/0x920 kernel/kthread.c:464
>>>>>      ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
>>>>>      ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>>> 
>>> Seeing this as part of the chain (fs reclaim from a worker doing
>>> pcpu_balance_workfn) makes me think Michal's patch could be a fix to this:
>>> 
>>> https://lore.kernel.org/all/20250206122633.167896-1-mhocko@kernel.org/
>> 
>> Thanks for the link - that does look like just the thing.
> 
> Sorry I missed the first email asking to weigh in.
> 
> Michal's problem is a little bit different than what's happening here.
> He's having an issue where a alloc_percpu_gfp(NOFS/NOIO) is considered
> atomic and failing during probing. This is because we don't have enough
> percpu memory backed to fulfill the "atomic" requests.
> 
> Historically we've considered any allocation that's not GFP_KERNEL to be
> atomic. Here it seems like the alloc_percpu() behind the bc->lock()
> should have been an "atomic" allocation to prevent the lock cycle?

I think so, if I understand it correctly, NOFS/NOIO could invoke the shrinker, so we 
can lock bc->lock again. And I think we should not rely on the implementation of 
alloc_percpu_gfp, but the GFP flags instead.

Correct me if I'm wrong.

> Thanks,
> Dennis

     prev parent reply	other threads:[~2025-02-21 19:44 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250212100625.55860-1-mmpgouride@gmail.com>
2025-02-12 14:27 ` Kent Overstreet
2025-02-20 10:57   ` Alan Huang
2025-02-20 12:40     ` Kent Overstreet
2025-02-20 12:44       ` Alan Huang
2025-02-20 17:16     ` Vlastimil Babka
2025-02-20 20:37       ` Kent Overstreet
2025-02-21  2:46         ` Dennis Zhou
2025-02-21  7:21           ` Vlastimil Babka
2025-02-21 19:44           ` Alan Huang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=92074FA7-F37E-49E7-810B-55ACD187EC0F@gmail.com \
    --to=mmpgouride@gmail.com \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox