linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
@ 2026-02-23 22:36 Ben Greear
  2026-02-27 16:31 ` Ben Greear
  0 siblings, 1 reply; 13+ messages in thread
From: Ben Greear @ 2026-02-23 22:36 UTC (permalink / raw)
  To: linux-wireless; +Cc: Korenblit, Miriam Rachel, linux-mm

Hello,

I hit a deadlock related to CMA mem allocation attempting to flush all work
while holding some wifi related mutex, and with a work-queue attempting to process a wifi regdomain
work item.  I really don't see any good way to fix this,
it would seem that any code that was holding a mutex that could block a work-queue
cannot safely allocate CMA memory?  Hopefully someone else has a better idea.

For whatever reason, my hacked up kernel will print out the sysrq process stack traces I need
to understand this, and my stable 6.18.13 will not.  But, the locks-held matches in both cases, so almost
certainly this is same problem.  I can reproduce the same problem on both un-modified stable
and my own.  The details below are from my modified 6.18.9+ kernel.

I only hit this (reliably?) with a KASAN enabled kernel, likely because it makes things slow enough to
hit the problem and/or causes CMA allocations in a different manner.

General way to reproduce is to have large amounts of intel be200 radios in a system, and bring them
admin up and down.


## From 6.18.13 (un-modified)

40479 Feb 23 14:13:31 ct523c-de7c kernel: 5 locks held by kworker/u32:11/34989:
40480 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff888120161148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
40481 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff8881a561fd20 ((work_completion)(&rdev->wiphy_work)){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
40482 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: cfg80211_wiphy_work+0x5c/0x570 [cfg80211]
40483 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffffffff87232e60 (&cma->alloc_mutex){+.+.}-{4:4}, at: __cma_alloc+0x3c5/0xd20
40484 Feb 23 14:13:31 ct523c-de7c kernel:  #4: ffffffff8534f668 (lock#5){+.+.}-{4:4}, at: __lru_add_drain_all+0x5f/0x530

40488 Feb 23 14:13:31 ct523c-de7c kernel: 4 locks held by kworker/1:0/39480:
40489 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff88812006b148 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
40490 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff88814087fd20 (reg_work){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
40491 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffffffff85970028 (rtnl_mutex){+.+.}-{4:4}, at: reg_todo+0x18/0x770 [cfg80211]
40492 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: reg_process_self_managed_hints+0x70/0x190 [cfg80211]


## Rest of this is from my 6.18.9+ hacks kernel.

### thread trying to allocate cma is blocked here, trying to flush work.

Type "apropos word" to search for commands related to "word"...
Reading symbols from vmlinux...
(gdb) l *(alloc_contig_range_noprof+0x1de)
0xffffffff8162453e is in alloc_contig_range_noprof (/home2/greearb/git/linux-6.18.dev.y/mm/page_alloc.c:6798).
6793			.reason = MR_CONTIG_RANGE,
6794		};
6795	
6796		lru_cache_disable();
6797	
6798		while (pfn < end || !list_empty(&cc->migratepages)) {
6799			if (fatal_signal_pending(current)) {
6800				ret = -EINTR;
6801				break;
6802			}
(gdb) l *(__lru_add_drain_all+0x19b)
0xffffffff815ae44b is in __lru_add_drain_all (/home2/greearb/git/linux-6.18.dev.y/mm/swap.c:884).
879				queue_work_on(cpu, mm_percpu_wq, work);
880				__cpumask_set_cpu(cpu, &has_work);
881			}
882		}
883	
884		for_each_cpu(cpu, &has_work)
885			flush_work(&per_cpu(lru_add_drain_work, cpu));
886	
887	done:
888		mutex_unlock(&lock);
(gdb)


#### and other thread is trying to process a regdom request, and trying to use
# rcu and rtnl???

Type "apropos word" to search for commands related to "word"...
Reading symbols from net/wireless/cfg80211.ko...
(gdb) l *(reg_todo+0x18)
0xe238 is in reg_todo (/home2/greearb/git/linux-6.18.dev.y/net/wireless/reg.c:3107).
3102	 */
3103	static void reg_process_pending_hints(void)
3104	{
3105		struct regulatory_request *reg_request, *lr;
3106	
3107		lr = get_last_request();
3108	
3109		/* When last_request->processed becomes true this will be rescheduled */
3110		if (lr && !lr->processed) {
3111			pr_debug("Pending regulatory request, waiting for it to be processed...\n");
(gdb)

static struct regulatory_request *get_last_request(void)
{
	return rcu_dereference_rtnl(last_request);
}


task:kworker/6:0     state:D stack:0     pid:56    tgid:56    ppid:2      task_flags:0x4208060 flags:0x00080000
Workqueue: events reg_todo [cfg80211]
Call Trace:
  <TASK>
  __schedule+0x526/0x1290
  preempt_schedule_notrace+0x35/0x50
  preempt_schedule_notrace_thunk+0x16/0x30
  rcu_is_watching+0x2a/0x30
  lock_acquire+0x26d/0x2c0
  schedule+0xac/0x120
  ? schedule+0x8d/0x120
  schedule_preempt_disabled+0x11/0x20
  __mutex_lock+0x726/0x1070
  ? reg_todo+0x18/0x2b0 [cfg80211]
  ? reg_todo+0x18/0x2b0 [cfg80211]
  reg_todo+0x18/0x2b0 [cfg80211]
  process_one_work+0x221/0x6d0
  worker_thread+0x1e5/0x3b0
  ? rescuer_thread+0x450/0x450
  kthread+0x108/0x220
  ? kthreads_online_cpu+0x110/0x110
  ret_from_fork+0x1c6/0x220
  ? kthreads_online_cpu+0x110/0x110
  ret_from_fork_asm+0x11/0x20
  </TASK>

task:ip              state:D stack:0     pid:72857 tgid:72857 ppid:72843  task_flags:0x400100 flags:0x00080001
Call Trace:
  <TASK>
  __schedule+0x526/0x1290
  ? schedule+0x8d/0x120
  ? schedule+0xe2/0x120
  schedule+0x36/0x120
  schedule_timeout+0xf9/0x110
  ? mark_held_locks+0x40/0x70
  __wait_for_common+0xbe/0x1e0
  ? hrtimer_nanosleep_restart+0x120/0x120
  ? __flush_work+0x20b/0x530
  __flush_work+0x34e/0x530
  ? flush_workqueue_prep_pwqs+0x160/0x160
  ? bpf_prog_test_run_tracing+0x160/0x2d0
  __lru_add_drain_all+0x19b/0x220
  alloc_contig_range_noprof+0x1de/0x8a0
  __cma_alloc+0x1f1/0x6a0
  __dma_direct_alloc_pages.isra.0+0xcb/0x2f0
  dma_direct_alloc+0x7b/0x250
  dma_alloc_attrs+0xa1/0x2a0
  _iwl_pcie_ctxt_info_dma_alloc_coherent+0x31/0xb0 [iwlwifi]
  iwl_pcie_ctxt_info_alloc_dma+0x20/0x50 [iwlwifi]
  iwl_pcie_init_fw_sec+0x2fc/0x380 [iwlwifi]
  iwl_pcie_ctxt_info_v2_alloc+0x19e/0x530 [iwlwifi]
  iwl_trans_pcie_gen2_start_fw+0x2e2/0x820 [iwlwifi]
  ? lock_is_held_type+0x92/0x100
  iwl_trans_start_fw+0x77/0x90 [iwlwifi]
  iwl_mld_load_fw_wait_alive+0x97/0x2c0 [iwlmld]
  ? iwl_mld_mac80211_sta_state+0x780/0x780 [iwlmld]
  ? lock_is_held_type+0x92/0x100
  iwl_mld_load_fw+0x91/0x240 [iwlmld]
  ? ieee80211_open+0x3d/0xe0 [mac80211]
  ? lock_is_held_type+0x92/0x100
  iwl_mld_start_fw+0x44/0x470 [iwlmld]
  iwl_mld_mac80211_start+0x3d/0x1b0 [iwlmld]
  drv_start+0x6f/0x1d0 [mac80211]
  ieee80211_do_open+0x2d6/0x960 [mac80211]
  ieee80211_open+0x62/0xe0 [mac80211]
  __dev_open+0x11a/0x2e0
  __dev_change_flags+0x1f8/0x280
  netif_change_flags+0x22/0x60
  do_setlink.isra.0+0xe57/0x11a0
  ? __mutex_lock+0xb0/0x1070
  ? __mutex_lock+0x99e/0x1070
  ? __nla_validate_parse+0x5e/0xcd0
  ? rtnl_newlink+0x355/0xb50
  ? cap_capable+0x90/0x100
  ? security_capable+0x72/0x80
  rtnl_newlink+0x7e8/0xb50
  ? __lock_acquire+0x436/0x2190
  ? lock_acquire+0xc2/0x2c0
  ? rtnetlink_rcv_msg+0x97/0x660
  ? find_held_lock+0x2b/0x80
  ? do_setlink.isra.0+0x11a0/0x11a0
  ? rtnetlink_rcv_msg+0x3ea/0x660
  ? lock_release+0xcc/0x290
  ? do_setlink.isra.0+0x11a0/0x11a0
  rtnetlink_rcv_msg+0x409/0x660
  ? rtnl_fdb_dump+0x240/0x240
  netlink_rcv_skb+0x56/0x100
  netlink_unicast+0x1e1/0x2d0
  netlink_sendmsg+0x219/0x460
  __sock_sendmsg+0x38/0x70
  ____sys_sendmsg+0x214/0x280
  ? import_iovec+0x2c/0x30
  ? copy_msghdr_from_user+0x6c/0xa0
  ___sys_sendmsg+0x85/0xd0
  ? __lock_acquire+0x436/0x2190
  ? find_held_lock+0x2b/0x80
  ? lock_acquire+0xc2/0x2c0
  ? mntput_no_expire+0x43/0x460
  ? find_held_lock+0x2b/0x80
  ? mntput_no_expire+0x8c/0x460
  __sys_sendmsg+0x6b/0xc0
  do_syscall_64+0x6b/0x11b0
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-02-23 22:36 6.18.13 iwlwifi deadlock allocating cma while work-item is active Ben Greear
@ 2026-02-27 16:31 ` Ben Greear
  2026-03-01 15:38   ` Ben Greear
  0 siblings, 1 reply; 13+ messages in thread
From: Ben Greear @ 2026-02-27 16:31 UTC (permalink / raw)
  To: linux-wireless; +Cc: Korenblit, Miriam Rachel, linux-mm

On 2/23/26 14:36, Ben Greear wrote:
> Hello,
> 
> I hit a deadlock related to CMA mem allocation attempting to flush all work
> while holding some wifi related mutex, and with a work-queue attempting to process a wifi regdomain
> work item.  I really don't see any good way to fix this,
> it would seem that any code that was holding a mutex that could block a work-queue
> cannot safely allocate CMA memory?  Hopefully someone else has a better idea.

I tried using a kthread to do the regulatory domain processing instead of worker item,
and that seems to have solved the problem.  If that seems reasonable approach to
wifi stack folks, I can post a patch.

Thanks,
Ben

> 
> For whatever reason, my hacked up kernel will print out the sysrq process stack traces I need
> to understand this, and my stable 6.18.13 will not.  But, the locks-held matches in both cases, so almost
> certainly this is same problem.  I can reproduce the same problem on both un-modified stable
> and my own.  The details below are from my modified 6.18.9+ kernel.
> 
> I only hit this (reliably?) with a KASAN enabled kernel, likely because it makes things slow enough to
> hit the problem and/or causes CMA allocations in a different manner.
> 
> General way to reproduce is to have large amounts of intel be200 radios in a system, and bring them
> admin up and down.
> 
> 
> ## From 6.18.13 (un-modified)
> 
> 40479 Feb 23 14:13:31 ct523c-de7c kernel: 5 locks held by kworker/u32:11/34989:
> 40480 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff888120161148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
> 40481 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff8881a561fd20 ((work_completion)(&rdev->wiphy_work)){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
> 40482 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: cfg80211_wiphy_work+0x5c/0x570 [cfg80211]
> 40483 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffffffff87232e60 (&cma->alloc_mutex){+.+.}-{4:4}, at: __cma_alloc+0x3c5/0xd20
> 40484 Feb 23 14:13:31 ct523c-de7c kernel:  #4: ffffffff8534f668 (lock#5){+.+.}-{4:4}, at: __lru_add_drain_all+0x5f/0x530
> 
> 40488 Feb 23 14:13:31 ct523c-de7c kernel: 4 locks held by kworker/1:0/39480:
> 40489 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff88812006b148 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
> 40490 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff88814087fd20 (reg_work){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
> 40491 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffffffff85970028 (rtnl_mutex){+.+.}-{4:4}, at: reg_todo+0x18/0x770 [cfg80211]
> 40492 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: reg_process_self_managed_hints+0x70/0x190 [cfg80211]
> 
> 
> ## Rest of this is from my 6.18.9+ hacks kernel.
> 
> ### thread trying to allocate cma is blocked here, trying to flush work.
> 
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from vmlinux...
> (gdb) l *(alloc_contig_range_noprof+0x1de)
> 0xffffffff8162453e is in alloc_contig_range_noprof (/home2/greearb/git/linux-6.18.dev.y/mm/page_alloc.c:6798).
> 6793            .reason = MR_CONTIG_RANGE,
> 6794        };
> 6795
> 6796        lru_cache_disable();
> 6797
> 6798        while (pfn < end || !list_empty(&cc->migratepages)) {
> 6799            if (fatal_signal_pending(current)) {
> 6800                ret = -EINTR;
> 6801                break;
> 6802            }
> (gdb) l *(__lru_add_drain_all+0x19b)
> 0xffffffff815ae44b is in __lru_add_drain_all (/home2/greearb/git/linux-6.18.dev.y/mm/swap.c:884).
> 879                queue_work_on(cpu, mm_percpu_wq, work);
> 880                __cpumask_set_cpu(cpu, &has_work);
> 881            }
> 882        }
> 883
> 884        for_each_cpu(cpu, &has_work)
> 885            flush_work(&per_cpu(lru_add_drain_work, cpu));
> 886
> 887    done:
> 888        mutex_unlock(&lock);
> (gdb)
> 
> 
> #### and other thread is trying to process a regdom request, and trying to use
> # rcu and rtnl???
> 
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from net/wireless/cfg80211.ko...
> (gdb) l *(reg_todo+0x18)
> 0xe238 is in reg_todo (/home2/greearb/git/linux-6.18.dev.y/net/wireless/reg.c:3107).
> 3102     */
> 3103    static void reg_process_pending_hints(void)
> 3104    {
> 3105        struct regulatory_request *reg_request, *lr;
> 3106
> 3107        lr = get_last_request();
> 3108
> 3109        /* When last_request->processed becomes true this will be rescheduled */
> 3110        if (lr && !lr->processed) {
> 3111            pr_debug("Pending regulatory request, waiting for it to be processed...\n");
> (gdb)
> 
> static struct regulatory_request *get_last_request(void)
> {
>      return rcu_dereference_rtnl(last_request);
> }
> 
> 
> task:kworker/6:0     state:D stack:0     pid:56    tgid:56    ppid:2      task_flags:0x4208060 flags:0x00080000
> Workqueue: events reg_todo [cfg80211]
> Call Trace:
>   <TASK>
>   __schedule+0x526/0x1290
>   preempt_schedule_notrace+0x35/0x50
>   preempt_schedule_notrace_thunk+0x16/0x30
>   rcu_is_watching+0x2a/0x30
>   lock_acquire+0x26d/0x2c0
>   schedule+0xac/0x120
>   ? schedule+0x8d/0x120
>   schedule_preempt_disabled+0x11/0x20
>   __mutex_lock+0x726/0x1070
>   ? reg_todo+0x18/0x2b0 [cfg80211]
>   ? reg_todo+0x18/0x2b0 [cfg80211]
>   reg_todo+0x18/0x2b0 [cfg80211]
>   process_one_work+0x221/0x6d0
>   worker_thread+0x1e5/0x3b0
>   ? rescuer_thread+0x450/0x450
>   kthread+0x108/0x220
>   ? kthreads_online_cpu+0x110/0x110
>   ret_from_fork+0x1c6/0x220
>   ? kthreads_online_cpu+0x110/0x110
>   ret_from_fork_asm+0x11/0x20
>   </TASK>
> 
> task:ip              state:D stack:0     pid:72857 tgid:72857 ppid:72843  task_flags:0x400100 flags:0x00080001
> Call Trace:
>   <TASK>
>   __schedule+0x526/0x1290
>   ? schedule+0x8d/0x120
>   ? schedule+0xe2/0x120
>   schedule+0x36/0x120
>   schedule_timeout+0xf9/0x110
>   ? mark_held_locks+0x40/0x70
>   __wait_for_common+0xbe/0x1e0
>   ? hrtimer_nanosleep_restart+0x120/0x120
>   ? __flush_work+0x20b/0x530
>   __flush_work+0x34e/0x530
>   ? flush_workqueue_prep_pwqs+0x160/0x160
>   ? bpf_prog_test_run_tracing+0x160/0x2d0
>   __lru_add_drain_all+0x19b/0x220
>   alloc_contig_range_noprof+0x1de/0x8a0
>   __cma_alloc+0x1f1/0x6a0
>   __dma_direct_alloc_pages.isra.0+0xcb/0x2f0
>   dma_direct_alloc+0x7b/0x250
>   dma_alloc_attrs+0xa1/0x2a0
>   _iwl_pcie_ctxt_info_dma_alloc_coherent+0x31/0xb0 [iwlwifi]
>   iwl_pcie_ctxt_info_alloc_dma+0x20/0x50 [iwlwifi]
>   iwl_pcie_init_fw_sec+0x2fc/0x380 [iwlwifi]
>   iwl_pcie_ctxt_info_v2_alloc+0x19e/0x530 [iwlwifi]
>   iwl_trans_pcie_gen2_start_fw+0x2e2/0x820 [iwlwifi]
>   ? lock_is_held_type+0x92/0x100
>   iwl_trans_start_fw+0x77/0x90 [iwlwifi]
>   iwl_mld_load_fw_wait_alive+0x97/0x2c0 [iwlmld]
>   ? iwl_mld_mac80211_sta_state+0x780/0x780 [iwlmld]
>   ? lock_is_held_type+0x92/0x100
>   iwl_mld_load_fw+0x91/0x240 [iwlmld]
>   ? ieee80211_open+0x3d/0xe0 [mac80211]
>   ? lock_is_held_type+0x92/0x100
>   iwl_mld_start_fw+0x44/0x470 [iwlmld]
>   iwl_mld_mac80211_start+0x3d/0x1b0 [iwlmld]
>   drv_start+0x6f/0x1d0 [mac80211]
>   ieee80211_do_open+0x2d6/0x960 [mac80211]
>   ieee80211_open+0x62/0xe0 [mac80211]
>   __dev_open+0x11a/0x2e0
>   __dev_change_flags+0x1f8/0x280
>   netif_change_flags+0x22/0x60
>   do_setlink.isra.0+0xe57/0x11a0
>   ? __mutex_lock+0xb0/0x1070
>   ? __mutex_lock+0x99e/0x1070
>   ? __nla_validate_parse+0x5e/0xcd0
>   ? rtnl_newlink+0x355/0xb50
>   ? cap_capable+0x90/0x100
>   ? security_capable+0x72/0x80
>   rtnl_newlink+0x7e8/0xb50
>   ? __lock_acquire+0x436/0x2190
>   ? lock_acquire+0xc2/0x2c0
>   ? rtnetlink_rcv_msg+0x97/0x660
>   ? find_held_lock+0x2b/0x80
>   ? do_setlink.isra.0+0x11a0/0x11a0
>   ? rtnetlink_rcv_msg+0x3ea/0x660
>   ? lock_release+0xcc/0x290
>   ? do_setlink.isra.0+0x11a0/0x11a0
>   rtnetlink_rcv_msg+0x409/0x660
>   ? rtnl_fdb_dump+0x240/0x240
>   netlink_rcv_skb+0x56/0x100
>   netlink_unicast+0x1e1/0x2d0
>   netlink_sendmsg+0x219/0x460
>   __sock_sendmsg+0x38/0x70
>   ____sys_sendmsg+0x214/0x280
>   ? import_iovec+0x2c/0x30
>   ? copy_msghdr_from_user+0x6c/0xa0
>   ___sys_sendmsg+0x85/0xd0
>   ? __lock_acquire+0x436/0x2190
>   ? find_held_lock+0x2b/0x80
>   ? lock_acquire+0xc2/0x2c0
>   ? mntput_no_expire+0x43/0x460
>   ? find_held_lock+0x2b/0x80
>   ? mntput_no_expire+0x8c/0x460
>   __sys_sendmsg+0x6b/0xc0
>   do_syscall_64+0x6b/0x11b0
>   entry_SYSCALL_64_after_hwframe+0x4b/0x53
> 
> Thanks,
> Ben
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-02-27 16:31 ` Ben Greear
@ 2026-03-01 15:38   ` Ben Greear
  2026-03-02  8:07     ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Ben Greear @ 2026-03-01 15:38 UTC (permalink / raw)
  To: linux-wireless; +Cc: Korenblit, Miriam Rachel, linux-mm

On 2/27/26 08:31, Ben Greear wrote:
> On 2/23/26 14:36, Ben Greear wrote:
>> Hello,
>>
>> I hit a deadlock related to CMA mem allocation attempting to flush all work
>> while holding some wifi related mutex, and with a work-queue attempting to process a wifi regdomain
>> work item.  I really don't see any good way to fix this,
>> it would seem that any code that was holding a mutex that could block a work-queue
>> cannot safely allocate CMA memory?  Hopefully someone else has a better idea.
> 
> I tried using a kthread to do the regulatory domain processing instead of worker item,
> and that seems to have solved the problem.  If that seems reasonable approach to
> wifi stack folks, I can post a patch.

The other net/wireless work-item 'disconnect_work' also needs to be moved to the kthread
for the same reason....

Thanks,
Ben

>> For whatever reason, my hacked up kernel will print out the sysrq process stack traces I need
>> to understand this, and my stable 6.18.13 will not.  But, the locks-held matches in both cases, so almost
>> certainly this is same problem.  I can reproduce the same problem on both un-modified stable
>> and my own.  The details below are from my modified 6.18.9+ kernel.
>>
>> I only hit this (reliably?) with a KASAN enabled kernel, likely because it makes things slow enough to
>> hit the problem and/or causes CMA allocations in a different manner.
>>
>> General way to reproduce is to have large amounts of intel be200 radios in a system, and bring them
>> admin up and down.
>>
>>
>> ## From 6.18.13 (un-modified)
>>
>> 40479 Feb 23 14:13:31 ct523c-de7c kernel: 5 locks held by kworker/u32:11/34989:
>> 40480 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff888120161148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
>> 40481 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff8881a561fd20 ((work_completion)(&rdev->wiphy_work)){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
>> 40482 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: cfg80211_wiphy_work+0x5c/0x570 [cfg80211]
>> 40483 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffffffff87232e60 (&cma->alloc_mutex){+.+.}-{4:4}, at: __cma_alloc+0x3c5/0xd20
>> 40484 Feb 23 14:13:31 ct523c-de7c kernel:  #4: ffffffff8534f668 (lock#5){+.+.}-{4:4}, at: __lru_add_drain_all+0x5f/0x530
>>
>> 40488 Feb 23 14:13:31 ct523c-de7c kernel: 4 locks held by kworker/1:0/39480:
>> 40489 Feb 23 14:13:31 ct523c-de7c kernel:  #0: ffff88812006b148 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xf7a/0x17b0
>> 40490 Feb 23 14:13:31 ct523c-de7c kernel:  #1: ffff88814087fd20 (reg_work){+.+.}-{0:0}, at: process_one_work+0x7ca/0x17b0
>> 40491 Feb 23 14:13:31 ct523c-de7c kernel:  #2: ffffffff85970028 (rtnl_mutex){+.+.}-{4:4}, at: reg_todo+0x18/0x770 [cfg80211]
>> 40492 Feb 23 14:13:31 ct523c-de7c kernel:  #3: ffff88815e618788 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: reg_process_self_managed_hints+0x70/0x190 [cfg80211]
>>
>>
>> ## Rest of this is from my 6.18.9+ hacks kernel.
>>
>> ### thread trying to allocate cma is blocked here, trying to flush work.
>>
>> Type "apropos word" to search for commands related to "word"...
>> Reading symbols from vmlinux...
>> (gdb) l *(alloc_contig_range_noprof+0x1de)
>> 0xffffffff8162453e is in alloc_contig_range_noprof (/home2/greearb/git/linux-6.18.dev.y/mm/page_alloc.c:6798).
>> 6793            .reason = MR_CONTIG_RANGE,
>> 6794        };
>> 6795
>> 6796        lru_cache_disable();
>> 6797
>> 6798        while (pfn < end || !list_empty(&cc->migratepages)) {
>> 6799            if (fatal_signal_pending(current)) {
>> 6800                ret = -EINTR;
>> 6801                break;
>> 6802            }
>> (gdb) l *(__lru_add_drain_all+0x19b)
>> 0xffffffff815ae44b is in __lru_add_drain_all (/home2/greearb/git/linux-6.18.dev.y/mm/swap.c:884).
>> 879                queue_work_on(cpu, mm_percpu_wq, work);
>> 880                __cpumask_set_cpu(cpu, &has_work);
>> 881            }
>> 882        }
>> 883
>> 884        for_each_cpu(cpu, &has_work)
>> 885            flush_work(&per_cpu(lru_add_drain_work, cpu));
>> 886
>> 887    done:
>> 888        mutex_unlock(&lock);
>> (gdb)
>>
>>
>> #### and other thread is trying to process a regdom request, and trying to use
>> # rcu and rtnl???
>>
>> Type "apropos word" to search for commands related to "word"...
>> Reading symbols from net/wireless/cfg80211.ko...
>> (gdb) l *(reg_todo+0x18)
>> 0xe238 is in reg_todo (/home2/greearb/git/linux-6.18.dev.y/net/wireless/reg.c:3107).
>> 3102     */
>> 3103    static void reg_process_pending_hints(void)
>> 3104    {
>> 3105        struct regulatory_request *reg_request, *lr;
>> 3106
>> 3107        lr = get_last_request();
>> 3108
>> 3109        /* When last_request->processed becomes true this will be rescheduled */
>> 3110        if (lr && !lr->processed) {
>> 3111            pr_debug("Pending regulatory request, waiting for it to be processed...\n");
>> (gdb)
>>
>> static struct regulatory_request *get_last_request(void)
>> {
>>      return rcu_dereference_rtnl(last_request);
>> }
>>
>>
>> task:kworker/6:0     state:D stack:0     pid:56    tgid:56    ppid:2      task_flags:0x4208060 flags:0x00080000
>> Workqueue: events reg_todo [cfg80211]
>> Call Trace:
>>   <TASK>
>>   __schedule+0x526/0x1290
>>   preempt_schedule_notrace+0x35/0x50
>>   preempt_schedule_notrace_thunk+0x16/0x30
>>   rcu_is_watching+0x2a/0x30
>>   lock_acquire+0x26d/0x2c0
>>   schedule+0xac/0x120
>>   ? schedule+0x8d/0x120
>>   schedule_preempt_disabled+0x11/0x20
>>   __mutex_lock+0x726/0x1070
>>   ? reg_todo+0x18/0x2b0 [cfg80211]
>>   ? reg_todo+0x18/0x2b0 [cfg80211]
>>   reg_todo+0x18/0x2b0 [cfg80211]
>>   process_one_work+0x221/0x6d0
>>   worker_thread+0x1e5/0x3b0
>>   ? rescuer_thread+0x450/0x450
>>   kthread+0x108/0x220
>>   ? kthreads_online_cpu+0x110/0x110
>>   ret_from_fork+0x1c6/0x220
>>   ? kthreads_online_cpu+0x110/0x110
>>   ret_from_fork_asm+0x11/0x20
>>   </TASK>
>>
>> task:ip              state:D stack:0     pid:72857 tgid:72857 ppid:72843  task_flags:0x400100 flags:0x00080001
>> Call Trace:
>>   <TASK>
>>   __schedule+0x526/0x1290
>>   ? schedule+0x8d/0x120
>>   ? schedule+0xe2/0x120
>>   schedule+0x36/0x120
>>   schedule_timeout+0xf9/0x110
>>   ? mark_held_locks+0x40/0x70
>>   __wait_for_common+0xbe/0x1e0
>>   ? hrtimer_nanosleep_restart+0x120/0x120
>>   ? __flush_work+0x20b/0x530
>>   __flush_work+0x34e/0x530
>>   ? flush_workqueue_prep_pwqs+0x160/0x160
>>   ? bpf_prog_test_run_tracing+0x160/0x2d0
>>   __lru_add_drain_all+0x19b/0x220
>>   alloc_contig_range_noprof+0x1de/0x8a0
>>   __cma_alloc+0x1f1/0x6a0
>>   __dma_direct_alloc_pages.isra.0+0xcb/0x2f0
>>   dma_direct_alloc+0x7b/0x250
>>   dma_alloc_attrs+0xa1/0x2a0
>>   _iwl_pcie_ctxt_info_dma_alloc_coherent+0x31/0xb0 [iwlwifi]
>>   iwl_pcie_ctxt_info_alloc_dma+0x20/0x50 [iwlwifi]
>>   iwl_pcie_init_fw_sec+0x2fc/0x380 [iwlwifi]
>>   iwl_pcie_ctxt_info_v2_alloc+0x19e/0x530 [iwlwifi]
>>   iwl_trans_pcie_gen2_start_fw+0x2e2/0x820 [iwlwifi]
>>   ? lock_is_held_type+0x92/0x100
>>   iwl_trans_start_fw+0x77/0x90 [iwlwifi]
>>   iwl_mld_load_fw_wait_alive+0x97/0x2c0 [iwlmld]
>>   ? iwl_mld_mac80211_sta_state+0x780/0x780 [iwlmld]
>>   ? lock_is_held_type+0x92/0x100
>>   iwl_mld_load_fw+0x91/0x240 [iwlmld]
>>   ? ieee80211_open+0x3d/0xe0 [mac80211]
>>   ? lock_is_held_type+0x92/0x100
>>   iwl_mld_start_fw+0x44/0x470 [iwlmld]
>>   iwl_mld_mac80211_start+0x3d/0x1b0 [iwlmld]
>>   drv_start+0x6f/0x1d0 [mac80211]
>>   ieee80211_do_open+0x2d6/0x960 [mac80211]
>>   ieee80211_open+0x62/0xe0 [mac80211]
>>   __dev_open+0x11a/0x2e0
>>   __dev_change_flags+0x1f8/0x280
>>   netif_change_flags+0x22/0x60
>>   do_setlink.isra.0+0xe57/0x11a0
>>   ? __mutex_lock+0xb0/0x1070
>>   ? __mutex_lock+0x99e/0x1070
>>   ? __nla_validate_parse+0x5e/0xcd0
>>   ? rtnl_newlink+0x355/0xb50
>>   ? cap_capable+0x90/0x100
>>   ? security_capable+0x72/0x80
>>   rtnl_newlink+0x7e8/0xb50
>>   ? __lock_acquire+0x436/0x2190
>>   ? lock_acquire+0xc2/0x2c0
>>   ? rtnetlink_rcv_msg+0x97/0x660
>>   ? find_held_lock+0x2b/0x80
>>   ? do_setlink.isra.0+0x11a0/0x11a0
>>   ? rtnetlink_rcv_msg+0x3ea/0x660
>>   ? lock_release+0xcc/0x290
>>   ? do_setlink.isra.0+0x11a0/0x11a0
>>   rtnetlink_rcv_msg+0x409/0x660
>>   ? rtnl_fdb_dump+0x240/0x240
>>   netlink_rcv_skb+0x56/0x100
>>   netlink_unicast+0x1e1/0x2d0
>>   netlink_sendmsg+0x219/0x460
>>   __sock_sendmsg+0x38/0x70
>>   ____sys_sendmsg+0x214/0x280
>>   ? import_iovec+0x2c/0x30
>>   ? copy_msghdr_from_user+0x6c/0xa0
>>   ___sys_sendmsg+0x85/0xd0
>>   ? __lock_acquire+0x436/0x2190
>>   ? find_held_lock+0x2b/0x80
>>   ? lock_acquire+0xc2/0x2c0
>>   ? mntput_no_expire+0x43/0x460
>>   ? find_held_lock+0x2b/0x80
>>   ? mntput_no_expire+0x8c/0x460
>>   __sys_sendmsg+0x6b/0xc0
>>   do_syscall_64+0x6b/0x11b0
>>   entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>
>> Thanks,
>> Ben
>>
> 
> 

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-01 15:38   ` Ben Greear
@ 2026-03-02  8:07     ` Johannes Berg
  2026-03-02 15:26       ` Ben Greear
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2026-03-02  8:07 UTC (permalink / raw)
  To: Ben Greear, linux-wireless; +Cc: Korenblit, Miriam Rachel, linux-mm

On Sun, 2026-03-01 at 07:38 -0800, Ben Greear wrote:
> On 2/27/26 08:31, Ben Greear wrote:
> > On 2/23/26 14:36, Ben Greear wrote:
> > > Hello,
> > > 
> > > I hit a deadlock related to CMA mem allocation attempting to flush all work
> > > while holding some wifi related mutex, and with a work-queue attempting to process a wifi regdomain
> > > work item.  I really don't see any good way to fix this,
> > > it would seem that any code that was holding a mutex that could block a work-queue
> > > cannot safely allocate CMA memory?  Hopefully someone else has a better idea.
> > 
> > I tried using a kthread to do the regulatory domain processing instead of worker item,
> > and that seems to have solved the problem.  If that seems reasonable approach to
> > wifi stack folks, I can post a patch.
> 
> The other net/wireless work-item 'disconnect_work' also needs to be moved to the kthread
> for the same reason....

I don't think we want to use a kthread for this, it doesn't really make
sense.

Was this with lockdep? If so, it complain about anything?

I'm having a hard time seeing why it would deadlock at all when wifi
uses  schedule_work() and therefore the system_percpu_wq, and
__lru_add_drain_all() flushes lru_add_drain_work on mm_percpu_wq, and
lru_add_and_bh_lrus_drain() doesn't really _seem_ to do anything related
to RTNL etc.?

I think we need a real explanation here rather than "if I randomly
change this, it no longer appears".

johannes


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-02  8:07     ` Johannes Berg
@ 2026-03-02 15:26       ` Ben Greear
  2026-03-02 15:38         ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Ben Greear @ 2026-03-02 15:26 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless; +Cc: Korenblit, Miriam Rachel, linux-mm

On 3/2/26 00:07, Johannes Berg wrote:
> On Sun, 2026-03-01 at 07:38 -0800, Ben Greear wrote:
>> On 2/27/26 08:31, Ben Greear wrote:
>>> On 2/23/26 14:36, Ben Greear wrote:
>>>> Hello,
>>>>
>>>> I hit a deadlock related to CMA mem allocation attempting to flush all work
>>>> while holding some wifi related mutex, and with a work-queue attempting to process a wifi regdomain
>>>> work item.  I really don't see any good way to fix this,
>>>> it would seem that any code that was holding a mutex that could block a work-queue
>>>> cannot safely allocate CMA memory?  Hopefully someone else has a better idea.
>>>
>>> I tried using a kthread to do the regulatory domain processing instead of worker item,
>>> and that seems to have solved the problem.  If that seems reasonable approach to
>>> wifi stack folks, I can post a patch.
>>
>> The other net/wireless work-item 'disconnect_work' also needs to be moved to the kthread
>> for the same reason....
> 
> I don't think we want to use a kthread for this, it doesn't really make
> sense.
> 
> Was this with lockdep? If so, it complain about anything?
> 
> I'm having a hard time seeing why it would deadlock at all when wifi
> uses  schedule_work() and therefore the system_percpu_wq, and
> __lru_add_drain_all() flushes lru_add_drain_work on mm_percpu_wq, and
> lru_add_and_bh_lrus_drain() doesn't really _seem_ to do anything related
> to RTNL etc.?
> 
> I think we need a real explanation here rather than "if I randomly
> change this, it no longer appears".

The path where iwlwifi acquires CMA holds rtnl and/or wiphy locks before
allocating CMA memory, as expected.

And the CMA allocation path attempts to flush the work queues in
at least some cases.

If there is a work item queued that is trying to grab rtnl and/or wiphy lock
when CMA attempts to flush, then the flush work cannot complete, so it deadlocks.

Lockdep doesn't warn about this.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-02 15:26       ` Ben Greear
@ 2026-03-02 15:38         ` Johannes Berg
  2026-03-02 15:50           ` Ben Greear
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2026-03-02 15:38 UTC (permalink / raw)
  To: Ben Greear, linux-wireless; +Cc: Korenblit, Miriam Rachel, linux-mm, Tejun Heo

On Mon, 2026-03-02 at 07:26 -0800, Ben Greear wrote:
> 
> > 
> > Was this with lockdep? If so, it complain about anything?
> > 
> > I'm having a hard time seeing why it would deadlock at all when wifi
> > uses  schedule_work() and therefore the system_percpu_wq, and
> > __lru_add_drain_all() flushes lru_add_drain_work on mm_percpu_wq, and
> > lru_add_and_bh_lrus_drain() doesn't really _seem_ to do anything related
> > to RTNL etc.?
> > 
> > I think we need a real explanation here rather than "if I randomly
> > change this, it no longer appears".
> 
> The path where iwlwifi acquires CMA holds rtnl and/or wiphy locks before
> allocating CMA memory, as expected.
> 
> And the CMA allocation path attempts to flush the work queues in
> at least some cases.
> 
> If there is a work item queued that is trying to grab rtnl and/or wiphy lock
> when CMA attempts to flush, then the flush work cannot complete, so it deadlocks.
> 
> Lockdep doesn't warn about this.

It really should, in cases where it can actually happen, I wrote the
code myself for that... Though things have changed since, and the checks
were lost at least once (and re-added), so I suppose it's possible that
they were lost _again_, but the flushing system is far more flexible now
and it's not flushing the same workqueue anyway, so it shouldn't happen.

I stand by what I said before, need to show more precisely what depends
on what, and I'm not going to accept a random kthread into this.

johannes


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-02 15:38         ` Johannes Berg
@ 2026-03-02 15:50           ` Ben Greear
  2026-03-03 11:49             ` Johannes Berg
  0 siblings, 1 reply; 13+ messages in thread
From: Ben Greear @ 2026-03-02 15:50 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless
  Cc: Korenblit, Miriam Rachel, linux-mm, Tejun Heo

On 3/2/26 07:38, Johannes Berg wrote:
> On Mon, 2026-03-02 at 07:26 -0800, Ben Greear wrote:
>>
>>>
>>> Was this with lockdep? If so, it complain about anything?
>>>
>>> I'm having a hard time seeing why it would deadlock at all when wifi
>>> uses  schedule_work() and therefore the system_percpu_wq, and
>>> __lru_add_drain_all() flushes lru_add_drain_work on mm_percpu_wq, and
>>> lru_add_and_bh_lrus_drain() doesn't really _seem_ to do anything related
>>> to RTNL etc.?
>>>
>>> I think we need a real explanation here rather than "if I randomly
>>> change this, it no longer appears".
>>
>> The path where iwlwifi acquires CMA holds rtnl and/or wiphy locks before
>> allocating CMA memory, as expected.
>>
>> And the CMA allocation path attempts to flush the work queues in
>> at least some cases.
>>
>> If there is a work item queued that is trying to grab rtnl and/or wiphy lock
>> when CMA attempts to flush, then the flush work cannot complete, so it deadlocks.
>>
>> Lockdep doesn't warn about this.
> 
> It really should, in cases where it can actually happen, I wrote the
> code myself for that... Though things have changed since, and the checks
> were lost at least once (and re-added), so I suppose it's possible that
> they were lost _again_, but the flushing system is far more flexible now
> and it's not flushing the same workqueue anyway, so it shouldn't happen.
> 
> I stand by what I said before, need to show more precisely what depends
> on what, and I'm not going to accept a random kthread into this.

My first email on the topic has process stack traces as well as lockdep
locks-held printout that points to the deadlock.  I'm not sure what else to offer...please let me know
what you'd like to see.

Thanks,
Ben


> 
> johannes
> 

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-02 15:50           ` Ben Greear
@ 2026-03-03 11:49             ` Johannes Berg
  2026-03-03 20:52               ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2026-03-03 11:49 UTC (permalink / raw)
  To: Ben Greear, linux-wireless
  Cc: Korenblit, Miriam Rachel, linux-mm, Tejun Heo, linux-kernel

On Mon, 2026-03-02 at 07:50 -0800, Ben Greear wrote:
> On 3/2/26 07:38, Johannes Berg wrote:
> > On Mon, 2026-03-02 at 07:26 -0800, Ben Greear wrote:
> > > 
> > > > 
> > > > Was this with lockdep? If so, it complain about anything?
> > > > 
> > > > I'm having a hard time seeing why it would deadlock at all when wifi
> > > > uses  schedule_work() and therefore the system_percpu_wq, and
> > > > __lru_add_drain_all() flushes lru_add_drain_work on mm_percpu_wq, and
> > > > lru_add_and_bh_lrus_drain() doesn't really _seem_ to do anything related
> > > > to RTNL etc.?
> > > > 
> > > > I think we need a real explanation here rather than "if I randomly
> > > > change this, it no longer appears".
> > > 
> > > The path where iwlwifi acquires CMA holds rtnl and/or wiphy locks before
> > > allocating CMA memory, as expected.
> > > 
> > > And the CMA allocation path attempts to flush the work queues in
> > > at least some cases.
> > > 
> > > If there is a work item queued that is trying to grab rtnl and/or wiphy lock
> > > when CMA attempts to flush, then the flush work cannot complete, so it deadlocks.
> > > 
> > > Lockdep doesn't warn about this.
> > 
> > It really should, in cases where it can actually happen, I wrote the
> > code myself for that... Though things have changed since, and the checks
> > were lost at least once (and re-added), so I suppose it's possible that
> > they were lost _again_, but the flushing system is far more flexible now
> > and it's not flushing the same workqueue anyway, so it shouldn't happen.
> > 
> > I stand by what I said before, need to show more precisely what depends
> > on what, and I'm not going to accept a random kthread into this.
> 
> My first email on the topic has process stack traces as well as lockdep
> locks-held printout that points to the deadlock.  I'm not sure what else to offer...please let me know
> what you'd like to see.

Fair. I don't know, I don't think there's anything that even shows that
there's a dependency between the two workqueues and the
"((wq_completion)events_unbound)" and "((wq_completion)events)", and
there would have to be for it to deadlock this way because of that?

But one is mm_percpu_wq and the other is system_percpu_wq.

Tejun, does the workqueue code somehow introduce a dependency between
different per-CPU workqueues that's not modelled in lockdep?

johannes


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-03 11:49             ` Johannes Berg
@ 2026-03-03 20:52               ` Tejun Heo
  2026-03-03 21:03                 ` Johannes Berg
  2026-03-03 21:12                 ` Johannes Berg
  0 siblings, 2 replies; 13+ messages in thread
From: Tejun Heo @ 2026-03-03 20:52 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Ben Greear, linux-wireless, Korenblit, Miriam Rachel, linux-mm,
	linux-kernel

Hello,

On Tue, Mar 03, 2026 at 12:49:24PM +0100, Johannes Berg wrote:
> Fair. I don't know, I don't think there's anything that even shows that
> there's a dependency between the two workqueues and the
> "((wq_completion)events_unbound)" and "((wq_completion)events)", and
> there would have to be for it to deadlock this way because of that?
> 
> But one is mm_percpu_wq and the other is system_percpu_wq.
> 
> Tejun, does the workqueue code somehow introduce a dependency between
> different per-CPU workqueues that's not modelled in lockdep?

Hopefully not. Kinda late to the party. Why isn't mm_percpu_wq making
forward progress? That should in all circumstances. What's the work item and
kworker doing?

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-03 20:52               ` Tejun Heo
@ 2026-03-03 21:03                 ` Johannes Berg
  2026-03-03 21:12                 ` Johannes Berg
  1 sibling, 0 replies; 13+ messages in thread
From: Johannes Berg @ 2026-03-03 21:03 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ben Greear, linux-wireless, Korenblit, Miriam Rachel, linux-mm,
	linux-kernel

On Tue, 2026-03-03 at 10:52 -1000, Tejun Heo wrote:
> Hello,
> 
> On Tue, Mar 03, 2026 at 12:49:24PM +0100, Johannes Berg wrote:
> > Fair. I don't know, I don't think there's anything that even shows that
> > there's a dependency between the two workqueues and the
> > "((wq_completion)events_unbound)" and "((wq_completion)events)", and
> > there would have to be for it to deadlock this way because of that?
> > 
> > But one is mm_percpu_wq and the other is system_percpu_wq.
> > 
> > Tejun, does the workqueue code somehow introduce a dependency between
> > different per-CPU workqueues that's not modelled in lockdep?
> 
> Hopefully not. Kinda late to the party.

Yeah, sorry, should've included a link:
https://lore.kernel.org/linux-wireless/fa4e82ee-eb14-3930-c76c-f3bd59c5f258@candelatech.com/

> Why isn't mm_percpu_wq making
> forward progress? That should in all circumstances. What's the work item and
> kworker doing?

So it seems that first iwlwifi is holding the RTNL:

  ieee80211_open+0x62/0xe0 [mac80211]
  __dev_open+0x11a/0x2e0
  __dev_change_flags+0x1f8/0x280
  netif_change_flags+0x22/0x60
  do_setlink.isra.0+0xe57/0x11a0
  rtnl_newlink+0x7e8/0xb50

(last stack trace at the above link)
This stuff definitely happens with the RTNL held, although I didn't
check now which function actually acquires it in this stack.

Simultaneously the kworker/6:0 is stuck in reg_todo(), trying to acquire
the RTNL.

So far that seems fairly much normal. The kworker/6:0 running reg_todo()
is from net/wireless/reg.c, reg_work, scheduled to system_percpu_wq (by
simply schedule_work.)

Now iwlwifi is also trying to allocate coherent DMA memory (continuing
the stack trace), potentially a significant chunk for firmware loading:

  dma_direct_alloc+0x7b/0x250
  dma_alloc_attrs+0xa1/0x2a0
  _iwl_pcie_ctxt_info_dma_alloc_coherent+0x31/0xb0 [iwlwifi]
  iwl_pcie_ctxt_info_alloc_dma+0x20/0x50 [iwlwifi]
  iwl_pcie_init_fw_sec+0x2fc/0x380 [iwlwifi]
  iwl_pcie_ctxt_info_v2_alloc+0x19e/0x530 [iwlwifi]
  iwl_trans_pcie_gen2_start_fw+0x2e2/0x820 [iwlwifi]
  iwl_trans_start_fw+0x77/0x90 [iwlwifi]
  iwl_mld_load_fw_wait_alive+0x97/0x2c0 [iwlmld]
  iwl_mld_load_fw+0x91/0x240 [iwlmld]
  iwl_mld_start_fw+0x44/0x470 [iwlmld]
  iwl_mld_mac80211_start+0x3d/0x1b0 [iwlmld]
  drv_start+0x6f/0x1d0 [mac80211]
  ieee80211_do_open+0x2d6/0x960 [mac80211]
  ieee80211_open+0x62/0xe0 [mac80211]

This is fine, but then it gets into __flush_work() in
__lru_add_drain_all():

  __flush_work+0x34e/0x530
  __lru_add_drain_all+0x19b/0x220
  alloc_contig_range_noprof+0x1de/0x8a0
  __cma_alloc+0x1f1/0x6a0
  __dma_direct_alloc_pages.isra.0+0xcb/0x2f0
  dma_direct_alloc+0x7b/0x250

which is because __lru_add_drain_all() schedules a bunch of workers, one
for each CPU, onto the mm_percpu_wq and then waits for them.

Conceptually, I see nothing wrong with this, hence my question; Ben says
that the system stops making progress at this point.

johannes


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-03 20:52               ` Tejun Heo
  2026-03-03 21:03                 ` Johannes Berg
@ 2026-03-03 21:12                 ` Johannes Berg
  2026-03-03 21:40                   ` Ben Greear
  1 sibling, 1 reply; 13+ messages in thread
From: Johannes Berg @ 2026-03-03 21:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ben Greear, linux-wireless, Korenblit, Miriam Rachel, linux-mm,
	linux-kernel

On Tue, 2026-03-03 at 10:52 -1000, Tejun Heo wrote:
> Hello,
> 
> On Tue, Mar 03, 2026 at 12:49:24PM +0100, Johannes Berg wrote:
> > Fair. I don't know, I don't think there's anything that even shows that
> > there's a dependency between the two workqueues and the
> > "((wq_completion)events_unbound)" and "((wq_completion)events)", and
> > there would have to be for it to deadlock this way because of that?
> > 
> > But one is mm_percpu_wq and the other is system_percpu_wq.
> > 
> > Tejun, does the workqueue code somehow introduce a dependency between
> > different per-CPU workqueues that's not modelled in lockdep?
> 
> Hopefully not. Kinda late to the party. Why isn't mm_percpu_wq making
> forward progress? That should in all circumstances. What's the work item and
> kworker doing?

Oh and in addition: the worker that's kicked off by
__lru_add_drain_all() doesn't really seem to do anything long-running?
It's lru_add_drain_per_cpu(), which is lru_add_and_bh_lrus_drain(),
which would appear to be entirely non-sleepable code (holding either
local locks or having irqs disabled.) It also doesn't show up in the
log, apparently, hence my question about strange dependencies.

johannes


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-03 21:12                 ` Johannes Berg
@ 2026-03-03 21:40                   ` Ben Greear
  2026-03-03 21:54                     ` Tejun Heo
  0 siblings, 1 reply; 13+ messages in thread
From: Ben Greear @ 2026-03-03 21:40 UTC (permalink / raw)
  To: Johannes Berg, Tejun Heo
  Cc: linux-wireless, Korenblit, Miriam Rachel, linux-mm, linux-kernel

On 3/3/26 13:12, Johannes Berg wrote:
> On Tue, 2026-03-03 at 10:52 -1000, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Mar 03, 2026 at 12:49:24PM +0100, Johannes Berg wrote:
>>> Fair. I don't know, I don't think there's anything that even shows that
>>> there's a dependency between the two workqueues and the
>>> "((wq_completion)events_unbound)" and "((wq_completion)events)", and
>>> there would have to be for it to deadlock this way because of that?
>>>
>>> But one is mm_percpu_wq and the other is system_percpu_wq.
>>>
>>> Tejun, does the workqueue code somehow introduce a dependency between
>>> different per-CPU workqueues that's not modelled in lockdep?
>>
>> Hopefully not. Kinda late to the party. Why isn't mm_percpu_wq making
>> forward progress? That should in all circumstances. What's the work item and
>> kworker doing?
> 
> Oh and in addition: the worker that's kicked off by
> __lru_add_drain_all() doesn't really seem to do anything long-running?
> It's lru_add_drain_per_cpu(), which is lru_add_and_bh_lrus_drain(),
> which would appear to be entirely non-sleepable code (holding either
> local locks or having irqs disabled.) It also doesn't show up in the
> log, apparently, hence my question about strange dependencies.

Hello Tejun,

If I use a kthread to do the blocking reg_todo work, then the problem
goes away, so it somehow does appear that the work flush logic down in swap.c
is somehow being blocked by the reg_todo work item, not just the swap.c
logic somehow blocking against itself.

My kthread hack left the reg_todo work item logic in place, but instead of
the work item doing any blocking work, it instead just wakes the kthread
I added and has that kthread do the work under mutex.

The second regulatory related work item in net/wireless/ causes the same
lockup, though it was harder to reproduce.  Putting that work in the kthread
also seems to have fixed it.

I could only ever reproduce this with KASAN (and lockdep and other debugging options
enabled), my guess is that this is because then the system runs slower and/or there
is more memory pressure.

I should still be able to reproduce this if I switch to upstream kernel, so
if there is any debugging code you'd like me to execute, I will attempt to
do so.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
  2026-03-03 21:40                   ` Ben Greear
@ 2026-03-03 21:54                     ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2026-03-03 21:54 UTC (permalink / raw)
  To: Ben Greear
  Cc: Johannes Berg, linux-wireless, Korenblit, Miriam Rachel,
	linux-mm, linux-kernel

Hello,

On Tue, Mar 03, 2026 at 01:40:54PM -0800, Ben Greear wrote:
> If I use a kthread to do the blocking reg_todo work, then the problem
> goes away, so it somehow does appear that the work flush logic down in swap.c
> is somehow being blocked by the reg_todo work item, not just the swap.c
> logic somehow blocking against itself.
> 
> My kthread hack left the reg_todo work item logic in place, but instead of
> the work item doing any blocking work, it instead just wakes the kthread
> I added and has that kthread do the work under mutex.
> 
> The second regulatory related work item in net/wireless/ causes the same
> lockup, though it was harder to reproduce.  Putting that work in the kthread
> also seems to have fixed it.
> 
> I could only ever reproduce this with KASAN (and lockdep and other debugging options
> enabled), my guess is that this is because then the system runs slower and/or there
> is more memory pressure.
> 
> I should still be able to reproduce this if I switch to upstream kernel, so
> if there is any debugging code you'd like me to execute, I will attempt to
> do so.

I think the main thing is findin out what state the work item is in. Is it
pending, running, or finished? You can enable wq tracepoints to figure that
out or if you can take a crashdump when it's stalled, nowadays it's really
easy to tell the state w/ something like claude code and drgn. Just tell
claude to use drgn to look at the crashdump and ask it to locate the work
item and what it's doing. It works surprisingly well.

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-03-03 21:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-23 22:36 6.18.13 iwlwifi deadlock allocating cma while work-item is active Ben Greear
2026-02-27 16:31 ` Ben Greear
2026-03-01 15:38   ` Ben Greear
2026-03-02  8:07     ` Johannes Berg
2026-03-02 15:26       ` Ben Greear
2026-03-02 15:38         ` Johannes Berg
2026-03-02 15:50           ` Ben Greear
2026-03-03 11:49             ` Johannes Berg
2026-03-03 20:52               ` Tejun Heo
2026-03-03 21:03                 ` Johannes Berg
2026-03-03 21:12                 ` Johannes Berg
2026-03-03 21:40                   ` Ben Greear
2026-03-03 21:54                     ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox