From: Johannes Berg <johannes@sipsolutions.net>
To: Tejun Heo <tj@kernel.org>
Cc: Ben Greear <greearb@candelatech.com>,
linux-wireless <linux-wireless@vger.kernel.org>,
"Korenblit, Miriam Rachel" <miriam.rachel.korenblit@intel.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: 6.18.13 iwlwifi deadlock allocating cma while work-item is active.
Date: Tue, 03 Mar 2026 22:03:43 +0100 [thread overview]
Message-ID: <76682f4db2c378774fa8eefaff497570ec904cc1.camel@sipsolutions.net> (raw)
In-Reply-To: <aadKDCKGHk1Ua-7_@slm.duckdns.org>
On Tue, 2026-03-03 at 10:52 -1000, Tejun Heo wrote:
> Hello,
>
> On Tue, Mar 03, 2026 at 12:49:24PM +0100, Johannes Berg wrote:
> > Fair. I don't know, I don't think there's anything that even shows that
> > there's a dependency between the two workqueues and the
> > "((wq_completion)events_unbound)" and "((wq_completion)events)", and
> > there would have to be for it to deadlock this way because of that?
> >
> > But one is mm_percpu_wq and the other is system_percpu_wq.
> >
> > Tejun, does the workqueue code somehow introduce a dependency between
> > different per-CPU workqueues that's not modelled in lockdep?
>
> Hopefully not. Kinda late to the party.
Yeah, sorry, should've included a link:
https://lore.kernel.org/linux-wireless/fa4e82ee-eb14-3930-c76c-f3bd59c5f258@candelatech.com/
> Why isn't mm_percpu_wq making
> forward progress? That should in all circumstances. What's the work item and
> kworker doing?
So it seems that first iwlwifi is holding the RTNL:
ieee80211_open+0x62/0xe0 [mac80211]
__dev_open+0x11a/0x2e0
__dev_change_flags+0x1f8/0x280
netif_change_flags+0x22/0x60
do_setlink.isra.0+0xe57/0x11a0
rtnl_newlink+0x7e8/0xb50
(last stack trace at the above link)
This stuff definitely happens with the RTNL held, although I didn't
check now which function actually acquires it in this stack.
Simultaneously the kworker/6:0 is stuck in reg_todo(), trying to acquire
the RTNL.
So far that seems fairly much normal. The kworker/6:0 running reg_todo()
is from net/wireless/reg.c, reg_work, scheduled to system_percpu_wq (by
simply schedule_work.)
Now iwlwifi is also trying to allocate coherent DMA memory (continuing
the stack trace), potentially a significant chunk for firmware loading:
dma_direct_alloc+0x7b/0x250
dma_alloc_attrs+0xa1/0x2a0
_iwl_pcie_ctxt_info_dma_alloc_coherent+0x31/0xb0 [iwlwifi]
iwl_pcie_ctxt_info_alloc_dma+0x20/0x50 [iwlwifi]
iwl_pcie_init_fw_sec+0x2fc/0x380 [iwlwifi]
iwl_pcie_ctxt_info_v2_alloc+0x19e/0x530 [iwlwifi]
iwl_trans_pcie_gen2_start_fw+0x2e2/0x820 [iwlwifi]
iwl_trans_start_fw+0x77/0x90 [iwlwifi]
iwl_mld_load_fw_wait_alive+0x97/0x2c0 [iwlmld]
iwl_mld_load_fw+0x91/0x240 [iwlmld]
iwl_mld_start_fw+0x44/0x470 [iwlmld]
iwl_mld_mac80211_start+0x3d/0x1b0 [iwlmld]
drv_start+0x6f/0x1d0 [mac80211]
ieee80211_do_open+0x2d6/0x960 [mac80211]
ieee80211_open+0x62/0xe0 [mac80211]
This is fine, but then it gets into __flush_work() in
__lru_add_drain_all():
__flush_work+0x34e/0x530
__lru_add_drain_all+0x19b/0x220
alloc_contig_range_noprof+0x1de/0x8a0
__cma_alloc+0x1f1/0x6a0
__dma_direct_alloc_pages.isra.0+0xcb/0x2f0
dma_direct_alloc+0x7b/0x250
which is because __lru_add_drain_all() schedules a bunch of workers, one
for each CPU, onto the mm_percpu_wq and then waits for them.
Conceptually, I see nothing wrong with this, hence my question; Ben says
that the system stops making progress at this point.
johannes
next prev parent reply other threads:[~2026-03-03 21:03 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 22:36 Ben Greear
2026-02-27 16:31 ` Ben Greear
2026-03-01 15:38 ` Ben Greear
2026-03-02 8:07 ` Johannes Berg
2026-03-02 15:26 ` Ben Greear
2026-03-02 15:38 ` Johannes Berg
2026-03-02 15:50 ` Ben Greear
2026-03-03 11:49 ` Johannes Berg
2026-03-03 20:52 ` Tejun Heo
2026-03-03 21:03 ` Johannes Berg [this message]
2026-03-03 21:12 ` Johannes Berg
2026-03-03 21:40 ` Ben Greear
2026-03-03 21:54 ` Tejun Heo
2026-03-04 0:02 ` Ben Greear
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=76682f4db2c378774fa8eefaff497570ec904cc1.camel@sipsolutions.net \
--to=johannes@sipsolutions.net \
--cc=greearb@candelatech.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-wireless@vger.kernel.org \
--cc=miriam.rachel.korenblit@intel.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox