From: Jan Kara <jack@suse.cz>
To: Matthew Wilcox <willy@infradead.org>
Cc: Bernd Schubert <bernd@bsbernd.com>, Jan Kara <jack@suse.cz>,
Joanne Koong <joannelkoong@gmail.com>,
Miklos Szeredi <miklos@szeredi.hu>,
Horst Birthelmer <hbirthelmer@ddn.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"David Hildenbrand (Red Hat)" <david@kernel.org>
Subject: Re: __folio_end_writeback() lockdep issue
Date: Tue, 20 Jan 2026 15:35:33 +0100 [thread overview]
Message-ID: <4sn6k56c7g3jwvzze4imc4pilmomekvcelo7zo2awjqxsifaqe@djs5p2l55q64> (raw)
In-Reply-To: <wu7mu22kgr7pmzrneq6rkivhwvpbximyrkouciktl7wf5w7wur@rsyqyczopba2>
On Mon 12-01-26 14:06:26, Jan Kara wrote:
> On Sat 10-01-26 16:30:28, Matthew Wilcox wrote:
> > On Sat, Jan 10, 2026 at 04:31:28PM +0100, Bernd Schubert wrote:
> > > [ 872.499480] Possible interrupt unsafe locking scenario:
> > > [ 872.499480]
> > > [ 872.500326] CPU0 CPU1
> > > [ 872.500906] ---- ----
> > > [ 872.501464] lock(&p->sequence);
> > > [ 872.501923] local_irq_disable();
> > > [ 872.502615] lock(&xa->xa_lock#4);
> > > [ 872.503327] lock(&p->sequence);
> > > [ 872.504116] <Interrupt>
> > > [ 872.504513] lock(&xa->xa_lock#4);
> > >
> > >
> > > Which is introduced by commit 2841808f35ee for all file systems.
> > > The should be rather generic - I shouldn't be the only one seeing
> > > it?
> >
> > Oh wow, 2841808f35ee has a very confusing commit message. It implies
> > that _no_ filesystem uses BDI_CAP_WRITEBACK_ACCT, but what it really
> > means is that no filesystem now _clears_ BDI_CAP_WRITEBACK_ACCT, so
> > all filesystems do use this code path and therefore the flag can be
> > removed. And that matches the code change.
> >
> > So you should be able to reproduce this problem with commit 494d2f508883
> > as well?
> >
> > That tells me that this is something fuse-specific. Other filesystems
> > aren't seeing this. Wonder why ...
> >
> > __wb_writeout_add() or its predecessor __wb_writeout_inc() have been in
> > that spot since 2015 or earlier.
> >
> > The sequence lock itself is taken inside fprop_new_period() called from
> > writeout_period() which has been there since 2012, so that's not it.
> >
> > Looking at fprop_new_period() is more interesting. Commit a91befde3503
> > removed an earlier call to local_irq_save(). It was then replaced with
> > preempt_disable() in 9458e0a78c45 but maybe removing it was just
> > erroneous?
> >
> > Anyway, that was 2022, so it doesn't answer "why is this only showing up
> > now and only for fuse?" But maybe replacing the preempt-disable with
> > irq-disable in fprop_new_period() is the right solution, regardless.
>
> So I don't have a great explanation why it is showing up only now and only
> for FUSE. It seems the fprop code is unsafe wrt interrupts because
> fprop_new_period() grabs
>
> write_seqcount_begin(&p->sequence);
>
> and if IO completion interrupt on this CPU comes while p->sequence is odd,
> the call to
>
> read_seqcount_begin(&p->sequence);
>
> in __folio_end_writeback() -> __wb_writeout_add() -> wb_domain_writeout_add()
> -> __fprop_add_percpu_max() -> fprop_fraction_percpu() will loop
> indefinitely. *However* this isn't in fact possible because
> fprop_new_period() is only called from a timer code and thus in softirq
> context and thus IO completion softirq cannot really preempt it.
>
> But for the same reason I don't think what lockdep complains about is
> really possible because xa_lock gets only used from IO completion softirq as
> well. Or can we really acquire it from some hard irq context? Based on
> lockdep report at least lockdep things IO completion runs in hardirq
> context but then I don't see why we're not seeing complaints like this all
> the time and even deadlocks I've described above. I guess I'll have to do
> some experimentation to refresh how these things behave these days...
So I've got to experiment with this some more. It appears that my
recollection of block layer IO completion code is about decade old and
these days bio completion routines (and thus __folio_end_writeback()) are
getting called in hardirq context in most cases (verified this with ftrace).
Hence the flexible proportions code is indeed prone to deadlocks with
hardirqs. I'm just wondering why aren't we seeing much more lockdep reports
about this and possibly also real deadlocks... Anyway, I'll send a patch to
make fprop_new_period() irq safe.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
prev parent reply other threads:[~2026-01-20 14:35 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <9b845a47-9aee-43dd-99bc-1a82bea00442@bsbernd.com>
2026-01-10 15:31 ` Bernd Schubert
2026-01-10 16:30 ` Matthew Wilcox
2026-01-10 20:24 ` Bernd Schubert
2026-01-12 13:06 ` Jan Kara
2026-01-12 13:13 ` Bernd Schubert
2026-01-12 17:20 ` Jan Kara
2026-01-12 20:43 ` Joanne Koong
2026-01-20 14:35 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4sn6k56c7g3jwvzze4imc4pilmomekvcelo7zo2awjqxsifaqe@djs5p2l55q64 \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=bernd@bsbernd.com \
--cc=david@kernel.org \
--cc=hbirthelmer@ddn.com \
--cc=joannelkoong@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miklos@szeredi.hu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox