linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Bernd Schubert <bernd@bsbernd.com>
Cc: Jan Kara <jack@suse.cz>, Matthew Wilcox <willy@infradead.org>,
	 Joanne Koong <joannelkoong@gmail.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	 Horst Birthelmer <hbirthelmer@ddn.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	 "David Hildenbrand (Red Hat)" <david@kernel.org>
Subject: Re: __folio_end_writeback() lockdep issue
Date: Mon, 12 Jan 2026 18:20:35 +0100	[thread overview]
Message-ID: <yovlpjhao5gymqcdmg25w5tqxq4gduq3ccbrnsx4nssp4v2xs5@bflx5zxxb5l3> (raw)
In-Reply-To: <fc67afac-2345-43bf-a67e-c36f2d9f04c3@bsbernd.com>

On Mon 12-01-26 14:13:26, Bernd Schubert wrote:
> 
> 
> On 1/12/26 14:06, Jan Kara wrote:
> > On Sat 10-01-26 16:30:28, Matthew Wilcox wrote:
> >> On Sat, Jan 10, 2026 at 04:31:28PM +0100, Bernd Schubert wrote:
> >>> [  872.499480]  Possible interrupt unsafe locking scenario:
> >>> [  872.499480] 
> >>> [  872.500326]        CPU0                    CPU1
> >>> [  872.500906]        ----                    ----
> >>> [  872.501464]   lock(&p->sequence);
> >>> [  872.501923]                                local_irq_disable();
> >>> [  872.502615]                                lock(&xa->xa_lock#4);
> >>> [  872.503327]                                lock(&p->sequence);
> >>> [  872.504116]   <Interrupt>
> >>> [  872.504513]     lock(&xa->xa_lock#4);
> >>>
> >>>
> >>> Which is introduced by commit 2841808f35ee for all file systems. 
> >>> The should be rather generic - I shouldn't be the only one seeing
> >>> it?
> >>
> >> Oh wow, 2841808f35ee has a very confusing commit message.  It implies
> >> that _no_ filesystem uses BDI_CAP_WRITEBACK_ACCT, but what it really
> >> means is that no filesystem now _clears_ BDI_CAP_WRITEBACK_ACCT, so
> >> all filesystems do use this code path and therefore the flag can be
> >> removed.  And that matches the code change.
> >>
> >> So you should be able to reproduce this problem with commit 494d2f508883
> >> as well?
> >>
> >> That tells me that this is something fuse-specific.  Other filesystems
> >> aren't seeing this.  Wonder why ...
> >>
> >> __wb_writeout_add() or its predecessor __wb_writeout_inc() have been in
> >> that spot since 2015 or earlier.  
> >>
> >> The sequence lock itself is taken inside fprop_new_period() called from
> >> writeout_period() which has been there since 2012, so that's not it.
> >>
> >> Looking at fprop_new_period() is more interesting.  Commit a91befde3503
> >> removed an earlier call to local_irq_save().  It was then replaced with
> >> preempt_disable() in 9458e0a78c45 but maybe removing it was just
> >> erroneous?
> >>
> >> Anyway, that was 2022, so it doesn't answer "why is this only showing up
> >> now and only for fuse?"  But maybe replacing the preempt-disable with
> >> irq-disable in fprop_new_period() is the right solution, regardless.
> > 
> > So I don't have a great explanation why it is showing up only now and only
> > for FUSE. It seems the fprop code is unsafe wrt interrupts because
> > fprop_new_period() grabs
> > 
> >         write_seqcount_begin(&p->sequence);
> > 
> > and if IO completion interrupt on this CPU comes while p->sequence is odd,
> > the call to
> > 
> > 	read_seqcount_begin(&p->sequence);
> > 
> > in __folio_end_writeback() -> __wb_writeout_add() -> wb_domain_writeout_add()
> > -> __fprop_add_percpu_max() -> fprop_fraction_percpu() will loop
> > indefinitely. *However* this isn't in fact possible because
> > fprop_new_period() is only called from a timer code and thus in softirq
> > context and thus IO completion softirq cannot really preempt it.
> > 
> > But for the same reason I don't think what lockdep complains about is
> > really possible because xa_lock gets only used from IO completion softirq as
> > well. Or can we really acquire it from some hard irq context? Based on
> > lockdep report at least lockdep things IO completion runs in hardirq
> > context but then I don't see why we're not seeing complaints like this all
> > the time and even deadlocks I've described above. I guess I'll have to do
> > some experimentation to refresh how these things behave these days...
> 
> Is there anything that speaks about the patch I had posted?
> __wb_writeout_add() doesn't need the xa lock?

That's a bit hairy question :) because currently xa lock is what makes sure
the wb we've got from the inode is still valid. Now that you've forced me
to think about it: If we move __wb_writeout_add() from under xa lock, inode
could be switched to different wb after we release xa lock and before we
call __wb_writeout_add(). I don't think there is anything that would
protect the wb from being freed in that case before __wb_writeout_add() and
thus we could create UAF issue if I'm not mistaken.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


  reply	other threads:[~2026-01-12 17:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <9b845a47-9aee-43dd-99bc-1a82bea00442@bsbernd.com>
2026-01-10 15:31 ` Bernd Schubert
2026-01-10 16:30   ` Matthew Wilcox
2026-01-10 20:24     ` Bernd Schubert
2026-01-12 13:06     ` Jan Kara
2026-01-12 13:13       ` Bernd Schubert
2026-01-12 17:20         ` Jan Kara [this message]
2026-01-12 20:43       ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yovlpjhao5gymqcdmg25w5tqxq4gduq3ccbrnsx4nssp4v2xs5@bflx5zxxb5l3 \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=bernd@bsbernd.com \
    --cc=david@kernel.org \
    --cc=hbirthelmer@ddn.com \
    --cc=joannelkoong@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox