Re: [PATCH v2 1/1] fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Jan Kara <jack@suse.cz>, Joanne Koong <joannelkoong@gmail.com>
Cc: akpm@linux-foundation.org, miklos@szeredi.hu, linux-mm@kvack.org,
	athul.krishna.kr@protonmail.com, j.neuschaefer@gmx.net,
	carnil@debian.org, linux-fsdevel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH v2 1/1] fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()
Date: Tue, 6 Jan 2026 11:05:43 +0100	[thread overview]
Message-ID: <616c2e51-ff69-4ef9-9637-41f3ff8691dd@kernel.org> (raw)
In-Reply-To: <ypyumqgv5p7dnxmq34q33keb6kzqnp66r33gtbm4pglgdmhma6@3oleltql2qgp>

On 1/6/26 10:33, Jan Kara wrote:
> [Thanks to Andrew for CCing me on patch commit]
> 
> On Sun 14-12-25 19:00:43, Joanne Koong wrote:
>> Skip waiting on writeback for inodes that belong to mappings that do not
>> have data integrity guarantees (denoted by the AS_NO_DATA_INTEGRITY
>> mapping flag).
>>
>> This restores fuse back to prior behavior where syncs are no-ops. This
>> is needed because otherwise, if a system is running a faulty fuse
>> server that does not reply to issued write requests, this will cause
>> wait_sb_inodes() to wait forever.
>>
>> Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
>> Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
>> Reported-by: J. Neuschäfer <j.neuschaefer@gmx.net>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> 
> OK, but the difference 0c58a97f919c introduced goes much further than just
> wait_sb_inodes(). Before 0c58a97f919c also filemap_fdatawait() (and all the
> other variants waiting for folio_writeback() to clear) returned immediately
> because folio writeback was done as soon as we've copied the content into
> the temporary page. Now they will block waiting for the server to finish
> the IO. So e.g. fsync() will block waiting for the server in
> file_write_and_wait_range() now, instead of blocking in fuse_fsync_common()
> -> fuse_simple_request(). Similarly e.g. truncate(2) will now block waiting
> for the server so that folio_writeback can be cleared.
> 
> So I understand your patch fixes the regression with suspend blocking but I
> don't have a high confidence we are not just starting a whack-a-mole game

Yes, I think so, and I think it is [1] not even only limited to 
writeback [2].

> catching all the places that previously hiddenly depended on
> folio_writeback getting cleared without any involvement of untrusted fuse
> server and now this changed. 

Even worse, it's not only untrusted fuse servers, but also 
trusted-but-buggy fuse servers, unfortunately. As Joanne wrote in v1:

"
As reported by Athul upstream in [1], there is a userspace regression 
caused by commit 0c58a97f919c ("fuse: remove tmp folio for writebacks 
and internal rb tree") where if there is a bug in a fuse server that 
causes the server to never complete writeback, it will make 
wait_sb_inodes() wait forever, causing sync paths to hang.
"

> So do we have some higher-level idea what is /
> is not guaranteed with stuck fuse server?

Joanne first proposed AS_WRITEBACK_MAY_HANG, which I disliked [2] for 
various reasons because the semantics are weird. I am strongly against 
using such a flag to arbitrarily skip waiting for writeback on folios in 
the tree.

The patch here is at least logically the right thing to do when only 
looking at the wait_sb_inodes() writeback situation [3] and why it is 
even ok to skip waiting for writeback, and the fix Joanne originally 
proposed.

To handle the bigger picture (I raised another problematic instance in 
[4]): I don't know how to handle that without properly fixing fuse. Fuse 
folks should really invest some time to solve this problem for good.

As a big temporary kernel hack, we could add a 
AS_ANY_WAITING_UTTERLY_BROKEN and simply refuse to wait for writeback 
directly inside folio_wait_writeback() -- not arbitrarily skipping it in 
callers -- and possibly other places (readahead, not sure). That would 
restore the old behavior.

Well, not quite, because the semantics that folio_wait_writeback() 
promises -- writeback flag at least cleared once, like required here for 
data integrity -- are just not true anymore.

And it would still break migration of folios that are under writeback 
even though waiting for writeback even for migration even though in 
99.9999% of all cases with trusted fuse server will do the right thing. 
Just nasty.

Of course, we could set AS_ANY_WAITING_UTTERLY_BROKEN in fuse only 
conditionally, but the fact that buggy trusted fuse servers are now a 
thing, it all stops making any sense because we would have to set that 
flag always.

There is no easy way to get back the old behavior without reverting to 
the old way of using buffer pages I guess.
[1] 
https://lore.kernel.org/linux-mm/504d100d-b8f3-475b-b575-3adfd17627b5@kernel.org/[2] 
https://lore.kernel.org/linux-mm/f8da9ee0-f136-4366-b63a-1812fda11304@kernel.org/[3] 
https://lore.kernel.org/linux-mm/6d0948f5-e739-49f3-8e23-359ddbf3da8f@kernel.org/[4] 
https://lore.kernel.org/linux-mm/504d100d-b8f3-475b-b575-3adfd17627b5@kernel.org/
-- 
Cheers

David

next prev parent reply	other threads:[~2026-01-06 10:05 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-15  3:00 [PATCH v2 0/1] " Joanne Koong
2025-12-15  3:00 ` [PATCH v2 1/1] " Joanne Koong
2025-12-15 17:09   ` Bernd Schubert
2025-12-16  7:07     ` Joanne Koong
2025-12-16 18:13   ` J. Neuschäfer
2026-01-02 17:42   ` Joanne Koong
2026-01-03 18:03   ` Andrew Morton
2026-01-04 18:54     ` David Hildenbrand (Red Hat)
2026-01-05 19:55       ` Joanne Koong
2026-01-06  9:33   ` Jan Kara
2026-01-06 10:05     ` David Hildenbrand (Red Hat) [this message]
2026-01-06 13:13       ` Miklos Szeredi
2026-01-06 13:55         ` Jan Kara
2026-01-06 14:33         ` David Hildenbrand (Red Hat)
2026-01-06 15:21           ` Miklos Szeredi
2026-01-06 15:41             ` David Hildenbrand (Red Hat)
2026-01-06 16:05               ` Miklos Szeredi
2026-01-06 17:54                 ` David Hildenbrand (Red Hat)
2026-01-06 23:30     ` Joanne Koong
2026-01-07 10:12       ` Jan Kara
2026-01-07 23:20         ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=616c2e51-ff69-4ef9-9637-41f3ff8691dd@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=athul.krishna.kr@protonmail.com \
    --cc=carnil@debian.org \
    --cc=j.neuschaefer@gmx.net \
    --cc=jack@suse.cz \
    --cc=joannelkoong@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox