[PATCH v1 0/2] mm: skip wait in wait_sb_inodes() for hangable-writeback mappings

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 0/2] mm: skip wait in wait_sb_inodes() for hangable-writeback mappings
@ 2025-11-20 18:42 Joanne Koong
  2025-11-20 18:42 ` [PATCH v1 1/2] mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to AS_WRITEBACK_MAY_HANG Joanne Koong
  2025-11-20 18:42 ` [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes() Joanne Koong
  0 siblings, 2 replies; 15+ messages in thread
From: Joanne Koong @ 2025-11-20 18:42 UTC (permalink / raw)
  To: akpm; +Cc: david, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

As reported by Athul upstream in [1], there is a userspace regression caused
by commit 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb
tree") where if there is a bug in a fuse server that causes the server to
never complete writeback, it will make wait_sb_inodes() wait forever, causing
sync paths to hang.

This is a resubmission of this patch [2] that was dropped from the original
series due to a buggy/malicious server still being able to hold up sync() /
the system in other ways if they wanted to, but the wait_sb_inodes() path is
particularly common and easier to hit for malfunctioning servers.

Thanks,
Joanne

[1] https://lore.kernel.org/regressions/CAJnrk1ZjQ8W8NzojsvJPRXiv9TuYPNdj8Ye7=Cgkj=iV_i8EaA@mail.gmail.com/T/#t
[2] https://lore.kernel.org/linux-fsdevel/20241122232359.429647-4-joannelkoong@gmail.com/

Joanne Koong (2):
  mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to
    AS_WRITEBACK_MAY_HANG
  fs/writeback: skip inodes with potential writeback hang in
    wait_sb_inodes()

 fs/fs-writeback.c       |  3 +++
 fs/fuse/file.c          |  2 +-
 include/linux/pagemap.h | 10 +++++-----
 mm/vmscan.c             |  3 +--
 4 files changed, 10 insertions(+), 8 deletions(-)

-- 
2.47.3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 1/2] mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to AS_WRITEBACK_MAY_HANG
  2025-11-20 18:42 [PATCH v1 0/2] mm: skip wait in wait_sb_inodes() for hangable-writeback mappings Joanne Koong
@ 2025-11-20 18:42 ` Joanne Koong
  2025-11-20 20:08   ` David Hildenbrand (Red Hat)
  2025-11-20 18:42 ` [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes() Joanne Koong
  1 sibling, 1 reply; 15+ messages in thread
From: Joanne Koong @ 2025-11-20 18:42 UTC (permalink / raw)
  To: akpm; +Cc: david, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM was added to avoid waiting on
writeback during reclaim for inodes belonging to filesystems where
a) waiting on writeback in reclaim may lead to a deadlock or
b) a writeback request may never complete due to the nature of the
filesystem (unrelated to reclaim)

Rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to the more generic
AS_WRITEBACK_MAY_HANG to reflect mappings where writeback may hang where
the cause could be unrelated to reclaim.

This allows us to later use AS_WRITEBACK_MAY_HANG to mitigate other
scenarios such as possible hangs when sync waits on writeback.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fuse/file.c          |  2 +-
 include/linux/pagemap.h | 10 +++++-----
 mm/vmscan.c             |  3 +--
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index f1ef77a0be05..0804c832bcb7 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3126,7 +3126,7 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
 	inode->i_fop = &fuse_file_operations;
 	inode->i_data.a_ops = &fuse_file_aops;
 	if (fc->writeback_cache)
-		mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
+		mapping_set_writeback_may_hang(&inode->i_data);
 
 	INIT_LIST_HEAD(&fi->write_files);
 	INIT_LIST_HEAD(&fi->queued_writes);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 09b581c1d878..a895d6b6aabb 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -210,7 +210,7 @@ enum mapping_flags {
 	AS_STABLE_WRITES = 7,	/* must wait for writeback before modifying
 				   folio contents */
 	AS_INACCESSIBLE = 8,	/* Do not attempt direct R/W access to the mapping */
-	AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9,
+	AS_WRITEBACK_MAY_HANG = 9,
 	AS_KERNEL_FILE = 10,	/* mapping for a fake kernel file that shouldn't
 				   account usage to user cgroups */
 	/* Bits 16-25 are used for FOLIO_ORDER */
@@ -338,14 +338,14 @@ static inline bool mapping_inaccessible(const struct address_space *mapping)
 	return test_bit(AS_INACCESSIBLE, &mapping->flags);
 }
 
-static inline void mapping_set_writeback_may_deadlock_on_reclaim(struct address_space *mapping)
+static inline void mapping_set_writeback_may_hang(struct address_space *mapping)
 {
-	set_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags);
+	set_bit(AS_WRITEBACK_MAY_HANG, &mapping->flags);
 }
 
-static inline bool mapping_writeback_may_deadlock_on_reclaim(const struct address_space *mapping)
+static inline bool mapping_writeback_may_hang(const struct address_space *mapping)
 {
-	return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags);
+	return test_bit(AS_WRITEBACK_MAY_HANG, &mapping->flags);
 }
 
 static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 92980b072121..636c18ee2b2c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1216,8 +1216,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 			} else if (writeback_throttling_sane(sc) ||
 			    !folio_test_reclaim(folio) ||
 			    !may_enter_fs(folio, sc->gfp_mask) ||
-			    (mapping &&
-			     mapping_writeback_may_deadlock_on_reclaim(mapping))) {
+			    (mapping && mapping_writeback_may_hang(mapping))) {
 				/*
 				 * This is slightly racy -
 				 * folio_end_writeback() might have
-- 
2.47.3



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 1/2] mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to AS_WRITEBACK_MAY_HANG
  2025-11-20 18:42 ` [PATCH v1 1/2] mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to AS_WRITEBACK_MAY_HANG Joanne Koong
@ 2025-11-20 20:08   ` David Hildenbrand (Red Hat)
  2025-11-20 21:28     ` Joanne Koong
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-20 20:08 UTC (permalink / raw)
  To: Joanne Koong, akpm
  Cc: linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

On 11/20/25 19:42, Joanne Koong wrote:
> AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM was added to avoid waiting on
> writeback during reclaim for inodes belonging to filesystems where
> a) waiting on writeback in reclaim may lead to a deadlock or
> b) a writeback request may never complete due to the nature of the
> filesystem (unrelated to reclaim)
> 
> Rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to the more generic
> AS_WRITEBACK_MAY_HANG to reflect mappings where writeback may hang where
> the cause could be unrelated to reclaim.
> 
> This allows us to later use AS_WRITEBACK_MAY_HANG to mitigate other
> scenarios such as possible hangs when sync waits on writeback.

Hmm, there is a difference whether writeback may hang or whether 
writeback may deadlock.

In particular, isn't it the case that writeback on any filesystem might 
effectively hang forever on I/O errors etc?

Is this going back to the previous flag semantics before we decided on 
AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM? (I'd have to look at the previous 
discussions, but "writeback may take an indefinite amount" in patch #2 
pretty much looks like what I remember there)

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 1/2] mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to AS_WRITEBACK_MAY_HANG
  2025-11-20 20:08   ` David Hildenbrand (Red Hat)
@ 2025-11-20 21:28     ` Joanne Koong
  0 siblings, 0 replies; 15+ messages in thread
From: Joanne Koong @ 2025-11-20 21:28 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: akpm, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

On Thu, Nov 20, 2025 at 12:08 PM David Hildenbrand (Red Hat)
<david@kernel.org> wrote:
>
> On 11/20/25 19:42, Joanne Koong wrote:
> > AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM was added to avoid waiting on
> > writeback during reclaim for inodes belonging to filesystems where
> > a) waiting on writeback in reclaim may lead to a deadlock or
> > b) a writeback request may never complete due to the nature of the
> > filesystem (unrelated to reclaim)
> >
> > Rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to the more generic
> > AS_WRITEBACK_MAY_HANG to reflect mappings where writeback may hang where
> > the cause could be unrelated to reclaim.
> >
> > This allows us to later use AS_WRITEBACK_MAY_HANG to mitigate other
> > scenarios such as possible hangs when sync waits on writeback.
>
> Hmm, there is a difference whether writeback may hang or whether
> writeback may deadlock.
>
> In particular, isn't it the case that writeback on any filesystem might
> effectively hang forever on I/O errors etc?
>
> Is this going back to the previous flag semantics before we decided on
> AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM? (I'd have to look at the previous
> discussions, but "writeback may take an indefinite amount" in patch #2
> pretty much looks like what I remember there)

Yes, I think if we keep AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, then we
would need another flag to denote the inode should be skipped in
wait_sb_inodes(), which seems unideal. I was considering renaming this
to AS_WRITEBACK_INDETERMINATE but I remember everyone hated that name.

Thanks,
Joanne

>
> --
> Cheers
>
> David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-20 18:42 [PATCH v1 0/2] mm: skip wait in wait_sb_inodes() for hangable-writeback mappings Joanne Koong
  2025-11-20 18:42 ` [PATCH v1 1/2] mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to AS_WRITEBACK_MAY_HANG Joanne Koong
@ 2025-11-20 18:42 ` Joanne Koong
  2025-11-20 20:23   ` David Hildenbrand (Red Hat)
  1 sibling, 1 reply; 15+ messages in thread
From: Joanne Koong @ 2025-11-20 18:42 UTC (permalink / raw)
  To: akpm; +Cc: david, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

During superblock writeback waiting, skip inodes where writeback may
take an indefinite amount of time or hang, as denoted by the
AS_WRITEBACK_MAY_HANG mapping flag.

Currently, fuse is the only filesystem with this flag set. For a
properly functioning fuse server, writeback requests are completed and
there is no issue. However, if there is a bug in the fuse server and it
hangs on writeback, then without this change, wait_sb_inodes() will wait
forever.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
---
 fs/fs-writeback.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 2b35e80037fe..eb246e9fbf3d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2733,6 +2733,9 @@ static void wait_sb_inodes(struct super_block *sb)
 		if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
 			continue;
 
+		if (mapping_writeback_may_hang(mapping))
+			continue;
+
 		spin_unlock_irq(&sb->s_inode_wblist_lock);
 
 		spin_lock(&inode->i_lock);
-- 
2.47.3



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-20 18:42 ` [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes() Joanne Koong
@ 2025-11-20 20:23   ` David Hildenbrand (Red Hat)
  2025-11-20 21:20     ` Joanne Koong
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-20 20:23 UTC (permalink / raw)
  To: Joanne Koong, akpm
  Cc: linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

On 11/20/25 19:42, Joanne Koong wrote:
> During superblock writeback waiting, skip inodes where writeback may
> take an indefinite amount of time or hang, as denoted by the
> AS_WRITEBACK_MAY_HANG mapping flag.
> 
> Currently, fuse is the only filesystem with this flag set. For a
> properly functioning fuse server, writeback requests are completed and
> there is no issue. However, if there is a bug in the fuse server and it
> hangs on writeback, then without this change, wait_sb_inodes() will wait
> forever.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
> Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
> ---
>   fs/fs-writeback.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 2b35e80037fe..eb246e9fbf3d 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -2733,6 +2733,9 @@ static void wait_sb_inodes(struct super_block *sb)
>   		if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
>   			continue;
>   
> +		if (mapping_writeback_may_hang(mapping))
> +			continue;

I think I raised it in the past, but simply because it could happen, why 
would we unconditionally want to do that for all fuse mounts? That just 
seems wrong :(

To phrase it in a different way, if any writeback could theoretically 
hang, why are we even waiting on writeback in the first place?

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-20 20:23   ` David Hildenbrand (Red Hat)
@ 2025-11-20 21:20     ` Joanne Koong
  2025-11-24 13:58       ` David Hildenbrand (Red Hat)
  0 siblings, 1 reply; 15+ messages in thread
From: Joanne Koong @ 2025-11-20 21:20 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: akpm, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

On Thu, Nov 20, 2025 at 12:23 PM David Hildenbrand (Red Hat)
<david@kernel.org> wrote:
>
> On 11/20/25 19:42, Joanne Koong wrote:
> > During superblock writeback waiting, skip inodes where writeback may
> > take an indefinite amount of time or hang, as denoted by the
> > AS_WRITEBACK_MAY_HANG mapping flag.
> >
> > Currently, fuse is the only filesystem with this flag set. For a
> > properly functioning fuse server, writeback requests are completed and
> > there is no issue. However, if there is a bug in the fuse server and it
> > hangs on writeback, then without this change, wait_sb_inodes() will wait
> > forever.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
> > Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
> > ---
> >   fs/fs-writeback.c | 3 +++
> >   1 file changed, 3 insertions(+)
> >
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 2b35e80037fe..eb246e9fbf3d 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -2733,6 +2733,9 @@ static void wait_sb_inodes(struct super_block *sb)
> >               if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
> >                       continue;
> >
> > +             if (mapping_writeback_may_hang(mapping))
> > +                     continue;
>
> I think I raised it in the past, but simply because it could happen, why
> would we unconditionally want to do that for all fuse mounts? That just
> seems wrong :(

I think it's considered a userspace regression if we don't revert the
program behavior back to its previous version, even if it is from the
program being incorrectly written, as per the conversation in [1].

[1] https://lore.kernel.org/regressions/CAJnrk1Yh4GtF-wxWo_2ffbr90R44u0WDmMAEn9vr9pFgU0Nc6w@mail.gmail.com/T/#m73cf4b4828d51553caad3209a5ac92bca78e15d2

>
> To phrase it in a different way, if any writeback could theoretically
> hang, why are we even waiting on writeback in the first place?
>

I think it's because on other filesystems, something has to go
seriously wrong for writeback to hang, but on fuse a server can easily
make writeback hang and as it turns out, there are already existing
userspace programs that do this accidentally.

Thanks,
Joanne

> --
> Cheers
>
> David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-20 21:20     ` Joanne Koong
@ 2025-11-24 13:58       ` David Hildenbrand (Red Hat)
  2025-11-25  1:10         ` Joanne Koong
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-24 13:58 UTC (permalink / raw)
  To: Joanne Koong
  Cc: akpm, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

On 11/20/25 22:20, Joanne Koong wrote:
> On Thu, Nov 20, 2025 at 12:23 PM David Hildenbrand (Red Hat)
> <david@kernel.org> wrote:
>>
>> On 11/20/25 19:42, Joanne Koong wrote:
>>> During superblock writeback waiting, skip inodes where writeback may
>>> take an indefinite amount of time or hang, as denoted by the
>>> AS_WRITEBACK_MAY_HANG mapping flag.
>>>
>>> Currently, fuse is the only filesystem with this flag set. For a
>>> properly functioning fuse server, writeback requests are completed and
>>> there is no issue. However, if there is a bug in the fuse server and it
>>> hangs on writeback, then without this change, wait_sb_inodes() will wait
>>> forever.
>>>
>>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>>> Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
>>> Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
>>> ---
>>>    fs/fs-writeback.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>>> index 2b35e80037fe..eb246e9fbf3d 100644
>>> --- a/fs/fs-writeback.c
>>> +++ b/fs/fs-writeback.c
>>> @@ -2733,6 +2733,9 @@ static void wait_sb_inodes(struct super_block *sb)
>>>                if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
>>>                        continue;
>>>
>>> +             if (mapping_writeback_may_hang(mapping))
>>> +                     continue;
>>
>> I think I raised it in the past, but simply because it could happen, why
>> would we unconditionally want to do that for all fuse mounts? That just
>> seems wrong :(
> 
> I think it's considered a userspace regression if we don't revert the
> program behavior back to its previous version, even if it is from the
> program being incorrectly written, as per the conversation in [1].
> 
> [1] https://lore.kernel.org/regressions/CAJnrk1Yh4GtF-wxWo_2ffbr90R44u0WDmMAEn9vr9pFgU0Nc6w@mail.gmail.com/T/#m73cf4b4828d51553caad3209a5ac92bca78e15d2
> 
>>
>> To phrase it in a different way, if any writeback could theoretically
>> hang, why are we even waiting on writeback in the first place?
>>
> 
> I think it's because on other filesystems, something has to go
> seriously wrong for writeback to hang, but on fuse a server can easily
> make writeback hang and as it turns out, there are already existing
> userspace programs that do this accidentally.

Sorry, I only found the time to reply now. I wanted to reply in more 
detail why what you propose here does not make sense to me.

I understand that it might make one of the weird fuse scenarios (buggy 
fuse server) work again, but it sounds like we are adding more hacks on 
top of broken semantics. If we want to tackle the writeback problem, we 
should find a proper way to deal with that for good.

(1) AS_WRITEBACK_MAY_HANG semantics

As discussed in the past, writeeback of pretty much any filesystem might 
hang forever on I/O errors.

On network filesystems apparently as well fairly easily.

It's completely unclear when to set AS_WRITEBACK_MAY_HANG.

So as writeback on any filesystem may hang, AS_WRITEBACK_MAY_HANG would 
theoretically have to be set on any mapping out there.

The semantics don't make sense to me, unfortuantely.

(2) AS_WRITEBACK_MAY_HANG usage

It's unclear in which scenarios we would not want to wait for writeback, 
and what the effects of that are.

For example, wait_sb_inodes() documents "Data integrity sync. Must wait 
for all pages under writeback, because there may have been pages dirtied 
before our sync call ...".

It's completely unclear why it might be okay to skip that simply because 
a mapping indicated that waiting for writeback is maybe more sketchy 
than on other filesystems.

But what concerns me more is what we do about other 
folio_wait_writeback() callers. Throwing in AS_WRITEBACK_MAY_HANG 
wherever somebody reproduced a hang is not a good approach.

We need something more robust where we can just not break the kernel in 
weird ways because user space is buggy or malicious.

(3) Other operations

If my memory serves me right, there are similar issues on readahead. It 
wouldn't surprise me if there are yet other operations where fuse Et al 
can trick the kernel into hanging forever.

So I'm wondering if there is more to this than just "writeback may hang".

Obviously, getting the kernel to hang, controlled by user space that 
easily, is extremely unpleasant and probably the thing that I really 
dislike about fuse. Amir mentioned that maybe the iomap changes from 
Darrick might improve the situation in the long run, I would hope it 
would allow for de-nastifying fuse in that sense, at least in some 
scenarios.

I cannot really say what would be better here (maybe aborting writeback 
after a short timeout), but AS_WRITEBACK_MAY_HANG to then just skip 
selected waits for writeback is certainly something that does not make 
sense to me.

Regarding the patch here, is there a good reason why fuse does not have 
to wait for the "Data integrity sync. Must wait for all pages under 
writeback ..."?

IOW, is the documented "must" not a "must" for fuse? In that case, 
having a flag that states something like that that 
"AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC" would probable be what we would want 
to add to avoid waiting for writeback with clear semantics why it is ok 
in that specific scenario.

Hope that helps, and happy to be convinced why AS_WRITEBACK_MAY_HANG is 
the right thing to do in this way proposed here.

-- 
Cheers

David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-24 13:58       ` David Hildenbrand (Red Hat)
@ 2025-11-25  1:10         ` Joanne Koong
  2025-11-26 10:19           ` Miklos Szeredi
  2025-11-26 10:55           ` David Hildenbrand (Red Hat)
  0 siblings, 2 replies; 15+ messages in thread
From: Joanne Koong @ 2025-11-25  1:10 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: akpm, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

On Mon, Nov 24, 2025 at 5:58 AM David Hildenbrand (Red Hat)
<david@kernel.org> wrote:
>
> On 11/20/25 22:20, Joanne Koong wrote:
> > On Thu, Nov 20, 2025 at 12:23 PM David Hildenbrand (Red Hat)
> > <david@kernel.org> wrote:
> >>
> >> On 11/20/25 19:42, Joanne Koong wrote:
> >>> During superblock writeback waiting, skip inodes where writeback may
> >>> take an indefinite amount of time or hang, as denoted by the
> >>> AS_WRITEBACK_MAY_HANG mapping flag.
> >>>
> >>> Currently, fuse is the only filesystem with this flag set. For a
> >>> properly functioning fuse server, writeback requests are completed and
> >>> there is no issue. However, if there is a bug in the fuse server and it
> >>> hangs on writeback, then without this change, wait_sb_inodes() will wait
> >>> forever.
> >>>
> >>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> >>> Fixes: 0c58a97f919c ("fuse: remove tmp folio for writebacks and internal rb tree")
> >>> Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
> >>> ---
> >>>    fs/fs-writeback.c | 3 +++
> >>>    1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> >>> index 2b35e80037fe..eb246e9fbf3d 100644
> >>> --- a/fs/fs-writeback.c
> >>> +++ b/fs/fs-writeback.c
> >>> @@ -2733,6 +2733,9 @@ static void wait_sb_inodes(struct super_block *sb)
> >>>                if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
> >>>                        continue;
> >>>
> >>> +             if (mapping_writeback_may_hang(mapping))
> >>> +                     continue;
> >>
> >> I think I raised it in the past, but simply because it could happen, why
> >> would we unconditionally want to do that for all fuse mounts? That just
> >> seems wrong :(
> >
> > I think it's considered a userspace regression if we don't revert the
> > program behavior back to its previous version, even if it is from the
> > program being incorrectly written, as per the conversation in [1].
> >
> > [1] https://lore.kernel.org/regressions/CAJnrk1Yh4GtF-wxWo_2ffbr90R44u0WDmMAEn9vr9pFgU0Nc6w@mail.gmail.com/T/#m73cf4b4828d51553caad3209a5ac92bca78e15d2
> >
> >>
> >> To phrase it in a different way, if any writeback could theoretically
> >> hang, why are we even waiting on writeback in the first place?
> >>
> >
> > I think it's because on other filesystems, something has to go
> > seriously wrong for writeback to hang, but on fuse a server can easily
> > make writeback hang and as it turns out, there are already existing
> > userspace programs that do this accidentally.
>
> Sorry, I only found the time to reply now. I wanted to reply in more
> detail why what you propose here does not make sense to me.

No worries at all, thank you for taking the time to write out your thoughts.

>
> I understand that it might make one of the weird fuse scenarios (buggy
> fuse server) work again, but it sounds like we are adding more hacks on
> top of broken semantics. If we want to tackle the writeback problem, we
> should find a proper way to deal with that for good.

I agree that this doesn't solve the underlying problem that folios
belonging to a malfunctioning fuse server may be stuck in writeback
state forever. To properly and comprehensively address that and the
other issues (which you alluded to a bit in section 3 below) would I
think be a much larger effort, but as I understand it, a userspace
regression needs to be resolved more immediately. I wasn't aware that
if the regression is caused by a faulty userspace program, that rule
still holds, but I was made aware of that. Even though there are other
ways that sync could be held up by a faulty/malicious userspace
program prior to the changes that were added in commit 0c58a97f919c
("fuse: remove tmp folio for writebacks and internal rb tree"), I
think the issue is that that commit gives malfunctioning servers
another way, which may be a way that some well-intended but buggy
servers could trigger, which is considered a regression. If it's
acceptable to delay addressing this until the actual solution that
addresses the entire problem, then I agree that this patchset is
unnecessary and we should just wait for the more comprehensive
solution.

>
>
> (1) AS_WRITEBACK_MAY_HANG semantics
>
> As discussed in the past, writeeback of pretty much any filesystem might
> hang forever on I/O errors.
>
> On network filesystems apparently as well fairly easily.
>
> It's completely unclear when to set AS_WRITEBACK_MAY_HANG.
>
> So as writeback on any filesystem may hang, AS_WRITEBACK_MAY_HANG would
> theoretically have to be set on any mapping out there.
>
> The semantics don't make sense to me, unfortuantely.

I'm not sure what a better name here would be unfortunately. I
considered AS_WRITEBACK_UNRELIABLE and AS_WRITEBACK_UNSTABLE but I
think those run into the same issue where that could technically be
true of any filesystem (eg the block layer may fail the writeback, so
it's not completely reliable/stable).

>
>
> (2) AS_WRITEBACK_MAY_HANG usage
>
> It's unclear in which scenarios we would not want to wait for writeback,
> and what the effects of that are.
>
> For example, wait_sb_inodes() documents "Data integrity sync. Must wait
> for all pages under writeback, because there may have been pages dirtied
> before our sync call ...".
>
> It's completely unclear why it might be okay to skip that simply because
> a mapping indicated that waiting for writeback is maybe more sketchy
> than on other filesystems.
>
> But what concerns me more is what we do about other
> folio_wait_writeback() callers. Throwing in AS_WRITEBACK_MAY_HANG
> wherever somebody reproduced a hang is not a good approach.

If I'm recalling this correctly (I'm looking back at this patchset [1]
to trigger my memory), there were 3 cases where folio_wait_writeback()
callers run into issues: reclaim, sync, and migration. The reclaim
issue is addressed. For the migration case, I don't think this results
in any user-visible regressions. Not that the migration case is not a
big issue, I think we should find a proper fix for it, but the
migration stall is already easily caused by a server indefinitely
holding a folio lock, so the writeback case didn't add this stall as a
new side effect.

[1] https://lore.kernel.org/linux-fsdevel/20241122232359.429647-1-joannelkoong@gmail.com/
>
> We need something more robust where we can just not break the kernel in
> weird ways because user space is buggy or malicious.
>
>
> (3) Other operations
>
> If my memory serves me right, there are similar issues on readahead. It
> wouldn't surprise me if there are yet other operations where fuse Et al
> can trick the kernel into hanging forever.
>
> So I'm wondering if there is more to this than just "writeback may hang".
>
>
>
> Obviously, getting the kernel to hang, controlled by user space that
> easily, is extremely unpleasant and probably the thing that I really
> dislike about fuse. Amir mentioned that maybe the iomap changes from
> Darrick might improve the situation in the long run, I would hope it
> would allow for de-nastifying fuse in that sense, at least in some
> scenarios.
>
>
> I cannot really say what would be better here (maybe aborting writeback
> after a short timeout), but AS_WRITEBACK_MAY_HANG to then just skip
> selected waits for writeback is certainly something that does not make
> sense to me.
>
>
> Regarding the patch here, is there a good reason why fuse does not have
> to wait for the "Data integrity sync. Must wait for all pages under
> writeback ..."?
>
> IOW, is the documented "must" not a "must" for fuse? In that case,

Prior to the changes added in commit 0c58a97f919c ("fuse: remove tmp
folio for writebacks and internal rb tree"), fuse didn't ensure that
data was written back for sync. The folio was marked as not under
writeback anymore, even if it was still under writeback.

> having a flag that states something like that that
> "AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC" would probable be what we would want
> to add to avoid waiting for writeback with clear semantics why it is ok
> in that specific scenario.

Having a separate AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC mapping flag
sounds reasonable to me and I agree is more clearer semantically.

>
> Hope that helps, and happy to be convinced why AS_WRITEBACK_MAY_HANG is
> the right thing to do in this way proposed here.

This was helpful, thanks for your thoughts!

>
> --
> Cheers
>
> David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-25  1:10         ` Joanne Koong
@ 2025-11-26 10:19           ` Miklos Szeredi
  2025-11-26 10:41             ` David Hildenbrand (Red Hat)
  2025-11-26 10:55           ` David Hildenbrand (Red Hat)
  1 sibling, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2025-11-26 10:19 UTC (permalink / raw)
  To: Joanne Koong
  Cc: David Hildenbrand (Red Hat),
	akpm, linux-mm, shakeel.butt, athul.krishna.kr, stable

On Tue, 25 Nov 2025 at 02:10, Joanne Koong <joannelkoong@gmail.com> wrote:

> Prior to the changes added in commit 0c58a97f919c ("fuse: remove tmp
> folio for writebacks and internal rb tree"), fuse didn't ensure that
> data was written back for sync. The folio was marked as not under
> writeback anymore, even if it was still under writeback.

This is the main point.  Fuse has existed for 20 years without data
integrity guarantees.  Reverting to that behavior is *not* a
regression, it's simply a decades old bug.   And solving that bug is
darn hard, which is why it's not an option at this point.

One idea to limit the scope of this revert is to constrain it to
suspend.  I'm not sure how easy or ugly that might be.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-26 10:19           ` Miklos Szeredi
@ 2025-11-26 10:41             ` David Hildenbrand (Red Hat)
  0 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26 10:41 UTC (permalink / raw)
  To: Miklos Szeredi, Joanne Koong
  Cc: akpm, linux-mm, shakeel.butt, athul.krishna.kr, stable

On 11/26/25 11:19, Miklos Szeredi wrote:
> On Tue, 25 Nov 2025 at 02:10, Joanne Koong <joannelkoong@gmail.com> wrote:
> 
>> Prior to the changes added in commit 0c58a97f919c ("fuse: remove tmp
>> folio for writebacks and internal rb tree"), fuse didn't ensure that
>> data was written back for sync. The folio was marked as not under
>> writeback anymore, even if it was still under writeback.
> 
> This is the main point.  Fuse has existed for 20 years without data
> integrity guarantees.  Reverting to that behavior is *not* a
> regression, it's simply a decades old bug.   And solving that bug is
> darn hard, which is why it's not an option at this point.

Yes, and it should be clearly spelled out in the patch descriptions that 
this is a fuse-only thing, and why it is a fuse-only thing. It's not a 
"we don't have data integrity because writeback may hang", it's a "we 
don't have data integrity because fuse never supported it, so waiting 
here is simply not required".

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-25  1:10         ` Joanne Koong
  2025-11-26 10:19           ` Miklos Szeredi
@ 2025-11-26 10:55           ` David Hildenbrand (Red Hat)
  2025-11-26 17:58             ` Joanne Koong
  1 sibling, 1 reply; 15+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26 10:55 UTC (permalink / raw)
  To: Joanne Koong
  Cc: akpm, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

>>
>> I understand that it might make one of the weird fuse scenarios (buggy
>> fuse server) work again, but it sounds like we are adding more hacks on
>> top of broken semantics. If we want to tackle the writeback problem, we
>> should find a proper way to deal with that for good.
> 
> I agree that this doesn't solve the underlying problem that folios
> belonging to a malfunctioning fuse server may be stuck in writeback
> state forever. To properly and comprehensively address that and the
> other issues (which you alluded to a bit in section 3 below) would I
> think be a much larger effort, but as I understand it, a userspace
> regression needs to be resolved more immediately. 

Right, and that should have be spelled out more clearly in the patch 
description.

The "Currently, fuse is the only filesystem with this flag set. For a
properly functioning fuse server, writeback requests are completed and
there is no issue." part doesn't emphasis that there is no need to wait 
for a different reason: there are no data integrity guarantees with 
fuse, and trying to provide them is currently shaky.

> I wasn't aware that
> if the regression is caused by a faulty userspace program, that rule
> still holds, but I was made aware of that.

Yeah, that is weird though. But I don't understand the details there. 
Hanging the kernel in any case is beyond nasty.

> Even though there are other
> ways that sync could be held up by a faulty/malicious userspace
> program prior to the changes that were added in commit 0c58a97f919c
> ("fuse: remove tmp folio for writebacks and internal rb tree"), I
> think the issue is that that commit gives malfunctioning servers
> another way, which may be a way that some well-intended but buggy
> servers could trigger, which is considered a regression. If it's
> acceptable to delay addressing this until the actual solution that
> addresses the entire problem, then I agree that this patchset is
> unnecessary and we should just wait for the more comprehensive
> solution.
> 
>>
>>
>> (1) AS_WRITEBACK_MAY_HANG semantics
>>
>> As discussed in the past, writeeback of pretty much any filesystem might
>> hang forever on I/O errors.
>>
>> On network filesystems apparently as well fairly easily.
>>
>> It's completely unclear when to set AS_WRITEBACK_MAY_HANG.
>>
>> So as writeback on any filesystem may hang, AS_WRITEBACK_MAY_HANG would
>> theoretically have to be set on any mapping out there.
>>
>> The semantics don't make sense to me, unfortuantely.
> 
> I'm not sure what a better name here would be unfortunately. I
> considered AS_WRITEBACK_UNRELIABLE and AS_WRITEBACK_UNSTABLE but I
> think those run into the same issue where that could technically be
> true of any filesystem (eg the block layer may fail the writeback, so
> it's not completely reliable/stable).

See below, I think this here is really about "no data integrity 
guarantees, so no need to wait even if writeback would be working 
perfectly".

The reproduced hang is rather a symptom of us now trying to achieve data 
integrity when it was never guaranteed.

At least that's my understanding after reading Miklos reply.

> 
>>
>>
>> (2) AS_WRITEBACK_MAY_HANG usage
>>
>> It's unclear in which scenarios we would not want to wait for writeback,
>> and what the effects of that are.
>>
>> For example, wait_sb_inodes() documents "Data integrity sync. Must wait
>> for all pages under writeback, because there may have been pages dirtied
>> before our sync call ...".
>>
>> It's completely unclear why it might be okay to skip that simply because
>> a mapping indicated that waiting for writeback is maybe more sketchy
>> than on other filesystems.
>>
>> But what concerns me more is what we do about other
>> folio_wait_writeback() callers. Throwing in AS_WRITEBACK_MAY_HANG
>> wherever somebody reproduced a hang is not a good approach.
> 
> If I'm recalling this correctly (I'm looking back at this patchset [1]
> to trigger my memory), there were 3 cases where folio_wait_writeback()
> callers run into issues: reclaim, sync, and migration. 

I suspect there are others where we could hang forever, but maybe 
limited to operations where an operation would be stuck forever. Or new 
ones would simply be added in the future.

For example, memory_failure() calls folio_wait_writeback() and I don't 
immediately see why that one would not be able to hit fuse.

So my concern remains. We try to fix the fallout while we really need a 
long term plan of how to deal with that mess.

Sorry that you are the poor soul that opened this can of worms,

[...]

>>
>> Regarding the patch here, is there a good reason why fuse does not have
>> to wait for the "Data integrity sync. Must wait for all pages under
>> writeback ..."?
>>
>> IOW, is the documented "must" not a "must" for fuse? In that case,
> 
> Prior to the changes added in commit 0c58a97f919c ("fuse: remove tmp
> folio for writebacks and internal rb tree"), fuse didn't ensure that
> data was written back for sync. The folio was marked as not under
> writeback anymore, even if it was still under writeback.

Okay, so this really is a fuse special thing.

> 
>> having a flag that states something like that that
>> "AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC" would probable be what we would want
>> to add to avoid waiting for writeback with clear semantics why it is ok
>> in that specific scenario.
> 
> Having a separate AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC mapping flag
> sounds reasonable to me and I agree is more clearer semantically.

Good. Then it's clear that we are not waiting because writeback is 
shaky, but because even if it would be working, because we don't have to 
because there are no such guarantees.

Maybe

AS_NO_DATA_INTEGRITY

or similar would be cleaner, I'll have to leave that to you and Miklos 
to decide what exactly the semantics are that fuse currently doesn't 
provide.


-- 
Cheers

David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-26 10:55           ` David Hildenbrand (Red Hat)
@ 2025-11-26 17:58             ` Joanne Koong
  2025-12-03  9:28               ` Miklos Szeredi
  0 siblings, 1 reply; 15+ messages in thread
From: Joanne Koong @ 2025-11-26 17:58 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: akpm, linux-mm, shakeel.butt, athul.krishna.kr, miklos, stable

On Wed, Nov 26, 2025 at 2:55 AM David Hildenbrand (Red Hat)
<david@kernel.org> wrote:
> >
> >> having a flag that states something like that that
> >> "AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC" would probable be what we would want
> >> to add to avoid waiting for writeback with clear semantics why it is ok
> >> in that specific scenario.
> >
> > Having a separate AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC mapping flag
> > sounds reasonable to me and I agree is more clearer semantically.
>
> Good. Then it's clear that we are not waiting because writeback is
> shaky, but because even if it would be working, because we don't have to
> because there are no such guarantees.
>
> Maybe
>
> AS_NO_DATA_INTEGRITY
>
> or similar would be cleaner, I'll have to leave that to you and Miklos
> to decide what exactly the semantics are that fuse currently doesn't
> provide.

After reading Miklos's reply, I must have misunderstood this then - my
understanding was that the reason we couldn't guarantee data integrity
in fuse was because of the temp pages design where checking the
writeback flag on the real folio doesn't reflect writeback state, but
that removing the temp pages and using the real folio now does
guarantee this. But it seems like it's not as simple as that and
there's no data integrity guarantees for other reasons.

Changing this to AS_NO_DATA_INTEGRITY sounds good to me, if that
sounds good to Miklos as well. Or do you have another preference,
Miklos?

Thanks,
Joanne
>
>
> --
> Cheers
>
> David


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-11-26 17:58             ` Joanne Koong
@ 2025-12-03  9:28               ` Miklos Szeredi
  2025-12-04 18:06                 ` Joanne Koong
  0 siblings, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2025-12-03  9:28 UTC (permalink / raw)
  To: Joanne Koong
  Cc: David Hildenbrand (Red Hat),
	akpm, linux-mm, shakeel.butt, athul.krishna.kr, stable

On Wed, 26 Nov 2025 at 18:58, Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Wed, Nov 26, 2025 at 2:55 AM David Hildenbrand (Red Hat)
> <david@kernel.org> wrote:
> > >
> > >> having a flag that states something like that that
> > >> "AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC" would probable be what we would want
> > >> to add to avoid waiting for writeback with clear semantics why it is ok
> > >> in that specific scenario.
> > >
> > > Having a separate AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC mapping flag
> > > sounds reasonable to me and I agree is more clearer semantically.
> >
> > Good. Then it's clear that we are not waiting because writeback is
> > shaky, but because even if it would be working, because we don't have to
> > because there are no such guarantees.
> >
> > Maybe
> >
> > AS_NO_DATA_INTEGRITY
> >
> > or similar would be cleaner, I'll have to leave that to you and Miklos
> > to decide what exactly the semantics are that fuse currently doesn't
> > provide.
>
> After reading Miklos's reply, I must have misunderstood this then - my
> understanding was that the reason we couldn't guarantee data integrity
> in fuse was because of the temp pages design where checking the
> writeback flag on the real folio doesn't reflect writeback state, but
> that removing the temp pages and using the real folio now does
> guarantee this. But it seems like it's not as simple as that and
> there's no data integrity guarantees for other reasons.
>
> Changing this to AS_NO_DATA_INTEGRITY sounds good to me, if that
> sounds good to Miklos as well. Or do you have another preference,
> Miklos?

Sure, sounds good.

(Sorry about the delay, missed this.)

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes()
  2025-12-03  9:28               ` Miklos Szeredi
@ 2025-12-04 18:06                 ` Joanne Koong
  0 siblings, 0 replies; 15+ messages in thread
From: Joanne Koong @ 2025-12-04 18:06 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: David Hildenbrand (Red Hat),
	akpm, linux-mm, shakeel.butt, athul.krishna.kr, stable

On Wed, Dec 3, 2025 at 1:28 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Wed, 26 Nov 2025 at 18:58, Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Wed, Nov 26, 2025 at 2:55 AM David Hildenbrand (Red Hat)
> > <david@kernel.org> wrote:
> > > >
> > > >> having a flag that states something like that that
> > > >> "AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC" would probable be what we would want
> > > >> to add to avoid waiting for writeback with clear semantics why it is ok
> > > >> in that specific scenario.
> > > >
> > > > Having a separate AS_NO_WRITEBACK_WAIT_ON_DATA_SYNC mapping flag
> > > > sounds reasonable to me and I agree is more clearer semantically.
> > >
> > > Good. Then it's clear that we are not waiting because writeback is
> > > shaky, but because even if it would be working, because we don't have to
> > > because there are no such guarantees.
> > >
> > > Maybe
> > >
> > > AS_NO_DATA_INTEGRITY
> > >
> > > or similar would be cleaner, I'll have to leave that to you and Miklos
> > > to decide what exactly the semantics are that fuse currently doesn't
> > > provide.
> >
> > After reading Miklos's reply, I must have misunderstood this then - my
> > understanding was that the reason we couldn't guarantee data integrity
> > in fuse was because of the temp pages design where checking the
> > writeback flag on the real folio doesn't reflect writeback state, but
> > that removing the temp pages and using the real folio now does
> > guarantee this. But it seems like it's not as simple as that and
> > there's no data integrity guarantees for other reasons.
> >
> > Changing this to AS_NO_DATA_INTEGRITY sounds good to me, if that
> > sounds good to Miklos as well. Or do you have another preference,
> > Miklos?
>
> Sure, sounds good.
>
> (Sorry about the delay, missed this.)

Is the reason we can't guarantee data integrity because the server
ultimately controls disk persistence whereby it can claim to have
written data even if it didn't? If so, it seems like NFS / the other
network-based filesystems would also have to set that
AS_NO_DATA_INTEGRITY mapping flag too to be consistent? Or is there
some other reason?

Thanks,
Joanne


>
> Thanks,
> Miklos


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-12-04 18:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-20 18:42 [PATCH v1 0/2] mm: skip wait in wait_sb_inodes() for hangable-writeback mappings Joanne Koong
2025-11-20 18:42 ` [PATCH v1 1/2] mm: rename AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM to AS_WRITEBACK_MAY_HANG Joanne Koong
2025-11-20 20:08   ` David Hildenbrand (Red Hat)
2025-11-20 21:28     ` Joanne Koong
2025-11-20 18:42 ` [PATCH v1 2/2] fs/writeback: skip inodes with potential writeback hang in wait_sb_inodes() Joanne Koong
2025-11-20 20:23   ` David Hildenbrand (Red Hat)
2025-11-20 21:20     ` Joanne Koong
2025-11-24 13:58       ` David Hildenbrand (Red Hat)
2025-11-25  1:10         ` Joanne Koong
2025-11-26 10:19           ` Miklos Szeredi
2025-11-26 10:41             ` David Hildenbrand (Red Hat)
2025-11-26 10:55           ` David Hildenbrand (Red Hat)
2025-11-26 17:58             ` Joanne Koong
2025-12-03  9:28               ` Miklos Szeredi
2025-12-04 18:06                 ` Joanne Koong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox