From: Dave Chinner <david@fromorbit.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>,
Matthew Wilcox <willy@infradead.org>,
Christian Theune <ct@flyingcircus.io>,
linux-mm@kvack.org,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Daniel Dao <dqminh@cloudflare.com>,
clm@meta.com, regressions@lists.linux.dev,
regressions@leemhuis.info
Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
Date: Mon, 16 Sep 2024 10:00:18 +1000 [thread overview]
Message-ID: <Zud1EhTnoWIRFPa/@dread.disaster.area> (raw)
In-Reply-To: <CAHk-=wh5LRp6Tb2oLKv1LrJWuXKOvxcucMfRMmYcT-npbo0=_A@mail.gmail.com>
On Thu, Sep 12, 2024 at 03:25:50PM -0700, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:12, Jens Axboe <axboe@kernel.dk> wrote:
> Honestly, the fact that it hasn't been reverted after apparently
> people knowing about it for months is a bit shocking to me. Filesystem
> people tend to take unknown corruption issues as a big deal. What
> makes this so special? Is it because the XFS people don't consider it
> an XFS issue, so...
I don't think this is a data corruption/loss problem - it certainly
hasn't ever appeared that way to me. The "data loss" appeared to be
in incomplete postgres dump files after the system was rebooted and
this is exactly what would happen when you randomly crash the
system. i.e. dirty data in memory is lost, and application data
being written at the time is in an inconsistent state after the
system recovers. IOWs, there was no clear evidence of actual data
corruption occuring, and data loss is definitely expected when the
page cache iteration hangs and the system is forcibly rebooted
without being able to sync or unmount the filesystems...
All the hangs seem to be caused by folio lookup getting stuck
on a rogue xarray entry in truncate or readahead. If we find an
invalid entry or a folio from a different mapping or with a
unexpected index, we skip it and try again. Hence this does not
appear to be a data corruption vector, either - it results in a
livelock from endless retry because of the bad entry in the xarray.
This endless retry livelock appears to be what is being reported.
IOWs, there is no evidence of real runtime data corruption or loss
from this pagecache livelock bug. We also haven't heard of any
random file data corruption events since we've enabled large folios
on XFS. Hence there really is no evidence to indicate that there is
a large folio xarray lookup bug that results in data corruption in
the existing code, and therefore there is no obvious reason for
turning off the functionality we are already building significant
new functionality on top of.
It's been 10 months since I asked Christain to help isolate a
reproducer so we can track this down. Nothing came from that, so
we're still at exactly where we were at back in november 2023 -
waiting for information on a way to reproduce this issue more
reliably.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-09-16 0:00 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-12 21:18 Christian Theune
2024-09-12 21:55 ` Matthew Wilcox
2024-09-12 22:11 ` Christian Theune
2024-09-12 22:12 ` Jens Axboe
2024-09-12 22:25 ` Linus Torvalds
2024-09-12 22:30 ` Jens Axboe
2024-09-12 22:56 ` Linus Torvalds
2024-09-13 3:44 ` Matthew Wilcox
2024-09-13 13:23 ` Christian Theune
2024-09-13 12:11 ` Christian Brauner
2024-09-16 13:29 ` Matthew Wilcox
2024-09-18 9:51 ` Christian Brauner
2024-09-13 15:30 ` Chris Mason
2024-09-13 15:51 ` Matthew Wilcox
2024-09-13 16:33 ` Chris Mason
2024-09-13 18:15 ` Matthew Wilcox
2024-09-13 21:24 ` Linus Torvalds
2024-09-13 21:30 ` Matthew Wilcox
2024-09-13 16:04 ` David Howells
2024-09-13 16:37 ` Chris Mason
2024-09-16 0:00 ` Dave Chinner [this message]
2024-09-16 4:20 ` Linus Torvalds
2024-09-16 8:47 ` Chris Mason
2024-09-17 9:32 ` Matthew Wilcox
2024-09-17 9:36 ` Chris Mason
2024-09-17 10:11 ` Christian Theune
2024-09-17 11:13 ` Chris Mason
2024-09-17 13:25 ` Matthew Wilcox
2024-09-18 6:37 ` Jens Axboe
2024-09-18 9:28 ` Chris Mason
2024-09-18 12:23 ` Chris Mason
2024-09-18 13:34 ` Matthew Wilcox
2024-09-18 13:51 ` Linus Torvalds
2024-09-18 14:12 ` Matthew Wilcox
2024-09-18 14:39 ` Linus Torvalds
2024-09-18 17:12 ` Matthew Wilcox
2024-09-18 16:37 ` Chris Mason
2024-09-19 1:43 ` Dave Chinner
2024-09-19 3:03 ` Linus Torvalds
2024-09-19 3:12 ` Linus Torvalds
2024-09-19 3:38 ` Jens Axboe
2024-09-19 4:32 ` Linus Torvalds
2024-09-19 4:42 ` Jens Axboe
2024-09-19 4:36 ` Matthew Wilcox
2024-09-19 4:46 ` Jens Axboe
2024-09-19 5:20 ` Jens Axboe
2024-09-19 4:46 ` Linus Torvalds
2024-09-20 13:54 ` Chris Mason
2024-09-24 15:58 ` Matthew Wilcox
2024-09-24 17:16 ` Sam James
2024-09-25 16:06 ` Kairui Song
2024-09-25 16:42 ` Christian Theune
2024-09-27 14:51 ` Sam James
2024-09-27 14:58 ` Jens Axboe
2024-10-01 21:10 ` Kairui Song
2024-09-24 19:17 ` Chris Mason
2024-09-24 19:24 ` Linus Torvalds
2024-09-19 6:34 ` Christian Theune
2024-09-19 6:57 ` Linus Torvalds
2024-09-19 10:19 ` Christian Theune
2024-09-30 17:34 ` Christian Theune
2024-09-30 18:46 ` Linus Torvalds
2024-09-30 19:25 ` Christian Theune
2024-09-30 20:12 ` Linus Torvalds
2024-09-30 20:56 ` Matthew Wilcox
2024-09-30 22:42 ` Davidlohr Bueso
2024-09-30 23:00 ` Davidlohr Bueso
2024-09-30 23:53 ` Linus Torvalds
2024-10-01 0:56 ` Chris Mason
2024-10-01 7:54 ` Christian Theune
2024-10-10 6:29 ` Christian Theune
2024-10-11 7:27 ` Christian Theune
2024-10-11 9:08 ` Christian Theune
2024-10-11 13:06 ` Chris Mason
2024-10-11 13:50 ` Christian Theune
2024-10-12 17:01 ` Linus Torvalds
2024-12-02 10:44 ` Christian Theune
2024-10-01 2:22 ` Dave Chinner
2024-09-16 7:14 ` Christian Theune
2024-09-16 12:16 ` Matthew Wilcox
2024-09-18 8:31 ` Christian Theune
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zud1EhTnoWIRFPa/@dread.disaster.area \
--to=david@fromorbit.com \
--cc=axboe@kernel.dk \
--cc=clm@meta.com \
--cc=ct@flyingcircus.io \
--cc=dqminh@cloudflare.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=regressions@leemhuis.info \
--cc=regressions@lists.linux.dev \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox