linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Matthew Wilcox <willy@infradead.org>,
	Christian Theune <ct@flyingcircus.io>
Cc: linux-mm@kvack.org,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	torvalds@linux-foundation.org, Daniel Dao <dqminh@cloudflare.com>,
	Dave Chinner <david@fromorbit.com>,
	clm@meta.com, regressions@lists.linux.dev,
	regressions@leemhuis.info
Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
Date: Thu, 12 Sep 2024 16:12:36 -0600	[thread overview]
Message-ID: <0fc8c3e7-e5d2-40db-8661-8c7199f84e43@kernel.dk> (raw)
In-Reply-To: <ZuNjNNmrDPVsVK03@casper.infradead.org>

On 9/12/24 3:55 PM, Matthew Wilcox wrote:
> On Thu, Sep 12, 2024 at 11:18:34PM +0200, Christian Theune wrote:
>> This bug is very hard to reproduce but has been known to exist as a
>> ?fluke? for a while already. I have invested a number of days trying
>> to come up with workloads to trigger it quicker than that stochastic
>> ?once every few weeks in a fleet of 1.5k machines", but it eludes
>> me so far. I know that this also affects Facebook/Meta as well as
>> Cloudflare who are both running newer kernels (at least 6.1, 6.6,
>> and 6.9) with the above mentioned patch reverted. I?m from a much
>> smaller company and seeing that those guys are running with this patch
>> reverted (that now makes their kernel basically an untested/unsupported
>> deviation from the mainline) smells like desparation. I?m with a
>> much smaller team and company and I?m wondering why this isn?t
>> tackled more urgently from more hands to make it shallow (hopefully).
> 
> This passive-aggressive nonsense is deeply aggravating.  I've known
> about this bug for much longer, but like you I am utterly unable to
> reproduce it.  I've spent months looking for the bug, and I cannot.

What passive aggressiveness?! There's a data corruption bug where we
know what causes it, yet we continue to ship it. That's aggravating.

People are aware of the bug, and since there's no good reproducer, it's
hard to fix. That part is fine and understandable. What seems amiss here
is the fact that large folio support for xfs hasn't just been reverted
until the issue is understood and resolved.

When I saw Christian's report, I seemed to recall that we ran into this
at Meta too. And we did, and hence have been reverting it since our 5.19
release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping
things that are known broken.

-- 
Jens Axboe


  parent reply	other threads:[~2024-09-12 22:12 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-12 21:18 Christian Theune
2024-09-12 21:55 ` Matthew Wilcox
2024-09-12 22:11   ` Christian Theune
2024-09-12 22:12   ` Jens Axboe [this message]
2024-09-12 22:25     ` Linus Torvalds
2024-09-12 22:30       ` Jens Axboe
2024-09-12 22:56         ` Linus Torvalds
2024-09-13  3:44           ` Matthew Wilcox
2024-09-13 13:23             ` Christian Theune
2024-09-13 12:11       ` Christian Brauner
2024-09-16 13:29         ` Matthew Wilcox
2024-09-18  9:51           ` Christian Brauner
2024-09-13 15:30       ` Chris Mason
2024-09-13 15:51         ` Matthew Wilcox
2024-09-13 16:33           ` Chris Mason
2024-09-13 18:15             ` Matthew Wilcox
2024-09-13 21:24               ` Linus Torvalds
2024-09-13 21:30                 ` Matthew Wilcox
2024-09-13 16:04       ` David Howells
2024-09-13 16:37         ` Chris Mason
2024-09-16  0:00       ` Dave Chinner
2024-09-16  4:20         ` Linus Torvalds
2024-09-16  8:47           ` Chris Mason
2024-09-17  9:32             ` Matthew Wilcox
2024-09-17  9:36               ` Chris Mason
2024-09-17 10:11               ` Christian Theune
2024-09-17 11:13               ` Chris Mason
2024-09-17 13:25                 ` Matthew Wilcox
2024-09-18  6:37                   ` Jens Axboe
2024-09-18  9:28                     ` Chris Mason
2024-09-18 12:23                       ` Chris Mason
2024-09-18 13:34                       ` Matthew Wilcox
2024-09-18 13:51                         ` Linus Torvalds
2024-09-18 14:12                           ` Matthew Wilcox
2024-09-18 14:39                             ` Linus Torvalds
2024-09-18 17:12                               ` Matthew Wilcox
2024-09-18 16:37                             ` Chris Mason
2024-09-19  1:43                         ` Dave Chinner
2024-09-19  3:03                           ` Linus Torvalds
2024-09-19  3:12                             ` Linus Torvalds
2024-09-19  3:38                               ` Jens Axboe
2024-09-19  4:32                                 ` Linus Torvalds
2024-09-19  4:42                                   ` Jens Axboe
2024-09-19  4:36                                 ` Matthew Wilcox
2024-09-19  4:46                                   ` Jens Axboe
2024-09-19  5:20                                     ` Jens Axboe
2024-09-19  4:46                                   ` Linus Torvalds
2024-09-20 13:54                                   ` Chris Mason
2024-09-24 15:58                                     ` Matthew Wilcox
2024-09-24 17:16                                     ` Sam James
2024-09-25 16:06                                       ` Kairui Song
2024-09-25 16:42                                         ` Christian Theune
2024-09-27 14:51                                         ` Sam James
2024-09-27 14:58                                           ` Jens Axboe
2024-10-01 21:10                                             ` Kairui Song
2024-09-24 19:17                                     ` Chris Mason
2024-09-24 19:24                                       ` Linus Torvalds
2024-09-19  6:34                               ` Christian Theune
2024-09-19  6:57                                 ` Linus Torvalds
2024-09-19 10:19                                   ` Christian Theune
2024-09-30 17:34                                     ` Christian Theune
2024-09-30 18:46                                       ` Linus Torvalds
2024-09-30 19:25                                         ` Christian Theune
2024-09-30 20:12                                           ` Linus Torvalds
2024-09-30 20:56                                             ` Matthew Wilcox
2024-09-30 22:42                                               ` Davidlohr Bueso
2024-09-30 23:00                                                 ` Davidlohr Bueso
2024-09-30 23:53                                               ` Linus Torvalds
2024-10-01  0:56                                       ` Chris Mason
2024-10-01  7:54                                         ` Christian Theune
2024-10-10  6:29                                         ` Christian Theune
2024-10-11  7:27                                           ` Christian Theune
2024-10-11  9:08                                             ` Christian Theune
2024-10-11 13:06                                               ` Chris Mason
2024-10-11 13:50                                                 ` Christian Theune
2024-10-12 17:01                                                 ` Linus Torvalds
2024-12-02 10:44                                                   ` Christian Theune
2024-10-01  2:22                                       ` Dave Chinner
2024-09-16  7:14         ` Christian Theune
2024-09-16 12:16           ` Matthew Wilcox
2024-09-18  8:31           ` Christian Theune

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0fc8c3e7-e5d2-40db-8661-8c7199f84e43@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=clm@meta.com \
    --cc=ct@flyingcircus.io \
    --cc=david@fromorbit.com \
    --cc=dqminh@cloudflare.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox