linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)
@ 2024-09-12 21:18 Christian Theune
  2024-09-12 21:55 ` Matthew Wilcox
  0 siblings, 1 reply; 81+ messages in thread
From: Christian Theune @ 2024-09-12 21:18 UTC (permalink / raw)
  To: linux-mm, linux-xfs, linux-fsdevel, linux-kernel
  Cc: torvalds, axboe, Daniel Dao, Dave Chinner, willy, clm,
	regressions, regressions

Hello everyone,

I’d like to raise awareness about a bug causing data loss somewhere in MM interacting with XFS that seems to have been around since Dec 2021 (https://github.com/torvalds/linux/commit/6795801366da0cd3d99e27c37f020a8f16714886).

We started encountering this bug when upgrading to 6.1 around June 2023 and we have had at least 16 instances with data loss in a fleet of 1.5k VMs.

This bug is very hard to reproduce but has been known to exist as a “fluke” for a while already. I have invested a number of days trying to come up with workloads to trigger it quicker than that stochastic “once every few weeks in a fleet of 1.5k machines", but it eludes me so far. I know that this also affects Facebook/Meta as well as Cloudflare who are both running newer kernels (at least 6.1, 6.6, and 6.9) with the above mentioned patch reverted. I’m from a much smaller company and seeing that those guys are running with this patch reverted (that now makes their kernel basically an untested/unsupported deviation from the mainline) smells like desparation. I’m with a much smaller team and company and I’m wondering why this isn’t tackled more urgently from more hands to make it shallow (hopefully).

The issue appears to happen mostly on nodes that are running some kind of database or specifically storage-oriented load. In our case we see this happening with PostgreSQL and MySQL. Cloudflare IIRC saw this with RocksDB load and Meta is talking about nfsd load.

I suspect low memory (but not OOM low) / pressure and maybe swap conditions seem to increase the chance of triggering it - but I might be completely wrong on that suspicion.

There is a bug report I started here back then: https://bugzilla.kernel.org/show_bug.cgi?id=217572 and there have been discussions on the XFS list: https://lore.kernel.org/lkml/CA+wXwBS7YTHUmxGP3JrhcKMnYQJcd6=7HE+E1v-guk01L2K3Zw@mail.gmail.com/T/ but ultimately this didn’t receive sufficient interested to keep it moving forward and I ran out of steam. Unfortunately we can’t be stuck on 5.15 forever and other kernel developers correctly keep pointing out that we should be updating, but that isn’t an option as long as this time bomb still exists.

Jens pointed out that Meta's findings and their notes on the revert included "When testing nfsd on top of v5.19, we hit lockups in filemap_read(). These ended up being because the xarray for the files being read had pages from other files mixed in."

XFS is known to me and admired for the very high standards they represent regarding testing and avoiding data loss but ultimately that doesn’t matter if we’re going to be stuck with this bug forever.

I’m able to help funding efforts, help creating a reproducer, generally donate my time (not a kernel developer myself) and even provide access to machines that did see the crash (but don’t carry customer data), but I’m not making any progress or getting any traction here.

Jens encouraged me to raise the visibility in this way - so that’s what I’m trying here.

Please help.

In appreciation of all the hard work everyone is putting in and with hugs and love,
Christian

-- 
Christian Theune · ct@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick



^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2024-12-02 10:44 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-12 21:18 Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) Christian Theune
2024-09-12 21:55 ` Matthew Wilcox
2024-09-12 22:11   ` Christian Theune
2024-09-12 22:12   ` Jens Axboe
2024-09-12 22:25     ` Linus Torvalds
2024-09-12 22:30       ` Jens Axboe
2024-09-12 22:56         ` Linus Torvalds
2024-09-13  3:44           ` Matthew Wilcox
2024-09-13 13:23             ` Christian Theune
2024-09-13 12:11       ` Christian Brauner
2024-09-16 13:29         ` Matthew Wilcox
2024-09-18  9:51           ` Christian Brauner
2024-09-13 15:30       ` Chris Mason
2024-09-13 15:51         ` Matthew Wilcox
2024-09-13 16:33           ` Chris Mason
2024-09-13 18:15             ` Matthew Wilcox
2024-09-13 21:24               ` Linus Torvalds
2024-09-13 21:30                 ` Matthew Wilcox
2024-09-13 16:04       ` David Howells
2024-09-13 16:37         ` Chris Mason
2024-09-16  0:00       ` Dave Chinner
2024-09-16  4:20         ` Linus Torvalds
2024-09-16  8:47           ` Chris Mason
2024-09-17  9:32             ` Matthew Wilcox
2024-09-17  9:36               ` Chris Mason
2024-09-17 10:11               ` Christian Theune
2024-09-17 11:13               ` Chris Mason
2024-09-17 13:25                 ` Matthew Wilcox
2024-09-18  6:37                   ` Jens Axboe
2024-09-18  9:28                     ` Chris Mason
2024-09-18 12:23                       ` Chris Mason
2024-09-18 13:34                       ` Matthew Wilcox
2024-09-18 13:51                         ` Linus Torvalds
2024-09-18 14:12                           ` Matthew Wilcox
2024-09-18 14:39                             ` Linus Torvalds
2024-09-18 17:12                               ` Matthew Wilcox
2024-09-18 16:37                             ` Chris Mason
2024-09-19  1:43                         ` Dave Chinner
2024-09-19  3:03                           ` Linus Torvalds
2024-09-19  3:12                             ` Linus Torvalds
2024-09-19  3:38                               ` Jens Axboe
2024-09-19  4:32                                 ` Linus Torvalds
2024-09-19  4:42                                   ` Jens Axboe
2024-09-19  4:36                                 ` Matthew Wilcox
2024-09-19  4:46                                   ` Jens Axboe
2024-09-19  5:20                                     ` Jens Axboe
2024-09-19  4:46                                   ` Linus Torvalds
2024-09-20 13:54                                   ` Chris Mason
2024-09-24 15:58                                     ` Matthew Wilcox
2024-09-24 17:16                                     ` Sam James
2024-09-25 16:06                                       ` Kairui Song
2024-09-25 16:42                                         ` Christian Theune
2024-09-27 14:51                                         ` Sam James
2024-09-27 14:58                                           ` Jens Axboe
2024-10-01 21:10                                             ` Kairui Song
2024-09-24 19:17                                     ` Chris Mason
2024-09-24 19:24                                       ` Linus Torvalds
2024-09-19  6:34                               ` Christian Theune
2024-09-19  6:57                                 ` Linus Torvalds
2024-09-19 10:19                                   ` Christian Theune
2024-09-30 17:34                                     ` Christian Theune
2024-09-30 18:46                                       ` Linus Torvalds
2024-09-30 19:25                                         ` Christian Theune
2024-09-30 20:12                                           ` Linus Torvalds
2024-09-30 20:56                                             ` Matthew Wilcox
2024-09-30 22:42                                               ` Davidlohr Bueso
2024-09-30 23:00                                                 ` Davidlohr Bueso
2024-09-30 23:53                                               ` Linus Torvalds
2024-10-01  0:56                                       ` Chris Mason
2024-10-01  7:54                                         ` Christian Theune
2024-10-10  6:29                                         ` Christian Theune
2024-10-11  7:27                                           ` Christian Theune
2024-10-11  9:08                                             ` Christian Theune
2024-10-11 13:06                                               ` Chris Mason
2024-10-11 13:50                                                 ` Christian Theune
2024-10-12 17:01                                                 ` Linus Torvalds
2024-12-02 10:44                                                   ` Christian Theune
2024-10-01  2:22                                       ` Dave Chinner
2024-09-16  7:14         ` Christian Theune
2024-09-16 12:16           ` Matthew Wilcox
2024-09-18  8:31           ` Christian Theune

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox