linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Keith Busch <kbusch@kernel.org>
To: Robert Beckett <bob.beckett@collabora.com>
Cc: linux-nvme <linux-nvme@lists.infradead.org>,
	Jens Axboe <axboe@fb.com>, Christoph Hellwig <hch@lst.de>,
	Sagi Grimberg <sagi@grimberg.me>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: possible regression fs corruption on 64GB nvme
Date: Mon, 9 Sep 2024 13:29:09 -0600	[thread overview]
Message-ID: <Zt9MheFctnqW260Z@kbusch-mbp> (raw)
In-Reply-To: <191d810a4e3.fcc6066c765804.973611676137075390@collabora.com>

On Mon, Sep 09, 2024 at 07:34:15PM +0100, Robert Beckett wrote:
> After a lot of testing, we managed to get a repro case that would trigger within 2-3 tests using the desync tool [2], reducing the repro time from a day or more to minutes. For repro steps see [3].
> We bisected the issue to 
> 
> da9619a30e73b dmapool: link blocks across pages
> https://lore.kernel.org/all/20230126215125.4069751-12-kbusch@meta.com/T/#u

That's not the patch that was ultimately committed. Still, that's the
one I tested extensively with nvme, so the updated one shouldn't make a
difference for protocol.
 
> Some other thoughts about the issue:
> 
> - we have received reports of occasional filesystem corruptions on btrfs and ext4 filesystems on the same disk, this doesn't appear fs related
> - it only seems to affect these 64GB single queue simple disks. Other devices with more capable disks have not showed this issue.
> - using simple dd or md5sum testing does not sow the issue. desync seems to be very parallel in it's attack patterns.
> - I was investigating a previous potential regression that was deemed not an issue https://lkml.org/lkml/2023/2/21/762 . I assume nvme doesn't need it's addresses to be ordered. I'm not familiar with the spec.

nvme should not care about address ordering. The dma buffers are all
pulled from the same pool for all threads, and could be dispatched in
different orders than what was allocated, so any order should be fine.
 
> I'd appreciate any advice you may have on why this dmapool patch could potentially cause or expose an issue with these nvme devices.
> If any more info would be useful to help diagnose, I'll happily provide it.

Did you try with CONFIG_SLUB_DEBUG_ON enabled?


  reply	other threads:[~2024-09-09 19:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-09 18:34 Robert Beckett
2024-09-09 19:29 ` Keith Busch [this message]
2024-09-09 20:29 ` Keith Busch
2024-09-09 20:31   ` Keith Busch
2024-09-10  9:30     ` Robert Beckett
2024-09-10 17:27       ` Robert Beckett
2024-09-10 17:53         ` Keith Busch
2024-09-11 16:56           ` Robert Beckett
2024-09-11 16:57             ` Robert Beckett
2024-09-11 17:08             ` Keith Busch
2024-09-11 17:17               ` Robert Beckett
2024-09-10  4:24 ` Christoph Hellwig
2024-09-10  9:37   ` Robert Beckett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zt9MheFctnqW260Z@kbusch-mbp \
    --to=kbusch@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@fb.com \
    --cc=bob.beckett@collabora.com \
    --cc=hch@lst.de \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox