From: Robert Beckett <bob.beckett@collabora.com>
To: "Keith Busch" <kbusch@kernel.org>
Cc: "linux-nvme" <linux-nvme@lists.infradead.org>,
"Jens Axboe" <axboe@fb.com>, "Christoph Hellwig" <hch@lst.de>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Andrew Morton" <akpm@linux-foundation.org>,
"linux-mm" <linux-mm@kvack.org>
Subject: Re: possible regression fs corruption on 64GB nvme
Date: Tue, 10 Sep 2024 18:27:55 +0100 [thread overview]
Message-ID: <191dcfa4846.bb18f3291189856.1624418308692137124@collabora.com> (raw)
In-Reply-To: <191db450152.e0b28690987786.6989198174827147639@collabora.com>
---- On Tue, 10 Sep 2024 10:30:18 +0100 Robert Beckett wrote ---
>
>
>
>
>
> ---- On Mon, 09 Sep 2024 21:31:41 +0100 Keith Busch wrote ---
> > On Mon, Sep 09, 2024 at 02:29:14PM -0600, Keith Busch wrote:
> > > As a test, could you try kernel parameter "nvme.io_queue_depth_set=2"?
> >
> > Err, I mean "nvme.io_queue_depth=2".
> >
>
> Thanks, I'll give it a try along with your other questions and report back.
>
> For clarity, the repro steps dropped a step. They should have included the make command:
>
>
> $ dd if=/dev/urandom of=test_file bs=1M count=10240
> $ desync make test_file.caibx test_file
> $ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"
> $ desync verify-index test_file.caibx test_file
>
CONFIG_SLUB_DEBUG_ON showed no debug output.
nvme.io_queue_depth=2 appears to fix it. Could you explain the implications of this?
I assume it is limiting to 2 outstanding requests concurrently.
Does it suggest an issue with the specific device's FW?
I assume this would suggest that it is not actually anything wrong with the dmapool, it was just exposing the issue of the device/fw?
Any advice for handling this and/or investigating further?
My initial speculation was that maybe the disk fw is signalling completion of an access before it has actually finished making it's way to ram. I checked the code and saw that the dmapool appears to be used for storing the buffer page addresses, so I imagine that is not updated by the disk at all, which would rule out my assumption.
I'd appreciate any insight you could give on the usage of the dmapools in the driver and whether you would expect them to be significant in this issue, or if they are just making a device/fw bug more observable.
Thanks
Bob
p.s. Here is an transcript of the issue seen in testing. To my knowledge, if everything is working as it should, nothing should be able to produce this output, that dropping caches and re-priming the page cache via a linear read it fixes things.
$ dd if=/dev/urandom of=test_file bs=1M count=10240
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 111.609 s, 96.2 MB/s
$ desync make test_file.caibx test_file
Chunking [=======================================================================================================================================] 100.00% 18s
$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"
$ desync verify-index test_file.caibx test_file
[=============>-----------------------------------------------------------------------------------------------------------------------------------] 9.00% 4s
Error: seed index for test_file doesn't match its data
$ md5sum test_file
ce4f1cca0b3dfd63ea2adfd745e4bfc1 test_file
$ sudo bash -c "echo 3 > /proc/sys/vm/drop_caches"
$ md5sum test_file
1edb3eaf5ae57b6187cc0be843ed2e5c test_file
$ desync verify-index test_file.caibx test_file
[=================================================================================================================================================] 100.00% 5s
next prev parent reply other threads:[~2024-09-10 17:28 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-09 18:34 Robert Beckett
2024-09-09 19:29 ` Keith Busch
2024-09-09 20:29 ` Keith Busch
2024-09-09 20:31 ` Keith Busch
2024-09-10 9:30 ` Robert Beckett
2024-09-10 17:27 ` Robert Beckett [this message]
2024-09-10 17:53 ` Keith Busch
2024-09-11 16:56 ` Robert Beckett
2024-09-11 16:57 ` Robert Beckett
2024-09-11 17:08 ` Keith Busch
2024-09-11 17:17 ` Robert Beckett
2024-09-10 4:24 ` Christoph Hellwig
2024-09-10 9:37 ` Robert Beckett
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=191dcfa4846.bb18f3291189856.1624418308692137124@collabora.com \
--to=bob.beckett@collabora.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@fb.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox