Re: [RFC 00/23] Enable block size > page size in XFS

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Pankaj Raghav <p.raghav@samsung.com>,
	Pankaj Raghav <kernel@pankajraghav.com>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	da.gomez@samsung.com, akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org, willy@infradead.org,
	djwong@kernel.org, linux-mm@kvack.org, chandan.babu@oracle.com,
	gost.dev@samsung.com, riteshh@linux.ibm.com
Subject: Re: [RFC 00/23] Enable block size > page size in XFS
Date: Fri, 22 Sep 2023 15:03:25 +1000	[thread overview]
Message-ID: <ZQ0gHRvrbZUjO/rA@dread.disaster.area> (raw)
In-Reply-To: <ZQvuNaYIukAnlEDM@bombadil.infradead.org>

On Thu, Sep 21, 2023 at 12:18:13AM -0700, Luis Chamberlain wrote:
> On Thu, Sep 21, 2023 at 04:03:56PM +1000, Dave Chinner wrote:
> > On Wed, Sep 20, 2023 at 09:57:56PM -0700, Luis Chamberlain wrote:
> > > On Wed, Sep 20, 2023 at 08:00:12PM -0700, Luis Chamberlain wrote:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=large-block-linus
> > > > 
> > > > I haven't tested yet the second branch I pushed though but it applied without any changes
> > > > so it should be good (usual famous last words).
> > > 
> > > I have run some preliminary tests on that branch as well above using fsx
> > > with larger LBA formats running them all on the *same* system at the
> > > same time. Kernel is happy.
> 
> <-- snip -->
> 
> > So I just pulled this, built it and run generic/091 as the very
> > first test on this:
> > 
> > # ./run_check.sh --mkfs-opts "-m rmapbt=1 -b size=64k" --run-opts "-s xfs_64k generic/091"
> 
> The cover letter for this patch series acknowledged failures in fstests.

But this is a new update, which you said fixed various issues, and
you posted this in direct response to the bug report I gave you.

> For kdevops now, we borrow the same last linux-next baseline:
> 
> git grep "generic/091" workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_1024.txt:generic/091 # possible regression
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_16k.txt:generic/091
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_32k.txt:generic/091
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_64k_4ks.txt:generic/091
> 
> So well, we already know this fails.

*cough*

-You- know it already fails.

And you are expecting people who try the code to somehow know
that you've explicitly ignored this fsx failure, especially after
all your words to tell us how much fsx testing it has passed?

And that's kinda my point - you're effusing about how much fsx
testing this has passed, yet it istill fails after just a handful of
ops in generic/091. The dissonance could break windows...

----

Fundamentally, when it comes to data integrity, it important to
exercise as much of the operational application space as quickly as
possible as it is that breadth of variation in operations that
flushes out more bugs and helps stabilises the code faster.

Why do you think we talk about the massive test matrix most
filesytsems have and how long it takes to iterate so much? It's
because iterating that complex test matrix is how we find all the
whacky, weird bugs in the code.

Concentrating on a single test configuration and running it over and
over again won't find bugs in code it doesn't exercise no matter how
long it is run for. Running such a setup in an automated environment
doesn't mean you get better code coverage, it just means you cover
the same narrow set of corner cases faster and more times. If it
works once, it should work a million times. Iterating it a billion
more times doesn't tell us anything additional, either.

Put simply: performing deep, homogenous testing on code that has known
data corruption bugs outside the narrow scope of the test case is
not telling us anything useful about the overall state of the code.
Indeed, turning off failing tests that are critical to validating the
correct operation of the code you are modifying is bad practice.

For code changes like this, all fsx testing in fstests should pass
before you post anything for review - even for an RFC. There is no
point reviewing code that doesn't work properly, nor wasting
people's time by encouraging them to test it when it's clear to you
that it's going to fail in various important ways.

Hence I think your testing is focussing on the wrong things and I
suspect that you've misunderstood the statements of "we'll need
billions of fsx ops to test this code" that various people have made
really meant.  You've elevated running billions of fsx ops to your
primary "it works" gating condition, at the expense of making sure
all the other parts of the filesystem still work correctly.

The reality is that the returns from fsx diminish as the number of
ops go up. Once you've run the first hundred million fsx ops for a
given operations set, the chance that the next 100M ops will find a
new problem is -greatly- reduced. The vast majority of problems will
be found in the first 10M ops that are run in any given fsx
operation, and few bugs are found beyond the 100M mark. Yes, we
occasionally find one up in the billions, but that's rare and most
definitely not somethign to focus on when still developing RFC level
code.

Different fsx configurations change the operation set that is run -
mixing DIO reads with buffered writes, turning mmap on and off,
using AIO or io_uring rather than synchronous IO, etc. These all
exercise different code paths and corner cases and have vastly
different code interactions, and that is what we need to cover when
developing new code.

IOWs, we need coverage of the *entire operation space*, not just the
same narrow set of operations run billions of time.  A wide focus
requires billions of ops to cover because it requires lots of
different application configurations to be run. In constrast, there
are only three fs configurations that matter: bs < PS, bs == PS and
bs > PS.

For example, 16kB, 32kB and 64kB filesystem configs exercise exactly
the same code paths in exactly the same way (e.g. both have non-zero
miniumum folio orders but only differ by what that order is). Hence
running the same test application configs on these different
filessytem configurations does actually not improve code coverage of
the testing at all. Testing all of them only increases the resources
required to the test a change, it does not improve the quality of
coverage of the testing being performed at all....

Hence I'd strongly suggest that, for the next posting of these
cahnge, you focus on making fstests pass without turning off any
failing tests, and that fsx is run with a wide variety of
configurations (e.g. modify all the fstests cases to run for a
configurable number of ops (e.g. via SOAK_DURATION)). We just don't
care at this point about finding that 1 in 10^15 ops bug because
it's code in development; what we actually care about is that
-everything- works correctly for the vast majority of use cases....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2023-09-22  5:03 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-15 18:38 Pankaj Raghav
2023-09-15 18:38 ` [RFC 01/23] fs: Allow fine-grained control of folio sizes Pankaj Raghav
2023-09-15 19:03   ` Matthew Wilcox
2023-09-15 18:38 ` [RFC 02/23] pagemap: use mapping_min_order in fgf_set_order() Pankaj Raghav
2023-09-15 18:55   ` Matthew Wilcox
2023-09-20  7:46     ` Pankaj Raghav
2023-09-15 18:38 ` [RFC 03/23] filemap: add folio with at least mapping_min_order in __filemap_get_folio Pankaj Raghav
2023-09-15 19:00   ` Matthew Wilcox
2023-09-20  8:06     ` Pankaj Raghav
2023-09-15 18:38 ` [RFC 04/23] filemap: set the order of the index in page_cache_delete_batch() Pankaj Raghav
2023-09-15 19:43   ` Matthew Wilcox
2023-09-18 18:20     ` Luis Chamberlain
2023-09-15 18:38 ` [RFC 05/23] filemap: align index to mapping_min_order in filemap_range_has_page() Pankaj Raghav
2023-09-15 19:45   ` Matthew Wilcox
2023-09-18 18:25     ` Luis Chamberlain
2023-09-15 18:38 ` [RFC 06/23] mm: call xas_set_order() in replace_page_cache_folio() Pankaj Raghav
2023-09-15 19:46   ` Matthew Wilcox
2023-09-18 18:27     ` Luis Chamberlain
2023-09-15 18:38 ` [RFC 07/23] filemap: align the index to mapping_min_order in __filemap_add_folio() Pankaj Raghav
2023-09-15 19:48   ` Matthew Wilcox
2023-09-18 18:32     ` Luis Chamberlain
2023-09-15 18:38 ` [RFC 08/23] filemap: align the index to mapping_min_order in filemap_get_folios_tag() Pankaj Raghav
2023-09-15 19:50   ` Matthew Wilcox
2023-09-18 18:36     ` Luis Chamberlain
2023-09-15 18:38 ` [RFC 09/23] filemap: use mapping_min_order while allocating folios Pankaj Raghav
2023-09-15 19:54   ` Matthew Wilcox
2023-09-15 18:38 ` [RFC 10/23] filemap: align the index to mapping_min_order in filemap_get_pages() Pankaj Raghav
2023-09-15 18:38 ` [RFC 11/23] filemap: align the index to mapping_min_order in do_[a]sync_mmap_readahead Pankaj Raghav
2023-09-15 18:38 ` [RFC 12/23] filemap: align index to mapping_min_order in filemap_fault() Pankaj Raghav
2023-09-15 18:38 ` [RFC 13/23] readahead: set file_ra_state->ra_pages to be at least mapping_min_order Pankaj Raghav
2023-09-15 18:38 ` [RFC 14/23] readahead: allocate folios with mapping_min_order in ra_unbounded() Pankaj Raghav
2023-09-15 18:38 ` [RFC 15/23] readahead: align with mapping_min_order in force_page_cache_ra() Pankaj Raghav
2023-09-15 18:38 ` [RFC 16/23] readahead: add folio with at least mapping_min_order in page_cache_ra_order Pankaj Raghav
2023-09-15 18:38 ` [RFC 17/23] readahead: set the minimum ra size in get_(init|next)_ra Pankaj Raghav
2023-09-15 18:38 ` [RFC 18/23] readahead: align ra start and size to mapping_min_order in ondemand_ra() Pankaj Raghav
2023-09-15 18:38 ` [RFC 19/23] truncate: align index to mapping_min_order Pankaj Raghav
2023-09-15 18:38 ` [RFC 20/23] mm: round down folio split requirements Pankaj Raghav
2023-09-15 18:38 ` [RFC 21/23] xfs: expose block size in stat Pankaj Raghav
2023-09-15 18:38 ` [RFC 22/23] xfs: enable block size larger than page size support Pankaj Raghav
2023-09-15 18:38 ` [RFC 23/23] xfs: set minimum order folio for page cache based on blocksize Pankaj Raghav
2023-09-15 18:50 ` [RFC 00/23] Enable block size > page size in XFS Matthew Wilcox
2023-09-18 12:35   ` Pankaj Raghav
2023-09-17 22:05 ` Dave Chinner
2023-09-18  2:04   ` Luis Chamberlain
2023-09-18  5:07     ` Dave Chinner
2023-09-18 12:29       ` Pankaj Raghav
2023-09-19 11:56         ` Ritesh Harjani
2023-09-19 21:15           ` Luis Chamberlain
2023-09-21  3:00         ` Luis Chamberlain
     [not found]           ` <ZQvNVAfZMjE3hgmN@bombadil.infradead.org>
2023-09-21  6:03             ` Dave Chinner
2023-09-21  7:18               ` Luis Chamberlain
2023-09-21  7:20                 ` Luis Chamberlain
2023-09-22  5:03                 ` Dave Chinner [this message]
2023-09-22 19:38               ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQ0gHRvrbZUjO/rA@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=chandan.babu@oracle.com \
    --cc=da.gomez@samsung.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=kernel@pankajraghav.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=riteshh@linux.ibm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox