From: Matthew Wilcox <willy@infradead.org>
To: Ritesh Harjani <ritesh.list@gmail.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
Zhang Yi <yi.zhang@huaweicloud.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu,
adilger.kernel@dilger.ca, jack@suse.cz, hch@infradead.org,
zokeefe@google.com, yi.zhang@huawei.com, chengzhihao1@huawei.com,
yukuai3@huawei.com, wangkefeng.wang@huawei.com
Subject: Re: [RFC PATCH v3 00/26] ext4: use iomap for regular file's buffered IO path and enable large foilo
Date: Mon, 12 Feb 2024 10:24:22 +0000 [thread overview]
Message-ID: <Zcnx1pP_iZBf6Y-t@casper.infradead.org> (raw)
In-Reply-To: <87ttmef3fp.fsf@doe.com>
On Mon, Feb 12, 2024 at 02:46:10PM +0530, Ritesh Harjani wrote:
> "Darrick J. Wong" <djwong@kernel.org> writes:
> > though iirc willy never got the performance to match because iomap
>
> Ohh, can you help me provide details on what performance benchmark was
> run? I can try and run them when I rebase.
I didn't run a benchmark, we just knew what would happen (on rotating
storage anyway).
> > didn't have a mechanism for the caller to tell it "run the IO now even
> > though you don't have a complete page, because the indirect block is the
> > next block after the 11th block".
>
> Do you mean this for a large folio? I still didn't get the problem you
> are referring here. Can you please help me explain why could that be a
> problem?
A classic ext2 filesystem lays out a 16kB file like this (with 512
byte blocks):
file offset disk block
0-6KiB 1000-1011
6KiB-16KiB 1013-1032
What's in block 1012? The indirect block! The block which tells ext2
that blocks 12-31 of the file are in disk blocks 1013-1032. So we can't
issue the read for them until we've finished the read for block 1012.
Buffer heads have a solution for this, BH_Boundary. ext2 sets it for
block 11 which prompts mpage.c to submit the read immediately (see
the various calls to buffer_boundary()). Then ext2 will submit the read
for block 1012 and the two reads will be coalesced by the IO scheduler.
So we still end up doing two reads instead of one, but that's
unavoidable because fragmentation might have meant that 6KiB-16KiB were
not stored at 1013-1032.
There's no equivalent iomap solution. What needs to happen is:
- iomap_folio_state->read_bytes_pending needs to be initialised to
folio_size(), not 0.
- Remove "ifs->read_bytes_pending += plen" from iomap_readpage_iter()
- Subtract plen in the iomap_block_needs_zeroing() case
- Submit a bio at the end of each iomap_readpage_iter() call
Now iomap will behave the same way as mpage, only without needing a
flag to do it (instead it will assume that the filesystem coalesces
adjacent ranges, which it should do anyway for good performance).
next prev parent reply other threads:[~2024-02-12 10:24 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-27 1:57 Zhang Yi
2024-01-27 1:58 ` [PATCH v3 01/26] ext4: refactor ext4_da_map_blocks() Zhang Yi
2024-02-03 17:56 ` Theodore Ts'o
2024-01-27 1:58 ` [PATCH v3 02/26] ext4: convert to exclusive lock while inserting delalloc extents Zhang Yi
2024-02-03 17:56 ` Theodore Ts'o
2024-01-27 1:58 ` [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks() Zhang Yi
2024-02-03 17:56 ` Theodore Ts'o
2024-05-09 15:16 ` Luis Henriques
[not found] ` <20240509163953.GI3620298@mit.edu>
2024-05-09 17:23 ` Luis Henriques
2024-05-10 3:39 ` Zhang Yi
2024-05-10 9:41 ` Luis Henriques
2024-05-10 11:40 ` Zhang Yi
2024-01-27 1:58 ` [PATCH v3 04/26] ext4: add a hole extent entry in cache after punch Zhang Yi
2024-02-03 17:56 ` Theodore Ts'o
2024-01-27 1:58 ` [PATCH v3 05/26] ext4: make ext4_map_blocks() distinguish delalloc only extent Zhang Yi
2024-02-03 17:57 ` Theodore Ts'o
2024-01-27 1:58 ` [PATCH v3 06/26] ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC map type Zhang Yi
2024-02-03 17:57 ` Theodore Ts'o
2024-01-27 1:58 ` [RFC PATCH v3 07/26] iomap: don't increase i_size if it's not a write operation Zhang Yi
2024-02-13 5:46 ` Christoph Hellwig
2024-02-17 8:55 ` Zhang Yi
2024-02-18 23:30 ` Dave Chinner
2024-02-19 1:14 ` Zhang Yi
2024-02-28 8:53 ` Zhang Yi
2024-02-28 22:13 ` Christoph Hellwig
2024-02-29 9:20 ` Zhang Yi
2024-02-28 22:25 ` Dave Chinner
2024-02-29 8:59 ` Zhang Yi
2024-02-29 23:19 ` Dave Chinner
2024-02-29 23:29 ` Darrick J. Wong
2024-03-01 3:26 ` Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 08/26] iomap: add pos and dirty_len into trace_iomap_writepage_map Zhang Yi
2024-02-12 6:02 ` Christoph Hellwig
2024-02-19 1:27 ` Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 09/26] ext4: allow inserting delalloc extents with multi-blocks Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 10/26] ext4: correct delalloc extent length Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 11/26] ext4: also mark extent as delalloc if it's been unwritten Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 12/26] ext4: factor out bh handles to ext4_da_get_block_prep() Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 13/26] ext4: use reserved metadata blocks when splitting extent in endio Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 14/26] ext4: factor out ext4_map_{create|query}_blocks() Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 15/26] ext4: introduce seq counter for extent entry Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 16/26] ext4: add a new iomap aops for regular file's buffered IO path Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 17/26] ext4: implement buffered read iomap path Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 18/26] ext4: implement buffered write " Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 19/26] ext4: implement writeback " Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 20/26] ext4: implement mmap " Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 21/26] ext4: implement zero_range " Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 22/26] ext4: writeback partial blocks before zero range Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 23/26] ext4: fall back to buffer_head path for defrag Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 24/26] ext4: partially enable iomap for regular file's buffered IO path Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 25/26] filemap: support disable large folios on active inode Zhang Yi
2024-01-27 1:58 ` [RFC PATCH v3 26/26] ext4: enable large folio for regular file with iomap buffered IO path Zhang Yi
2024-02-12 6:18 ` [RFC PATCH v3 00/26] ext4: use iomap for regular file's buffered IO path and enable large foilo Darrick J. Wong
2024-02-12 9:16 ` Ritesh Harjani
2024-02-12 10:24 ` Matthew Wilcox [this message]
2024-02-17 9:31 ` Zhang Yi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zcnx1pP_iZBf6Y-t@casper.infradead.org \
--to=willy@infradead.org \
--cc=adilger.kernel@dilger.ca \
--cc=chengzhihao1@huawei.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=wangkefeng.wang@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yi.zhang@huaweicloud.com \
--cc=yukuai3@huawei.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox