linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zhang Yi <yi.zhang@huaweicloud.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com,
	hch@infradead.org, djwong@kernel.org, willy@infradead.org,
	zokeefe@google.com, yi.zhang@huawei.com, chengzhihao1@huawei.com,
	yukuai3@huawei.com, wangkefeng.wang@huawei.com
Subject: Re: [RFC PATCH v4 24/34] ext4: implement buffered write iomap path
Date: Mon, 6 May 2024 19:44:44 +0800	[thread overview]
Message-ID: <96bbdb25-b420-67b1-d4c4-b838a5c70f9f@huaweicloud.com> (raw)
In-Reply-To: <ZjH+QFVXLlcDkSdh@dread.disaster.area>

On 2024/5/1 16:33, Dave Chinner wrote:
> On Wed, May 01, 2024 at 06:11:13PM +1000, Dave Chinner wrote:
>> On Wed, Apr 10, 2024 at 10:29:38PM +0800, Zhang Yi wrote:
>>> From: Zhang Yi <yi.zhang@huawei.com>
>>>
>>> Implement buffered write iomap path, use ext4_da_map_blocks() to map
>>> delalloc extents and add ext4_iomap_get_blocks() to allocate blocks if
>>> delalloc is disabled or free space is about to run out.
>>>
>>> Note that we always allocate unwritten extents for new blocks in the
>>> iomap write path, this means that the allocation type is no longer
>>> controlled by the dioread_nolock mount option. After that, we could
>>> postpone the i_disksize updating to the writeback path, and drop journal
>>> handle in the buffered dealloc write path completely.
> .....
>>> +/*
>>> + * Drop the staled delayed allocation range from the write failure,
>>> + * including both start and end blocks. If not, we could leave a range
>>> + * of delayed extents covered by a clean folio, it could lead to
>>> + * inaccurate space reservation.
>>> + */
>>> +static int ext4_iomap_punch_delalloc(struct inode *inode, loff_t offset,
>>> +				     loff_t length)
>>> +{
>>> +	ext4_es_remove_extent(inode, offset >> inode->i_blkbits,
>>> +			DIV_ROUND_UP_ULL(length, EXT4_BLOCK_SIZE(inode->i_sb)));
>>>  	return 0;
>>>  }
>>>  
>>> +static int ext4_iomap_buffered_write_end(struct inode *inode, loff_t offset,
>>> +					 loff_t length, ssize_t written,
>>> +					 unsigned int flags,
>>> +					 struct iomap *iomap)
>>> +{
>>> +	handle_t *handle;
>>> +	loff_t end;
>>> +	int ret = 0, ret2;
>>> +
>>> +	/* delalloc */
>>> +	if (iomap->flags & IOMAP_F_EXT4_DELALLOC) {
>>> +		ret = iomap_file_buffered_write_punch_delalloc(inode, iomap,
>>> +			offset, length, written, ext4_iomap_punch_delalloc);
>>> +		if (ret)
>>> +			ext4_warning(inode->i_sb,
>>> +			     "Failed to clean up delalloc for inode %lu, %d",
>>> +			     inode->i_ino, ret);
>>> +		return ret;
>>> +	}
>>
>> Why are you creating a delalloc extent for the write operation and
>> then immediately deleting it from the extent tree once the write
>> operation is done?
> 
> Ignore this, I mixed up the ext4_iomap_punch_delalloc() code
> directly above with iomap_file_buffered_write_punch_delalloc().
> 
> In hindsight, iomap_file_buffered_write_punch_delalloc() is poorly
> named, as it is handling a short write situation which requires
> newly allocated delalloc blocks to be punched.
> iomap_file_buffered_write_finish() would probably be a better name
> for it....
> 
>> Also, why do you need IOMAP_F_EXT4_DELALLOC? Isn't a delalloc iomap
>> set up with iomap->type = IOMAP_DELALLOC? Why can't that be used?
> 
> But this still stands - the first thing
> iomap_file_buffered_write_punch_delalloc() is:
> 
> 	if (iomap->type != IOMAP_DELALLOC)
>                 return 0;
> 

Thanks for the suggestion, the delalloc and non-delalloc write paths
share the same ->iomap_end() now (i.e. ext4_iomap_buffered_write_end()),
I use the IOMAP_F_EXT4_DELALLOC to identify the write path. For
non-delalloc path, If we have allocated more blocks and copied less, we
should truncate extra blocks that newly allocated by ->iomap_begin().
If we use IOMAP_DELALLOC, we can't tell if the blocks are pre-existing
or newly allocated, we can't truncate the pre-existing blocks, so I have
to introduce IOMAP_F_EXT4_DELALLOC. But if we split the delalloc and
non-delalloc handler, we could drop IOMAP_F_EXT4_DELALLOC.

I also checked xfs, IIUC, xfs doesn't free the extra blocks beyond EOF
in xfs_buffered_write_iomap_end() for non-delalloc case since they will
be freed by xfs_free_eofblocks in some other inactive paths, like
xfs_release()/xfs_inactive()/..., is that right?

Thanks,
Yi.



  reply	other threads:[~2024-05-06 11:44 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-10 14:29 [RESEND RFC PATCH v4 00/34] ext4: use iomap for regular file's buffered IO path and enable large folio Zhang Yi
2024-04-10 14:29 ` [PATCH v4 01/34] ext4: factor out a common helper to query extent map Zhang Yi
2024-04-10 14:29 ` [PATCH v4 02/34] ext4: check the extent status again before inserting delalloc block Zhang Yi
2024-05-01  6:51   ` Dave Chinner
2024-05-01 12:19     ` Ritesh Harjani
2024-05-01 22:49       ` Dave Chinner
2024-05-02  4:11         ` Ritesh Harjani
2024-05-06  3:49           ` Zhang Yi
2024-04-10 14:29 ` [PATCH v4 03/34] ext4: trim delalloc extent Zhang Yi
2024-05-01 14:31   ` Ritesh Harjani
2024-05-06  6:15     ` Zhang Yi
2024-04-10 14:29 ` [PATCH v4 04/34] ext4: drop iblock parameter Zhang Yi
2024-05-01 14:41   ` Ritesh Harjani
2024-04-10 14:29 ` [PATCH v4 05/34] ext4: make ext4_es_insert_delayed_block() insert multi-blocks Zhang Yi
2024-04-10 14:29 ` [PATCH v4 06/34] ext4: make ext4_da_reserve_space() reserve multi-clusters Zhang Yi
2024-04-10 14:29 ` [PATCH v4 07/34] ext4: factor out check for whether a cluster is allocated Zhang Yi
2024-04-10 14:29 ` [PATCH v4 08/34] ext4: make ext4_insert_delayed_block() insert multi-blocks Zhang Yi
2024-04-10 14:29 ` [PATCH v4 09/34] ext4: make ext4_da_map_blocks() buffer_head unaware Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 10/34] ext4: factor out ext4_map_create_blocks() to allocate new blocks Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 11/34] ext4: optimize the EXT4_GET_BLOCKS_DELALLOC_RESERVE flag set Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 12/34] ext4: don't set EXTENT_STATUS_DELAYED on allocated blocks Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 13/34] ext4: let __revise_pending() return newly inserted pendings Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 14/34] ext4: count removed reserved blocks for delalloc only extent entry Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 15/34] ext4: update delalloc data reserve spcae in ext4_es_insert_extent() Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 16/34] ext4: drop ext4_es_delayed_clu() Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 17/34] ext4: use ext4_map_query_blocks() in ext4_map_blocks() Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 18/34] ext4: drop ext4_es_is_delonly() Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 19/34] ext4: drop all delonly descriptions Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 20/34] ext4: use reserved metadata blocks when splitting extent on endio Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 21/34] ext4: introduce seq counter for the extent status entry Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 22/34] ext4: add a new iomap aops for regular file's buffered IO path Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 23/34] ext4: implement buffered read iomap path Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 24/34] ext4: implement buffered write " Zhang Yi
2024-05-01  8:11   ` Dave Chinner
2024-05-01  8:33     ` Dave Chinner
2024-05-06 11:44       ` Zhang Yi [this message]
2024-05-06 23:19         ` Dave Chinner
2024-05-07  5:10           ` Zhang Yi
2024-05-06 11:21     ` Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 25/34] ext4: implement writeback " Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 26/34] ext4: implement mmap " Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 27/34] ext4: implement zero_range " Zhang Yi
2024-05-01  9:40   ` Dave Chinner
2024-05-06 12:33     ` Zhang Yi
2024-04-10 14:29 ` [RFC PATCH v4 28/34] ext4: writeback partial blocks before zeroing out range Zhang Yi
2024-04-10 15:03 ` [RFC PATCH v4 29/34] ext4: fall back to buffer_head path for defrag Zhang Yi
2024-05-01  9:32   ` Dave Chinner
2024-05-06 13:05     ` Zhang Yi
2024-04-10 15:03 ` [RFC PATCH v4 30/34] ext4: partial enable iomap for regular file's buffered IO path Zhang Yi
2024-04-10 15:03 ` [RFC PATCH v4 31/34] filemap: support disable large folios on active inode Zhang Yi
2024-04-10 15:03 ` [RFC PATCH v4 32/34] ext4: enable large folio for regular file with iomap buffered IO path Zhang Yi
2024-04-10 15:03 ` [RFC PATCH v4 33/34] ext4: don't mark IOMAP_F_DIRTY for buffer write Zhang Yi
2024-05-01  9:27   ` Dave Chinner
2024-05-06 14:02     ` Zhang Yi
2024-04-10 15:03 ` [RFC PATCH v4 34/34] ext4: add mount option for buffered IO iomap path Zhang Yi
2024-04-11  1:12 ` [RESEND RFC PATCH v4 00/34] ext4: use iomap for regular file's buffered IO path and enable large folio Zhang Yi
2024-04-24  8:12 ` Zhang Yi
  -- strict thread matches above, loose matches on Subject: below --
2024-04-10 13:27 [RFC " Zhang Yi
2024-04-10 13:28 ` [RFC PATCH v4 24/34] ext4: implement buffered write iomap path Zhang Yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96bbdb25-b420-67b1-d4c4-b838a5c70f9f@huaweicloud.com \
    --to=yi.zhang@huaweicloud.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=chengzhihao1@huawei.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox