From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 964DDC10F16 for ; Mon, 6 May 2024 11:44:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F40C26B007B; Mon, 6 May 2024 07:44:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EF1346B0082; Mon, 6 May 2024 07:44:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB8AF6B0083; Mon, 6 May 2024 07:44:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BC6F16B007B for ; Mon, 6 May 2024 07:44:58 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 32151801E6 for ; Mon, 6 May 2024 11:44:58 +0000 (UTC) X-FDA: 82087789476.05.534FD96 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by imf13.hostedemail.com (Postfix) with ESMTP id 658E420002 for ; Mon, 6 May 2024 11:44:54 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of yi.zhang@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=yi.zhang@huaweicloud.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714995896; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JqAqpz7lkVIR8Hx5IvRjWXDdbz8vAHlNRwjktYA90ic=; b=7BhCNySoDRyk2Us34B2PsKdrfopWIiU2maD41Raq2wQ/LCb7PYP7kuM6m8f8wRleYd7UCr 8q65fHlxFSWRpLV7YMX1b7V3wdXQFqUMHM1NwbDuYsyoujGIhPQ8y/VdNHYd0X3DAADrIs YFBYZTraC3UNFhqLUeFTj0eYbASKxJM= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of yi.zhang@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=yi.zhang@huaweicloud.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714995896; a=rsa-sha256; cv=none; b=ZPDU0mH4z4pTOhw1b6VAcTm+wgYgoYtVKPvZeSR4JxjeLOQr3umIhjSODdcaMFfvpolW/5 oUgaCwLRKmqtYkPABXlIcvwWn0A6iRUXjmT8vTiMma+4UzYePkc14bZY4+xnd6ElkKCn6f W0wr1f6VKORB9pWPzgRl7h3ddWGCGiQ= Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4VY01y5s53z4f3jHn for ; Mon, 6 May 2024 19:44:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 069D81A0572 for ; Mon, 6 May 2024 19:44:47 +0800 (CST) Received: from [10.174.179.80] (unknown [10.174.179.80]) by APP2 (Coremail) with SMTP id Syh0CgDH6w6swjhmHsDLMA--.57384S3; Mon, 06 May 2024 19:44:46 +0800 (CST) Subject: Re: [RFC PATCH v4 24/34] ext4: implement buffered write iomap path To: Dave Chinner Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com References: <20240410142948.2817554-1-yi.zhang@huaweicloud.com> <20240410142948.2817554-25-yi.zhang@huaweicloud.com> From: Zhang Yi Message-ID: <96bbdb25-b420-67b1-d4c4-b838a5c70f9f@huaweicloud.com> Date: Mon, 6 May 2024 19:44:44 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-CM-TRANSID:Syh0CgDH6w6swjhmHsDLMA--.57384S3 X-Coremail-Antispam: 1UD129KBjvJXoWxZFWxCryDXFyDtw17Cry3Jwb_yoWrJw1kpr Z8KFWUKFsrXr18ur1vvF4UWF1Fk3WxGr17Wr45WryqvFZ8ZFySga48GF1Y9FW7Ars2kF10 qFWUuFyxZa4Yy37anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvIb4IE77IF4wAFF20E14v26ryj6rWUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I 0E14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40E x7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x 0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7I2V7IY0VAS 07AlzVAYIcxG8wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c 02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_ WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7 CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6rWUJVWrZr1UMIIF0xvEx4A2jsIE 14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf 9x07UZ18PUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 658E420002 X-Stat-Signature: nqpuira9xnzz1mm8k61zgpiy9eank51u X-HE-Tag: 1714995894-174666 X-HE-Meta: U2FsdGVkX1+7a42yYTnFIVMRWul6fk1+pJqkd3lPIQtMBAODlOhmjRSI9kwJU1nDM2yaYDv8hGBDOMtGgrA/KBPiCc7y6aZzA0W/M/3CzEbc9y2VtPeoFBkMqKyp5buYMd3zG6s9+2af3Pm3RKhW1ar/r+noCplBZf5OExeZf+7p+GxTCKX10vsku83XytnENZ9HVFgrUNiXkUzH5ZZLBQ4zv0M/5GYxKZrIB7Fz0n3is2xV4Lyl54E+ayBHBtk1E1jPXITpVjqrarDhnhzZM18sgpDiJrIr5ChaQSPwCr96l9lGK+PafRdO7dr5Ser6K+ZXi3CKvCTzc/Kkpe+Wb4j+RhrI7ZRBu6VdZD3vHeMQF07L1cX718jOqUpJV5lAP10yiID7bkahjaywuGNd4KRT/PeJkgrpoX85uZYpKdrDahmjdujO0mvy9YMQrzR5eJw4VEnM75SzmaN0w072ilKMLHYXVwVOT7NM5ScCqJY53n1KSe8BooI658kWP9tnYNgiHt2KRUPAI0UrCVtASG0/qZ2GRd+LPMt2K2QBNWaRumxHYib+SxAk9y+mUfRHMml2rWPYvR6rpngU0bfAxIRlyIbEQZTUAzFxOfhKAcW4nNNX7pfZ0nemDT0fqzgtqj76MTHINUkG3+0HzKVhYybTI5TdQKINZaR4k2+dKnSfSfT3y4h0y0zpZETx5y3FZVFw55R2wAXnSLBQwh/6LBISYBomcbHOYzYFR3ka1m/q7uHwDbssDQjA5QJslNm2TAqx23HY3FzI2shm5j9KwQxKphySR+VfQSw0Olk8qNwEKABRdeiTlc4KKxUVFqVarf17pwFFiezN9lxG2w2eOePjXenDa4O1GlzUppA+qcuH3uScT+CTFOvhLkCLh2lsmqGCuNviuFHNDlmT1bnD630OS38KT+KoHeCjD3WGmDpJ7xh/3Y1Jsy3f5u2/pXa2tXQmlxxyf51AD2dAeQT Yv8fJ3Pt 4ZxSTOsnFC4XXuxJ8cA8FLnbub1biibaiat8Tg6Fp/VAj5cR/bXD77dKc4P81pJu9JBvXmDCF22a6Tyxp0pX2zC+sgdWxZdn0N0D1RVsShJrPNcBwijDG73YCk9fMl9JwhM5X0bGfDAzryfiowNw91wWzHvEwLFsSiEPT/0FtF/V6r/ZlKRzN45VfzslfNmPjj4HJv1s7jbgZ9vdly2pcsCa5t/FbUXIh0CcQsYXEjAgb5C/SVBrpNGIMu7RIp0VEgBPlBsTdjU8G9VY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/5/1 16:33, Dave Chinner wrote: > On Wed, May 01, 2024 at 06:11:13PM +1000, Dave Chinner wrote: >> On Wed, Apr 10, 2024 at 10:29:38PM +0800, Zhang Yi wrote: >>> From: Zhang Yi >>> >>> Implement buffered write iomap path, use ext4_da_map_blocks() to map >>> delalloc extents and add ext4_iomap_get_blocks() to allocate blocks if >>> delalloc is disabled or free space is about to run out. >>> >>> Note that we always allocate unwritten extents for new blocks in the >>> iomap write path, this means that the allocation type is no longer >>> controlled by the dioread_nolock mount option. After that, we could >>> postpone the i_disksize updating to the writeback path, and drop journal >>> handle in the buffered dealloc write path completely. > ..... >>> +/* >>> + * Drop the staled delayed allocation range from the write failure, >>> + * including both start and end blocks. If not, we could leave a range >>> + * of delayed extents covered by a clean folio, it could lead to >>> + * inaccurate space reservation. >>> + */ >>> +static int ext4_iomap_punch_delalloc(struct inode *inode, loff_t offset, >>> + loff_t length) >>> +{ >>> + ext4_es_remove_extent(inode, offset >> inode->i_blkbits, >>> + DIV_ROUND_UP_ULL(length, EXT4_BLOCK_SIZE(inode->i_sb))); >>> return 0; >>> } >>> >>> +static int ext4_iomap_buffered_write_end(struct inode *inode, loff_t offset, >>> + loff_t length, ssize_t written, >>> + unsigned int flags, >>> + struct iomap *iomap) >>> +{ >>> + handle_t *handle; >>> + loff_t end; >>> + int ret = 0, ret2; >>> + >>> + /* delalloc */ >>> + if (iomap->flags & IOMAP_F_EXT4_DELALLOC) { >>> + ret = iomap_file_buffered_write_punch_delalloc(inode, iomap, >>> + offset, length, written, ext4_iomap_punch_delalloc); >>> + if (ret) >>> + ext4_warning(inode->i_sb, >>> + "Failed to clean up delalloc for inode %lu, %d", >>> + inode->i_ino, ret); >>> + return ret; >>> + } >> >> Why are you creating a delalloc extent for the write operation and >> then immediately deleting it from the extent tree once the write >> operation is done? > > Ignore this, I mixed up the ext4_iomap_punch_delalloc() code > directly above with iomap_file_buffered_write_punch_delalloc(). > > In hindsight, iomap_file_buffered_write_punch_delalloc() is poorly > named, as it is handling a short write situation which requires > newly allocated delalloc blocks to be punched. > iomap_file_buffered_write_finish() would probably be a better name > for it.... > >> Also, why do you need IOMAP_F_EXT4_DELALLOC? Isn't a delalloc iomap >> set up with iomap->type = IOMAP_DELALLOC? Why can't that be used? > > But this still stands - the first thing > iomap_file_buffered_write_punch_delalloc() is: > > if (iomap->type != IOMAP_DELALLOC) > return 0; > Thanks for the suggestion, the delalloc and non-delalloc write paths share the same ->iomap_end() now (i.e. ext4_iomap_buffered_write_end()), I use the IOMAP_F_EXT4_DELALLOC to identify the write path. For non-delalloc path, If we have allocated more blocks and copied less, we should truncate extra blocks that newly allocated by ->iomap_begin(). If we use IOMAP_DELALLOC, we can't tell if the blocks are pre-existing or newly allocated, we can't truncate the pre-existing blocks, so I have to introduce IOMAP_F_EXT4_DELALLOC. But if we split the delalloc and non-delalloc handler, we could drop IOMAP_F_EXT4_DELALLOC. I also checked xfs, IIUC, xfs doesn't free the extra blocks beyond EOF in xfs_buffered_write_iomap_end() for non-delalloc case since they will be freed by xfs_free_eofblocks in some other inactive paths, like xfs_release()/xfs_inactive()/..., is that right? Thanks, Yi.