From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F334C25B10 for ; Fri, 10 May 2024 09:41:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A61F76B00A8; Fri, 10 May 2024 05:41:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A121D6B00A9; Fri, 10 May 2024 05:41:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 900B86B00AA; Fri, 10 May 2024 05:41:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 72B056B00A8 for ; Fri, 10 May 2024 05:41:55 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B871C1A0552 for ; Fri, 10 May 2024 09:41:54 +0000 (UTC) X-FDA: 82101994548.12.104B04F Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) by imf13.hostedemail.com (Postfix) with ESMTP id D750220011 for ; Fri, 10 May 2024 09:41:52 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=wjdczF31; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf13.hostedemail.com: domain of luis.henriques@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=luis.henriques@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715334113; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jVrskhxBKxvv9z8KIVgQd5Lldn1b3J2MV9vWclbKXxw=; b=ngaOWvn28aDKA7CGX+ncICEMFh+ATEBMDBzUeokPYounUp0J4help1EsG/Sk/9rwOxF8d0 f9VvGMZbP3/S4gA/eEaDoz/GfqVtsS9GhTcJreUQB3fihhmY3sIIU6REXpQ69wsleAjPo2 spF6LPTVHf0j1c86bp4yvaI2vycabFg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715334113; a=rsa-sha256; cv=none; b=hWnqSKeY6wfqq3aaW8ECEhRHQYE3W9/MJaZIFcxOgVMAUfUku5lQ/tpXiwdDM2TylttLvy KT7RBBMHYhlcnUqXNCb72kSMwhwOO9AQGyoQ2gDdYzhOW7cuSNeNcLjHwSKhbeA9F2iIDH EXn9HIHSuwfEiB5a8wEV8vCzfidiCYQ= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=wjdczF31; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf13.hostedemail.com: domain of luis.henriques@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=luis.henriques@linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1715334110; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=jVrskhxBKxvv9z8KIVgQd5Lldn1b3J2MV9vWclbKXxw=; b=wjdczF31RF+qRp+6AWaFv5jmUHfu3EMgiww/OAXl2mQujL6olclJLVYnjVcNaimCCHjAGb lmD1GtIin78uuOReujW6fU/7VxHMSkQDXK5prD+whOMoDmBVzfL9tnRBif4oL2HHqEuIuD Xi87nFlgpTVI2CcJN359VVQNE1uH43A= From: Luis Henriques To: Zhang Yi Cc: Theodore Ts'o , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks() In-Reply-To: (Zhang Yi's message of "Fri, 10 May 2024 11:39:48 +0800") References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> <20240127015825.1608160-4-yi.zhang@huaweicloud.com> <87zfszuib1.fsf@brahms.olymp> <20240509163953.GI3620298@mit.edu> <87h6f6vqzj.fsf@brahms.olymp> Date: Fri, 10 May 2024 10:41:45 +0100 Message-ID: <87seyquhpi.fsf@brahms.olymp> MIME-Version: 1.0 Content-Type: text/plain X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: D750220011 X-Stat-Signature: 9375pqaef7pdkxu5d8dz9ze9g39u3n7t X-Rspam-User: X-HE-Tag: 1715334112-313000 X-HE-Meta: U2FsdGVkX1+lKcueQQEa7IYxzDQ+p645y52nnjg1fq99zoYfN10PEHblj6prL70/wNlQaDPUKgZX/UX7SGiEsqgPMVGFLkXvyyVzwBVHs5N4nWORvXY2zozcnpMLTJ7qB+n3ojTtCgCxowjXGUu9Da0zXJ6BeUul7nVqcGtNwhUe0PXXqGwcTZCvx/cHZG5p9CtpioGG67j8YWGCOpK2si5UNCPwM5IKLFtctSv6guqYUFhejd2peOrdpsgbvav/z1GwlSlGIJcMAHRKaUWVLLQfxX8/Felc9LXsyHVd6VqL5fnDsdhs6wdseyKcp/DNJzsi0OKQ7JBXBjUSPQFQoDcUwEvTCCp7oprCdj9DAKJOIYoQ0u8rA8U6oDWl6wh1YQwXHV8Z8fQpjxszDcU2cjaViYlkhFmaD06zah9MTaYupDTEowEOqYp3tHuTUeXU+Ru6QKzGEzh7rWzuRdWNDfsiumbQNA6I5PbnjouY2jWhaKmay4tVE7+bWLnueWO7fM6bzptMFP7a5nvwl8ZY4Rmj6lfGUxjpyziwlCanmIzCXZniM5gu5Qt9mXElTsne7YO70aYQDHOkKVnYwOWNLJqk8mfBQwdmdBXp7MOfj8Qob2kp/khwXNC5n4e8zYKF/NBSdyUkZJwZwl5dEtGGEJH1DIdYHKz9TOkiav10M/F4y7gZuA7o5eSc6AceB9MwVmGBvAFXsu+J1DldoA1vqVRleBzYJogw0u4AhLTpjcy9/F5+HjHgsbzbLhaQh1cmKJhSgsqPCujyYMqcS2Nh7vh2MYq29uF+sADT4viIrdafQjST5fg+3NLkhi/JHcP+AmiZILpz6xWeEUQuiGhcPJ0sEctTRsWoHrYcMSzEBOJGURBnuXGLid3J4uCw6NG7u9f4MCmh8CNapl9wr7JFJzHnersnvgLFIOMHpY8a3n1g3I/pTZWcDuXNXVolIw4vg5d74tEogrWoVG3bZSh MiWtK33o Oxa5d3aP3Toig1NuBMa+DybVNfQ4AsJwYF6i/HIU/SfZlcOBDkjE00vKqWJC9Osf6GmIUbfvQGrtudJOu3RAYNYRje9uV0+5hUEqesfFLM+9cLAXBwbZ9lcSkRQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri 10 May 2024 11:39:48 AM +08, Zhang Yi wrote; > On 2024/5/10 1:23, Luis Henriques wrote: >> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote; >> >>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote: >>>> >>>> It's looks like it's easy to trigger an infinite loop here using fstest >>>> generic/039. If I understand it correctly (which doesn't happen as often >>>> as I'd like), this is due to an integer overflow in the 'if' condition, >>>> and should be fixed with the patch below. >>> >>> Thanks for the report. However, I can't reproduce the failure, and >>> looking at generic/039, I don't see how it could be relevant to the >>> code path in question. Generic/039 creates a test symlink with two >>> hard links in the same directory, syncs the file system, and then >>> removes one of the hard links, and then drops access to the block >>> device using dmflakey. So I don't see how the extent code would be >>> involved at all. Are you sure that you have the correct test listed? >> >> Yep, I just retested and it's definitely generic/039. I'm using a simple >> test environment, with virtme-ng. >> >>> Looking at the code in question in fs/ext4/extents.c: >>> >>> again: >>> ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start, >>> hole_start + len - 1, &es); >>> if (!es.es_len) >>> goto insert_hole; >>> >>> * There's a delalloc extent in the hole, handle it if the delalloc >>> * extent is in front of, behind and straddle the queried range. >>> */ >>> - if (lblk >= es.es_lblk + es.es_len) { >>> + if (lblk >= ((__u64) es.es_lblk) + es.es_len) { >>> /* >>> * The delalloc extent is in front of the queried range, >>> * find again from the queried start block. >>> len -= lblk - hole_start; >>> hole_start = lblk; >>> goto again; >>> >>> lblk and es.es_lblk are both __u32. So the infinite loop is >>> presumably because es.es_lblk + es.es_len has overflowed. This should >>> never happen(tm), and in fact we have a test for this case which >> >> If I instrument the code, I can see that es.es_len is definitely set to >> EXT_MAX_BLOCKS, which will overflow. >> > > Thanks for the report. After looking at the code, I think the root > cause of this issue is the variable es was not initialized on replaying > fast commit. ext4_es_find_extent_range() will return directly when > EXT4_FC_REPLAY flag is set, and then the es.len becomes stall. > > I can always reproduce this issue on generic/039 with > MKFS_OPTIONS="-O fast_commit". > > This uninitialization problem originally existed in the old > ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem > since we never check and use extent cache when replaying fast commit. > So I suppose the correct fix would be to unconditionally initialize > the es variable. Oh, you're absolutely right -- the extent_status 'es' struct isn't being initialized in that case. I totally failed to see that. And yes, I also failed to mention I had 'fast_commit' feature enabled, sorry! Thanks a lot for figuring this out, Yi. I'm looking at this code and trying to understand if it would be safe to call __es_find_extent_range() when EXT4_FC_REPLAY is in progress. Probably not, and probably better to simply do: es->es_lblk = es->es_len = es->es_pblk = 0; in that case. I'll send out a patch later today. Cheers, -- Luis