From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74F79C369C2 for ; Fri, 25 Apr 2025 22:51:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BBF1B6B0007; Fri, 25 Apr 2025 18:51:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B48106B0008; Fri, 25 Apr 2025 18:51:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E85E6B000A; Fri, 25 Apr 2025 18:51:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7C7586B0007 for ; Fri, 25 Apr 2025 18:51:57 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4F936160CD4 for ; Fri, 25 Apr 2025 22:51:58 +0000 (UTC) X-FDA: 83374065516.04.84B39FA Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf20.hostedemail.com (Postfix) with ESMTP id B46EB1C0009 for ; Fri, 25 Apr 2025 22:51:56 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=jEYaJyoC; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf20.hostedemail.com: domain of mcgrof@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=mcgrof@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745621516; a=rsa-sha256; cv=none; b=CfZ8urAMFvbnDcyzK4jFbnomaFwG2RvX/M1LR7TH6PUJXvoz8LlfdADUt89rNiW3k/59t8 BmrdJ29UTNfY/lKMW32nE1o/ueA7HudjgOqxdnHg98L3v70Z3zLyahyZGfQdNDuRHSwuuW zRWGvi/nHhCLMp9ytS0U6aIj1jnyGLs= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=jEYaJyoC; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf20.hostedemail.com: domain of mcgrof@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=mcgrof@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745621516; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SSVW6qrmfyXHZaxSgdOdvLoHef7lQ3ctA5MTs26cpLU=; b=5phEZyk3J5LxpHbeUhPFEU26GPo/R9XYKkYnKd3tiV5Oo1S28g/7l0UyT4mZtzm7cjZ3L6 CXscUlymzYWmzwpNTPJGK1qyzxqCQUbtGNaHyLNIL9K1Frsc1CzZCkJ70JJR/+ZXVppp7i uPJiHUQ04U183HaMgLq9FHbtsBp6XWw= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 2BB7AA4D692; Fri, 25 Apr 2025 22:46:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4297BC4CEE4; Fri, 25 Apr 2025 22:51:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745621515; bh=36lugvvl4zzEMISZR+vkYMgDKBdjxnRI+gAdGLkgpKc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=jEYaJyoCx0+Ng3BLeOf9fGGqg7JGn6rmbpV+T1iMGeX0fLpbiF7XjrJQAwrXj3ii9 s+zMTyN+46Sm8mnZMyUU6eHag5qE8moum14tNMMYgEdHWfYuQPXbdDu2pcGAkGZVBE lE3W88Af9w/iqj311UZwlCXkxuXQ7FXSuV+H38e3pa3kVTjKmUW3rCS8A+OwTrxzm2 Oa155moLXl3x978WwIWVXGtNm7FFjb7EUAmIgVBb6YrVwaNicnEGLWnvXlMW22HmvY KIyctV3KaeiT9pFpctTrsg8HTf1qRKyJPjgQeW2i3NZpnfyUZ2KNOjNzy7gmw4YPuL ZANOj7MJpSh6g== Date: Fri, 25 Apr 2025 15:51:53 -0700 From: Luis Chamberlain To: Jan Kara Cc: dave@stgolabs.net, brauner@kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, riel@surriel.com, willy@infradead.org, hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com, axboe@kernel.dk, hare@suse.de, david@fromorbit.com, djwong@kernel.org, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, syzbot+f3c6fda1297c748a7076@syzkaller.appspotmail.com Subject: Re: [PATCH v2 1/8] migrate: fix skipping metadata buffer heads on migration Message-ID: References: <20250410014945.2140781-1-mcgrof@kernel.org> <20250410014945.2140781-2-mcgrof@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B46EB1C0009 X-Stat-Signature: jasfk3uyg7b1iemq1oshgzwjx1jadaix X-Rspam-User: X-HE-Tag: 1745621516-634113 X-HE-Meta: U2FsdGVkX18oz8ctilHVJwbbjbRRxofbODOoBkUtMlVRRKoj7KlFnw9K9KAnyAeXIGbVA4E4DLQx+Gz4Q/lmo/Sr7jajq5rI3wRVud6qLLqbXsLyjJm2VkN76RAyxCwfHg5d8bYb3T9qlQDER5h1TMWU4iA51wWok6Y1pbxAe6c1ViTMan1aiRKBsW5VPgtgGo0Vg6HaMcikBQRo29hBGBJv+8v36Z4+aZdtBp/trrJeRnN3IbQokWbLCd9Q1DlLicntUwoJduP5YvtOe8TvMEUph84DC9NLyLijxpT3PQITjZs24OvWhhBlt8BQzRohd6ipD0Vv2CjUnqYMCYv2IJmbss8ybHZp2X9YT4dKTiR+SlHrCKGAh82acz4VdWbkejhQXiM9qRzKZrSYHxI2JIQ7fIt+rqD1FBXb9UdZzX6Ww5+N6M84om9kt/triyUIA9wTy/4GF/XB4ITsU8nzBI+oEyJAb3qUUlqKgWw4JEdVHUis2WOdwevceMqUePVt89X6sjqUAj/tHVpJ6+4AHnJGutMIqiIE52Y/eSkySXq+ICX4KVOKin7l1s4RQaEzg+LsPLh94ik3V4fssA0XkFPoE3wU4/uqJYAxJlQCiePD3RFqMPZuQj7s+BikiGE0rgYnYinAlYtUWNZmce/x6sbM7nXcKbNlUJwqDejq051i5BRdZ4UImnL1YrhQBmZIT/5avfI6GxqwtCfgXv4yy2R7XDLQzTxrhJOKF8cQMf+kQ+ythJ23aFe270kNFndYZ8DzfSSnywCou7YjmX50ZfqnRUI55pYSHGXutoiELsHqJjnY7kxPWzpL0W7O9iGDxs1jPusn2LqAz6l2RCw/QCmpkEOUDnA0Lap5K97Cjsvn6v7CMbilEoOKNv40hx63rzoHdOSjZNdNlU25SAXsMeuLdtYscnVAWVxtqly4OFT71sL6o74hnj/JZhGsNNKPZ5GUImJYAbWZgbYI/vk TgUHROvG edxKNXwWCxFrZjRyK7Gv2yDXltBFupGv0B5+cUinAuEHhx9bhXv21fw5vyC5n5opn9E23uERbpgFrcuLuNbNnrWg42IU2ga+qhzoimZRaFACAi8jCHy8CpE/FrR/SZ5S+6kEjtbWKi7fTqbFWlvmeP1aCccSz0GKPnT7uxFzRT1ZfQVusd9tlW3a2IZAMJO4adi78UAJtMUpX3FbRkmN5ytXBVVwt1YH+f+vn1aU/ZHAyqADwcfBqEsLGojo4/l2pCyRl3o05RXG1DJrriI96RGH1vKEZQ9lJSxC49sVQGHwnnlzt+8AquYhV/v3UsxmJZSv7sOS4iJ9dAlho8rfL4UwaDZ6++fece3f0FbcSEx8kSf/vR/X+DHh1bnMP+Mxi+DaWN4qcs3EgOoAKK5gIoXB1OjzSszVxKXhdfmi+79JZBW8pjCnS35QxYwKkRa9cCrFLVrQTqil8i5LCvuei1BQtZ+Kc5PxkBGVs X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 23, 2025 at 01:30:29PM -0700, Luis Chamberlain wrote: > On Wed, Apr 23, 2025 at 07:09:28PM +0200, Jan Kara wrote: > > On Wed 16-04-25 09:58:30, Luis Chamberlain wrote: > > > On Tue, Apr 15, 2025 at 06:28:55PM +0200, Jan Kara wrote: > > > > > So I tried: > > > > > > > > > > root@e1-ext4-2k /var/lib/xfstests # fsck /dev/loop5 -y 2>&1 > log > > > > > e2fsck 1.47.2 (1-Jan-2025) > > > > > root@e1-ext4-2k /var/lib/xfstests # wc -l log > > > > > 16411 log > > > > > > > > Can you share the log please? > > > > > > Sure, here you go: > > > > > > https://github.com/linux-kdevops/20250416-ext4-jbd2-bh-migrate-corruption > > > > > > The last trace-0004.txt is a fresh one with Davidlohr's patches > > > applied. It has trace-0004-fsck.txt. > > > > Thanks for the data! I was staring at them for some time and at this point > > I'm leaning towards a conclusion that this is actually not a case of > > metadata corruption but rather a bug in ext4 transaction credit computation > > that is completely independent of page migration. > > > > Based on the e2fsck log you've provided the only damage in the filesystem > > is from the aborted transaction handle in the middle of extent tree growth. > > So nothing points to a lost metadata write or anything like that. And the > > credit reservation for page writeback is indeed somewhat racy - we reserve > > number of transaction credits based on current tree depth. However by the > > time we get to ext4_ext_map_blocks() another process could have modified > > the extent tree so we may need to modify more blocks than we originally > > expected and reserved credits for. > > > > Can you give attached patch a try please? > > > > Honza > > -- > > Jan Kara > > SUSE Labs, CR > > > From 4c53fb9f4b9b3eb4a579f69b7adcb6524d55629c Mon Sep 17 00:00:00 2001 > > From: Jan Kara > > Date: Wed, 23 Apr 2025 18:10:54 +0200 > > Subject: [PATCH] ext4: Fix calculation of credits for extent tree modification > > > > Luis and David are reporting that after running generic/750 test for 90+ > > hours on 2k ext4 filesystem, they are able to trigger a warning in > > jbd2_journal_dirty_metadata() complaining that there are not enough > > credits in the running transaction started in ext4_do_writepages(). > > > > Indeed the code in ext4_do_writepages() is racy and the extent tree can > > change between the time we compute credits necessary for extent tree > > computation and the time we actually modify the extent tree. Thus it may > > happen that the number of credits actually needed is higher. Modify > > ext4_ext_index_trans_blocks() to count with the worst case of maximum > > tree depth. > > > > Link: https://lore.kernel.org/all/20250415013641.f2ppw6wov4kn4wq2@offworld > > Reported-by: Davidlohr Bueso > > Reported-by: Luis Chamberlain > > Signed-off-by: Jan Kara > > I kicked off tests! Let's see after ~ 90 hours! Tested-by: kdevops@lists.linux.dev I have run the test over 3 separate guests and each one has tested this over 48 hours each. There is no ext4 fs corruption reported, all is good, so I do believe thix fixes the issue. One of the guests was on Linus't tree which didn't yet have Davidlorh's fixes for folio migration. And so I believe this patch should have a stable tag fix so stable gets it. Luis