From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D76E8FA3748 for ; Fri, 13 Sep 2024 12:11:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51D046B00CC; Fri, 13 Sep 2024 08:11:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CD706B00CD; Fri, 13 Sep 2024 08:11:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36CE06B00CF; Fri, 13 Sep 2024 08:11:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 141366B00CC for ; Fri, 13 Sep 2024 08:11:33 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7B14181565 for ; Fri, 13 Sep 2024 12:11:32 +0000 (UTC) X-FDA: 82559600424.12.AE13332 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf09.hostedemail.com (Postfix) with ESMTP id 62067140017 for ; Fri, 13 Sep 2024 12:11:29 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=aRvVf7oj; spf=pass (imf09.hostedemail.com: domain of brauner@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726229384; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ymfGlmQ+fVUAOjDCrxz+oybxKPmPDyvLDM2tMu0eySY=; b=KGjHm6Wqqgx3I2rgmaUhKg+n8d+t4VWsSXo2WSXLzBEhVn15IVZR7KkNrXg0dnYiSUeR1f ZnvoOnTgToJu0xleLDIXMbdbxd34gyCJOf36+pXFvmP0VK82BwCOO4xr2dp147ZL9c1U+Z VnzenqYjfA1kXCoeKJY8fh0wWqtHsS8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726229384; a=rsa-sha256; cv=none; b=OO4CYcwzrC607SRiKvtMnkT7TJn3TupGPzHGVJcKMuKsvB1zgXdqj61kI2ueCqIKjcDL9e ptLVU91orx3b20kSQn3F8I9cij6OTYzKj5ppNbrT4ALBZsnjzuBcrtZTIa9EVmTP1rFBcG 8ZfSOYVQHhquYR4dUEN2mDXockJU4Bk= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=aRvVf7oj; spf=pass (imf09.hostedemail.com: domain of brauner@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=brauner@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 068CDA45CBE; Fri, 13 Sep 2024 12:11:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F3A34C4CEC0; Fri, 13 Sep 2024 12:11:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726229488; bh=7xIfV9cbzSwkAOH/0KEgBigHdF3rNbLzHt6TgHTSjX8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=aRvVf7ojKKKqLD9YWcx/3bpn4sdndF1F9fraVJqDU9bsn8N9wDsH6LxXzx5WXi26H RwHxd2OsJbu+eV3786Ep4aqOwbz5xkjy5VnVOCcecIYWaO3v3JiJ0ZpPpXxnUFn4KO akkoN0/Q3X/1rJ7i6GOfDOxsjXqtkxqROJ40T1bVJEwb6JaS6OH9ApegotS4CmLHIk a167Sb5gvXUNQxhLfgiAjmCLv1c6M/jLZJoD7pcXz4WFNe/SzshO7m76/XfsWddZrF MFKoPSvf8yrcXolFmwP3Swuwl7aTm+lUg/5KGOG/DCxutKVu5/zyJ3g7l79DG5f0y9 gdfel2VlMccNg== Date: Fri, 13 Sep 2024 14:11:22 +0200 From: Christian Brauner To: Linus Torvalds , Pankaj Raghav , Luis Chamberlain Cc: Jens Axboe , Matthew Wilcox , Christian Theune , linux-mm@kvack.org, "linux-xfs@vger.kernel.org" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Dao , Dave Chinner , clm@meta.com, regressions@lists.linux.dev, regressions@leemhuis.info Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) Message-ID: <20240913-ortsausgang-baustart-1dae9a18254d@brauner> References: <0fc8c3e7-e5d2-40db-8661-8c7199f84e43@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 62067140017 X-Stat-Signature: a8kczcdemjod8nyfxbt6iko179ppjizd X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726229489-142457 X-HE-Meta: U2FsdGVkX1/VSI3j/pGlkTdvVryUnMK8EGD+RLQk4eWGayRe8cl83ExYw4ddqOSeoog+1zEOmh/JmqSzo238U/XMUqSZCjOKUWSz6tKHJJs+iwloXhNiuFTOjCnsVn02fxZYTSd6Y+bqP+rT7QSPIH8w7ed+5of0XM96ABB1KE2tM7/XAL+hX9XN5oN7f9bwJC0MkaZ4AZ7Tvp7WtgPa02r+ve5cLSIqzuV7bAXpc0TOS1KUWtTpc1qC9MF1VbkRG1KvMEC0Q3C5ZjERW+ZvrUFPvizLghyuYxGc+/BgZZsJ/Witg7m9LDvypnY0Ca3vh8A3DtJxWeDJDrNpXynolGQrO9ZAp+Q5Ta41jaOZPTR6vCuEo/38AygR3WEFhVMNb77JOgKor1HZmfKXxk7Hyb758DYI6ZOCM/DDriHo5Nb+iT/zHMsCA+POXG/yTDUjPuNogSkvKpIFo5vPoD1WMaL2jip89MIE1EFJuaD96+dRFqeByOYtcyOqDsp83ttN31d418CLh5lzDCWCCbGZUq0InXVdU/PuSPJYJEZpqf8QzAgQEkgW3haIB7O6fpe2WSFTpZ2vgaI2Rqp6JIbFJtIEMBcJavLN13Urz+hLN9RG/fpVDUF2yBgfVL4MT+rSz+UgS/ge4Awg2baRuLX3HYKVm4Qrrb4q0DXmOVzafIR2wjFOrILIm6JkMXhyX7F1aHvG02wPxTpo6Rhoffo+0fMnwadVNn8Lc6tGxF+xLspzkqn5qnWHnF7m635uaq+2hpbsWiDmx5cdUlBNXSuzi+Pu6woMSa/67lmxr5I8z+Y0Xg3q6zw7HZYC1+wkROhW3ecHYz2FEjLhwhgIDhLwAM5es6M6z+wey8RX1Md96iiPslQP3hkCr3t6576Dwn7d+jprbw2U/Ga3VAtc04vHTG0HWRWSF198LDSkY9Li7cVIDQ16hao0BDbwkux9p/vrKJiiGgxCHCFRnLjWDzk bXjd7S4Z B3T46YEKBD2d5Gcy7eDL5Jw0iOwhF45D25OtSeGdItFB8hRfWg6TySMBLpwg0J7JNN9vdWs2FIudJY8HQNDGDZJX6M8Tks0F2UKjAZfk3fgPKW21PoKuFF4jv78mqD5T9fMVEGXA92mkpBai13ijeMAETXGh+CAYiykKA1mxlOB8W2EhrmeMu9Y0FCRvO44Am1Pek/euOGSJDaLm38q4sujLNMR3VZuHHI949YWL0VmmOxFdyDD/VmCkVgDseDCH4nfioAVUz3WTcb+udcsC0oy3plE+JdxK1nW5/x10cF0gmzgC13NOE+zjWeGh9yXkID5ndPyryXkCDQAbNIZBhfEVN0g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 12, 2024 at 03:25:50PM GMT, Linus Torvalds wrote: > On Thu, 12 Sept 2024 at 15:12, Jens Axboe wrote: > > > > When I saw Christian's report, I seemed to recall that we ran into this > > at Meta too. And we did, and hence have been reverting it since our 5.19 > > release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping > > things that are known broken. > > I do think that if we have big sites just reverting it as known broken > and can't figure out why, we should do so upstream too. > > Yes, it's going to make it even harder to figure out what's wrong. > Not great. But if this causes filesystem corruption, that sure isn't > great either. And people end up going "I'll use ext4 which doesn't > have the problem", that's not exactly helpful either. > > And yeah, the reason ext4 doesn't have the problem is simply because > ext4 doesn't enable large folios. So that doesn't pin anything down > either (ie it does *not* say "this is an xfs bug" - it obviously might > be, but it's probably more likely some large-folio issue). > > Other filesystems do enable large folios (afs, bcachefs, erofs, nfs, > smb), but maybe just not be used under the kind of load to show it. > > Honestly, the fact that it hasn't been reverted after apparently > people knowing about it for months is a bit shocking to me. Filesystem > people tend to take unknown corruption issues as a big deal. What > makes this so special? Is it because the XFS people don't consider it > an XFS issue, so... So this issue it new to me as well. One of the items this cycle is the work to enable support for block sizes that are larger than page sizes via the large block size (LBS) series that's been sitting in -next for a long time. That work specifically targets xfs and builds on top of the large folio support. If the support for large folios is going to be reverted in xfs then I see no point to merge the LBS work now. So I'm holding off on sending that pull request until a decision is made (for xfs). As far as I understand, supporting larger block sizes will not be meaningful without large folio support.