From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C1A9CDD54E for ; Wed, 18 Sep 2024 17:13:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 76D966B00A4; Wed, 18 Sep 2024 13:13:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 71E4C6B00A6; Wed, 18 Sep 2024 13:13:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E5246B00A7; Wed, 18 Sep 2024 13:13:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3F22A6B00A4 for ; Wed, 18 Sep 2024 13:13:09 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C3B59140730 for ; Wed, 18 Sep 2024 17:13:08 +0000 (UTC) X-FDA: 82578504456.13.3893A0B Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id D9BA340012 for ; Wed, 18 Sep 2024 17:13:05 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=EqP2DSYW; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726679555; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+NxubRC8s/p94iaKYGIJ7mmF/q9E02cnviGlZvwUxeo=; b=TnQTG2HfRyQxwf6ihELXOmpjyqtKNQo0aTDOhYeJpCR/7WAih/AxSAYDlt50/YUBt586Jq 9eD2IPUh+Aorgpi/Pw2Kh1y+ZfovuMNyZ9DmLpBumnKcLP8EwQvj1X7Yx+b0iIeD0UbBCp /OFYskvdj4ypBvTxMPP2HzsMiOGMrYA= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=EqP2DSYW; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726679555; a=rsa-sha256; cv=none; b=7P/tvXrhFQeQLPl1IniqhdPIVUSX6oa2ttkunLDshpFgILcXT+Gb17S4nQeNeZ+u0R8arZ clawzhMUQuucJC17A54o0c2rvFThgupBraJaJPugEupXK9tocJOV4iocDixwHfOogt1L9M Psa6l7rzDymIROyW3ZjPGTF4bGvG+/E= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=+NxubRC8s/p94iaKYGIJ7mmF/q9E02cnviGlZvwUxeo=; b=EqP2DSYWNYqpMGwTLOV0R4PctG MlOp1T34c76BjT6rkvYxwvZXirWtpLsZlT8I6HJ0vBWWjX2aBjHtz6/Vd5tn4yt4tA+oE+bD5kobs eFfPH3suRCi856xD9ijv+VPuvCLK7YWFCMoQHz49yWJBivmJ0NgTsovEDNC/zKpKNyPtTUffC7aL6 oWG8aG/xh5j9U2vRJJAt/ODAuBHpruWMns2k7qel/ltIkcg7eOwf6oF8+XjvHomQ7+FrC67ClYKvo yH1UuQ2I6CPBc/M5DSLxUKYCzOB9E3tOi+3tD4Z1yyi+kOUdLBtKgJb94A8E5yvkvhss9x484kfQZ V5PFrC8Q==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1sqyEa-00000005scz-3Hkh; Wed, 18 Sep 2024 17:12:56 +0000 Date: Wed, 18 Sep 2024 18:12:56 +0100 From: Matthew Wilcox To: Linus Torvalds Cc: Chris Mason , Jens Axboe , Dave Chinner , Christian Theune , linux-mm@kvack.org, "linux-xfs@vger.kernel.org" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Dao , regressions@lists.linux.dev, regressions@leemhuis.info Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) Message-ID: References: <74cceb67-2e71-455f-a4d4-6c5185ef775b@meta.com> <52d45d22-e108-400e-a63f-f50ef1a0ae1a@meta.com> <5bee194c-9cd3-47e7-919b-9f352441f855@kernel.dk> <459beb1c-defd-4836-952c-589203b7005c@meta.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Stat-Signature: b9judkm4ijxqrxhcdt6soj7gbze7181s X-Rspamd-Queue-Id: D9BA340012 X-Rspamd-Server: rspam11 X-HE-Tag: 1726679585-763631 X-HE-Meta: U2FsdGVkX1/ZW5aZhyYjtOlX1zxQGYjp6bwMkW7OqhIDSKF3WG+sRHkLvXGdAhxqgEE9eL+zS9v1yX1jadsf46hbyQMsGisBCuSYOlWbnEGe29YJSROcO/mYbXquyqcu7z0A2o3mxQkJiY8fO/EnYlITRj33hkPWEkCJsnYWB06NfVyGvTMmQlIMaj34NgqqOoxFmiYaSy6FuvuptepGo+gVAIcpRRtLfmjbgCRh+MuDNzTCFlQ9vEQW1mEpY88ga7bnXqq1y2B/pbeX9I6NCYexEnDzy0jamogC6uLA5kxgUVxn8ls8EwPwdZ8yyXa1ok0oSxdKaRb+4u68GgIEiEiJYHYDtcG8/K+VVXWEBON0ATSTWoYGvudBjpOBLLo7fslJqeFnAIU/SeEdKZnenU9EEqjoFHCRaToZMmz+GJo8zUISSH67n+Axw4qesxpj2dvc3obifSs1dUjzU5VruThxZCe7FSS6t1B1dVGHDBqX8dXSKwcKA5gV5lMEdLKGr9zL/VwkaLCWuPHO/Ia0f45y0Iy9EhM49RgfsEZ4gz/kxcZWaWXJtp9oAKli/ez3Yf+c90SHWhiRMh4VDcE9/C3lYmb+szQFQJbIsNBX4I5cvOwLkUtzM/8rGV9gri+lXnDGq2VGlGl7LpAA5cHKYvalbALxWTpvP6dy0S7QxR0jetxBzkQCQ9u4be3rUj16NzEz0kRot9PItz3rTjeLoood/cQ6b9M0EWnSIJ8CpIR4pIB9fyb8XZrokvvsm+PgMyoby9xwFvsbNn14meJM5F/ynwulXnLUSmlthLnaY7G2rMpmGCxd3UWq1PgxhESSWc+7YZP5EjA5CG4/Xw9uVYWhE2fPBZR3DL7g5YfkpIrFEnQxhBcS6LW/kOPGJV45fETHFoDFvs86FBUb0pFhddxUTj5UZe41w0fbArLPcXCtPbXuBrZZwF7fsENjJ7J/WpSvBklk625X19L528t WrjNRd93 H62sma92OAW85VNUyHHaBbEg6yjk2C5alV1aucR3BkjfpdIxNJbJmcRypJ3bSztlup1/ujyF2EPEJ5BP8ta2OThArbexhTWdVHxddlHociZPEz64SLIFpTagByEYndkJ4+925RnbSRVrJoub4VTTxzwGQx5YCWZXuekDq17iSI3pe+K1mtbwazvhTUHIIbgQs8qBm3Rp84Ta/YKAS7TwBmqNIXHq1Nd90/HnPZsm5lNoaZR3rQJqK1DQT7A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 18, 2024 at 04:39:56PM +0200, Linus Torvalds wrote: > The fact that this bug was fixed basically entirely by mistake does > say "this is much too subtle". Yup. > Of course, the fact that an xas_reset() not only resets the walk, but > also clears any pending errors (because it's all the same "xa_node" > thing), doesn't make things more obvious. Because right now you > *could* treat errors as "cumulative", but if a xas_split_alloc() does > an xas_reset() on success, that means that it's actually a big > conceptual change and you can't do the "cumulative" thing any more. So ... the way xas was intended to work is that the first thing we did that set an error meant that everything after it was a no-op. You can see that in functions like xas_start() which do: if (xas_error(xas)) return NULL; obviously something like xas_unlock() isn't a noop because you still want to unlock even if you had an error. The xas_split_alloc() was done in too much of a hurry. I had thought that I wouldn't need it, and then found out that it was a prerequisite for something I needed to do, and so I wasn't in the right frame of mind when I wrote it. It's actually a giant pain and I wanted to redo it even before this, as well as clear up some pieces from xas_nomem() / __xas_nomem(). The restriction on "we can only split to one additional level" is awful, and has caused some contortions elsewhere. > End result: it would probably make sense to change "xas_split_alloc()" > to explicitly *not* have that "check xas_error() afterwards as if it > could be cumulative", and instead make it very clearly have no history > and change the semantics to What it really should do is just return if it's already in an error state. That makes it consistent with the rest of the API, and we don't have to worry about it losing an already-found error. But also all the other infelicities with it need to be fixed.