From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21CF2CE8D49 for ; Thu, 19 Sep 2024 04:32:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B07A6B0082; Thu, 19 Sep 2024 00:32:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 439146B0083; Thu, 19 Sep 2024 00:32:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B34A6B0085; Thu, 19 Sep 2024 00:32:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 03F5A6B0082 for ; Thu, 19 Sep 2024 00:32:43 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2502CA0172 for ; Thu, 19 Sep 2024 04:32:43 +0000 (UTC) X-FDA: 82580217006.08.E6741F3 Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by imf25.hostedemail.com (Postfix) with ESMTP id 03A11A0002 for ; Thu, 19 Sep 2024 04:32:40 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=XwSN2ct8; spf=pass (imf25.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.53 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726720248; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l9yFEfRvpp3RBfmF+3PJrVEwQqIIGgrTWijYjV6RGF0=; b=F+sZ3R+9mgjq0g9Getx6kz7va7+p0tUMshHO6FenMN20YrFl4BcQYRlTYqbHXhIlr47on6 30hDXnG3iZzwWNBIPVHPZWBqB4y1GTDFJYG7JrprFtWYfR+ZuhKADV84qr7Tx+GwrD9KGj /ZJW60+H6TXDOiDFvI8oj1TFkvk5J6c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726720248; a=rsa-sha256; cv=none; b=IY8PUSfj8m/C9mftmmCds6iR0z5kw7rVr/CcTHjUQCxUqwsYvOSco45H+QIJArmnc0xr4R xIQQ7jj4E2rm5Xjv+YHRETyc3fjXWFy9DgMst9cYNbmyQMNXWFhTR7zy0WJCLm03WZ3S3z mjdO6EtApxZHxFL/r8jpUd0Rrr/hYhg= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=XwSN2ct8; spf=pass (imf25.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.53 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-5c44e1cde53so451928a12.0 for ; Wed, 18 Sep 2024 21:32:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1726720359; x=1727325159; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=l9yFEfRvpp3RBfmF+3PJrVEwQqIIGgrTWijYjV6RGF0=; b=XwSN2ct8og87by6GRonxeIB7BhjbujMhY33KyPeHEhmivCupfhO2iQbGwSVjBX7FWe 1XHNx0EehZDSrhgna3iSTwj8dKdk4oMF8IcW5oFS2cwRq+PXi/BQ4f378hywlZ1YSwTc sufzsMeXatOjR1kDOvtTk579NSgiPkYXGRxIU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726720359; x=1727325159; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=l9yFEfRvpp3RBfmF+3PJrVEwQqIIGgrTWijYjV6RGF0=; b=PgOFkLUBinilDA2w46QhkNXV71qX79iTbS7cqg0nHryhgfcrzcHDUqsDqgVamN2V78 RRyxBUots3jtad475rcdzCZnIKaqiHVidL3LzSZ3Y2rFHnVW9qkGH1BRQoOPucaWrJcf vRCZId9c4e6QmZ+seMmXw1/YLxbi6ODdBi2rvvAOhf0i3FfSBVJBJpwlkpokqfSiBAnG W2/jDI2Loszo51k0WoZZtp8cWMkVB4O3myEwunAc0a5AjZuy1bkreNZfOOjqtiLatj2p QDRLVweJmCbmgsu8kiVoR9DdeETQQbpETMp6xxFW9Jy7QfnHOqdMBQxxnGssiHRTbWTa UNJA== X-Forwarded-Encrypted: i=1; AJvYcCUXRgL4xlfNYXahJU012tDpP8DD6B3ewGz7WBk8JWWXoEMGWLoSbFhjFXVuF1k0kBwhLX/M4Bd06Q==@kvack.org X-Gm-Message-State: AOJu0YzMIS4LAG5QICOnTObeA1cazunxk3cIDu4NS3Z9piaAx19hrKab AZJQOh9Q6NfRc5oTvMU7Q+CF10np57BGIFmaqBUfy3HcPOpLkkurFE8zx/7EzgRqumWJ7O2wgms /6nZVlQ== X-Google-Smtp-Source: AGHT+IEjOsdNdbVCeVCNf5A7ZxGX1BmiryF8+M+ezV1aPLMz6SgTGCTX+ozLqukKQHzmzoxv6WDDZA== X-Received: by 2002:a05:6402:40c9:b0:5c3:c548:ab3a with SMTP id 4fb4d7f45d1cf-5c413e089a8mr20070110a12.2.1726720359132; Wed, 18 Sep 2024 21:32:39 -0700 (PDT) Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com. [209.85.218.51]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5c42bb89d44sm5645164a12.70.2024.09.18.21.32.36 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 18 Sep 2024 21:32:37 -0700 (PDT) Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a86e9db75b9so46451866b.1 for ; Wed, 18 Sep 2024 21:32:36 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCXB9xff0NR3sQa3liETeHxjNg1JKD7ZcLbacAsH8F922MZsNQDi02LjSoyXNZ91dqFGs/+Mw4zi9Q==@kvack.org X-Received: by 2002:a17:907:f766:b0:a8a:8d81:97b1 with SMTP id a640c23a62f3a-a90295a2171mr2279810366b.27.1726720356520; Wed, 18 Sep 2024 21:32:36 -0700 (PDT) MIME-Version: 1.0 References: <74cceb67-2e71-455f-a4d4-6c5185ef775b@meta.com> <52d45d22-e108-400e-a63f-f50ef1a0ae1a@meta.com> <5bee194c-9cd3-47e7-919b-9f352441f855@kernel.dk> <459beb1c-defd-4836-952c-589203b7005c@meta.com> <8697e349-d22f-43a0-8469-beb857eb44a1@kernel.dk> In-Reply-To: <8697e349-d22f-43a0-8469-beb857eb44a1@kernel.dk> From: Linus Torvalds Date: Thu, 19 Sep 2024 06:32:19 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) To: Jens Axboe Cc: Dave Chinner , Matthew Wilcox , Chris Mason , Christian Theune , linux-mm@kvack.org, "linux-xfs@vger.kernel.org" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Dao , regressions@lists.linux.dev, regressions@leemhuis.info Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 03A11A0002 X-Stat-Signature: weakgr57ufetmb4e6zw6ibyzotmtirey X-HE-Tag: 1726720360-82588 X-HE-Meta: U2FsdGVkX19ZHmpsr+TJtxq51PECnG0HhSwz99esJhwfcxC+QA1z6Yq2mz5z06MJs+YagXADQmdkBfDaG5v0QQyC50PvgVFDcZ9Swwfz5sVlKazaD1ae7UU555bJAagZM2o3arb/II4xxOedPO1cyUREgzYlc2oo5/5oxpavVAKlkyPvws1J/rq8Fm4mQZ2Kx7fTmYAdCbpB2/pEduxSgU3WGqwMjuNFnBGPKzEgubRj9hbZJ0Hu1b/jazmhOQInJtIrZYsvJUnPegi+qsvTepdyag979ipstLt/Fn1HI9DXLq0xEWevZGRRLEsZWCAwmSO7u/A83LaBMPbXFtiSHye6zVKzsWQOTf5WgjDEUW8cbTg+Teq5f3WnDRbVn3B9OLyKSz0V2hEpX37cWyJZGl0fO7RKv9H20sI6E/j2bB7b9f+mL61Iueg5WuZglGao07ZG8AK6rFWynoBM+lCR7C24a3n65iMESWDM9y3x6Zj3FUyFTjWr4AqSoe9qO2fDa5fhUiEBq1MXMEd1KMFuFAE33SVMv2I+N2emr2Vwt/pgfDP6SYDBkB0uPBNi9DL7U+nCKNHEGWAXHOjCMxqIeeSPEWqHN+Ll4/HCuotQu9uoBueSdNRDR5ODJ2gNGg9REWXm7kCrXVW3PX6D3u83X420ob+3nFygXBXHziyW6PIulZSnZ695/20X8FhEu3ygEFDJp1C8GGcNUvjWZql4kF2HJaR3ZhPcVZPZ1oBHhgatf8vvElcsmu7tcNbmFbbt+5G5j3KCFXd0SAEeFX2pZF0N8Hv6s3FTGJIfhKdxffij8IHibPXO36TANxn5YJ1aRFZH3hgVolmeW5/HLIIgTxWiFcUb4NRwjenJVYqiadnGtCAr5HEK+onPIhwYN7tImyQA/mJ6Aslm5a16fWFmQdyGj1PnzCFiSDOcZm0P0pbTR89i1XbRJH6h0mGi1wLg3USHugjhuVS30BUgWmZ Fxe0yDOr LiYVFcS2jaan7p5Md/z7RDtRYJP/LcXgH+2HGVsCeFQC7ZAApnEKv4k5UbkgNjtzbk8dO+zM8UFVXgVW4OfA7Cl6ecRtPJkb/RaY7mQgY+qYJCeRqFB+Cps6vd5oAkDx4VMSXVt1rtqHK15NQxPRpL1lLhTkRHobcpvBzWCKk3zNrrhBfMe6k/y7+fdtoJodCh+o6XgFnghKxZY8usxg4qwGhPlWII9odt/wS4b9YJBJN+gP04X07IEgV3HpSKmY7UstqujyPThlthFACw3lgUJTsA1VTJc1SO/GWOIFryzRNefm4VNAJiUVAPtD5gWkLhCqgOdSxrcn70I9PFnpnhd1Y7+SvyqrSvk3+cfrYL0tnThq4TbJ02m35/dhRE7RybR+peBrlsEGiT9FUPYVS6x3jlVGQ3iAKCE8RsOzTVAMX0qvaYJz1KWAJuR1IPPYGkUTQW8dtLnm6p1eh2hXtQqFbJdvNRKhbT9cWNhxoXMBFe7uj6m+FwWJjml3dE9bl9G7zid4p1MsgmKg9Fl6OTARWEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 19 Sept 2024 at 05:38, Jens Axboe wrote: > > I kicked off a quick run with this on 6.9 with my debug patch as well, > and it still fails for me... I'll double check everything is sane. For > reference, below is the 6.9 filemap patch. Ok, that's interesting. So it's *not* just about "that code didn't do xas_reset() after xas_split_alloc()". Now, another thing that commit 6758c1128ceb ("mm/filemap: optimize filemap folio adding") does is that it now *only* calls xa_get_order() under the xa lock, and then it verifies it against the xas_split_alloc() that it did earlier. The old code did "xas_split_alloc()" with one order (all outside the lock), and then re-did the xas_get_order() lookup inside the lock. But if it changed in between, it ended up doing the "xas_split()" with the new order, even though "xas_split_alloc()" was done with the *old* order. That seems dangerous, and maybe the lack of xas_reset() was never the *major* issue? Willy? You know this code much better than I do. Maybe we should just back-port 6758c1128ceb in its entirety. Regardless, I'd want to make sure that we really understand the root cause. Because it certainly looks like *just* the lack of xas_reset() wasn't it. Linus