From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA0D8EEE26C for ; Thu, 12 Sep 2024 22:26:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C99A6B0082; Thu, 12 Sep 2024 18:26:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 379C36B0083; Thu, 12 Sep 2024 18:26:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 241FB6B008C; Thu, 12 Sep 2024 18:26:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 07F746B0082 for ; Thu, 12 Sep 2024 18:26:13 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B03FA80AC2 for ; Thu, 12 Sep 2024 22:26:12 +0000 (UTC) X-FDA: 82557520584.24.6952792 Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) by imf11.hostedemail.com (Postfix) with ESMTP id 9414B4000E for ; Thu, 12 Sep 2024 22:26:10 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=TiJa1nsk; dmarc=none; spf=pass (imf11.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.46 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726179965; a=rsa-sha256; cv=none; b=JZREnspWUh0bQyEqCTIuk7IykRC4+baUGni8kDsNGddsp9zuwrWwvfur8An1V2HM7/y/ws LPwdOlz+mTS7Pvd5+w3L3T1uwxZM/4w9twaJ1J2ciS+BGMYWOfCtM7JhPsVptt+5LVzSW0 LtW2le860WSw2tnSMGu60NSg1TEd+eE= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=TiJa1nsk; dmarc=none; spf=pass (imf11.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.46 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726179965; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QeNLQ4zvks3207JaJjctekem2eb/h7As09d+6h+vRY8=; b=oiHBdyvI/lFnRXZ5XxApyWO4yTBwvSs8X9OfC73JXgZB7pVww3H2njPE0p+yUttB2A+qxI cnuvR8Q+8tfYAbcZ1xZOX15VOc5UdptlfhCkVm2itPwAQMDGm5YZ216lAHJnLvfQ7AY7KU ha8UBM9dNIYCLrHZOmRdhgG2hlrlinU= Received: by mail-lf1-f46.google.com with SMTP id 2adb3069b0e04-5356aa9a0afso490976e87.2 for ; Thu, 12 Sep 2024 15:26:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1726179969; x=1726784769; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=QeNLQ4zvks3207JaJjctekem2eb/h7As09d+6h+vRY8=; b=TiJa1nskJMwgaalI4FOACB1FH3lU1qdCUuovtn3tsNaLZFDRS+pB4vkF11VGkmDAon VtVkdOvg2G54tkhB/Z5YCYow0nX3IS8q5bErGM2zljWurq72mZD7tg1sCBDVHtq18hfT kJYbGSINEyUdhuzAGDX6NnZjv5359joyraDkc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726179969; x=1726784769; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=QeNLQ4zvks3207JaJjctekem2eb/h7As09d+6h+vRY8=; b=xUIx6GPFapgGX+36eVMgQ468H/cdVKer2ZUPcgZcdfrvLRZni5v3hG1FbwplodPpN1 BpKjYPnjrmAkI0zy1m3GlVGumStmSKt0S1xr3oSyt8LBb5kYSppfkLxlysynfLf2KaFS tPFKW7pw7y4FJbfMcL3CR9vX+/baCkyrioIYBbqiAyo6fRjM9no8BHEXPOOOM9cF/dlO NummpwxzBWwwVvxsrCxC/qQBQqM7VwbFpCrXuaiESTGIx0TxSXu1Aqy+mbqTm043BD+y YBkuqSRQIabnMWXLf6fQlP8veEfvyLBwMH9o9ob7F+pTSHdpId5Cjk74fMlRuVISCffT jgIA== X-Forwarded-Encrypted: i=1; AJvYcCVSSrqRREZXyVjSkSsVlh9o+kF/CFFR7IN4qcFnw8UxueBMitbewhyWGeja2abVH5y2PlUYHy1+TQ==@kvack.org X-Gm-Message-State: AOJu0YxHNaGYOVbpVuJB9hgEMYGvvb41vv/TfjOU25Y3Rqu5M2Q+aVxT EtXM5/Eve1cdyuRcwK3YZayN/rObxkl8yhoQ9UGVyoTC50qRgpQj8E7jfwlCMWryppMFkVeL0Fu UtYmo9w== X-Google-Smtp-Source: AGHT+IG2ldwwKUw7vJT+bFYUy8DjMeMwVj4UYuBrWQPi9g/WWKgJjnHFdfbDpyuT9ku4gCTeb4cyCw== X-Received: by 2002:ac2:4e06:0:b0:52e:933f:f1fa with SMTP id 2adb3069b0e04-5367ff39cefmr421319e87.61.1726179968264; Thu, 12 Sep 2024 15:26:08 -0700 (PDT) Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com. [209.85.208.54]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a8d2583c20bsm793997466b.22.2024.09.12.15.26.07 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 12 Sep 2024 15:26:07 -0700 (PDT) Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-5c26a52cf82so126486a12.2 for ; Thu, 12 Sep 2024 15:26:07 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCXmCOxPMn73m9AX+oWSAeBE9iaGpm8vrySK5nkHvhO+p1/aDKu0smclFl8LZQRW59keJ96Iu9VYfg==@kvack.org X-Received: by 2002:a05:6402:43cd:b0:5c4:ae3:83bd with SMTP id 4fb4d7f45d1cf-5c41e193d3dmr666054a12.21.1726179967306; Thu, 12 Sep 2024 15:26:07 -0700 (PDT) MIME-Version: 1.0 References: <0fc8c3e7-e5d2-40db-8661-8c7199f84e43@kernel.dk> In-Reply-To: <0fc8c3e7-e5d2-40db-8661-8c7199f84e43@kernel.dk> From: Linus Torvalds Date: Thu, 12 Sep 2024 15:25:50 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) To: Jens Axboe Cc: Matthew Wilcox , Christian Theune , linux-mm@kvack.org, "linux-xfs@vger.kernel.org" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Dao , Dave Chinner , clm@meta.com, regressions@lists.linux.dev, regressions@leemhuis.info Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Stat-Signature: fpdstp61787qmcnfdg3byjcxoph4ry6z X-Rspamd-Queue-Id: 9414B4000E X-Rspamd-Server: rspam02 X-HE-Tag: 1726179970-641116 X-HE-Meta: U2FsdGVkX1/hUx/XxLc0REgJYhHB/+gl2WB7I7wbDtT/vv2DxvHmeuDcaojme/rBHNPgLEDaUZi6iKmrpYZMsURKaYgFzvI+XKjYecSE2l4wXvZ9ZqTxzRrd9e8zZ/eVtmAXqrW2F8nKAcpQ2+/jOjsx+xLkU1NfXEZ5P2/VGyAzykii9SOsMQedoAVOTfAJ7b25dBRnWBha86ZcpmP9MyZ1vNobKkFb3s4wS3no9Pt5NoVDjGsDkwWUjiWZsaVBgqTSWg67uzNp6JrdC8LSwZJkaCLEMxx57laXGO7F3yKwYkfad4tHH6gSQSRGwVUoOUrmlE1Xriu8xl8GObGv+DxMwoF7SjuJrblGvbhC3izVe8+tQbm4MXw/Hx9Xs3/piQ5p+RDBCqiUuOrqdqOIAIzOs7PJZFXdEj1swanWeCQIEPc5BVidb/p1eXjPgYN00ouq1xPCP2koH+w4qQcwtMSnYD5Wxgf2VR7Zfk8ij1M5uy3qp388N/GM1YAtCMX1t62zWb9VxIZgs6cMwHIA1TdyifZlion0ZgPru1L/7dWfWZlIxv4ZUmqT4go/2V4qmlmt75d/lJC0Emhg63aTNRjoCB3VYvJXHuAbzlA2pe39fPPq7v0tiTG3ouyAtREP5u9u7Y4VOgKUEMIYJyzpt/Se0W2GljZndsEZaI92k7K4ZhbJk9CBAwtn5lYtoYVMo0lAo/jtc6WVzORA14BZmCWSbdGsm6PzhGy6pkteojIzvlN3KmIpQMXQcev1a9bvyTCQABYuvsxLEDcllgL7BsFgX7fK4ltWbLaO04WDGzsRBvP5lSUQheOuLw4lf47VtM11EJB/zGdPygmcUeu2UWSbf3IuV+tJt87FfSVMlW6azE64PVumCDqY1IKb13OGLQSxhh/aUcMBKleDxWWkfCII3WvxROUYpg2Fa60X7ZQvvdGGOUSilXRhqErh+5WDoqVnyfT9VBqu9uXyR5f KmdUXE+8 DV8FW9n7JYTrw3aUARWGYL5oArAjA6AbOd2RemfQzb315a6IZ+D7s4GlQnWE27KOFk0eVkkw2EvkwaCN8fYKYwmmZ1w2ndxEQM6XmjZyw5IEZ0H3lQidwqxeM4o48VOFLdgumvDTLoaoi8lvBDdVCgp0aYFyQyyJLdPZhRLHDezcP7lbNLSXtATlk2+bt1mTDsQrVBF8uZYC6091sjqOURY3xehN7qe88LoN75yvYES1QXfYn4gbGvvAI2Z+jj+U5kyu4xF9PuX3SaBIQUmg9rW84wJXiCLFoYJu4ViHdfx1X/+mWXA4YjwaqEHhLpnpv5C2dvpgS+4Px5cHE1Mpgl90lXy1nOwefGZ4j2OLTGBotr0mldu0MZPALqPH6xiQqaAVMgyf7NK3JAQcN74CwUDWdPdNs/yYSKhTxla7UD95qecp9aBbbwrLStLDbQ9LaXt0Scik/5qy/xsRk3Nq0ELMmEBiphVyxIrZG3yiF7kqyYLBK3J20qT9ZppNg88GgPeZyIpE+yk4uyyHPDxIZf78tnYpJlm04GafdIXuiPkYj/t4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000013, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 12 Sept 2024 at 15:12, Jens Axboe wrote: > > When I saw Christian's report, I seemed to recall that we ran into this > at Meta too. And we did, and hence have been reverting it since our 5.19 > release (and hence 6.4, 6.9, and 6.11 next). We should not be shipping > things that are known broken. I do think that if we have big sites just reverting it as known broken and can't figure out why, we should do so upstream too. Yes, it's going to make it even harder to figure out what's wrong. Not great. But if this causes filesystem corruption, that sure isn't great either. And people end up going "I'll use ext4 which doesn't have the problem", that's not exactly helpful either. And yeah, the reason ext4 doesn't have the problem is simply because ext4 doesn't enable large folios. So that doesn't pin anything down either (ie it does *not* say "this is an xfs bug" - it obviously might be, but it's probably more likely some large-folio issue). Other filesystems do enable large folios (afs, bcachefs, erofs, nfs, smb), but maybe just not be used under the kind of load to show it. Honestly, the fact that it hasn't been reverted after apparently people knowing about it for months is a bit shocking to me. Filesystem people tend to take unknown corruption issues as a big deal. What makes this so special? Is it because the XFS people don't consider it an XFS issue, so... Linus