From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0A9AFC6172 for ; Fri, 13 Sep 2024 21:24:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5FB536B00C2; Fri, 13 Sep 2024 17:24:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5ABB66B00C5; Fri, 13 Sep 2024 17:24:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44D6F6B00C6; Fri, 13 Sep 2024 17:24:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 23C7D6B00C2 for ; Fri, 13 Sep 2024 17:24:28 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A49711C52D7 for ; Fri, 13 Sep 2024 21:24:27 +0000 (UTC) X-FDA: 82560993774.07.02D2CC3 Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) by imf10.hostedemail.com (Postfix) with ESMTP id 821F3C0011 for ; Fri, 13 Sep 2024 21:24:25 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=YlTpEJKn; spf=pass (imf10.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.180 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726262560; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xRuFfVQTrS2LaFQHsAbqiDg4XIXZJlnGtDFvh2VKrJg=; b=chsIogT0OXG7Co9wzfwCJXzDQdeXb1KEDWVk8RhqCrfr17DReeEU4d6FyD9EAfyCH4gZv9 yHJxXAK7VDpdbE15/zVrWRqzEJgcX1H3HspQnhIo+G5yUZ12Yhk1ihK6+hgj/rt5jVGbwE Y2VdAEaRXHVwXcQ1MgAu5pqBVHKoMqQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726262560; a=rsa-sha256; cv=none; b=NztGQy2K+CPR8djmARIxarFZBMWx4pcD6Clvk3antUd5wAqrSVrWiciYYK3soekrg/9ygO 9MkLzRnph3dVPlyOkeRb4B4kz1gAGD88yMhc1T0tGKoAKlDEpVK1vFHFP8u+U9XzmCafWa Cy6jqGJELaclHXh0fhrSIp+zn9whmQA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=YlTpEJKn; spf=pass (imf10.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.180 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none Received: by mail-lj1-f180.google.com with SMTP id 38308e7fff4ca-2f75c6ed397so29530431fa.2 for ; Fri, 13 Sep 2024 14:24:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1726262663; x=1726867463; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xRuFfVQTrS2LaFQHsAbqiDg4XIXZJlnGtDFvh2VKrJg=; b=YlTpEJKn5SJzWtDy4F3kD8yANrzxckkldD5pU5TtJQfK4GLRhL27Pn+E89+QkRrdK/ dDa4b3J1Ucg8GoWEykbOqzLFqnaGtXeblQ/BxTe2bnmB3yj6Pwi8fIK4MrtYuHICuAM2 leSTKN2pyK/hhcHinJJPFlbVA4qayEUJ3JFco= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726262663; x=1726867463; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xRuFfVQTrS2LaFQHsAbqiDg4XIXZJlnGtDFvh2VKrJg=; b=XZFGwlbfGMeny2Eq6KPs/wN6S4AD7l2YSQBFpA2TDc3cnsKtQTUcZSQxJlfgW40KKS 3tFZkajN5wHZKV1pLEJ3D28FrR9CSuw9ck1KGVpkK9V/PADR/s3NehdqvGXARHNckIYE DRr2Y0i+vMOpJcJIB72wzx7IPs2c6enlaoxtM9meabWWRdgO2vRHi6wTem36fM5voCce 2Nh6tFNfVd8RpVXepnic8KVnlQvQshTjEsma0vUniX0gDvvlhhTdWm2BmhletlZ2v+wZ C/OALtff1bZmxGXRmOBMjWeLa2IepFLGwsxy5TTM0X4tT2E6mEH/ve1Mzj2ucVJTaBIB c0Qg== X-Forwarded-Encrypted: i=1; AJvYcCWH6pAHyrf2xu8OBt+mimKMsL9gTvJW7B7sH7/QXBr/gQpMN/TSydAAi8rLB9cOzQ5rOZRBG9e/4g==@kvack.org X-Gm-Message-State: AOJu0YxyD2uxzYBiWNXaZG+6eiDcGlX0PoyrEX51xJqkZadXhTMKjXo+ RzJqe6Ja9/sDOA5YYECIrzQCjhMbwxI9yKYy8LSsHrrbqz/gNX/kpemMi+SMdvQ93/T0mHobWhD L92M= X-Google-Smtp-Source: AGHT+IG/WIxHWNchkbyaMNLOCnBXF9Smvu39+owmOpW69wa5xS1FENq/petbHb3TdhYqmHmUH1N0Mw== X-Received: by 2002:a2e:bc25:0:b0:2f7:939f:a49d with SMTP id 38308e7fff4ca-2f7939fa820mr23720851fa.24.1726262662792; Fri, 13 Sep 2024 14:24:22 -0700 (PDT) Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com. [209.85.208.171]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-2f79d59b9a7sm190001fa.132.2024.09.13.14.24.20 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 13 Sep 2024 14:24:20 -0700 (PDT) Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2f75c205e4aso30663321fa.0 for ; Fri, 13 Sep 2024 14:24:20 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCWNyK+p1k7VAAP+qD2Ox0vmYrKKnhN03pjz4WsMBsauGGS59CexUxqnJawHM6W77t1dnDmrhVf+rw==@kvack.org X-Received: by 2002:a2e:7c0d:0:b0:2f5:11f6:1b24 with SMTP id 38308e7fff4ca-2f787dd0941mr36494131fa.18.1726262659861; Fri, 13 Sep 2024 14:24:19 -0700 (PDT) MIME-Version: 1.0 References: <0fc8c3e7-e5d2-40db-8661-8c7199f84e43@kernel.dk> In-Reply-To: From: Linus Torvalds Date: Fri, 13 Sep 2024 14:24:02 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) To: Matthew Wilcox Cc: Chris Mason , Jens Axboe , Christian Theune , linux-mm@kvack.org, "linux-xfs@vger.kernel.org" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Dao , Dave Chinner , regressions@lists.linux.dev, regressions@leemhuis.info Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 821F3C0011 X-Stat-Signature: c89pfdjfffe7311kdp88kskburaa5ieh X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726262665-267512 X-HE-Meta: U2FsdGVkX1/ynDbLaPEsyJuGma8yGi2mjH6YRYdPxXi1gva2KTT8y9SfskC4k04IieY5ASfGqc4PHwV4dQ57b4i0hs5/qhGm3gw2WmvINfAMvenOZmgaieSsuERiqYQfxunwPlgv1XDv0SN8ULSiAlH5XbGSopW+/41dHr7MBItNVQDhw4EMvjNDR5bJOoILTiVxaA7vkFLSM//jyN86nZRwo8StqeuQdyfdGVL6JZ8ed9yVMdhq2lWwEuNJ757K2sAanVIDpZJi6xhG1QfSweTuP0Wg8Uhv3qNsKZIeD4sdjqtHly7MBuQ/EprvPVhgEreSlN2KSBdBdc+kQ5d/B16DTRjw79WFix6Dzea+PipfNnveGYG0kHhDSua8UJOjUy6klNNLbeIObot6r9TgV3OvbXBr7MFdLofM/lOE6D0rnUdnEKjoXttyhKya/C3koRh8Nu4U2Nrk1c1HxQMGH2vXsKTAR0kdYyryt/Z96IDBUizIgFpxNvvG6FJG+EWuZliYWyw17Kon/fgNrwz3kac9eC7S9tYkua24Uw40ZIyDTShYeF9g492uuvJ7Da/9JieFfJlvOxRf+tTp9fyKrwi16WepdLvbFN5754eFTB5+LAtyWhQwfOGH1cWxGYwPSwE1lCS+rKk3xtroJ55Orh2VmcyZNQR00Ef6iGG55/TaFXrENNCD9Mv1wZBds5pgOdS/cahRLU1s6EytVTDsXUWmX1da/Du8pJT2zfgx1cIh3Paxo9d90yVHEt8Dmpqq9Udf/Dp+kAf0b4JZ2WPfXEk+wEhaJNLeTZT8mpjd56R2fGanTSaTQO2kgpU34fNuIoJxYM0U6+rVrcaWKZDeDx0JGyiM5jaAJ4TJpM6HxjqADvB2QIL7Dg7d/oOUFOnbFNqrV2E9bRbSfmDUxCJdyvdMXXm5mvXH3AT2qqJiPIVeCGzAUrqL/hOYom8m18J2MBIdr84TiA9uLm7eMkr sOxN0QZt qxgceGil+NN4ed4/3uUR5arhsGZc0yP0hfxafTNWJf3H5r6EdZSKhJoZ/5T3ZRA0/JlCBP03jBfIan2G1JxCx2yjbxLDWcQVK6CFy9rj8eW7mhrAJ+R1HSvtq9GCizB7s9EAgUJKcvD83OLw+0/zLH5f8zxykEwf6ZJ/QHo/0Pd3y9Fej+lRilVzGrqJ/iqDdVwzSaHcVDSmqiDfeYjc1yZ0OzXQfotYgr76Y1ljYBNfY6rp6Hd5iabfilBfMgPXeGLxXFsbS0lPn8+5GHRU/Jjk28Vo0pN7Z4TAwgaNaDR5TLSD7MfzV8Nd4lVuuL+mpjN8UYQUpAf9EHr0xgc08x8rkF3r0TdRqWJBBSn7l/oYDcJ7heRmcVB5o/S0vzJmcZopUzcNkJoCr5sTDoWovZWIPcB4QZuZH60tXK1tRD5S3p78qaGSwMP2Nyt+3f72mkNbnlwUbwiMtv9gz4H1jRSWazTMw647EAAD56gF2QfY22UGte5E880uhgh5t1oIIgDhr3WNAxyBlwOmdjfG0iNw/0E5MpYLqclHZoqcfvl51v/aKC5r2ke344TseywFL3UIJpDgxRopH6VG5NxkqHMV532dPN6JVtZJ8wwPaQxC0ynI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 13 Sept 2024 at 11:15, Matthew Wilcox wrote: > > Oh! I think split is the key. Let's say we have an order-6 (or > larger) folio. And we call split_huge_page() (whatever it's called > in your kernel version). That calls xas_split_alloc() followed > by xas_split(). xas_split_alloc() puts entry in node->slots[0] and > initialises node->slots[1..XA_CHUNK_SIZE] to a sibling entry. Hmm. The splitting does seem to be not just indicated by the debug logs, but it ends up being a fairly complicated case. *The* most complicated case of adding a new folio by far, I'd say. And I wonder if it's even necessary? Because I think the *common* case is through filemap_add_folio(), isn't it? And that code path really doesn't care what the size of the folio is. So instead of splitting, that code path would seem to be perfectly happy with instead erroring out, and simply re-doing the new folio allocation using the same size that the old conflicting folio had (at which point it won't be conflicting any more). No? It's possible that I'm entirely missing something, but at least the filemap_add_folio() case looks like it really would actually be happier with a "oh, that size conflicts with an existing entry, let's just allocate a smaller size then" Linus