From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 312FDCE8D49 for ; Thu, 19 Sep 2024 04:46:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 971456B008C; Thu, 19 Sep 2024 00:46:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8FA456B0092; Thu, 19 Sep 2024 00:46:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 774266B0093; Thu, 19 Sep 2024 00:46:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 50F0D6B008C for ; Thu, 19 Sep 2024 00:46:13 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BCBB61C50A6 for ; Thu, 19 Sep 2024 04:46:12 +0000 (UTC) X-FDA: 82580250984.04.B89A235 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) by imf29.hostedemail.com (Postfix) with ESMTP id BF4B9120007 for ; Thu, 19 Sep 2024 04:46:10 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=isWiow+G; dmarc=none; spf=pass (imf29.hostedemail.com: domain of axboe@kernel.dk designates 209.85.128.54 as permitted sender) smtp.mailfrom=axboe@kernel.dk ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726721079; a=rsa-sha256; cv=none; b=b3R1e/iUN7aMF2E/wrZZxHWezqafBep5VnIgReJZG0g8FnJmecHvaNo/vq27gqhiQRjlaP Urp8Z3ammqmshOKd4LVEyDuCBfKiyAgR7KixG/+KfLA4JV9djixV0VfbumYpVqiayGqxxX C71fsksujUdUMzqSoqaSLpXMX+znWS0= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=isWiow+G; dmarc=none; spf=pass (imf29.hostedemail.com: domain of axboe@kernel.dk designates 209.85.128.54 as permitted sender) smtp.mailfrom=axboe@kernel.dk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726721079; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VKK0K7oF92qoXvqUn1Ads4k++5P61ZhpT/0tIxiVX/Q=; b=2NrS/tNYuBRvXzSGZ+iNE0NB9sOZejTPlSBhpDfmV2zsbF9Mx9aUs2I1IjHkEmJZuvkv9q 49gmCT0Csfsb/4UMSHWa7Ucr0DWz71b/WFngDBar7tTiE7JPPufGOfuQEHAtPkuH713ToM 0FnpS7kb42guhSnCRCxSeVhAjo7o7QM= Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-42ca6ba750eso2420655e9.0 for ; Wed, 18 Sep 2024 21:46:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1726721169; x=1727325969; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=VKK0K7oF92qoXvqUn1Ads4k++5P61ZhpT/0tIxiVX/Q=; b=isWiow+GQ2ViYdwbsazh8Xp1oWFK3B15DzOJP/3ORgiToob+N2nyMW4iDc0tp1XV4b q8TyPzPrlAKI8YWPQMlvd6RIvbK8SAqv3G984Q1P3qU+YG6eOcR3hFZxZsOsGvP+Rc3c gn56N8clHB/VONUGkjETYGlC+bVN6fcitjOt2Oggb7yiemNObA/2H8JJXpR8PRMCGAUc MX36wzhJpSBK2vHFcBew9DseFxmXq3qCck+nV8X4uugX7CuSTZQKA1LflX/WfU+mMOjv Z8TaKx+5yrK9Ah00cl7GHzjKbGdFXzY1On3MyjIk84iVkDB+36b72j7zkxTmWZOuUdgJ CyQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726721169; x=1727325969; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VKK0K7oF92qoXvqUn1Ads4k++5P61ZhpT/0tIxiVX/Q=; b=DTJdWzihSjdoYBohQdqUQ59YC0VMyZGgZ4k5hl7KRs8MaPlFRBsTsUBJ3e64mBl6Mg g2QevpLuIEm3NEQ3gn2EDlyVhWBoiqInJ38B/QqHbgT4ySZMofGppgTFWM5894iIlzbd Smc+mH06XcL7kP6jZ7WKRmQ3r7mfAErbnmUUDHKqsQO/Fk5/31ewa/cSfmKWCuPlPX72 O7Hw9aWEH4pnsjXDEPivL7d5JyCcCm+DM2UavDm+K1RjzczRTGMW1HmW6lkkJphbJMEf IS4tw+/hqcS3sRYEsdbhtvJbkKwejI6ptHKupa/kZVKK+a0da+Ld+a5OpzA3gzDlF6Fk qzpw== X-Forwarded-Encrypted: i=1; AJvYcCW7u5hZhsSXoXih/FEkz6Ww0pr89HKwROvEaL8YFaVAEkR4Lw5bPi8qRC0+MIXwBmSThhjj00XL2Q==@kvack.org X-Gm-Message-State: AOJu0YwTccC68fxQ5biIZJedv87k+ZW5Jx41sHA4TmTb4JWfZSSve5cP lpMlzPpTM41IXiQvO1TCsctlsrLQPDwqErc09U05g11xsIJdIlFqSPOSsd+CBqk= X-Google-Smtp-Source: AGHT+IGHHVaBDfu7IHVrU1p7r2GdHKFQUmAh5YFVoOfSzfC5RVAziHBMEMJBUvybKuiAcnw/DiY4EA== X-Received: by 2002:a05:600c:1d02:b0:42c:b68f:38fb with SMTP id 5b1f17b1804b1-42e74417444mr9041015e9.7.1726721169038; Wed, 18 Sep 2024 21:46:09 -0700 (PDT) Received: from [192.168.0.216] ([185.44.53.103]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42e75445a98sm10823905e9.28.2024.09.18.21.46.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 18 Sep 2024 21:46:07 -0700 (PDT) Message-ID: <9e62898f-907e-439f-96f3-de2e29f57e37@kernel.dk> Date: Wed, 18 Sep 2024 22:46:06 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) To: Matthew Wilcox Cc: Linus Torvalds , Dave Chinner , Chris Mason , Christian Theune , linux-mm@kvack.org, "linux-xfs@vger.kernel.org" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Dao , regressions@lists.linux.dev, regressions@leemhuis.info References: <52d45d22-e108-400e-a63f-f50ef1a0ae1a@meta.com> <5bee194c-9cd3-47e7-919b-9f352441f855@kernel.dk> <459beb1c-defd-4836-952c-589203b7005c@meta.com> <8697e349-d22f-43a0-8469-beb857eb44a1@kernel.dk> Content-Language: en-US From: Jens Axboe In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: BF4B9120007 X-Stat-Signature: thzk3of8n9gtn7x1w7wq7yjmury14mpo X-Rspam-User: X-HE-Tag: 1726721170-343134 X-HE-Meta: U2FsdGVkX19xmkkcXW1tAyWeNp51zczoqZOBI/B+HkyddQfEEHuL98XY3B0EG2ucMX/JnKks8Yx9vWQyqKl/t+7WK0SYlLY7/05p6+xGgG7yTnEijNE1Adr4JbKgN4wFnrj5fgsyX3MtpirMUMAO13UWyWwRcpU/n30zXGmNgP9JRJFeY52UnKrqAP2maY0uB/s0DcbSM7GMqy1cAVasP7FuQst+XDKAQw6cLxqUtIFBBdqRI/VSpoBgIBQj/dOlq6sW/1NWxKhwuM9Ddsi9pEVvutAQpqaNzEBHrgWvo7tp4m9pfAtp1DSAP5yNSS5WF6ltyQiEL15QvGKmBZJl2umfcM8hWvazmxoXrEuc4bHuzQ4mm8wPOu2m+kK9frgBErWP4AIOY2OcO+X8dr+vAHb2HCauE7e7viClRHRsWmUY9BEIzfZF6Kw+/W6aLAmZBtOlbcGGDmwOA1P1ccTw/BGdISy+/PNW19Gg9Xraq35NBhTH8eJDIbDo35ApbTuoaT/9PQnYSfnguOXOVQ79H4kQ3aeu1YqEakx3Y1m2zhHAndfalRwBnrEhmrCTeaNHGGXuxRhYEa+7iJVuSWMFxk184XVn5eHvl4ZGqc29Uvq89UEKqnz4kZN9ZUkL8puAsGJW7x8fFtl5CZZ1dOK5U6SwKJy5sJNuJiveNoFfYl+vFRBBP+KXIknEoVIBh6XhwUwAzzzjCZBeyLHaIl2dBa5kPgWZJsAahQVxxKQGsXVWpYzF8nOoKHaS2y7ypshwRQIlcJSisBFhk1PPh51j/sgP+XBngEOjkcM43WLvsqivKnS9TB3jf1IqMSXnXRjcFXJubnrHdgXBR2u2UgCqQ+c1i4OgPvn5OHtO/gTAk5QVv/v6fedJgisx/2tqoOBmHi+XPNsHIIuBl0bmHtU/QSJXws5Tkhd9wG43ysvLGxFzJJbvNJfYVKgWv0PuPT5pQMy/d3f/MZOMV5hoZ18 pvU9XLn9 DwmUDXr/TcXfrM9M2gCGDUkJF0zqjkPc7wGrw0whjEg+RRR5Q0Z+eXTfYGhDj2xNue/bOQBXkB0lJA8S8kW44JI/KY0sgukXN1jPHQge8dUSf07tk/p5r4SJ9waRJ4F00i0cm/EMwS3HYYDFZAkM4ru2ObugoRIBBccukisQ5F3e7qfZ55Nk+Gi+MxULCZKsecp2/dBM3KUe0pxe2rRnhwunKsPjGMMT58t6hEEXguGdmpkxgkmMZEaOZwytpCwZglWfO00xM0hyQuAdIijOPVIhn/KVwdzw8piApBxJBnvKSrCZ4OJ24EVhvxSWLDq5qK3CrClI2H97xBjIZuKg4k73AB/PYtTAJr4mPftX3JS9E+60h8e7FG7lEYnU1APjZoo7NycLNTaR6TCPJfl1cLJih4+9cTUGtHuXTttmeMj31LIHhAGYn2j2S1MzBwSubwQZJYeadrylsU9M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 9/18/24 10:36 PM, Matthew Wilcox wrote: > On Wed, Sep 18, 2024 at 09:38:41PM -0600, Jens Axboe wrote: >> On 9/18/24 9:12 PM, Linus Torvalds wrote: >>> On Thu, 19 Sept 2024 at 05:03, Linus Torvalds >>> wrote: >>>> >>>> I think we should just do the simple one-liner of adding a >>>> "xas_reset()" to after doing xas_split_alloc() (or do it inside the >>>> xas_split_alloc()). >>> >>> .. and obviously that should be actually *verified* to fix the issue >>> not just with the test-case that Chris and Jens have been using, but >>> on Christian's real PostgreSQL load. >>> >>> Christian? >>> >>> Note that the xas_reset() needs to be done after the check for errors >>> - or like Willy suggested, xas_split_alloc() needs to be re-organized. >>> >>> So the simplest fix is probably to just add a >>> >>> if (xas_error(&xas)) >>> goto error; >>> } >>> + xas_reset(&xas); >>> xas_lock_irq(&xas); >>> xas_for_each_conflict(&xas, entry) { >>> old = entry; >>> >>> in __filemap_add_folio() in mm/filemap.c >>> >>> (The above is obviously a whitespace-damaged pseudo-patch for the >>> pre-6758c1128ceb state. I don't actually carry a stable tree around on >>> my laptop, but I hope it's clear enough what I'm rambling about) >> >> I kicked off a quick run with this on 6.9 with my debug patch as well, >> and it still fails for me... I'll double check everything is sane. For >> reference, below is the 6.9 filemap patch. >> >> diff --git a/mm/filemap.c b/mm/filemap.c >> index 30de18c4fd28..88093e2b7256 100644 >> --- a/mm/filemap.c >> +++ b/mm/filemap.c >> @@ -883,6 +883,7 @@ noinline int __filemap_add_folio(struct address_space *mapping, >> if (order > folio_order(folio)) >> xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index), >> order, gfp); >> + xas_reset(&xas); >> xas_lock_irq(&xas); >> xas_for_each_conflict(&xas, entry) { >> old = entry; > > My brain is still mushy, but I think there is still a problem (both with > the simple fix for 6.9 and indeed with 6.10). > > For splitting a folio, we have the folio locked, so we know it's not > going anywhere. The tree may get rearranged around it while we don't > have the xa_lock, but we're somewhat protected. > > In this case we're splitting something that was, at one point, a shadow > entry. There's no struct there to lock. So I think we can have a > situation where we replicate 'old' (in 6.10) or xa_load() (in 6.9) > into the nodes we allocate in xas_split_alloc(). In 6.10, that's at > least guaranteed to be a shadow entry, but in 6.9, it might already be a > folio by this point because we've raced with something else also doing a > split. > > Probably xas_split_alloc() needs to just do the alloc, like the name > says, and drop the 'entry' argument. ICBW, but I think it explains > what you're seeing? Maybe it doesn't? Since I can hit it pretty reliably and quickly, I'm happy to test whatever you want on top of 6.9. From the other email, I backported: a4864671ca0b ("lib/xarray: introduce a new helper xas_get_order") 6758c1128ceb ("mm/filemap: optimize filemap folio adding") to 6.9 and kicked off a test with that 5 min ago, and it's still going. I'd say with 90% confidence that it should've hit already, but let's leave it churning for an hour and see what pops out the other end. -- Jens Axboe