From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28890C3ABD2 for ; Tue, 17 Sep 2024 09:32:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D6AD6B0088; Tue, 17 Sep 2024 05:32:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 387366B0089; Tue, 17 Sep 2024 05:32:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 227696B008A; Tue, 17 Sep 2024 05:32:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 036846B0088 for ; Tue, 17 Sep 2024 05:32:13 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 982BF12027E for ; Tue, 17 Sep 2024 09:32:13 +0000 (UTC) X-FDA: 82573714146.08.BC41D37 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf04.hostedemail.com (Postfix) with ESMTP id 81EB340006 for ; Tue, 17 Sep 2024 09:32:10 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="hiAllB/c"; spf=none (imf04.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726565420; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AqxKUzFGQdmgzjFQB7tJnhZLtky0WoPwa3ZP0bwa7kw=; b=X2isXf17IuI8b7XCm7ZskskcwdalXYfIqrXWBzUxU0cSCf07FPP/CN4YPwimuFa8WtxRgP FjFNv9HJdJ42SwgROr1FE7Rpqqa+dHV8uYirqnJltfZUJpoSz0N4taq/5GMtNNr1hjgd9U qLMcqKH2wXso5HLk89YzfTTA9xoQzvg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726565420; a=rsa-sha256; cv=none; b=TQR56ckfcBEXUlNkWWnoS6CwPE/8iLsxuIgrwkTQbHjX2z48Vk+jtTMuN5gbKwqX8XCJFL mVz4MVJw73PMkxmaC4PCwhUCCJKIeAf2B34AWEuEby/EjvWm1VBTC6d+51nhpU+fmcGKVL Fnr6BlO5rkIXZelx4oyon9MR3bitDJ0= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="hiAllB/c"; spf=none (imf04.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=AqxKUzFGQdmgzjFQB7tJnhZLtky0WoPwa3ZP0bwa7kw=; b=hiAllB/ciABy+VhhuHSacSE+Oa SkkEdCai+UpuhIa9Yf1t+vo02bnbfsGYxVDgljWR3ImyeE7KwLUBF1VNYMhKx4oAl16yvn/pQ6OPQ H3m2XoG5UiYzmqwMb5tuuRdps/hVoLjPG4pohXphNPaX2soLEzxWxfUcY+BYEFrXFfVvJSoYdsG02 2Pyl5awELZWuUbbvhKTT6oxktKMcKpioP2G61vO7hgxyRoK9+bZXzQBt7Fmf/M7fihX6s6K9fAlbG 9wzarVZDbmH1YOOtLKXK07Ymuk7V/UT48FXq1KB2oVwYg0M6i2v4+MLsQVIiw6agHfahLFV47FNAC s9XTgnRA==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1sqUZ2-00000002zBP-3zLK; Tue, 17 Sep 2024 09:32:04 +0000 Date: Tue, 17 Sep 2024 10:32:04 +0100 From: Matthew Wilcox To: Chris Mason Cc: Linus Torvalds , Dave Chinner , Jens Axboe , Christian Theune , linux-mm@kvack.org, "linux-xfs@vger.kernel.org" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Dao , regressions@lists.linux.dev, regressions@leemhuis.info Subject: Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards) Message-ID: References: <0fc8c3e7-e5d2-40db-8661-8c7199f84e43@kernel.dk> <74cceb67-2e71-455f-a4d4-6c5185ef775b@meta.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <74cceb67-2e71-455f-a4d4-6c5185ef775b@meta.com> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 81EB340006 X-Stat-Signature: 4t6z4b6393e9a3tmndjoqf7i6yknosf3 X-HE-Tag: 1726565530-130821 X-HE-Meta: U2FsdGVkX1/GGh1Jguw4ItjNbzXQMRGf5LlYIkBlx3A4popKpC0apMMPRI8jVX4bNTPhed09pmqnf238lo6gMdM4YQVuHNDByse8A7qsVpApTOt+YXjpFWM8+51wXMBF/s46Dh0a87qpvKhmQuoLc2Wjq3FjNvB5Vw+GskA0S4d2ayXoVKBBebiH95PivxamJvxJS9Tn4rtyvCG1cEDvjO1tZOfFrzpoVVGxULYbJ9ltpWqZMR6LxKq8NHdx067kZ41wAqSAKQXvQ7JSMExCqVIJ7Fw2Sxq1DVogQdwXMO9hpJoqTJ2Rej6NTGhojH/2S+A1z9yNS4qxSRe4MLj3hBvIvFYEv6rMHhzH0oiGkpqHtRy5Wo1kNkGbjWcsqoqYdvZx74WGm5e1mV3mhX0bDwdAxWhdAMw9AL20lyfAQ2clZ9hFt1meemopnXwgiEraEeKHW7LajCRDXQcUgY3S5IZGJ2gBCk12vL1OJzd2AFQcyzUHN8SJge+Nr8vQSBKz80kOVjK4h6asph8eon9zXlSxK9qDzjNGGM9RbZYFX5umxoqZ6Z5MW61stDXenNDMlBhMOoyG5QjtbXK/vFsPXr4Sr2FICicRNl7SMmVOuHRmCagTAvOev612NuN1BRVRmo6Ia78a1ihziQtsoAwkCJizA4CW0dbcbqlIGo2xJD0gkHpa0vgYpng1eoqSavK7x2ReRDNu8gR0XGVO/aHRsSRN/Z17dU/YvAwKGTEMLIHN83s9SSv7a2uVDv+K7xTK8GXPDtsRoS3WWscZ0ySh3Ms7VSOc5rD2IqgIB8fI/AMgGrkxP4gXtSnwsRA5IijVwwO+dt3ahDbycVVOJSWGI0ql0Zz6nIA+i2MSR5u/qcmafk23DO6p2VvxlJx6UbvFvJ02F+09SNcU4PfSvFrbSYNDf/di/Yk1doiLcq2eKDH/8nKVP2pC7QM2TSaY8Rp1fj5jbIZNpnBNsQ1zrbs bsRh7JIL jJu97SVmZYSpl85YH0jaEXnlQldadjyzk6Rw/f2GKZ0bJPvcl/UeRI8qOSNIFyn3bE/yeS64laqtlskBHUHNf2VddfviCCoZIXWc6hFL+XHsy5Y8GUxhWGmiENn0O9rZX8PgxJ9oBPMb8kpyNI2OPglRT0ecCpsNp1UuaDUCkRfTQbdO0TVCr2X3uDZ9GwVCoHOjVNFICBmnoIJYMKcHas/dXXxVAAwacTGktw5a8xj2F+bDBwmNR2Vmkl57CmcW4KJHwk1+U4OFAVeUVxzq/CU4xkXnqqCIFnhRMXwMDiRqBWYKRASTVY5AKE/3dmnLGP6/sLBtkq9/CuAQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Sep 16, 2024 at 10:47:10AM +0200, Chris Mason wrote: > I've got a bunch of assertions around incorrect folio->mapping and I'm > trying to bash on the ENOMEM for readahead case. There's a GFP_NOWARN > on those, and our systems do run pretty short on ram, so it feels right > at least. We'll see. I've been running with some variant of this patch the whole way across the Atlantic, and not hit any problems. But maybe with the right workload ...? There are two things being tested here. One is whether we have a cross-linked node (ie a node that's in two trees at the same time). The other is whether the slab allocator is giving us a node that already contains non-NULL entries. If you could throw this on top of your kernel, we might stand a chance of catching the problem sooner. If it is one of these problems and not something weirder. diff --git a/include/linux/xarray.h b/include/linux/xarray.h index 0b618ec04115..006556605eb3 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -1179,6 +1179,8 @@ struct xa_node { void xa_dump(const struct xarray *); void xa_dump_node(const struct xa_node *); +void xa_dump_index(unsigned long index, unsigned int shift); +void xa_dump_entry(const void *entry, unsigned long index, unsigned long shift); #ifdef XA_DEBUG #define XA_BUG_ON(xa, x) do { \ diff --git a/lib/xarray.c b/lib/xarray.c index 32d4bac8c94c..6bb35bdca30e 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -6,6 +6,8 @@ * Author: Matthew Wilcox */ +#define XA_DEBUG + #include #include #include @@ -206,6 +208,7 @@ static __always_inline void *xas_descend(struct xa_state *xas, unsigned int offset = get_offset(xas->xa_index, node); void *entry = xa_entry(xas->xa, node, offset); + XA_NODE_BUG_ON(node, node->array != xas->xa); xas->xa_node = node; while (xa_is_sibling(entry)) { offset = xa_to_sibling(entry); @@ -309,6 +312,7 @@ bool xas_nomem(struct xa_state *xas, gfp_t gfp) return false; xas->xa_alloc->parent = NULL; XA_NODE_BUG_ON(xas->xa_alloc, !list_empty(&xas->xa_alloc->private_list)); + XA_NODE_BUG_ON(xas->xa_alloc, memchr_inv(&xas->xa_alloc->slots, 0, sizeof(void *) * XA_CHUNK_SIZE)); xas->xa_node = XAS_RESTART; return true; } @@ -345,6 +349,7 @@ static bool __xas_nomem(struct xa_state *xas, gfp_t gfp) return false; xas->xa_alloc->parent = NULL; XA_NODE_BUG_ON(xas->xa_alloc, !list_empty(&xas->xa_alloc->private_list)); + XA_NODE_BUG_ON(xas->xa_alloc, memchr_inv(&xas->xa_alloc->slots, 0, sizeof(void *) * XA_CHUNK_SIZE)); xas->xa_node = XAS_RESTART; return true; } @@ -388,6 +393,7 @@ static void *xas_alloc(struct xa_state *xas, unsigned int shift) } XA_NODE_BUG_ON(node, shift > BITS_PER_LONG); XA_NODE_BUG_ON(node, !list_empty(&node->private_list)); + XA_NODE_BUG_ON(node, memchr_inv(&node->slots, 0, sizeof(void *) * XA_CHUNK_SIZE)); node->shift = shift; node->count = 0; node->nr_values = 0;