From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7386C3DA64 for ; Tue, 6 Aug 2024 15:29:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32FCC6B007B; Tue, 6 Aug 2024 11:29:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DFFA6B0083; Tue, 6 Aug 2024 11:29:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A8186B0085; Tue, 6 Aug 2024 11:29:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EA1166B007B for ; Tue, 6 Aug 2024 11:29:40 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5A579A6028 for ; Tue, 6 Aug 2024 15:29:40 +0000 (UTC) X-FDA: 82422205320.04.65CB4EF Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 7FD9918000F for ; Tue, 6 Aug 2024 15:29:38 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722958147; a=rsa-sha256; cv=none; b=DAZyyyvQGbyxBgcRdux4nvvO+A7rF8g4S//xGi6hIR9YGAa1kcjbO8P3buCn6U1JK+ixRt 693bJLUXhTtCuYqZ8bpH4iixOCySx6MhX4ZtCfqN2AWxunpAxsDmk15hcVdnOLp924wlbA QYaFjAUobUP3KEjEQWgLHBJFHrdgcV8= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722958147; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=snAQVTWC/jEkpMpTqCTyWnELA0E6VU8l7oH8Iz3/A0Y=; b=24JvMMDnWj9p8t0aWdqscmMV/qGRnAHaDYMQJv5fcBAZLuPRABys9BHUdR7IrXF5nwx1K+ wTLsP3iWhll0muIh2zXZePwIPzUmq/HJvFg3abV1kQl7ciB4kkdiGM6yi3eCa4QhckKxg1 M4Fx1blEF4mwE9M1qo9N9oZY8aj1a3k= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3909AFEC; Tue, 6 Aug 2024 08:30:03 -0700 (PDT) Received: from [10.1.31.182] (XHFQ2J9959.cambridge.arm.com [10.1.31.182]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B3F343F6A8; Tue, 6 Aug 2024 08:29:36 -0700 (PDT) Message-ID: <9f722126-3212-4e3c-9e08-bad44fe9b590@arm.com> Date: Tue, 6 Aug 2024 16:29:34 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: shmem folio changes have broken linux-next Content-Language: en-GB To: Matthew Wilcox Cc: Mark Brown , Linux-MM References: <234b0b85-cbf1-4af4-85c9-ac0d00ceeed2@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: 8gmso7otn777wwh9cxia13ohy3gi7atu X-Rspamd-Queue-Id: 7FD9918000F X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722958178-238847 X-HE-Meta: U2FsdGVkX1/plVT+JU1lUFEvC7Qu67JnvwDUDdL9Ddx8FO3zBjqf/aeWAThZxVuR9EzftTonZhv9HvPZnEfJaxcm9BmP+pXPtJpYwWw2U7c7/KjRTZJRYHKDYrzIn0VEX+IwF/tWS7h7v+BBFlCbf9d7nIAOg2nqqE0055/p/cOt7mWor50UHPAfUcNK+eJ5/5v80B0E2Kc/+C0ige+sBsST9omb6w5K9qhoDwiXX519vLNhfvI5kIISCK8z3u/VNEzjNp2J0jrZwkjz6ynaqZXA488IJxKeLHMQQ0OChbXQg1mzcPkeoCLlRYT+iWLnvBL28zaI2pSDA8j6ixdxDLTvj4YxHMc77eJjAJT3RGTNGViuDtfEBvHqTc7tr/EZARpHo1CO1Xm53uv+Xa5NBjRJcVBQYeWUoj54CDZp9f82yt6apdKrHd6+VddAy4nL1vUI2mEsg/VprWxVXNpgoEXR3ECNr3g9XvTUcJGslUhNy4mWXJ3gRV4B7DHk+03qmpYTcm282V9LH5fdHtpCPu56wibtv60J1W7SPwp357A9ggXWHNk8COu62RxXGgsQy2bQQTiw2pqp8po8p/9tMdG4WInQrjvTvJIdQUPYXOjdaox0RF1i0jTo5dVxZbFU3UNi+DLGudA5upLZOj+vvtYuSeKf5ZNNkKqiYBDs285l57BrSZj6InYekprmqQIdsMqNzUV/Hw7NO9kg2AXfmWj7+fs7mhiFjyWlEWc3Tkj6a4anmo8amPu6sjTWPdzLPmT35Bl3wjnC+oUd9ww2dPQykHmZdMkVaysEcL6uVKyKkIc0Zoa5SOyJ6S1ObcHy/f6D1aeInNwOP2q4jUyqjVjZGsun2Bt+ZvGarJXvu5DvO5sYf9EUqQ1+b8oRR631971a7VIl4IED5U4pHdarYUqWY5NPKw7jp8RbhAO1RNu5Uy11r9g3peVizEgRF9VjDTtSypt7CAuojIUr4P7 ouhn+Zom xMszm9JW1OYLuQiNn5yWTAAE26OXmomnT4xZ7Nq7dh4ftdToSqqUi1xONWspNdiH9OS8TMsN6acrLWqh8S+Z5rBUWBDD/s+L5baRijoday1bsefiMYz69B/cWeGw+6CgUD5k69BBc/5FJF8F2JER3j5GbnGkEOnxAe5FaUW4gjGqAme4x7UfP4hNq/012a9uaRaZzHQRgNjBsAH1qot/odAHxHTUaSEsnsdNZFpZGJC2eDkoKlubIP99C2F+CKk2G1wgw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 06/08/2024 16:16, Matthew Wilcox wrote: > On Tue, Aug 06, 2024 at 09:47:19AM +0100, Ryan Roberts wrote: >> Our CI is reporting an oops during boot on linux-next (next-20240806) on arm64. Bisect tells me that it is due to your commit cdc4ad36a871b ("fs: Convert aops->write_begin to take a folio"), but there is no link to a mail thread on the patch and I can't find it in lore. > > You're looking in the wrong place ;-) > > https://lore.kernel.org/linux-fsdevel/20240717154716.237943-22-willy@infradead.org/#Z31mm:shmem.c Ahha, cheers! > >> Anyway, I believe the issue is that you are doing this in shmem_write_begin(): >> >> if (folio_test_has_hwpoisoned(folio)) { >> >> But folio could be small and I think that function is only safe for large folios? (AFAICT it is unconditionally looking at the flags in the second page?). >> >> Elsewhere in the file, this pattern is used: >> >> if (folio_test_hwpoison(folio) || >> (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) { > > Ugh. The hwpoison stuff is too complicated. Because that's wrong too. > It should be ... > > if (folio_test_large(folio) && folio_test_has_hwpoisoned(folio) || > !folio_test_large(folio) && folio_test_hwpoison(folio)) Err... I clearly don't understand it properly. I guess you sometimes want to know if any page in the folio is poisioned, and sometimes if a specific page is poisoned? So // returns true if any page in the folio is hwpoisoned. // works for any folio (large or small). folio_test_hwpoison(folio); // returns true if the page at index within folio is hwpoisoned. // works for any folio (large or small). // BUGs if index out of range. folio_test_hwpoison_page(folio, index); Why isn't this the right interface? Why do we have a function that takes a folio but is only correct to call if the folio is large? > > right? But that's a mouthful to write. I'm tempted to rip it all out > and start again ... > >> Here is the oops (pretty much as soon as we get into user space): >> >> [ 0.623253] page: refcount:3 mapcount:0 mapping:00000000eebcb8cf index:0x0 pfn:0x18cc07 >> [ 0.624212] memcg:ffff000142023000 >> [ 0.624617] aops:shmem_aops ino:800 dentry name:"memfd:snapd-env-generator" >> [ 0.625444] flags: 0xbfffe0000040005(locked|referenced|swapbacked|node=0|zone=2|lastcpupid=0x1ffff) >> [ 0.626532] raw: 0bfffe0000040005 0000000000000000 dead000000000122 ffff000181dd0ac0 >> [ 0.627442] raw: 0000000000000000 0000000000000000 00000003ffffffff ffff000142023000 >> [ 0.628331] page dumped because: VM_BUG_ON_PAGE(n > 0 && !((__builtin_constant_p(PG_head) && __builtin_constant_p((uintptr_t)(&page->flags) != (uintptr_t)((void *)0)) && (uintptr_t)(&page->flags) != (uintptr_t)((void *)0) && __builtin_constant_p(*(const unsigned long *)(&page->flags))) ? const_test_bit(PG_head, &page->flags) : generic_test_bit(PG_head, &page->flags))) >> [ 0.632106] ------------[ cut here ]------------ >> [ 0.632630] kernel BUG at include/linux/page-flags.h:308! > > I'm glad I made it so noisy instead of silently checking something > that's not the flag we thought it was ...