From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89742C36008 for ; Wed, 26 Mar 2025 21:48:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CD902800B1; Wed, 26 Mar 2025 17:48:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57CC8280048; Wed, 26 Mar 2025 17:48:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4473A2800B1; Wed, 26 Mar 2025 17:48:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E12D5280048 for ; Wed, 26 Mar 2025 17:48:55 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 222C714135B for ; Wed, 26 Mar 2025 21:48:57 +0000 (UTC) X-FDA: 83265042714.20.A0744B6 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf03.hostedemail.com (Postfix) with ESMTP id 2669620006 for ; Wed, 26 Mar 2025 21:48:54 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b="ConITU/w"; spf=pass (imf03.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743025735; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AOA0g6xjZW3NM82qhVrH0hYkXRzjDdNGaRStsJEGb/4=; b=BwGYra6VL99ZK5RrC8Phr1k4ySIWkwtUySK814dDo7Wl9mJlHi96+qjcHzAF7VUvqJA4WI oMMQpLmCPsZz3xhiI72fo6nfLenZaN1WiMkp/sj2JmerYF7an356/itfystNe8GwLnStHY J3p1p/L7HIcUA4x0pupG2M2TxpI5O/U= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b="ConITU/w"; spf=pass (imf03.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743025735; a=rsa-sha256; cv=none; b=t1cYbW1nCXXCTlHxxF2rFaeDIWniBtY1JPC/OdPFtpmZSZCgA+MP2iEsPyVyvqaF9SElF7 MVxknA8oAikrAQRpkimBUJeudEhNOKloHOYK6u2phbaLuonJ32zdHGzvQloeMIYfi0f/N+ fzTRjhWJLN33qYADeH9L3mD9s0Mvr/o= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-22403cbb47fso8018125ad.0 for ; Wed, 26 Mar 2025 14:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1743025734; x=1743630534; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=AOA0g6xjZW3NM82qhVrH0hYkXRzjDdNGaRStsJEGb/4=; b=ConITU/w1LyVQ0CyKFejBXw3XMKVB7mDMgyMsXnMSNxRGoiidCHk3YJD0LZW24roCa ePS+X2X4oSxdJNNWgo57LyP3DZpI9XF8i9F9skmphYJAmkSicJ8QMAVmFCBZcVIKnCP2 00ZsG+NeBaxIvfgp7aDZIXfYVqFzeeYA3miLOzCszVxYYYpwBaMBY7PdEvTIZ1brdOcf JppjJ7+Ub4oTUdoywywLKn87UD9sqFZPoQrPLuEnmYtZ3ZIu6vKQj4aOOtXjlb04hnrv LZWH/QUGaFCjse3xAF7V7eOSnNyj1Zn07ei03wE6tySRUkGnsY9rsTqcRa+A1QjFwepm dqsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743025734; x=1743630534; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=AOA0g6xjZW3NM82qhVrH0hYkXRzjDdNGaRStsJEGb/4=; b=HBCf5DP86jLszTGl7PC6S7PzvDefQk2JHGO/VUPoxGcqbMUSJveJibEFppdSvwjsS+ 8Ruu7RxndX/MJ5Rd6TUcogKUJuhBHXr//UFNUcCq2RuavIdAGxNevkKFTehkGboJDs4y sc/oZ9xezcSPPQ+81NGbK9j0d5Rsu2JXvVaILMPBBN4pcZ2h61LJHsbly/QssynbM8Ue zNTJmEsD10wAXbHyYEs9Ub8igDavQXibGAokdfDBdaJmxZ1FsDWec5AoJ2bsWKwSlGyA Y5/gb+FRnozoL7JFrreian6Tp5eSV3t4XdsIr7B/f/f0Cy5ML+a2TqXdhjrPa8ve205N QfaQ== X-Forwarded-Encrypted: i=1; AJvYcCW4JO5fHUR7cvzHq7A4JFrgeQgcncEVsdDPJm+eDXN3/y27TY0BElznmf4hhYmKAbqhlRNL/u75gA==@kvack.org X-Gm-Message-State: AOJu0YzNNqAZHQGP+/o30yrAOebc4JRTjTqnLQu8UYRPPI7GvmaY/01b FTCj7nm/9qKnptKT0H5pmZniNbCwcj5JVaIcKBGhgasmYjA7k30TqSoNquJsW4w= X-Gm-Gg: ASbGncvQ/LZVN2diGzLA55m9Z64PolJxxPZMuD0a6FezcDRvKyyeId0rd78301MrWp9 /HInd0j0RU1tpZu6gFFG1Q649nkYT83RBhgpuJ/mBt6PijY7hOpGL2kGjE6S6/Wcu3UpI9GiV0t BjJzJeKGjJifBniiB6ujRNbP0YesYWcrnQSq/y1QXt4jonS7Tlesxd7g3qnTyXjLIUOj9TWNOEz kzzQNw6wlVKcOqxmty1jPSUcty6YC2UQnVeFt0pFJ3c/MAcIr70ySaW3rh1KNG4B1OaSqBGY7/l pAAZm0tMpUMUmu7iy9519aP3QUeuURcybzhz3rt1lrOGwBIngCrTd/XRP7EeY17R88mvZlHYsLb hPNRvhlkWOfxDR3ZMM7WMPKgbX87n X-Google-Smtp-Source: AGHT+IExYtNqemtiH4PchdNU0gkFofimKplIG0mYOp41fANWlH1UQJZ5KnbvIsB+XPVtQSaw6PmPeA== X-Received: by 2002:a05:6a00:4607:b0:736:3d7c:2368 with SMTP id d2e1a72fcca58-73960e2cff6mr1752334b3a.7.1743025733869; Wed, 26 Mar 2025 14:48:53 -0700 (PDT) Received: from dread.disaster.area (pa49-181-60-96.pa.nsw.optusnet.com.au. [49.181.60.96]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-af8a292b1ffsm11398612a12.61.2025.03.26.14.48.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Mar 2025 14:48:53 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.98) (envelope-from ) id 1txYcD-00000000hgM-2MZe; Thu, 27 Mar 2025 08:48:49 +1100 Date: Thu, 27 Mar 2025 08:48:49 +1100 From: Dave Chinner To: Matthew Wilcox Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, "Theodore Y. Ts'o" , Chris Mason , Josef Bacik , Luis Chamberlain Subject: Re: [LSF/MM/BPF Topic] Filesystem reclaim & memory allocation BOF Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 2669620006 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: hop8a4wjw3rf8i4aty1ea87uouzjtx1z X-HE-Tag: 1743025734-330965 X-HE-Meta: U2FsdGVkX18JQXTVAJa40TZy0vkcrlErzicFAqE4AfyeEOqmo02kFlLuuNrEAx09gUUzwyKIX5MxkFWwElcduBqP6foHXjUKjUYyCskzm3Czih8NXygedYus02Ng2rTZdV/0jq8urf3Po13pUdABMu7LhhVnGgfesvZmlFBKv3K37Bc7bIrtDMZPF1dvZcqzKdFhRhJBqWmJlon/nwIGspYLnLAQNaFPIorNnT6gWg0SI5DvfBMMh2jk0XzmY3hdQbzVhr5BAbkxD/M8Cs6x/m3xQoXl+XbKoziTgov1dipQ0GZHdEsuP7W/P6hjHUc5oSfxDaKor3yAQF6AXFDhllgitIzS2a/hSZSpCAR9a26/SmovZq4BiVCyUbzDAvAhIPKcRs/EV2rlu+DtIZtiYCSqSHQ9GCw7wv8p1OU1SmT9byP3Dp6cgZdXhByFOl+jwXDNbZhzYEITiB3ktOuo+acBcty9wXMqdQxWNKxlE4Xpb48H0EgY5IQm40vxzSCOaLFOssF+tTDEzO2RGZ3Ety4YPeoRwLLJna747/h2yQZ3R9gfsyWfi4jr9MeucYL3cjAFWunXoSEYXkE1N0LrwCNlLUnQIyZbaS2SIMrgtZ2QtLKJ2ZUkBGxOWRH5YGs7LNOGrE63wqgZmvVU+chEDpuZJQT2rtJ+OjiLwr9tavgws2to/uhaUXxzA5L332mlMal1XhpxNwLU3ioZi6dB6N3zLWB/kxTghrjk2q6kBs87wAhv9th69Vf5auENGjqk+yzSm0e7JyQc5Bz6NmdwSS5KjySQJtNIKX24rOuddAvq1c6JKNd2yW43KLoCDWMGJiBvXFY/j3+DGe0z9OOBlUyAgcRlXEOqkOJKTAknP1sRl7V3XTB9M2qm8rHoqzhOM+VnLK9+hjmfPA+A0tMTxJroecpvfBWfWY0onnKwXdGBSXtcJfLttzLIuOvQ5NnB7kriGQquZyT8TQfPMiL VT4YJfHI E9T6TDmKxpdsQtqtkDnRAwwMJIU9+87FD7LTK7eXlBPEXmNjcD/pqEXXLLJR9NORNeZXOYjlh/I/07M2FzPAo6PvdTOZfmqaio60+aweV1x+I4aOMfVPMBXyQFz+6WEw7lhWUtoHz8vyRSGkHOYfXrg2LmnBbopp3F6gtaKfuVC5Y+ax7ZMS35Pvx1q8/nFDo93RhgH4xD+YnFU/EvVroTbwbevNTTiIPfuBOvIz/1169atuNsaCwwj2iZurMh3zWRezmHfEDaWcOA6AEi/kqdeZy4HcVFYt+golkg41PqgprHoNkoBowoO9fwLUjhT9zFG1UVFwE7wRbOq6WWHjD7nr0s7B/Qef01zFV8zRv8vKB854zJV2m3W6O4ldkiXFrDe0kO+WSxHpZmG14/I9ZQLvDVgpIWYbG4mPZd3SPDKwsPMtWu/fdf9S7DFwssUeBL2ISt2Rgy8tBMQITsiR20q9VUcZZQX32Hza9C/Kkqe5P5ym6t80v1KJ9Dg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 26, 2025 at 03:25:07PM +0000, Matthew Wilcox wrote: > > We've got three reports now (two are syzkaller kiddie stuff, but one's a > real workload) of a warning in the page allocator from filesystems > doing reclaim. Essentially they're using GFP_NOFAIL from reclaim > context. This got me thinking about bs>PS and I realised that if we fix > this, then we're going to end up trying to do high order GFP_NOFAIL allocations > in the memory reclaim path, and that is really no bueno. > > https://lore.kernel.org/linux-mm/20250326105914.3803197-1-matt@readmodwrite.com/ Anything that does IO or blocking memory allocation from evict() context is a deadlock vector. They will also cause unpredictable memory allocation latency as direct reclaim can get stuck on them. The case that was brought up here is overlay dropping the last reference to an inode from dentry cache reclaim, and that inode having evict() run on it. The filesystems then make journal reservations (which can block waiting on IO), memory allocation (which can block waiting on IO and/or direct memory reclaim stalling), do IO directly from that context, etc. Memory reclaim is supposed to be a non-blocking operation, so inode reclaim really needs to avoid blocking or doing complex stuff that requires memory allocation or IO in the direct evict() path. Indeed, people spent -years- complaining that XFS did IO from evict() context from direct memory reclaim because this caused unacceptable memory allocation latency variations. It required significant architectural changes to XFS inode journalling and writeback to avoid blocking RMW IO during inode reclaim. It's also one of the driving reasons for XFS aggressively pushing *any* XFS-specific inode reclaim work that could block to background inodegc workers that run after ->destroy_inode has removed the inode from VFS visibility. As I understand it, Josef's recent inode reference counting changes will help with this, allowing the filesystem to hold a passive reference to the inode whilst it it gets pushed to a background context where the fs-specific cleanup code is allowed to block. This is probably the direction we need to head to solve this problem in a generic manner.... -Dave. -- Dave Chinner david@fromorbit.com