From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84138C10F04 for ; Wed, 6 Dec 2023 08:16:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 05DB56B0088; Wed, 6 Dec 2023 03:16:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F288A6B0089; Wed, 6 Dec 2023 03:16:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC8A76B008A; Wed, 6 Dec 2023 03:16:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C672A6B0088 for ; Wed, 6 Dec 2023 03:16:11 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9115DC013C for ; Wed, 6 Dec 2023 08:16:11 +0000 (UTC) X-FDA: 81535685742.25.31E1EC8 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf16.hostedemail.com (Postfix) with ESMTP id 826A7180014 for ; Wed, 6 Dec 2023 08:16:09 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=qKDuHnBz; spf=pass (imf16.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701850569; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BVirJMyPevVIgzIpiwvmS8AQpGiAuX4zW9DZNPcO3bw=; b=ZZDc0Nzh6HaSVL2JOhlNzixIvokvzXIFyQYP9u80RR+odyqEztjM+whnph1AS3LYUl8EX4 VqwdLCdOsNgayBv+VKWOPAcuTFIi37MjYGpJzvJBX+kiQZdsu8Kc2i5Vk9N6M2h3SIMPXW uztVXe/q/mp7Jc8j3by1aXqZMk/d4Yo= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=qKDuHnBz; spf=pass (imf16.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701850569; a=rsa-sha256; cv=none; b=zQ65+V8E+pkEeY3roi/Jh/sCKhaIqjVWD7eBUyw5Qdtu+fvezjX8xUjJ0vmfTsPD8dePPz r6uret2blylIo9BXJYtAu8fIU3leqioRXaSfWbTolqCgY2hZUHm7Wn5lgcVIzdafPzfDGi VrQO5vUtM8JrQfBolAICI+Q9b8ImNLA= Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-286b4a84044so2250083a91.1 for ; Wed, 06 Dec 2023 00:16:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1701850568; x=1702455368; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=BVirJMyPevVIgzIpiwvmS8AQpGiAuX4zW9DZNPcO3bw=; b=qKDuHnBzCPWc0sM2m0nSvuVwU1JpUTNZXaNoern5RwIXVF73lvqS95WYqqY1xNebwd QvVuHRVhKvSteg06u+3FdZWQ01hyu3R5clBOunDiWw4/UH1JqzYmXSSkvSUKHWyIyLC7 ZcLjka8mg+ipiCcAorgJ+NohqkXvkWSNYf/Ibk/grxsjDs5ZrXF62hJXtjN0acPbk9a1 VgxLLBiEKow9uz4N/P9hyo7dIVN2CydYX7empD0u1Pxzjp7t9CP8Kwdms7o4L55TvddB vTGvKw6dVLNJo1lsckZMh8U4qU0vHUn+SRqSOerRi5wTQmb3eU2lUhRtH1cm3xXOxfZG VEwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701850568; x=1702455368; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BVirJMyPevVIgzIpiwvmS8AQpGiAuX4zW9DZNPcO3bw=; b=OptxTfIU1sQYFMd9PMw5YR2J6JSHZrk080fg2jcXmGbAklvVVxxPWyLE0utc94UncU Sq9/5CJP4O2MCeqzoIjUDbOkkYU2BYb38MVL/N1rrb7jtnrsIjeF5w+eBxYQcPMeHiBG j9/nWSv5hTeX4gP2porGsmZOPyoS5gNJ8F4mowDUcr0Gkm43sXVQWwV28wknwF9zkU3x xvBHpTK9wE5m4yq4VHwWq6voTg5xXcI3yEw/YBnHJ7dezU4r6shDrlaw1rcM79xs4Z4X 4bPb2PnkyuwFaLyQSfV07oawYKDAcWWi5aJBWhzfJ64YQeE7HWf/xpAhhGS7GQV+nhj0 F20g== X-Gm-Message-State: AOJu0Yy5BwqSMHWtMJhc70tYy9FCVDP6Aolfh+K4tuyEImCDh9vwFbvz 2hG11ZKkeYDbflr70A8NkWaazA== X-Google-Smtp-Source: AGHT+IFSfT6V9PieGVxLQ0jzN0gmpf040pMnGvleNScmRHV5uFBPZW+tOcTFxYmRoUAPPO9G/z8W1w== X-Received: by 2002:a17:90b:30cb:b0:286:c398:841b with SMTP id hi11-20020a17090b30cb00b00286c398841bmr391653pjb.58.1701850568162; Wed, 06 Dec 2023 00:16:08 -0800 (PST) Received: from dread.disaster.area (pa49-180-125-5.pa.nsw.optusnet.com.au. [49.180.125.5]) by smtp.gmail.com with ESMTPSA id hg6-20020a17090b300600b00286ead49257sm1993982pjb.21.2023.12.06.00.16.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 00:16:07 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1rAn4e-004XpP-2g; Wed, 06 Dec 2023 19:16:04 +1100 Date: Wed, 6 Dec 2023 19:16:04 +1100 From: Dave Chinner To: Roman Gushchin Cc: Kent Overstreet , Qi Zheng , Michal Hocko , Muchun Song , Linux-MM , linux-kernel@vger.kernel.org, Andrew Morton Subject: Re: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers Message-ID: References: <20231128035345.5c7yc7jnautjpfoc@moria.home.lan> <20231129231147.7msiocerq7phxnyu@moria.home.lan> <04f63966-af72-43ef-a65c-ff927064a3e4@bytedance.com> <20231130032149.ynap4ai47dj62fy3@moria.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 826A7180014 X-Rspam-User: X-Stat-Signature: qoar6z4qobiex7sdpgs6zhhdipyhu8ix X-Rspamd-Server: rspam01 X-HE-Tag: 1701850569-714248 X-HE-Meta: U2FsdGVkX1+m8SW7hx5FciHBjcVi1tsN+p5g8zQ+Hgk9z7iaQ/xbXs7/fxeYoB7XMKoijG1LpoAOOXfO3xjhK6MV4dCAtcig4yr896v1shY2ut12r2R9m4qJ8Zr55uH0excFZS6BShX6fR3plFiYn2dLJrzcJikazBX/uuuYSZ5rsbng317lyccv1+4iohc/8uvARAeLhrNr/b1xSzcL3nz8rDoXW7xUQFSqsSgNpE8+FowNN/6iNT/qk5B/ChRNEtbVO6PW02UG9AQbTSvYJSWJ1p2oSW4e93WHDpAWJ3gC9RgiTNPdnika+2vGSJM3s4fG9e1ZANJ3YknG3qIBAXRbfUSSzDhWBh8v+Rb5TnARg4mXrKNAxpAxON0kUgkL9pftR2RXv42jMB7oR2/o+xUhh+H6jeDhn2n5e1yYQSeU9GIHrg78prn3vaMfGXaZZBuKT67SzmMmv7Mzt08yfj+nknSrSZrVN13KaqQyYaJg+vACbmCIA+Tco0DSveKj2EboHyuhjLP1a6ciuMiRD6KDHOY5yp5eBgOifK5OtNyyIIJluorsx07OlrLUtSNBibPA9NO74DNgDI6JUxEQQps32wMQPJtc0wKhI5s/PqD3/RIAEfS31GdnQl7SwVk7KFcirw3vrAhX6NXnkR7sGRzqtF0RU7sHhn8dlsDn7xxqh7vpUz5M/EMShaqzP2DChYmpvK081ee9O+1ajoGViMKEq618HcwHW5LDt4yVQJ/PlfUAQQ/itCl/wv8X1ucxO5jRpcarWDIapZ4aTJDNTWTzss1zdEpzljoerncWdaDVmOhgD0topR3fXo3Nu7D6fbDncnGIzm1Y8BSp/T3eu1Kj0UUQ1fqchLzujHvLV/OZgc7ux5niiYz0a2EoA6Kt8oQghkb4btCsnUTfxOJRUshn8m3b8hFHgfFHUeBT4E7TmhPIgUu2+N3LaZ/CGEW84L3ajA/e+a/SzHgeBc4 U/6bqkQh 0OXOxjIF9Fo+uui9NzZtDxktX0Rg3fRfzHHZkSfw1UfQxz7NXVVeBZ71aHnuFa21zGqXtg2aGmMNaff6b88QntFU0yLPyvccaiCpaAjDiDMO4uiPj3/vUXNKAUjI0ZIZsTEMFY+O2+0Tgb9XpGwDPngBhffpdRtppBls6DslkMIH7ZffFiSddERCDQRPVM/b+wJ4b6lqpsq3z91DLSV2b2OE7Sig0jI65EQhe5cCZyIFEuuFosa6znWf192TrQgMXmjETTHPRMAwKlPoRmHUK2zFjTXZPZtDq0L8t/OZFBkzQVMb2/R8u1uQSCC5WRgsH0++/RJD5WeZwQfW+aQYKQB/R+moa1F50Jo8Ytf673fGsOmiYfws5wR62AXVYImpqzyoDRgUaTxqfZu8+CMlHB6mHye1siVntuyWdFg9CJHntwa3Hc+cFp7c+Z6PD/eUHIHzd7r+1R5hH3CogtAYeWQBzWsgtNEApA69ARbxepF9+A6Qk61PJMCeIpXjqtudPRDRVh8wPs4g33NVXwuvRgyuUe0AGhcIB9CkU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Dec 01, 2023 at 12:01:33PM -0800, Roman Gushchin wrote: > On Fri, Dec 01, 2023 at 12:18:44PM +1100, Dave Chinner wrote: > > On Thu, Nov 30, 2023 at 11:01:23AM -0800, Roman Gushchin wrote: > > > On Wed, Nov 29, 2023 at 10:21:49PM -0500, Kent Overstreet wrote: > > > > On Thu, Nov 30, 2023 at 11:09:42AM +0800, Qi Zheng wrote: > > > > > For non-bcachefs developers, who knows what those statistics mean? > > > > > Ok, a simple question then: > > > why can't you dump /proc/slabinfo after the OOM? > > > > Taken to it's logical conclusion, we arrive at: > > > > OOM-kill doesn't need to output anything at all except for > > what it killed because we can dump > > /proc/{mem,zone,vmalloc,buddy,slab}info after the OOM.... > > > > As it is, even asking such a question shows that you haven't looked > > at the OOM kill output for a long time - it already reports the slab > > cache usage information for caches that are reclaimable. > > > > That is, if too much accounted slab cache based memory consumption > > is detected at OOM-kill, it will calldump_unreclaimable_slab() to > > dump all the SLAB_RECLAIM_ACCOUNT caches (i.e. those with shrinkers) > > to the console as part of the OOM-kill output. > > You are right, I missed that, partially because most of OOM's I had to deal > with recently were memcg OOM's. > > This changes my perspective at Kent's patches, if we dump this information > already, it might be not a bad idea to do it nicer. So I take my words back > here. > > > > > The problem Kent is trying to address is that this output *isn't > > sufficient to debug shrinker based memory reclaim issues*. It hasn't > > been for a long time, and so we've all got our own special debug > > patches and methods for checking that shrinkers are doing what they > > are supposed to. Kent is trying to formalise one of the more useful > > general methods for exposing that internal information when OOM > > occurs... > > > > Indeed, I can think of several uses for a shrinker->to_text() output > > that we simply cannot do right now. > > > > Any shrinker that does garbage collection on something that is not a > > pure slab cache (e.g. xfs buffer cache, xfs inode gc subsystem, > > graphics memory allocators, binder, etc) has no visibility of the > > actuall memory being used by the subsystem in the OOM-kill output. > > This information isn't in /proc/slabinfo, it's not accounted by a > > SLAB_RECLAIM_ACCOUNT cache, and it's not accounted by anything in > > the core mm statistics. > > > > e.g. How does anyone other than a XFS expert know that the 500k of > > active xfs_buf handles in the slab cache actually pins 15GB of > > cached metadata allocated directly from the page allocator, not just > > the 150MB of slab cache the handles take up? > > > > Another example is that an inode can pin lots of heap memory (e.g. > > for in-memory extent lists) and that may not be freeable until the > > inode is reclaimed. So while the slab cache might not be excesively > > large, we might have an a million inodes with a billion cumulative > > extents cached in memory and it is the heap memory consumed by the > > cached extents that is consuming the 30GB of "missing" kernel memory > > that is causing OOM-kills to occur. > > > > How is a user or developer supposed to know when one of these > > situations has occurred given the current lack of memory usage > > introspection into subsystems? > > What would be the proper solution to this problem from your point of view? > What functionality/API mm can provide to make the life of fs developers > better here? What can we do better? The first thing we can do better that comes to mind is to merge Kent's patches that allow the shrinker owner to output debug information when requested by the infrastructure. Then we - the shrinker implementers - have some control of our own destiny. We can add whatever we need to solve shrinker and OOM problems realted to our shrinkers not doing the right thing. But without that callout from the infrastructure and the infrastructure to drive it at appropriate times, we will make zero progress improving the situation. Yes, the code may not be perfect and, yes, it may not be useful to mm developers, but for the people who have to debug shrinker related problems in production systems we need all the help we can get. We certainly don't care if it isn't perfect, just having something we can partially tailor to our iindividual needs is far, far better than the current situation of nothing at all... -Dave. -- Dave Chinner david@fromorbit.com