From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DD3FC10DC1 for ; Wed, 29 Nov 2023 09:15:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E632D6B0372; Wed, 29 Nov 2023 04:14:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E13D86B0374; Wed, 29 Nov 2023 04:14:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDAF16B0376; Wed, 29 Nov 2023 04:14:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BC3B76B0372 for ; Wed, 29 Nov 2023 04:14:59 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 88CE58047F for ; Wed, 29 Nov 2023 09:14:59 +0000 (UTC) X-FDA: 81510432318.28.35775F6 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf29.hostedemail.com (Postfix) with ESMTP id 56FCF120005 for ; Wed, 29 Nov 2023 09:14:57 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=Bl4txgUy; spf=pass (imf29.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.130 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701249297; a=rsa-sha256; cv=none; b=YAQ1t2zAgz8eakbGb1tqsa9B3yE6yQUq45Qjh9vczYWThx90nQ0TfcRH39B3SP8Amb1Tql DBDFaGmm+GJa9/g/HXT46ccAgyCTp8cHDXYH07a1JqREod+o/DaNWCuUm000ODJ3wjyJEm V+ZnS/bz0HmzY+tNPf0vQ7sDpzchVE8= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=Bl4txgUy; spf=pass (imf29.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.130 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701249297; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iTXJagBqPSeQGBe2NgzqRCNM+zJBCNxL2TWAPwVsXTg=; b=7/wJHoxlvVcowgT59SiAJ2aElY//GQs3HStgzLc36kAc9bNBgzAs+2HW6HvJKueg4iCOjH EvhN38I0QKwUuaBjaR7oLNw2TGEVXbPDRfuZPOAsHFgPrYvKHZpz/7Z1MO9acoMZcNu53H UtrIU3z4E/1H3Be9Lus5XKUQWHhseV8= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 961D02193C; Wed, 29 Nov 2023 09:14:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1701249295; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=iTXJagBqPSeQGBe2NgzqRCNM+zJBCNxL2TWAPwVsXTg=; b=Bl4txgUygG7bkfTIElrAJ6DaFeVzy+4+L8B8gIZchfUabewwT84jCcO8gXDHEtRfmC0oQ2 2MQeS3K46DU4I5LId3NJmx8+fleWDFX1jibnGKRaiZMw+8CyiGU6UTDJebLfh8dqtOGx9h dwvPiaFqY102kl6Jv8eG6H6CMrX+FHA= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 7729D13637; Wed, 29 Nov 2023 09:14:55 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id mzkPGg8BZ2WgcQAAD6G6ig (envelope-from ); Wed, 29 Nov 2023 09:14:55 +0000 Date: Wed, 29 Nov 2023 10:14:54 +0100 From: Michal Hocko To: Roman Gushchin Cc: Qi Zheng , Kent Overstreet , Muchun Song , Linux-MM , linux-kernel@vger.kernel.org, Andrew Morton , Dave Chinner Subject: Re: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers Message-ID: References: <20231122232515.177833-1-kent.overstreet@linux.dev> <20231122232515.177833-3-kent.overstreet@linux.dev> <20231123212411.s6r5ekvkklvhwfra@moria.home.lan> <4caadff7-1df0-45cc-9d43-e616f9e4ddb3@bytedance.com> <20231125003009.tbaxuquny43uwei3@moria.home.lan> <76A1EE85-B62C-49B3-889C-80F9A2A88040@linux.dev> <20231128035345.5c7yc7jnautjpfoc@moria.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 56FCF120005 X-Stat-Signature: bbeirjojf4m9465dhi64jsy3xbwcpkiu X-Rspam-User: X-HE-Tag: 1701249297-161149 X-HE-Meta: U2FsdGVkX18u0mhLPV8cYnirEluaaX0JQGThM7oZMmKobN96Kd4ePmhyunFUFw4w1pRZ2sMsDH3aqNH9vxO5Ze7QWRLnTHFsskyeXa3CCIF1yuO5RhsrK9ojdw1Ewn8MyF5BMO0cSkSKQGvNCcHaqFILJ6ny6Xi2MP0c8b7TEjxdqQ41l8fxq2bdh9495ji5+ZqGVKE63L21EfAgP9xQZfSM6QHYD2fDciVi23IlXIi6oBaamdHoFbf8v7r0iVxQ7mt+eRdNtaQm4N6/UXgd8IA/JPH0oI+0/uZaOeuvwTbGUYTHEAj699OXrnT5cz0+CEvuS9YqrzE+mJtM/B1qHIWk+mGXbt6mNidX77t1+s+sMUN1GNg06I/qprOr9n+WDVOY1dhTJi6AdHhHXWAT0OdoLL0LIboN6zA2r4glN6duB9vXxTxviTYjw6tQtC6w0l+Rg6LAO7lpnj08lEqeAGkPjequ0MJJAhWMa4Gz53lUn4hZvJXnBhzP0CKL+u+eEGW4nE7HIRZ26oysZ6fSvZE+8BCAudIecBkx3tdJDwkRo++faGgvzebd8EW4ByaJKuYX/TGNc2bLB+8ItIzZVf91P33G+5aI8NLCguxqQ4FGhv1egqCc6C/GAJYGoUPJu6+z3NyHudJ3C+LXeP8GmSXAIJhA4+PRSofrk4vC/oqGXeqXQsn2ZjQ5S+kIai4oa5TDBchImgxBrhyQLeoailLrA9JR7Mov9w2PHz9OUzzls1bhsb92/W+SGyZvvi7ug+9t5VM5TfcdKwYwT3GWhGo+nGNV/Hi+i1Dj9B/OJbqNfC+dozmksRPXmHxan/7or/9ulep1opuUPflrq4q+AQTvjJF65BDCiz3N/8Bh/NAgvqvUJ2suNQlf+faDpt002KAD0uDx+jKoozMFjgwNpJ5+RD6/XyhagT55K5rtfTlvW33ZFkGwo24NbFijIt66uLqquMb5kV0tWIfIae2 1kzMvuOs ZFihb1I9h86M29FHx8wjh7+0UM63y67sd+0SPGlaG7HgVggLPMlcthpBwdZ3+8D2eivmPzJtAOQq/1iVoxyXxxj5CBZ0mK39ZASFSpSDJPXEswdvwcSLye1y6hCai9wazDCZOjoIW1cQZN6IVpS+qo6Nih40EfzBykf9wYf9rCoI5SXDeWdokOQ4iYLIIm6//tQM/eZP13ICW2awSVb/bUUXxQ8lO5G+LChBgXmd2b0i14SBQ0v1WxX8sOV1zEgv5L2DCZ46eu1A2Lm8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 28-11-23 16:34:35, Roman Gushchin wrote: > On Tue, Nov 28, 2023 at 02:23:36PM +0800, Qi Zheng wrote: [...] > > Now I think adding this method might not be a good idea. If we allow > > shrinkers to report thier own private information, OOM logs may become > > cluttered. Most people only care about some general information when > > troubleshooting OOM problem, but not the private information of a > > shrinker. > > I agree with that. > > It seems that the feature is mostly useful for kernel developers and it's easily > achievable by attaching a bpf program to the oom handler. If it requires a bit > of work on the bpf side, we can do that instead, but probably not. And this > solution can potentially provide way more information in a more flexible way. > > So I'm not convinced it's a good idea to make the generic oom handling code > more complicated and fragile for everybody, as well as making oom reports differ > more between kernel versions and configurations. Completely agreed! From my many years of experience of oom reports analysing from production systems I would conclude the following categories - clear runaways (and/or memory leaks) - userspace consumers - either shmem or anonymous memory predominantly consumes the memory, swap is either depleted or not configured. OOM report is usually useful to pinpoint those as we have required counters available - kernel memory consumers - if we are lucky they are using slab allocator and unreclaimable slab is a huge part of the memory consumption. If this is a page allocator user the oom repport only helps to deduce the fact by looking at how much user + slab + page table etc. form. But identifying the root cause is close to impossible without something like page_owner or a crash dump. - misbehaving memory reclaim - minority of issues and the oom report is usually insufficient to drill down to the root cause. If the problem is reproducible then collecting vmstat data can give a much better clue. - high number of slab reclaimable objects or free swap are good indicators. Shrinkers data could be potentially helpful in the slab case but I really have hard time to remember any such situation. On non-production systems the situation is quite different. I can see how it could be very beneficial to add a very specific debugging data for subsystem/shrinker which is developed and could cause the OOM. For that purpose the proposed scheme is rather inflexible AFAICS. -- Michal Hocko SUSE Labs