From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B962C433EF for ; Sat, 16 Apr 2022 00:28:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6698A6B0072; Fri, 15 Apr 2022 20:28:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6176C6B0073; Fri, 15 Apr 2022 20:28:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 507F06B0074; Fri, 15 Apr 2022 20:28:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 42A346B0072 for ; Fri, 15 Apr 2022 20:28:16 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AF6F81837CD71 for ; Sat, 16 Apr 2022 00:28:15 +0000 (UTC) X-FDA: 79360855350.31.CCD941A Received: from out2.migadu.com (out2.migadu.com [188.165.223.204]) by imf06.hostedemail.com (Postfix) with ESMTP id 04765180009 for ; Sat, 16 Apr 2022 00:28:14 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1650068890; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=Etasg+v6owLXNmckVniRSQ0Yg9kVpCRb3OzMCr8eDBk=; b=RW52i9HlHab9h0zmqX5ydkAQkRh/NBSkWfV4ilQkLPvPebn7zaTVnl56VGnEo5jgxQbUM+ feBmBOsbxgdnY3V7Ci/uT0olN2ZV7++ekypBE0bvv2WrxNsf1QyxjPJKmjME3oVofdJyiG LvDWoDAjz8GDVkAdn/iILLr78cYc5kI= From: Roman Gushchin To: linux-mm@kvack.org Cc: Andrew Morton , Dave Chinner , linux-kernel@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Yang Shi , Roman Gushchin Subject: [PATCH rfc 0/5] mm: introduce shrinker sysfs interface Date: Fri, 15 Apr 2022 17:27:51 -0700 Message-Id: <20220416002756.4087977-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 04765180009 X-Stat-Signature: o85zwyw66eq7hbtj61mw8h548qb98ins Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=RW52i9Hl; spf=pass (imf06.hostedemail.com: domain of roman.gushchin@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Rspam-User: X-HE-Tag: 1650068894-781293 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are 50+ different shrinkers in the kernel, many with their own bell= s and whistles. Under the memory pressure the kernel applies some pressure on e= ach of them in the order of which they were created/registered in the system. So= me of them can contain only few objects, some can be quite large. Some can b= e effective at reclaiming memory, some not. The only existing debugging mechanism is a couple of tracepoints in do_shrink_slab(): mm_shrink_slab_start and mm_shrink_slab_end. They aren'= t covering everything though: shrinkers which report 0 objects will never s= how up, there is no support for memcg-aware shrinkers. Shrinkers are identified b= y their scan function, which is not always enough (e.g. hard to guess which super block's shrinker it is having only "super_cache_scan"). They are a passiv= e mechanism: there is no way to call into counting and scanning of an indiv= idual shrinker and profile it. To provide a better visibility and debug options for memory shrinkers this patchset introduces a /sys/kernel/shrinker interface, to some extent similar to /sys/kernel/slab. For each shrinker registered in the system a folder is created. The folde= r contains "count" and "scan" files, which allow to trigger count_objects() and scan_objects() callbacks. For memcg-aware and numa-aware shrinkers count_memcg, scan_memcg, count_node, scan_node, count_memcg_node and scan_memcg_node are additionally provided. They allow to get per-memc= g and/or per-node object count and shrink only a specific memcg/node. To make debugging more pleasant, the patchset also names all shrinkers, so that sysfs entries can have more meaningful names. Usage examples: 1) List registered shrinkers: $ cd /sys/kernel/shrinker/ $ ls dqcache-16 sb-cgroup2-30 sb-hugetlbfs-33 sb-proc-41 = sb-selinuxfs-22 sb-tmpfs-40 sb-zsmalloc-19 kfree_rcu-0 sb-configfs-23 sb-iomem-12 sb-proc-44 = sb-sockfs-8 sb-tmpfs-42 shadow-18 sb-aio-20 sb-dax-11 sb-mqueue-21 sb-proc-45 = sb-sysfs-26 sb-tmpfs-43 thp_deferred_split-10 sb-anon_inodefs-15 sb-debugfs-7 sb-nsfs-4 sb-proc-47 = sb-tmpfs-1 sb-tmpfs-46 thp_zero-9 sb-bdev-3 sb-devpts-28 sb-pipefs-14 sb-pstore-31 = sb-tmpfs-27 sb-tmpfs-49 xfs_buf-37 sb-bpf-32 sb-devtmpfs-5 sb-proc-25 sb-rootfs-2 = sb-tmpfs-29 sb-tracefs-13 xfs_inodegc-38 sb-btrfs-24 sb-hugetlbfs-17 sb-proc-39 sb-securityfs-6= sb-tmpfs-35 sb-xfs-36 zspool-34 2) Get information about a specific shrinker: $ cd sb-btrfs-24/ $ ls count count_memcg count_memcg_node count_node scan scan_memcg s= can_memcg_node scan_node 3) Count objects on the system/root cgroup level $ cat count 212 4) Count objects on the system/root cgroup level per numa node (on a 2-no= de machine) $ cat count_node 209 3 5) Count objects for each memcg (output format: cgroup inode, count) $ cat count_memcg 1 212 20 96 53 817 2297 2 218 13 581 30 911 124 6) Same but with a per-node output $ cat count_memcg_node 1 209 3 20 96 0 53 810 7 2297 2 0 218 13 0 581 30 0 911 124 0 7) Don't display cgroups with less than 500 attached objects $ echo 500 > count_memcg $ cat count_memcg 53 817 1868 886 2396 799 2462 861 8) Don't display cgroups with less than 500 attached objects (sum over al= l nodes) $ echo "500" > count_memcg_node $ cat count_memcg_node 53 810 7 1868 886 0 2396 799 0 2462 861 0 9) Scan system/root shrinker $ cat count 212 $ echo 100 > scan $ cat scan 97 $ cat count 115 10) Scan individual memcg $ echo "1868 500" > scan_memcg $ cat scan_memcg 193 11) Scan individual node $ echo "1 200" > scan_node $ cat scan_node 2 12) Scan individual memcg and node $ echo "1868 0 500" > scan_memcg_node $ cat scan_memcg_node 435 If the output doesn't fit into a single page, "...\n" is printed at the e= nd of output. Roman Gushchin (5): mm: introduce sysfs interface for debugging kernel shrinker mm: memcontrol: introduce mem_cgroup_ino() and mem_cgroup_get_from_ino() mm: introduce memcg interfaces for shrinker sysfs mm: introduce numa interfaces for shrinker sysfs mm: provide shrinkers with names arch/x86/kvm/mmu/mmu.c | 2 +- drivers/android/binder_alloc.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 +- drivers/gpu/drm/msm/msm_gem_shrinker.c | 2 +- .../gpu/drm/panfrost/panfrost_gem_shrinker.c | 2 +- drivers/gpu/drm/ttm/ttm_pool.c | 2 +- drivers/md/bcache/btree.c | 2 +- drivers/md/dm-bufio.c | 2 +- drivers/md/dm-zoned-metadata.c | 2 +- drivers/md/raid5.c | 2 +- drivers/misc/vmw_balloon.c | 2 +- drivers/virtio/virtio_balloon.c | 2 +- drivers/xen/xenbus/xenbus_probe_backend.c | 2 +- fs/erofs/utils.c | 2 +- fs/ext4/extents_status.c | 3 +- fs/f2fs/super.c | 2 +- fs/gfs2/glock.c | 2 +- fs/gfs2/main.c | 2 +- fs/jbd2/journal.c | 2 +- fs/mbcache.c | 2 +- fs/nfs/nfs42xattr.c | 7 +- fs/nfs/super.c | 2 +- fs/nfsd/filecache.c | 2 +- fs/nfsd/nfscache.c | 2 +- fs/quota/dquot.c | 2 +- fs/super.c | 2 +- fs/ubifs/super.c | 2 +- fs/xfs/xfs_buf.c | 2 +- fs/xfs/xfs_icache.c | 2 +- fs/xfs/xfs_qm.c | 2 +- include/linux/memcontrol.h | 9 + include/linux/shrinker.h | 25 +- kernel/rcu/tree.c | 2 +- lib/Kconfig.debug | 9 + mm/Makefile | 1 + mm/huge_memory.c | 4 +- mm/memcontrol.c | 23 + mm/shrinker_debug.c | 792 ++++++++++++++++++ mm/vmscan.c | 66 +- mm/workingset.c | 2 +- mm/zsmalloc.c | 2 +- net/sunrpc/auth.c | 2 +- 42 files changed, 957 insertions(+), 47 deletions(-) create mode 100644 mm/shrinker_debug.c --=20 2.35.1