From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D5E7C433F5 for ; Tue, 22 Mar 2022 00:51:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2E436B0072; Mon, 21 Mar 2022 20:51:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DDDF96B0073; Mon, 21 Mar 2022 20:51:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA6B46B0074; Mon, 21 Mar 2022 20:51:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id BE3CC6B0072 for ; Mon, 21 Mar 2022 20:51:04 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 98AA9239CB for ; Tue, 22 Mar 2022 00:51:04 +0000 (UTC) X-FDA: 79270192848.13.6F660A1 Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by imf24.hostedemail.com (Postfix) with ESMTP id 2D52B180032 for ; Tue, 22 Mar 2022 00:51:04 +0000 (UTC) Received: by mail-qv1-f48.google.com with SMTP id ke15so3783585qvb.11 for ; Mon, 21 Mar 2022 17:51:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:subject:message-id:mime-version:content-disposition; bh=2yh3eeY6mtNkWo8Z5mBcuKeO1GX7Ddd7cVaBFvCLdqY=; b=T1AELz7TXpicUIyDb5QZbA+4pQnpQWobSFXoQ2oGtPnx/PNHES0PpfTNln/mzS9+cE jZ32TaAssTlfIkmAYzpUS2/jXB3TsJlCXZB1Nn943LxRVYI9IG6M0sPcfPF0ySlPHzm3 gAnLQivzAkgi2YQvhKmi21SFYRDWq4/E3QygKLTWi7z5pKtOeTKsB+mDfzq2pAe2iwQ4 GAPdKN9JsQanuFxmSkYEihQQBNqxCkRDFH2LqxAYLAJlVBVVCALCgVT+/CccZ4vqUrtl /ECmO3eKRwmLBLxT//6YO9dbGwEvGxj8sQ8q5nr38Z5+AIt2WKquf7KM2F5Gkcb3sPSp g9UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:subject:message-id:mime-version :content-disposition; bh=2yh3eeY6mtNkWo8Z5mBcuKeO1GX7Ddd7cVaBFvCLdqY=; b=zTCk7UYm4rE0ltri1/gxqznviWXLlq+Ejp/Or3BDf227nBiHvJVx/SZG2CfYTMkIhC pE/MhKm5nKftORrMVu3G5OuivM9fJgVDcgflLI21eUDS4b59G29isjoPM6Z5j76Crh84 ajNU4TGaENYUjmzGPvId95Pl7N2jYdfR1mb6oROVqYdLDtL6GpWctMMJM4OngzAWPppS GuM2ggv3S09ZOUCiTqOJ5Vvo24GWyU9mQqbgEK6sdNF2X64W7pYRH0GjCW2r/RtgxZHC CAIpYHLfBAAXHKLPpbGnzM1RmW9jffOVXW7EAB8pVrPM2nU+ICLeTBLN09g5G6/TblOt voHA== X-Gm-Message-State: AOAM530PQS9WbN5B75JkbTXwjuSgP2xiRIke9tXjmlxKemqRlK3DNagJ w5z3A2+ddJOOc8GkL7yPPPrzN4cb446h X-Google-Smtp-Source: ABdhPJyi1+KUHGTwPd9f7MTj0Iy15QPqAMe7NH2nWXXVcPZ2VNiKCcWGMKm6ugMmcUrXiUKHlCQw+g== X-Received: by 2002:a05:6214:19e3:b0:440:da81:34e9 with SMTP id q3-20020a05621419e300b00440da8134e9mr18153251qvc.31.1647910263462; Mon, 21 Mar 2022 17:51:03 -0700 (PDT) Received: from moria.home.lan (c-73-219-103-14.hsd1.vt.comcast.net. [73.219.103.14]) by smtp.gmail.com with ESMTPSA id z8-20020ac87f88000000b002e1cecad0e4sm12658207qtj.33.2022.03.21.17.51.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Mar 2022 17:51:02 -0700 (PDT) Date: Mon, 21 Mar 2022 20:51:01 -0400 From: Kent Overstreet To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, lsf-pc@lists.linux-foundation.org Subject: [LSF/MM TOPIC] Improving OOM debugging Message-ID: <20220322005101.actefn6nttzeo2qr@moria.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspam-User: Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=T1AELz7T; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of kent.overstreet@gmail.com designates 209.85.219.48 as permitted sender) smtp.mailfrom=kent.overstreet@gmail.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 2D52B180032 X-Stat-Signature: ftwkpkrfwxyurqe67d6zfrpxxrzcdq1w X-HE-Tag: 1647910264-231220 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Frustration when debugging OOMs, memory usage, and memory reclaim behaviour is a topic I think a lot of us can relate to. I think it might be worth having a talk to collectively air our frustrations and collect ideas for improvements. To start with: on memory allocation failure or OOM, we currently don't have a lot to go on. We get information about the allocation that failed, and only very coarse grained information about how memory is being tied up - page granural informatian aka show_mem() is nigh useless in most situations, and slab granural information is only slightly better. I have a couple ideas I want to float: - An old idea I've had and mentioned to some people before is to steal dynamic debug's trick of statically allocating tracking structs in a special elf section, and use it to wrap kmalloc(), alloc_pages() etc. calls for memory allocation tracking _per call site_, and then available in debugs broken out by file and line number. This would be cheap enough that it could be always on in production, unlike doing the same sort of thing with tracepoints. The cost would be another pointer of overhead for each allocation - for page allocations we've got CONFIG_PAGE_OWNER that does something like this (in a much more expensive fashion), and the pointer it uses could be repurposed. For slub/slab I think something analogous exists, but last I looked it'd probably need help from those developers (in both cases, really; mm code is hairy). - In bcachefs, I've been evolving a 'printbuf' thingy - heap allocated strings that you can pass around and append to. They make it really convenient to write pretty-printers for lots of things and pass them around, which in turn has made my life considerably easier in the debugging realm. I think that could be useful here: On a typical system shrinkers own a signifcant fraction of non-pagecache kernel memory, and shrinkers have internal state that's particular to each shrinker that's relevant to how much memory is currently freeable (dirtyness, locking issues). Imagine if shrinkers all had .to_text() methods, and then on memory allocation failure we could call those and print them for top-10 shrinkers by memory owned - in addition to sticking it in sysfs or debugfs.