From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E306AC67861 for ; Fri, 5 Apr 2024 14:14:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65EA86B008A; Fri, 5 Apr 2024 10:14:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E9036B008C; Fri, 5 Apr 2024 10:14:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 460C96B0092; Fri, 5 Apr 2024 10:14:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 257E96B008A for ; Fri, 5 Apr 2024 10:14:28 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B170E140249 for ; Fri, 5 Apr 2024 14:14:27 +0000 (UTC) X-FDA: 81975673374.15.A903A01 Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) by imf05.hostedemail.com (Postfix) with ESMTP id CA14D100014 for ; Fri, 5 Apr 2024 14:14:25 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fi4N4lyA; spf=pass (imf05.hostedemail.com: domain of surenb@google.com designates 209.85.219.179 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712326465; a=rsa-sha256; cv=none; b=1CX8QUQTo1fERmg5LXhLICLL8ntSZkgBT9W+g49VRVbT9UNSWFe3gyFLemhsd4SqlwTFME ZDmSItXO4AcR6agOLDW4+dZaiCPeEl5fFrfbP4Tbr7eDfVtS74LxSBZMs0YjZBtDgPTSr/ sy7OC5PAjGrZtNJ8xmBUWqLO5TMy4UE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fi4N4lyA; spf=pass (imf05.hostedemail.com: domain of surenb@google.com designates 209.85.219.179 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712326465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hpu1q6SRpmnX63rYNMcOKU3gYj7mEiDBoir+Gp671n0=; b=5Vz9FkLKn13g2jr1U22t8NWQzQh0GAXk7j5bFYLYGE5PwYjNodH/4tyAGJ6lCl1fPcO0+v W5jmeE6mwBhDcKfxkGitmSr8/DcvOWNXIEFpU/Il4M5hAYk5VCuts24PTpm2OafmVjKoSG z1nhJUu+bxJkti9T9CxAbuyJJbn1Npc= Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-dccb1421bdeso2302930276.1 for ; Fri, 05 Apr 2024 07:14:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712326465; x=1712931265; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hpu1q6SRpmnX63rYNMcOKU3gYj7mEiDBoir+Gp671n0=; b=fi4N4lyADO2zktd/dHDprQ9pyO+1qOYG9HBgzq7TReoIpmF2jWfA3XXko+tCEtMopr R5tNYXyORGI/ZW3QkjLWSfEXKu24iY2TP5YGca34YU/V8PFNazGdLnQraRKgeGA+GFCG E5kLw/5TbBthx+uheYWzRBo5ohdBJhXIMWlo0sGsGmGAyWP2/xR/E43/6KzyxNGb+jkK W0iw1dLpQcDaGB9gr7BnusoEgguLWDwwoWV+MtGNyngVfQTR4M1CxDt8u7sTHI7/V0+M epj4A4mqIEuqDXj0uCySG84S7nwrq9wwNYEYi2x6CWxE6d/24Sw6vMRrVM1F8NVe8NsM pamw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712326465; x=1712931265; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hpu1q6SRpmnX63rYNMcOKU3gYj7mEiDBoir+Gp671n0=; b=YSE/Uxq2kAmDwC514lqICYAFERENRwWtEVhIlq1FzjH6Ik/SZ267k17Ae2+brsYvjt xgMI2NzeJu8XcawO8Bi3mNr3MSUxoDHUw9ceiwkf/X89NB45s+cUHR3bhG58lOf1BXSL e/4FV0bpiJY311SksGgF5s48MP4fB6D3i4L+6pqV3OpFMFedHxQxAzh7cdRhmPi5Pu1d fR6mQAMxeKokd1LhGb5MPa0CpLr9TnPFVU0ZaTWOrXgQKRc58bssAxfho+fjtDKYg5PD r9zlc6D+3wkxIzcjb2RXH6hVcv1Gue2waApA79qOu7zayh8NqU1vTzOsiHbF1fVd/G0d tCAA== X-Forwarded-Encrypted: i=1; AJvYcCXd/ENwuJEmr6JFlJsvBChB9PL4lXJWc9YjlW6edVhGfHWDu2Fj0tEeU3zHs91lXujsLsR9FCKZVwF+V7sHCzjZ2m4= X-Gm-Message-State: AOJu0YzgK5nB9lO51n3nvcc5nrDrr+gqQxXepe9zR+Z9i6vboFRfgXU8 KW6hZUL4PT2SAQHY4OvpzG9gmsDWN5hsFMHxm1vpW3zuwhTya+i24LViBNPzuIFYenMLw5Lxjc1 R/NkZedfi7014/icmQxGiLhSF6APwfyf9X3If X-Google-Smtp-Source: AGHT+IHDoL8CyE4JI6s6nrfOOaPQHBhIqvTMcGBGoOIQ6VIDYTEmV/Kyll+3WtknG2SvBV6fBNw77eOzOIYfr57jOZQ= X-Received: by 2002:a05:6902:4a:b0:dc7:32ea:c89f with SMTP id m10-20020a056902004a00b00dc732eac89fmr1241871ybh.15.1712326464281; Fri, 05 Apr 2024 07:14:24 -0700 (PDT) MIME-Version: 1.0 References: <20240321163705.3067592-1-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Fri, 5 Apr 2024 07:14:13 -0700 Message-ID: Subject: Re: [PATCH v6 00/37] Memory allocation profiling To: Klara Modin Cc: akpm@linux-foundation.org, kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, penguin-kernel@i-love.sakura.ne.jp, corbet@lwn.net, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, catalin.marinas@arm.com, will@kernel.org, arnd@arndb.de, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, dennis@kernel.org, jhubbard@nvidia.com, tj@kernel.org, muchun.song@linux.dev, rppt@kernel.org, paulmck@kernel.org, pasha.tatashin@soleen.com, yosryahmed@google.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, andreyknvl@gmail.com, keescook@chromium.org, ndesaulniers@google.com, vvvvvv@google.com, gregkh@linuxfoundation.org, ebiggers@google.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, songmuchun@bytedance.com, jbaron@akamai.com, aliceryhl@google.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, kernel-team@android.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, kasan-dev@googlegroups.com, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: CA14D100014 X-Stat-Signature: z9i77mjbh64aa4u3c4eq6ddrzyiqo538 X-Rspam-User: X-HE-Tag: 1712326465-336628 X-HE-Meta: U2FsdGVkX1//QKAhxCH4Kz0NY/wpjTv+Dlg9VQfNWI2zIevg2J2Eh1uzjnIOXv0htEQ2PiwvNSDCYZ+wNeeCoa0pz+IFn7Bi7+ll7q2ITnj+qNPa/x5Ny839RC+y3rLewMqu5tp6p59FURjBzKnzVGH7gJvtUml2TtjKCp7WqpVfUBxV04xGQeT8W7nKNwRptqiC5U9EBtmR4stPpdpNEgyG2LYXsqrJKW6Ijv39NGiriWD81HbuqynxpoAldA/3HUEuhyS85mT9qjQYtUvTpBg9vuEJ6KZNVQ50/bi+W/7VOVFkrIEMr13yaLCbc336ZpvnypV8l4BaZfiZjXtMaBs+04QYvthxZBOYDIwF6LxleSchPH8xbOpxx/tF8QYm9t+6Wi5lcwIAOL2YJO9WQWEXrhwfhbOgZD50zIUodeP6Dy+Ml/0117wXxajzX6nnhrIgvW5UCmrn5VK7RsPE51OuE2L1qAqKD/n4JlesfTGGFuJDwqMWmjZN6LP96gAPc8kQUFpQ7C/iEMsvkd2Wm30mejuF34ioNTM7ayqQmTN3pl6hbOzQ9CztF+YWRG8Ig5PGP+4WAZfGgwQbGH7XcIEQBkjbJPpApPA9ZIRJfcPTC8ezDx+FK7d3I+Q5c79n/3Ru2w/XsNu4ObSiUSfcR4Ll1K2xWOWgd3HCi4g2xhDbqVGrnJ61xVVoO4fTqqZ+pdEF/XSENhtwFqpUGoUxexYKNHTaTbPp0gWJ4MeKWlCiwO1GFslu3/OKaTisUzt95tHgPQea52k2Eb1oWKwZZ5gfVzyXthQUcJu3z2jwEtpdw8Oxal+/OH/IBxwFs5qA/tzdFC0TEPWpc+EGMeUx5BVCw8isxZ3HJFI+fAngxUdyl+8rPETWL4GQLRfb1Jnq5Q5Si12cKEhM6UcKaSFHGqEYLTrMknnfWnQ5m/imK7QnbT+/oH7zKOLwdB3v5Vm1nu+rsKVYBbowlH/WF2c AoRh5xtm UHTko3UY46Xp+21KEHhhWYn6OcHbO1T+0LCNEpBSe7Fo4bSzou0hd2Q9MnBQyWMTlK7tEtLKPuZO06qo8pFI8wkrQIkdAAD69AAmL9SIn+8qMdXRkVKy31ZXb4l6eUoeRhAEQYLN+3hvO0siKoSnI61TS/fT6cQn7CELqr3MouuApNIMy8N3HDWnIzHfQhDOFnnpdeM0z8qDX9dfLVa7rjEJd39CoRT0JKgoeREM2vYqcJ2cFrYVz5k3+kJt39HDbCtA3Yp8p+C/BkPH6u6etRXOOg/y3hlpLJA/7VhgQoMBWpDfvDBXTcH49Q1+SINPqxtl5My0T9YtCc1g0WkneO+NDR798C3Nfi6upMFdpkHHx/WnqriunekW40y5cFgFgr6zkh+CasQSMdZM5GuZ7upVwm5n/RM11AE43sTf94q2rTKWHg0tAS0Q5bZ8qjAUyAbqFDTitJvrydeAf6AbtB7RTEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 5, 2024 at 6:37=E2=80=AFAM Klara Modin = wrote: > > Hi, > > On 2024-03-21 17:36, Suren Baghdasaryan wrote: > > Overview: > > Low overhead [1] per-callsite memory allocation profiling. Not just for > > debug kernels, overhead low enough to be deployed in production. > > > > Example output: > > root@moria-kvm:~# sort -rn /proc/allocinfo > > 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext > > 56373248 4737 mm/slub.c:2259 func:alloc_slab_page > > 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded > > 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash > > 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs > > 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio > > 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node > > 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_c= t_alloc_hashtable > > 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] f= unc:ctagmod_start > > 3940352 962 mm/memory.c:4214 func:alloc_anon_folio > > 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node > > ... > > > > Since v5 [2]: > > - Added Reviewed-by and Acked-by, per Vlastimil Babka and Miguel Ojeda > > - Changed pgalloc_tag_{add|sub} to use number of pages instead of order= , per Matthew Wilcox > > - Changed pgalloc_tag_sub_bytes to pgalloc_tag_sub_pages and adjusted t= he usage, per Matthew Wilcox > > - Moved static key check before prepare_slab_obj_exts_hook(), per Vlast= imil Babka > > - Fixed RUST helper, per Miguel Ojeda > > - Fixed documentation, per Randy Dunlap > > - Rebased over mm-unstable > > > > Usage: > > kconfig options: > > - CONFIG_MEM_ALLOC_PROFILING > > - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT > > - CONFIG_MEM_ALLOC_PROFILING_DEBUG > > adds warnings for allocations that weren't accounted because of a > > missing annotation > > > > sysctl: > > /proc/sys/vm/mem_profiling > > > > Runtime info: > > /proc/allocinfo > > > > Notes: > > > > [1]: Overhead > > To measure the overhead we are comparing the following configurations: > > (1) Baseline with CONFIG_MEMCG_KMEM=3Dn > > (2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=3Dy && > > CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=3Dn) > > (3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=3Dy && > > CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=3Dy) > > (4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=3Dy && > > CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=3Dn && /proc/sys/vm/mem_prof= iling=3D1) > > (5) Baseline with CONFIG_MEMCG_KMEM=3Dy && allocating with __GFP_ACCOUN= T > > (6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=3Dy && > > CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=3Dn) && CONFIG_MEMCG_KMEM= =3Dy > > (7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=3Dy && > > CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=3Dy) && CONFIG_MEMCG_KMEM=3D= y > > > > Performance overhead: > > To evaluate performance we implemented an in-kernel test executing > > multiple get_free_page/free_page and kmalloc/kfree calls with allocatio= n > > sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU > > affinity set to a specific CPU to minimize the noise. Below are results > > from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on > > 56 core Intel Xeon: > > > > kmalloc pgalloc > > (1 baseline) 6.764s 16.902s > > (2 default disabled) 6.793s (+0.43%) 17.007s (+0.62%) > > (3 default enabled) 7.197s (+6.40%) 23.666s (+40.02%) > > (4 runtime enabled) 7.405s (+9.48%) 23.901s (+41.41%) > > (5 memcg) 13.388s (+97.94%) 48.460s (+186.71%) > > (6 def disabled+memcg) 13.332s (+97.10%) 48.105s (+184.61%) > > (7 def enabled+memcg) 13.446s (+98.78%) 54.963s (+225.18%) > > > > Memory overhead: > > Kernel size: > > > > text data bss dec diff > > (1) 26515311 18890222 17018880 62424413 > > (2) 26524728 19423818 16740352 62688898 264485 > > (3) 26524724 19423818 16740352 62688894 264481 > > (4) 26524728 19423818 16740352 62688898 264485 > > (5) 26541782 18964374 16957440 62463596 39183 > > > > Memory consumption on a 56 core Intel CPU with 125GB of memory: > > Code tags: 192 kB > > PageExts: 262144 kB (256MB) > > SlabExts: 9876 kB (9.6MB) > > PcpuExts: 512 kB (0.5MB) > > > > Total overhead is 0.2% of total memory. > > > > Benchmarks: > > > > Hackbench tests run 100 times: > > hackbench -s 512 -l 200 -g 15 -f 25 -P > > baseline disabled profiling enabled profiling > > avg 0.3543 0.3559 (+0.0016) 0.3566 (+0.0023) > > stdev 0.0137 0.0188 0.0077 > > > > > > hackbench -l 10000 > > baseline disabled profiling enabled profiling > > avg 6.4218 6.4306 (+0.0088) 6.5077 (+0.0859) > > stdev 0.0933 0.0286 0.0489 > > > > stress-ng tests: > > stress-ng --class memory --seq 4 -t 60 > > stress-ng --class cpu --seq 4 -t 60 > > Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_str= ess-ng/ > > > > [2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.= com/ > > If I enable this, I consistently get percpu allocation failures. I can > occasionally reproduce it in qemu. I've attached the logs and my config, > please let me know if there's anything else that could be relevant. Thanks for the report! In debug_alloc_profiling.log I see: [ 7.445127] percpu: limit reached, disable warning That's probably the reason. I'll take a closer look at the cause of that and how we can fix it. In qemu-alloc3.log I see couple of warnings: [ 1.111620] alloc_tag was not set [ 1.111880] WARNING: CPU: 0 PID: 164 at include/linux/alloc_tag.h:118 kfree (./include/linux/alloc_tag.h:118 (discriminator 1) ./include/linux/alloc_tag.h:161 (discriminator 1) mm/slub.c:2043 ... [ 1.161710] alloc_tag was not cleared (got tag for fs/squashfs/cache.c:4= 13) [ 1.162289] WARNING: CPU: 0 PID: 195 at include/linux/alloc_tag.h:109 kmalloc_trace_noprof (./include/linux/alloc_tag.h:109 (discriminator 1) ./include/linux/alloc_tag.h:149 (discriminator 1) ... Which means we missed to instrument some allocation. Can you please check if disabling CONFIG_MEM_ALLOC_PROFILING_DEBUG fixes QEMU case? In the meantime I'll try to reproduce and fix this. Thanks, Suren. > > Kind regards, > Klara Modin