From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC953D18159 for ; Tue, 15 Oct 2024 00:23:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3ECC26B0082; Mon, 14 Oct 2024 20:23:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 39C676B0083; Mon, 14 Oct 2024 20:23:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23CF16B0085; Mon, 14 Oct 2024 20:23:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 029756B0082 for ; Mon, 14 Oct 2024 20:23:32 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5DA56AC26A for ; Tue, 15 Oct 2024 00:23:16 +0000 (UTC) X-FDA: 82673937696.30.069B7F0 Received: from fout-a2-smtp.messagingengine.com (fout-a2-smtp.messagingengine.com [103.168.172.145]) by imf25.hostedemail.com (Postfix) with ESMTP id EAA31A000A for ; Tue, 15 Oct 2024 00:23:25 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=dxuuu.xyz header.s=fm1 header.b=JccNm6SA; dkim=pass header.d=messagingengine.com header.s=fm2 header.b="h Vn0E9s"; spf=pass (imf25.hostedemail.com: domain of dxu@dxuuu.xyz designates 103.168.172.145 as permitted sender) smtp.mailfrom=dxu@dxuuu.xyz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728951738; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=K9W7oqJeeKfbnZ6btzb5m3NnYSN+RzoG8W7iPIj6DY0=; b=HZalURVzdWy7qopAzXL5otwQC7UaWyEn+Qy5wpUNmfIXOqrv8DV7n709nzX5cghnk4WAjq ucR9EFvyLLHb+Y3Z8dcfRbdGmM5RHqIgIa4EFoL0hl4sRK43gX6CS9DoTO78Ztrxlv+gcf 7zDbPUR2mpNis2fc14Q+Au06GuD0T7I= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=dxuuu.xyz header.s=fm1 header.b=JccNm6SA; dkim=pass header.d=messagingengine.com header.s=fm2 header.b="h Vn0E9s"; spf=pass (imf25.hostedemail.com: domain of dxu@dxuuu.xyz designates 103.168.172.145 as permitted sender) smtp.mailfrom=dxu@dxuuu.xyz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728951738; a=rsa-sha256; cv=none; b=TsazkzXmCAe61YscaAZea//FDDnQyoUHgw6MRlpbcF9S8iwgcclkRZcE239/fpCNz+Nlsz BMbjZkjrVbfLfUV4tHqRZQcMg/9uvrpNEIhveYBVR/heiYblPOqcqJPZR91mW7/j2IqHaQ Isg+n6CbgKPQsmdHf/a8uXikT3dryeQ= Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfout.phl.internal (Postfix) with ESMTP id D06DC1380A24; Mon, 14 Oct 2024 20:23:29 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-01.internal (MEProxy); Mon, 14 Oct 2024 20:23:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1728951809; x=1729038209; bh=K9W7oqJeeKfbnZ6btzb5m3NnYSN+RzoG8W7iPIj6DY0=; b= JccNm6SAPEc+aE0VymV2qgK4+K9z9jtbQTFdHZBixsQANoexLXQuymkbKvxCLUE+ 53Ac4/ZOtjqTXFczpRA5SG8hqzg4GYnRVktWCkAxpmKIeo3x7d6R3lMmPgPAiHwq RivK+/3/rfDxRXqAArKOdvVolmZ3whdfy9WMR7RTMzjegRgTurG+l9xco+qziB1w Rt+em/zm10ok45un/Q6meYQUFDQotLb0uAcrZlzoBLYaJqnuJdN9a9Y+P52TuFp7 pszkfPa3Go30D43rLR9ovk5SUyqQNZlnS3WsmRlvmmTJf5hI/sXYKpqRRvWKpxD8 o6DYIHXcAGY+9C2pMYlkVQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728951809; x= 1729038209; bh=K9W7oqJeeKfbnZ6btzb5m3NnYSN+RzoG8W7iPIj6DY0=; b=h Vn0E9s0ov4tkRFfLkn1PK+uqaQSG4q+a+fnBo43W8pkdDRWpUDVmgdmrSe9tZrcD GzTXTq4Ve+qCoabdoTeTwKqGWfUz0fpu5JXeIFGa0wj4JBq2Lx4XGenry60dvDZs yZWWeMHd13V08HH80i/4tUKTWGWzP1w3pBFCSpfncYeWF/nfVhsvoyTT0mSg7zoL ZKD7/md3Ium9nKFREgX+LXdwHtp7cdTVAhRcX8xlcDk42vkkYCXHdhFdPJopDMO/ K0k0vbvnyQLwIuG4WHXz1Ioq+ElUWp2Y2u3p2hXDnka+suvYwQ7oCEUWYvziYJkk nnkpLbWhorQ+y10fC8msg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdegiedgfeegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnegfrhhlucfvnfffucdlfeehmdenucfjughrpeffhffvvefukfhf gggtugfgjgestheksfdttddtjeenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihuse gugihuuhhurdighiiiqeenucggtffrrghtthgvrhhnpeffffeggeekjedvjeegheetkedu hffgfeegveeklefhgeeuleejhfeljedtkeevffenucffohhmrghinhepghhithhhuhgsrd gtohhmnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhep ugiguhesugiguhhuuhdrgiihiidpnhgspghrtghpthhtohepudehpdhmohguvgepshhmth hpohhuthdprhgtphhtthhopeihohhsrhihrghhmhgvugesghhoohhglhgvrdgtohhmpdhr tghpthhtohepshhhrghkvggvlhdrsghuthhtsehlihhnuhigrdguvghvpdhrtghpthhtoh eprhhoshhtvgguthesghhoohgumhhishdrohhrghdprhgtphhtthhopegrkhhpmheslhhi nhhugidqfhhouhhnuggrthhiohhnrdhorhhgpdhrtghpthhtohephhgrnhhnvghssegtmh hpgigthhhgrdhorhhgpdhrtghpthhtohepmhhhohgtkhhosehkvghrnhgvlhdrohhrghdp rhgtphhtthhopehrohhmrghnrdhguhhshhgthhhinheslhhinhhugidruggvvhdprhgtph htthhopehmuhgthhhunhdrshhonhhgsehlihhnuhigrdguvghvpdhrtghpthhtohepihhn figrrhguvhgvshhsvghlsehgmhgrihhlrdgtohhm X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 14 Oct 2024 20:23:27 -0400 (EDT) Date: Mon, 14 Oct 2024 18:23:26 -0600 From: Daniel Xu To: Yosry Ahmed Cc: Shakeel Butt , Steven Rostedt , Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , JP Kobryn , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team , bpf@vger.kernel.org, Martin KaFai Lau Subject: Re: [PATCH] memcg: add tracing for memcg stat updates Message-ID: References: <20241010003550.3695245-1-shakeel.butt@linux.dev> <20241009210848.43adb0c3@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: EAA31A000A X-Stat-Signature: 453e1kjiupjfdhjbnk8wwwiui8cw8tpf X-HE-Tag: 1728951805-651360 X-HE-Meta: U2FsdGVkX19zqGEzpwRYALAUYvIBuxPyukXBtK4blX3rgRm46x+FguDkw+8+5ML7d817HXFRwsl8l5edHzJly10wN+PEZJ7np2Jyr7SzBaiS3bXfdAvYlK6mu0+vEVY65kFbUpMrxOaE4zh+5ica3zEnEvTKs1bW57Z0WrrI1+86sJzzZaad9XAE9pZ1J69H0u3CnWPzSztkpGvQHgRdFl3bDF/n2/pqXGDY4gUq+d4P4axk0izz7JLfAXZ+bkStXobe2MjaT+Msy8Yq0BZFJjh37akjEBbb0jsZM4bwoKx2oJuwiG9HmWn2kDEQAd7Xj6fq5QJv+KY03vN+D3v5phQx6oa51EkWz2UdUYlCOID82grYkXfM1VQyxZaqgpNkKCWra5MroBus0BxqQSgBSq9M/jLYgBAsdfdVqXFY8lzidbBudu2Rx+ONwaTjt+yV7L/9J7cstVkpfGYrkst1RAI6iEs73UeoB8iZipe0DtwaSJbo7I2amgRgcXItGEFYr+bFO99xXlbpBsL8dw2E0QF7L6/fH2Oto29qk5YFDypBb4YLtnfKFHbCBtRFcCpH/w1qejrFe353ixsMHC0bmwGjVgjPc4yhP2X5yOsdGI00Vqf+c03BbJMO3J7KyB7pugcvazsvc2llJ+Skt/5lwNN9Lt8LsoiTG0hiUM5xOzHP1OlOINGoOpVk/vWmaNqr6rOHbu7tYJeQ/5ldSKMIQd8QWUmNYbBaj83tuiKcIrxZD7VgDLE10apGO0lJhOlpzuX2qFYYkQU8keaCdyHqUJ1AqeBOEIucXzmKnfps/uoSSfHHUMjOZ+KBSBdB8s5NMI/vPOapamKlu7PmgkW2WTdDC7ZBtygP4/K4bH7L5IC8OqSK7bcsPaRq3qvu/Y1GccHMrvHA5Dv0yL1VxB8tW1pSz1bdTPz5fGUh4QOl3m52cwYsxVk17Of/chy/f0qLcBTOuIY15PrMaCCn7po aMiEAdM8 gUirgmXxLLGujKIEhGoF5OD4uYNdbqYkB7o+nU1ZchGTKG5bz2GhhLzGFaejyhwXZOhSxsRQ/x+EonlFxNRPd7LFM8iQNJG2ELyik3TR2+tOLFHIGUcao/JeZR1TSi+uC2FIcqbvWrJlNtmola5sbzh9Fe/K0fISFaHAtSQo2DaNYZYE0x7KufIOLi74j+OzK3N8R4dajKNywCWlQVcCKzlF/qBaK1Pbl1jrGxgnr+F3YGnYL5b55G+jOy8IqC5WNq+uaZDQYjaMr/wRhGyfpjefZI9/WRrWeWAgYsJzibpnYTg5Jl3n0zkX5S8Yfp0PdvlTyFSPS20WKrKQErX58MHVPaDjUfb8FBQmOHCDMHWj5BWzXJDlioz5RBtOQNEeiMzhj8yW53DnZI4g3uoMymX/PiFyGc6bZKYiWAQ7f4+b9KeOtLbkCMT/7s3XIdok9DklGe6cRAVtpd1EXtSMkFIXVBklwgte/Exe4f/5x9lCDrDNVCvP+FzmROl0jNwtEdeGTzaLeW4aaV0PZU66s1JF3p/rbQwWL59EFfsxYpkQc61Rx61/qdCEw8kO9rAdncAFgc6/xsud2ZrKRmRpjIYBJrahfZZGB5twc X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Yosry, On Mon, Oct 14, 2024 at 05:15:39PM GMT, Yosry Ahmed wrote: > On Thu, Oct 10, 2024 at 10:26 AM Shakeel Butt wrote: > > > > On Wed, Oct 09, 2024 at 06:24:55PM GMT, Yosry Ahmed wrote: > > > On Wed, Oct 9, 2024 at 6:08 PM Steven Rostedt wrote: > > > > > > > > On Wed, 9 Oct 2024 17:46:22 -0700 > > > > Yosry Ahmed wrote: > > > > > > > > > > +++ b/mm/memcontrol.c > > > > > > @@ -71,6 +71,10 @@ > > > > > > > > > > > > #include > > > > > > > > > > > > +#define CREATE_TRACE_POINTS > > > > > > +#include > > > > > > +#undef CREATE_TRACE_POINTS > > > > > > + > > > > > > #include > > > > > > > > > > > > struct cgroup_subsys memory_cgrp_subsys __read_mostly; > > > > > > @@ -682,7 +686,9 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum memcg_stat_item idx, > > > > > > return; > > > > > > > > > > > > __this_cpu_add(memcg->vmstats_percpu->state[i], val); > > > > > > - memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); > > > > > > + val = memcg_state_val_in_pages(idx, val); > > > > > > + memcg_rstat_updated(memcg, val); > > > > > > + trace_mod_memcg_state(memcg, idx, val); > > > > > > > > > > Is it too unreasonable to include the stat name? > > > > > > > > > > The index has to be correlated with the kernel config and perhaps even > > > > > version. It's not a big deal, but if performance is not a concern when > > > > > tracing is enabled anyway, maybe we can lookup the name here (or in > > > > > TP_fast_assign()). > > > > > > > > What name? Is it looked up from idx? If so, you can do it on the reading of > > > > Does reading side mean the one reading /sys/kernel/tracing/trace will do > > the translation from enums to string? > > > > > > the trace event where performance is not an issue. See the __print_symbolic() > > > > and friends in samples/trace_events/trace-events-sample.h > > > > > > Yeah they can be found using idx. Thanks for referring us to > > > __print_symbolic(), I suppose for this to work we need to construct an > > > array of {idx, name}. I think we can replace the existing memory_stats > > > and memcg1_stats/memcg1_stat_names arrays with something that we can > > > reuse for tracing, so we wouldn't need to consume extra space. > > > > > > Shakeel, what do you think? > > > > Cc Daniel & Martin > > > > I was planning to use bpftrace which can use dwarf/btf to convert the > > raw int to its enum string. Martin provided the following command to > > extract the translation from the kernel. > > > > $ bpftool btf dump file /sys/kernel/btf/vmlinux | grep -A10 node_stat_item > > [2264] ENUM 'node_stat_item' encoding=UNSIGNED size=4 vlen=46 > > 'NR_LRU_BASE' val=0 > > 'NR_INACTIVE_ANON' val=0 > > 'NR_ACTIVE_ANON' val=1 > > 'NR_INACTIVE_FILE' val=2 > > 'NR_ACTIVE_FILE' val=3 > > 'NR_UNEVICTABLE' val=4 > > 'NR_SLAB_RECLAIMABLE_B' val=5 > > 'NR_SLAB_UNRECLAIMABLE_B' val=6 > > 'NR_ISOLATED_ANON' val=7 > > 'NR_ISOLATED_FILE' val=8 > > ... > > > > My point is userspace tools can use existing infra to extract this > > information. > > > > However I am not against adding __print_symbolic() (but without any > > duplication), so users reading /sys/kernel/tracing/trace directly can > > see more useful information as well. Please post a follow up patch after > > this one. > > I briefly looked into this and I think it would be annoying to have > this, unfortunately. Even if we rework the existing arrays with memcg > stat names to be in a format conforming to tracing, we would need to > move them out to a separate header to avoid a circular dependency. > > Additionally, for __count_memcg_events() things will be more > complicated because the names are not in an array in memcontrol.c, but > we use vm_event_name() and the relevant names are part of a larger > array, vmstat_text, which we would need to rework similarly. > > I think this would be easier to implement if we can somehow provide a > callback that returns the name based on the index, rather than an > array. But even then, we would need to specify a different callback > for each event, so it won't be as simple as specifying the callback in > the event class. > > All in all, unless we realize there is something that is fundamentally > more difficult to do in userspace, I think it's not worth adding this > unfortunately :/ Turned out to be quite straightforward to do in userspace: https://github.com/bpftrace/bpftrace/pull/3515 . A nice property is the resolution occurs out of line and saves the kernel some cycles in the fast path. Thanks, Daniel