From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AACA7E66887 for ; Sat, 23 Nov 2024 21:14:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C5A776B0082; Sat, 23 Nov 2024 16:14:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C0AD06B0083; Sat, 23 Nov 2024 16:14:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF9556B0085; Sat, 23 Nov 2024 16:14:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 889516B0082 for ; Sat, 23 Nov 2024 16:14:54 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 04A19AD979 for ; Sat, 23 Nov 2024 21:14:53 +0000 (UTC) X-FDA: 82818614466.18.887A13D Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) by imf19.hostedemail.com (Postfix) with ESMTP id 6E8DD1A0002 for ; Sat, 23 Nov 2024 21:14:50 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=jwP+eSFq; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf19.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.171 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732396491; a=rsa-sha256; cv=none; b=UXvuiPCTk4+O4tCl+GFCE/4E1SSFn8JdcDf4wNHdmvV6ZuH4t+4Mu7U3xuj1JqcgXuoVVW /uRfT6Hh7bSK89BzU09K6kYagRyHyfpznJYbrw0YxhXXvjXCKBEGcqO9O0WplwR9cqpNYX NrzIATyXg7qVOZNLVwkY5ei6WkyWOEc= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=jwP+eSFq; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf19.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.171 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732396491; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a/5pPJyT3fsLdVu0JPf1v9z8ZI+Cm/OsIvTfB5/40Lc=; b=YOph/K12NF4TtjnWb5MIIHoRYft93xsL7Gmru7pCtGftHdCi1FzRCDC309hqIZsUqF+rF6 Oce5I+31415XyYXDxwajLhxP5bqQzOzK0Ne4Qpw5ZHCwbnkKTv1wOjadYTbS/Na+7ZDCo2 6gAo0k81/0DqOH2N3IJzrUcztThq3qU= Date: Sat, 23 Nov 2024 13:14:40 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1732396486; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a/5pPJyT3fsLdVu0JPf1v9z8ZI+Cm/OsIvTfB5/40Lc=; b=jwP+eSFqul9j4mmk23hgc0/Bok6RfuUVwSjiOvGWh6eP1iQG4rN3pDMambhlk9tRjlmTNd XgivcX9YrlwNyuD6zibB/hKFfTw1PrpbrnkU2yE1UFzwxYq7SpB6/LyJ1OVCI5TcJghz2A LVKqpUtZ9SwgQnYG0Dn39xmXJlDk2zU= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Yosry Ahmed Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Axel Rasmussen , Steven Rostedt , Suren Baghdasaryan , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: Re: [PATCH] mm: mmap_lock: optimize mmap_lock tracepoints Message-ID: References: <20241123060939.169978-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Queue-Id: 6E8DD1A0002 X-Rspamd-Server: rspam01 X-Stat-Signature: m144wj7u9879m7c5gkstbkbp36inb3mm X-HE-Tag: 1732396490-365105 X-HE-Meta: U2FsdGVkX1/K9Z7xmhbqt0p2Ogk4Jhq1yfIEgJXEevdLNXh5Lz6mDAX+EMwRFeVU0/r1qL6qsioIxI1VOcseuIUICI/21tQ73PXil7HbLZg3yhLgMAz8iHz7rKd773O9O54ICF+mcF6BChKLlpPPMO/9eirGKHhRvii8GTuy5KrE4IE4PFc4m/42xERQJ/EA4ThHngQIakWpP/JRWrYOPg/cpJ4hJeeVBEcOTf0wf9K8MDmpneCpw3VUO62f+qYlHncYTdtLD9buqsdVJY2lJFTbXZHm/cQPQhXCeDTXEn/IppDTczhDj+yEgf4dnQfiVUmSB3Td3lOAD85Q37ym+TThXCM/sSeMC/EpDAsZFPq8Wu4UjyONCPSdj03BL4mCckaT+naB1fkBKh4g4GgzgI0WQrCOBUaYd18GMMmAlFdTSnMMmRYztvqH7FFq7Gf7Nk1FCyioSwZeG/Ncoe499rmW7NcV2h1T1rB24b6YsVgR3MOurrMdhkU0qtBhhVw7/l/lGrDG9uEXAg8iLiQp+rL7KugBmZiD9Z/ccq2XF/TktY8YEjmpYsZDNo2JFKACwPj61nd1ZDszxSlqXM0uWcR+Ucs2f303FWIYhNqAPxKoslklytqQfEJZi6qfXSybEGNzf0SsdmW1UIXgzE9uIpZzDdU294961vHaKFXiMnQXfDy17+YmNa59ODdHXTMzSUXYPYe2CZmxZqiqrb03Q4iKh6PRohjeNdrg9U/VdbR0vvkUa/FMFJO2/VaBpaDvU3CNBnErjH6mVQCGDbVEhSNBvZc0ldDTd18O70/oiYdd5jnidaZhYv5LytHUkWy0qIoxoJoZfb5oDzXOEB/uGy9xFmWfSD7eBCw2A8JjwghFuyabYpk0YHBumRCHHG3rNHMX9vrhqa9SfqujwCmh81b6nFJriiKm/hDn1qHQ+0gu+TWxzRaija4r+NUIe5uq4J5imL2A8y9OSGoKfE6 9X19C9L5 a0l9gouFd1iAoUxbhSrnT1PwHuZS1hv2BQM0uqfVChQphS8eNPzgXIgGq0D6dBkHh3czsAKb+Mz/Wid+tUm24UOeiMkZ0Tot5B5DklCSlZzZxkeU8KiuoW8Jjgla9X+S2ZS6uScp3I61sYoOrf+oD4mpUpoaDnLlSeOpwmraO9AsFO5YArn3dYYW9T6sT5kSxhZZV+jKoe5Ttq7IxdMWNG9Lpa34BBwvd0cHX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Nov 22, 2024 at 10:46:53PM -0800, Yosry Ahmed wrote: > On Fri, Nov 22, 2024 at 10:10 PM Shakeel Butt wrote: > > > > We are starting to deploy mmap_lock tracepoint monitoring across our > > fleet and the early results showed that these tracepoints are consuming > > significant amount of CPUs in kernfs_path_from_node when enabled. > > > > It seems like the kernel is trying to resolved the cgroup path in the > > s/resolved/resolve > > > fast path of the locking code path when the tracepoints are enabled. In > > addition for some application their metrics are regressing when > > monitoring is enabled. > > > > The cgroup path resolution can be slow and should not be done in the > > fast path. Most userspace tools, like bpftrace, provides functionality > > to get the cgroup path from cgroup id, so let's just trace the cgroup > > id and the users can use better tools to get the path in the slow path. > > > > Signed-off-by: Shakeel Butt > > --- > > include/linux/memcontrol.h | 18 ++++++++++++ > > include/trace/events/mmap_lock.h | 32 ++++++++++---------- > > mm/mmap_lock.c | 50 ++------------------------------ > > 3 files changed, 36 insertions(+), 64 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 5502aa8e138e..d82f08cd70cd 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -1046,6 +1046,19 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, > > > > void split_page_memcg(struct page *head, int old_order, int new_order); > > > > +static inline u64 memcg_id_from_mm(struct mm_struct *mm) > > The usage of memcg_id here and throughout the patch is a bit confusing > because we have a member called 'id' in struct mem_cgroup, but this > isn't it. This is the cgroup_id of the memcg. I admit it's hard to > distinguish them during naming, but when I first saw the function I > thought it was returning memcg->id. > > Maybe just cgroup_id_from_mm()? In cgroup v2, the cgroup id is the > same regardless of the controller anyway, in cgroup v1, it's kinda > natural that we return the cgroup id of the memcg. > > I don't feel strongly, but I prefer that we use clearer naming, and > either way a comment may help clarify things. > Ack, I will change to cgroup_id_from_mm() but I will keep memcg_id in the tracepoints. > > +{ > > + struct mem_cgroup *memcg; > > + u64 id = 0; > > + > > + rcu_read_lock(); > > + memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); > > + if (likely(memcg)) > > + id = cgroup_id(memcg->css.cgroup); > > We return 0 if the memcg is NULL here, shouldn't we return the cgroup > id of the root memcg instead? This is more consistent with > get_mem_cgroup_from_mm(), and makes sure we always return the id of a > valid cgroup. Good point and I need to add a mem_cgroup_disabled() check as well. Will do in v2.