From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F105CFC273 for ; Tue, 15 Oct 2024 08:08:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1537D6B0082; Tue, 15 Oct 2024 04:08:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 104016B0083; Tue, 15 Oct 2024 04:08:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0DC56B0088; Tue, 15 Oct 2024 04:08:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D71856B0082 for ; Tue, 15 Oct 2024 04:08:12 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8BFC9AC519 for ; Tue, 15 Oct 2024 08:07:55 +0000 (UTC) X-FDA: 82675108614.24.4296F53 Received: from mail-ej1-f41.google.com (mail-ej1-f41.google.com [209.85.218.41]) by imf24.hostedemail.com (Postfix) with ESMTP id 01D6D18000C for ; Tue, 15 Oct 2024 08:08:08 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DJPWYD5q; spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728979618; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6DrzNGwPkbdM1CPH+YAONA+c6AWmVrJfrGwHxsSXjb8=; b=n9xD7RZnpZQpntby6ULIE3Y1ojpwbGj2bVHc5Cm2LtCKZ9E4/sAjaMIOK2gmafG9Z3TfGv XcU09AmpIF/BH6We6DzXRyusxV4Aksy2rP5Z6VPXoFGTvm6KGqzT5V8l4eREQs/Baz55Tv 00cO8KsBB4qxBi7SsNvykwx3DiL7znE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DJPWYD5q; spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728979618; a=rsa-sha256; cv=none; b=zwQ6Mzj93vfqqIymErdtdgIgdRYWcbBQuhgc7B90SEXNOyzvHqIbJElO9/Q1W9LBlZOEZY AwMoZCWQzdxCF5mH9afCH5StfV28u/uyrgCgKP2LlRYGy5pjNu4mRpIrtL/yh4Zwz8RAqb jjvR6plHf/kt9KJLzME2EPjAXx2jJ2c= Received: by mail-ej1-f41.google.com with SMTP id a640c23a62f3a-a99eb8b607aso369103466b.2 for ; Tue, 15 Oct 2024 01:08:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728979689; x=1729584489; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6DrzNGwPkbdM1CPH+YAONA+c6AWmVrJfrGwHxsSXjb8=; b=DJPWYD5qrsiu5pY2NlG7lQ3KkV9wf238GeynFiENS3UG7LLt2ZJ9xnNjROt6Sotead wKEyydknCjsnrKmA09TTj0ba7FNkk35W+nQQxX0gXEflHwlvdC4FEzNXFqd4DyDc+wPl 5sAZEIgEgcDe3OCID9jp8FRPoDTZN5WzW0JIUQLkaNd6U1NFOfZ6MBogKvQ0veVPu6T4 62tVveXh8C3WmhkoZ9wKqhlgmMu0FJmza51BSZHYIH//cTtYUWEJhWPtcEHoK0ZNEDBV B4ay+AhI+XYJU2b0dEVWmjv5in6WhWGxrQLkF+Ow4cjH3eEVsNm7S0dyqfZXW/SWtCLt J47w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728979689; x=1729584489; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6DrzNGwPkbdM1CPH+YAONA+c6AWmVrJfrGwHxsSXjb8=; b=dGUM6H9tuCQaSqvA6+SDnu/6qdTmK3BOaymwPXHcK4d0dgfGy0obVLA6E663BuQabL T9lsREC0MJbZm3Nij88fUhaJlWzszOtMC9Kz4Hqw4R+qIXJ0e0pTI4XXCFchU5Xh5Klp 875p30ZqkUQRdl1WDYzRkiKlTptP7BWaynZpNGzQc8X2R/P6vGq/V4BWrY74Xif84IM5 Pp/esDvPsz/y9zhlTGbcgQxveOn1wU3XI/tXoDWljbfMK00WqI3F6On/ihJzbOLe5Ptx xNWeFpfmN3u1HfuCOvHm+6WWUxAr460enAd8450BCXxUPjCI66dRCgPfJBuAbCK+9YWl tD/g== X-Forwarded-Encrypted: i=1; AJvYcCXMuBrGEwQgTmPHdOAPpEipKX50oJ6Sc43FSVNYuEbyBlMJKpMU5id4nAcSf1eYMCeq6tTfAU4i3A==@kvack.org X-Gm-Message-State: AOJu0YwqDEydXl0VVM+j0XusnP/ZmQ1RdG9QwiimSmDXSq4sqC8bLnCJ H+jVagao6LLQoJkw1O2188+lYSu/cYNK2qZcM0mGFLJP+pOqN3UKiKmUDQIArjxdRPi0zx8+cJL 1cYcWiOH9hTsJR11sgRiggbNhwHlTaX8a52SH X-Google-Smtp-Source: AGHT+IHQzE7/7CkNCdHcq4IpVEJ3q3XKiM7PqNExgmsDCcv7tTn2wDhZIkhqo6XqDnhnl0O7IIiH6FHw2aZ61lcbY00= X-Received: by 2002:a17:907:7288:b0:a99:5601:7dc1 with SMTP id a640c23a62f3a-a99b9586a29mr1156979866b.49.1728979688785; Tue, 15 Oct 2024 01:08:08 -0700 (PDT) MIME-Version: 1.0 References: <20241010003550.3695245-1-shakeel.butt@linux.dev> In-Reply-To: <20241010003550.3695245-1-shakeel.butt@linux.dev> From: Yosry Ahmed Date: Tue, 15 Oct 2024 01:07:30 -0700 Message-ID: Subject: Re: [PATCH] memcg: add tracing for memcg stat updates To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Steven Rostedt , JP Kobryn , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 01D6D18000C X-Stat-Signature: kw7ocd5j3aykw5pj6px4bh4giobgyi3o X-HE-Tag: 1728979688-496298 X-HE-Meta: U2FsdGVkX1/F04bFHtViFOiQnPkrE8VcGIm16GV5dFFP4pRJv+y/Dl/LoAlkAjpD/y3U6DQFn9XkbpgED7rxiybGLIYhkiDt68aSQCViMZa1wcbYfPkAe8uzFqV3kKfHp1DQn16juwZdk/sTG+LjrdG3o5gyVoBIBdlVLrvr37POfkKTYYgcDjfmCHaYKE1/8Zi5gtc3PzPreupv/mLJzl+ykOznPw+Yxt9itUBasNFBUn8FoikAiR/M8V6Ndyy7TiMDPNxdFxHtT5ojChmFL9ncQjJC5pSyj+ooRvvlj2kDGy4Ny6Pwp7YGSwQa8y6r0ZIPqqpGKcWhcCtI3Q6cLuT0ZEcaOaT4pmkzO/UyDDbV5IuKMwW1bv4z0U4W1f/yisLeQy0ZU/n7H0vjmdoNWwYghWfkhHaDCGroBs6GuDTA9W9KmMo5LICMaGsEwGLg/A7NAIftsoxR4KJgTUCWm2uOQ8ZoNrZkUhCsYKS9KEhP7GUeRhU/EtvWvVYUnxvfHqngxLbBiysURa8ESaDMh+w0GwHA9s/9lKGB33tu9R4pm5hb59knuQH5t1WxT6+aLKLy+2FCVEdWOyuWHyZO5Jsg9Rr/wLsOWcRq/1MNzKnntXVBWcJ/ZUdtNjUug4lM0uJRXVVTlbhUBAF+XnQ3FeQF1rRWrJ/rOajJ0YxcOhttgtOTkTRz5Pvln2V8vFG6jkIJuDLEKoiEpV/rKAnv8MIPDxMo7vnXIgtR8j70Nj/rfOFwn4Afdmx0rOnn84l45SGjJe4Ml3N818iVOFeGL9xR4x0Sp7sMxI5XODoS2++R7NqYL3rGWcSU8oagnmukodjO5heqBvULVlqV+zE8vKCJtC5YujLFBETuVQpnJrDN1tTEC32vDOWnbrk+GbcTjy7Rln1k2kLIvfJSloNrsEjetC00EheunsYsPVsEGrCN0b9gXCknFq1VgsdyjhXyCcB2AjVKWrre4gf8k1l FnGFrIaO I8g6cPjngOf9/53RH+olHvU2MC5RiQja1GOyF3e1UKRt4SQ/UvyOLxuKlz0WbtuxiQaBRuTLA6CuH4N3gr3UMXPSLKBWlaYc6XXyY6oX3qopA6ZJrVK5zfETMAJBtOGkPqTdTYVsyekXsMBthf8sZEgrit14tTMQTRVUPTlDTRlXi98EOeKF50Dn8Gb0yz6xiURGDFCubqiJ0sBT6i11zjN1YcmjbGHH8JRL0mBBRdDMDj7c2PB/mrexH+Fliw+CiLP46 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 9, 2024 at 5:36=E2=80=AFPM Shakeel Butt wrote: > > The memcg stats are maintained in rstat infrastructure which provides > very fast updates side and reasonable read side. However memcg added > plethora of stats and made the read side, which is cgroup rstat flush, > very slow. To solve that, threshold was added in the memcg stats read > side i.e. no need to flush the stats if updates are within the > threshold. > > This threshold based improvement worked for sometime but more stats were > added to memcg and also the read codepath was getting triggered in the > performance sensitive paths which made threshold based ratelimiting > ineffective. We need more visibility into the hot and cold stats i.e. > stats with a lot of updates. Let's add trace to get that visibility. > > Signed-off-by: Shakeel Butt One question below, otherwise: Reviewed-by: Yosry Ahmed > --- > include/trace/events/memcg.h | 59 ++++++++++++++++++++++++++++++++++++ > mm/memcontrol.c | 13 ++++++-- > 2 files changed, 70 insertions(+), 2 deletions(-) > create mode 100644 include/trace/events/memcg.h > > diff --git a/include/trace/events/memcg.h b/include/trace/events/memcg.h > new file mode 100644 > index 000000000000..913db9aba580 > --- /dev/null > +++ b/include/trace/events/memcg.h > @@ -0,0 +1,59 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#undef TRACE_SYSTEM > +#define TRACE_SYSTEM memcg > + > +#if !defined(_TRACE_MEMCG_H) || defined(TRACE_HEADER_MULTI_READ) > +#define _TRACE_MEMCG_H > + > +#include > +#include > + > + > +DECLARE_EVENT_CLASS(memcg_rstat, > + > + TP_PROTO(struct mem_cgroup *memcg, int item, int val), > + > + TP_ARGS(memcg, item, val), > + > + TP_STRUCT__entry( > + __field(u64, id) > + __field(int, item) > + __field(int, val) > + ), > + > + TP_fast_assign( > + __entry->id =3D cgroup_id(memcg->css.cgroup); > + __entry->item =3D item; > + __entry->val =3D val; > + ), > + > + TP_printk("memcg_id=3D%llu item=3D%d val=3D%d", > + __entry->id, __entry->item, __entry->val) > +); > + > +DEFINE_EVENT(memcg_rstat, mod_memcg_state, > + > + TP_PROTO(struct mem_cgroup *memcg, int item, int val), > + > + TP_ARGS(memcg, item, val) > +); > + > +DEFINE_EVENT(memcg_rstat, mod_memcg_lruvec_state, > + > + TP_PROTO(struct mem_cgroup *memcg, int item, int val), > + > + TP_ARGS(memcg, item, val) > +); > + > +DEFINE_EVENT(memcg_rstat, count_memcg_events, > + > + TP_PROTO(struct mem_cgroup *memcg, int item, int val), > + > + TP_ARGS(memcg, item, val) > +); > + > + > +#endif /* _TRACE_MEMCG_H */ > + > +/* This part must be outside protection */ > +#include > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index c098fd7f5c5e..17af08367c68 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -71,6 +71,10 @@ > > #include > > +#define CREATE_TRACE_POINTS > +#include > +#undef CREATE_TRACE_POINTS > + > #include > > struct cgroup_subsys memory_cgrp_subsys __read_mostly; > @@ -682,7 +686,9 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum= memcg_stat_item idx, > return; > > __this_cpu_add(memcg->vmstats_percpu->state[i], val); > - memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); > + val =3D memcg_state_val_in_pages(idx, val); > + memcg_rstat_updated(memcg, val); > + trace_mod_memcg_state(memcg, idx, val); > } > > /* idx can be of type enum memcg_stat_item or node_stat_item. */ > @@ -741,7 +747,9 @@ static void __mod_memcg_lruvec_state(struct lruvec *l= ruvec, > /* Update lruvec */ > __this_cpu_add(pn->lruvec_stats_percpu->state[i], val); > > - memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); > + val =3D memcg_state_val_in_pages(idx, val); > + memcg_rstat_updated(memcg, val); > + trace_mod_memcg_lruvec_state(memcg, idx, val); > memcg_stats_unlock(); > } > > @@ -832,6 +840,7 @@ void __count_memcg_events(struct mem_cgroup *memcg, e= num vm_event_item idx, > memcg_stats_lock(); > __this_cpu_add(memcg->vmstats_percpu->events[i], count); > memcg_rstat_updated(memcg, count); > + trace_count_memcg_events(memcg, idx, count); count here is an unsigned long, and we are casting it to int, right? Would it be slightly better if the tracepoint uses a long instead of int? It's still not ideal but probably better than int. > memcg_stats_unlock(); > } > > -- > 2.43.5 >