From: Dmitry Rokosov <ddrokosov@salutedevices.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: <rostedt@goodmis.org>, <mhiramat@kernel.org>,
<hannes@cmpxchg.org>, <mhocko@kernel.org>,
<roman.gushchin@linux.dev>, <muchun.song@linux.dev>,
<mhocko@suse.com>, <akpm@linux-foundation.org>,
<kernel@sberdevices.ru>, <rockosov@gmail.com>,
<cgroups@vger.kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <bpf@vger.kernel.org>
Subject: Re: [PATCH v3 2/2] mm: memcg: introduce new event to trace shrink_memcg
Date: Sat, 25 Nov 2023 11:01:37 +0300 [thread overview]
Message-ID: <20231125080137.2fhmi4374yxqjyix@CAB-WSD-L081021> (raw)
In-Reply-To: <20231125063616.dex3kh3ea43ceyu3@google.com>
On Sat, Nov 25, 2023 at 06:36:16AM +0000, Shakeel Butt wrote:
> On Thu, Nov 23, 2023 at 10:39:37PM +0300, Dmitry Rokosov wrote:
> > The shrink_memcg flow plays a crucial role in memcg reclamation.
> > Currently, it is not possible to trace this point from non-direct
> > reclaim paths. However, direct reclaim has its own tracepoint, so there
> > is no issue there. In certain cases, when debugging memcg pressure,
> > developers may need to identify all potential requests for memcg
> > reclamation including kswapd(). The patchset introduces the tracepoints
> > mm_vmscan_memcg_shrink_{begin|end}() to address this problem.
> >
> > Example of output in the kswapd context (non-direct reclaim):
> > kswapd0-39 [001] ..... 240.356378: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=16
> > kswapd0-39 [001] ..... 240.356396: mm_vmscan_memcg_shrink_end: nr_reclaimed=0 memcg=16
> > kswapd0-39 [001] ..... 240.356420: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=16
> > kswapd0-39 [001] ..... 240.356454: mm_vmscan_memcg_shrink_end: nr_reclaimed=1 memcg=16
> > kswapd0-39 [001] ..... 240.356479: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=16
> > kswapd0-39 [001] ..... 240.356506: mm_vmscan_memcg_shrink_end: nr_reclaimed=4 memcg=16
> > kswapd0-39 [001] ..... 240.356525: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=16
> > kswapd0-39 [001] ..... 240.356593: mm_vmscan_memcg_shrink_end: nr_reclaimed=11 memcg=16
> > kswapd0-39 [001] ..... 240.356614: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=16
> > kswapd0-39 [001] ..... 240.356738: mm_vmscan_memcg_shrink_end: nr_reclaimed=25 memcg=16
> > kswapd0-39 [001] ..... 240.356790: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=16
> > kswapd0-39 [001] ..... 240.357125: mm_vmscan_memcg_shrink_end: nr_reclaimed=53 memcg=16
> >
> > Signed-off-by: Dmitry Rokosov <ddrokosov@salutedevices.com>
> > ---
> > include/trace/events/vmscan.h | 22 ++++++++++++++++++++++
> > mm/vmscan.c | 7 +++++++
> > 2 files changed, 29 insertions(+)
> >
> > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> > index e9093fa1c924..a4686afe571d 100644
> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -180,6 +180,17 @@ DEFINE_EVENT(mm_vmscan_memcg_reclaim_begin_template, mm_vmscan_memcg_softlimit_r
> > TP_ARGS(order, gfp_flags, memcg)
> > );
> >
> > +DEFINE_EVENT(mm_vmscan_memcg_reclaim_begin_template, mm_vmscan_memcg_shrink_begin,
> > +
> > + TP_PROTO(int order, gfp_t gfp_flags, const struct mem_cgroup *memcg),
> > +
> > + TP_ARGS(order, gfp_flags, memcg)
> > +);
> > +
> > +#else
> > +
> > +#define trace_mm_vmscan_memcg_shrink_begin(...)
> > +
> > #endif /* CONFIG_MEMCG */
> >
> > DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template,
> > @@ -243,6 +254,17 @@ DEFINE_EVENT(mm_vmscan_memcg_reclaim_end_template, mm_vmscan_memcg_softlimit_rec
> > TP_ARGS(nr_reclaimed, memcg)
> > );
> >
> > +DEFINE_EVENT(mm_vmscan_memcg_reclaim_end_template, mm_vmscan_memcg_shrink_end,
> > +
> > + TP_PROTO(unsigned long nr_reclaimed, const struct mem_cgroup *memcg),
> > +
> > + TP_ARGS(nr_reclaimed, memcg)
> > +);
> > +
> > +#else
> > +
> > +#define trace_mm_vmscan_memcg_shrink_end(...)
> > +
> > #endif /* CONFIG_MEMCG */
> >
> > TRACE_EVENT(mm_shrink_slab_start,
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 45780952f4b5..f7e3ddc5a7ad 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -6461,6 +6461,10 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
> > */
> > cond_resched();
> >
> > + trace_mm_vmscan_memcg_shrink_begin(sc->order,
> > + sc->gfp_mask,
> > + memcg);
> > +
>
> If you place the start of the trace here, you may have only the begin
> trace for memcgs whose usage are below their min or low limits. Is that
> fine? Otherwise you can put it just before shrink_lruvec() call.
>
From my point of view, it's fine. For situations like the one you
described, when we only see the begin() tracepoint raised without the
end(), we understand that reclaim requests are being made but cannot be
satisfied due to certain conditions within memcg (such as limits).
There may be some spam tracepoints in the trace pipe, which is a disadvantage
of this approach.
How important do you think it is to understand such situations? Or do
you suggest moving the begin() tracepoint after the memcg limits checks
and don't care about it?
> > mem_cgroup_calculate_protection(target_memcg, memcg);
> >
> > if (mem_cgroup_below_min(target_memcg, memcg)) {
> > @@ -6491,6 +6495,9 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
> > shrink_slab(sc->gfp_mask, pgdat->node_id, memcg,
> > sc->priority);
> >
> > + trace_mm_vmscan_memcg_shrink_end(sc->nr_reclaimed - reclaimed,
> > + memcg);
> > +
> > /* Record the group's reclaim efficiency */
> > if (!sc->proactive)
> > vmpressure(sc->gfp_mask, memcg, false,
> > --
> > 2.36.0
> >
--
Thank you,
Dmitry
next prev parent reply other threads:[~2023-11-25 8:01 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-23 19:39 [PATCH v3 0/2] mm: memcg: improve vmscan tracepoints Dmitry Rokosov
2023-11-23 19:39 ` [PATCH v3 1/2] mm: memcg: print out cgroup ino in the memcg tracepoints Dmitry Rokosov
2023-11-25 4:11 ` Shakeel Butt
[not found] ` <20231123193937.11628-3-ddrokosov@salutedevices.com>
2023-11-25 6:36 ` [PATCH v3 2/2] mm: memcg: introduce new event to trace shrink_memcg Shakeel Butt
2023-11-25 8:01 ` Dmitry Rokosov [this message]
2023-11-25 17:38 ` Shakeel Butt
2023-11-25 17:47 ` Shakeel Butt
2023-11-27 9:33 ` Michal Hocko
2023-11-27 11:36 ` Dmitry Rokosov
2023-11-27 12:50 ` Michal Hocko
2023-11-27 16:16 ` Dmitry Rokosov
2023-11-28 9:32 ` Michal Hocko
2023-11-29 15:20 ` Dmitry Rokosov
2023-11-29 15:26 ` Dmitry Rokosov
2023-11-29 16:06 ` Michal Hocko
2023-11-29 16:57 ` Dmitry Rokosov
2023-11-29 17:10 ` Michal Hocko
2023-11-29 17:34 ` Steven Rostedt
2023-11-29 17:35 ` Dmitry Rokosov
2023-11-29 17:33 ` Andrew Morton
2023-11-29 17:49 ` Dmitry Rokosov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231125080137.2fhmi4374yxqjyix@CAB-WSD-L081021 \
--to=ddrokosov@salutedevices.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel@sberdevices.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhiramat@kernel.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=rockosov@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=shakeelb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox