From: Joel Fernandes <joel@joelfernandes.org>
To: Daniel Colascione <dancol@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Tim Murray <timmurray@google.com>,
Carmen Jackson <carmenjackson@google.com>,
Mayank Gupta <mayankgupta@google.com>,
Minchan Kim <minchan@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
kernel-team <kernel-team@android.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Dan Williams <dan.j.williams@intel.com>,
Jerome Glisse <jglisse@redhat.com>, linux-mm <linux-mm@kvack.org>,
Matthew Wilcox <willy@infradead.org>,
Ralph Campbell <rcampbell@nvidia.com>,
Vlastimil Babka <vbabka@suse.cz>,
Tom Zanussi <zanussi@kernel.org>
Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold
Date: Thu, 5 Sep 2019 23:01:42 -0400 [thread overview]
Message-ID: <20190906030142.GA29926@google.com> (raw)
In-Reply-To: <CAKOZuevJyfZRFz3M5myLy+XpS=mAxYCf+oQ2csxCHh7VO-OrKw@mail.gmail.com>
On Thu, Sep 05, 2019 at 06:15:43PM -0700, Daniel Colascione wrote:
[snip]
> > > > > > > The bigger improvement with the threshold is the number of trace records are
> > > > > > > almost halved by using a threshold. The number of records went from 4.6K to
> > > > > > > 2.6K.
> > > > > >
> > > > > > Steven, would it be feasible to add a generic tracepoint throttling?
> > > > >
> > > > > I might misunderstand this but is the issue here actually throttling
> > > > > of the sheer number of trace records or tracing large enough changes
> > > > > to RSS that user might care about? Small changes happen all the time
> > > > > but we are likely not interested in those. Surely we could postprocess
> > > > > the traces to extract changes large enough to be interesting but why
> > > > > capture uninteresting information in the first place? IOW the
> > > > > throttling here should be based not on the time between traces but on
> > > > > the amount of change of the traced signal. Maybe a generic facility
> > > > > like that would be a good idea?
> > > >
> > > > You mean like add a trigger (or filter) that only traces if a field has
> > > > changed since the last time the trace was hit? Hmm, I think we could
> > > > possibly do that. Perhaps even now with histogram triggers?
> > >
> > > I was thinking along the same lines. The histogram subsystem seems
> > > like a very good fit here. Histogram triggers already let users talk
> > > about specific fields of trace events, aggregate them in configurable
> > > ways, and (importantly, IMHO) create synthetic new trace events that
> > > the kernel emits under configurable conditions.
> >
> > Hmm, I think this tracing feature will be a good idea. But in order not to
> > gate this patch, can we agree on keeping a temporary threshold for this
> > patch? Once such idea is implemented in trace subsystem, then we can remove
> > the temporary filter.
> >
> > As Tim said, we don't want our traces flooded and this is a very useful
> > tracepoint as proven in our internal usage at Android. The threshold filter
> > is just few lines of code.
>
> I'm not sure the threshold filtering code you've added does the right
> thing: we don't keep state, so if a counter constantly flips between
> one "side" of the TRACE_MM_COUNTER_THRESHOLD and the other, we'll emit
> ftrace events at high frequency. More generally, this filtering
> couples the rate of counter logging to the *value* of the counter ---
> that is, we log ftrace events at different times depending on how much
> memory we happen to have used --- and that's not ideal from a
> predictability POV.
>
> All things being equal, I'd prefer that we get things upstream as fast
> as possible. But in this case, I'd rather wait for a general-purpose
> filtering facility (whether that facility is based on histogram, eBPF,
> or something else) rather than hardcode one particular fixed filtering
> strategy (which might be suboptimal) for one particular kind of event.
> Is there some special urgency here?
>
> How about we instead add non-filtered tracepoints for the mm counters?
> These tracepoints will still be free when turned off.
>
> Having added the basic tracepoints, we can discuss separately how to
> do the rate limiting. Maybe instead of providing direct support for
> the algorithm that I described above, we can just use a BPF program as
> a yes/no predicate for whether to log to ftrace. That'd get us to the
> same place as this patch, but more flexibly, right?
Chatted with Daniel offline, we agreed on removing the threshold -- which
Michal also wants to be that way.
So I'll be resubmitting this patch with the threshold removed; and we'll work
on seeing to use filtering through other generic ways like BPF.
thanks all!
- Joel
next prev parent reply other threads:[~2019-09-06 3:01 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-03 20:09 Joel Fernandes (Google)
2019-09-04 4:44 ` Suren Baghdasaryan
2019-09-04 4:51 ` Daniel Colascione
2019-09-04 5:15 ` Joel Fernandes
2019-09-04 5:42 ` Daniel Colascione
2019-09-04 14:59 ` Joel Fernandes
2019-09-04 17:15 ` Daniel Colascione
2019-09-04 23:59 ` sspatil
2019-09-04 5:02 ` Joel Fernandes
2019-09-04 5:38 ` Suren Baghdasaryan
2019-09-04 8:45 ` Michal Hocko
2019-09-04 15:32 ` Joel Fernandes
2019-09-04 15:37 ` Michal Hocko
2019-09-04 16:28 ` Joel Fernandes
2019-09-05 10:54 ` Michal Hocko
2019-09-05 14:14 ` Joel Fernandes
2019-09-05 14:20 ` Michal Hocko
2019-09-05 14:23 ` Joel Fernandes
2019-09-05 14:43 ` Michal Hocko
2019-09-05 16:03 ` Suren Baghdasaryan
2019-09-05 17:35 ` Steven Rostedt
2019-09-05 17:39 ` Suren Baghdasaryan
2019-09-05 17:43 ` Tim Murray
2019-09-05 17:47 ` Joel Fernandes
2019-09-05 17:51 ` Joel Fernandes
2019-09-05 19:56 ` Tom Zanussi
2019-09-05 20:24 ` Daniel Colascione
2019-09-05 20:32 ` Tom Zanussi
2019-09-05 21:14 ` Tom Zanussi
2019-09-05 22:12 ` Daniel Colascione
2019-09-05 22:51 ` Daniel Colascione
2019-09-05 17:50 ` Daniel Colascione
2019-09-06 0:59 ` Joel Fernandes
2019-09-06 1:15 ` Daniel Colascione
2019-09-06 3:01 ` Joel Fernandes [this message]
2019-09-04 17:17 ` Daniel Colascione
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190906030142.GA29926@google.com \
--to=joel@joelfernandes.org \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=carmenjackson@google.com \
--cc=dan.j.williams@intel.com \
--cc=dancol@google.com \
--cc=jglisse@redhat.com \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mayankgupta@google.com \
--cc=mhocko@kernel.org \
--cc=minchan@kernel.org \
--cc=rcampbell@nvidia.com \
--cc=rostedt@goodmis.org \
--cc=surenb@google.com \
--cc=timmurray@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=zanussi@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox