Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Joel Fernandes <joel@joelfernandes.org>
To: Daniel Colascione <dancol@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Tim Murray <timmurray@google.com>,
	Carmen Jackson <carmenjackson@google.com>,
	Mayank Gupta <mayankgupta@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	kernel-team <kernel-team@android.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Jerome Glisse <jglisse@redhat.com>, linux-mm <linux-mm@kvack.org>,
	Matthew Wilcox <willy@infradead.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Tom Zanussi <zanussi@kernel.org>
Subject: Re: [PATCH v2] mm: emit tracepoint when RSS changes by threshold
Date: Thu, 5 Sep 2019 23:01:42 -0400	[thread overview]
Message-ID: <20190906030142.GA29926@google.com> (raw)
In-Reply-To: <CAKOZuevJyfZRFz3M5myLy+XpS=mAxYCf+oQ2csxCHh7VO-OrKw@mail.gmail.com>

On Thu, Sep 05, 2019 at 06:15:43PM -0700, Daniel Colascione wrote:
[snip]
> > > > > > > The bigger improvement with the threshold is the number of trace records are
> > > > > > > almost halved by using a threshold. The number of records went from 4.6K to
> > > > > > > 2.6K.
> > > > > >
> > > > > > Steven, would it be feasible to add a generic tracepoint throttling?
> > > > >
> > > > > I might misunderstand this but is the issue here actually throttling
> > > > > of the sheer number of trace records or tracing large enough changes
> > > > > to RSS that user might care about? Small changes happen all the time
> > > > > but we are likely not interested in those. Surely we could postprocess
> > > > > the traces to extract changes large enough to be interesting but why
> > > > > capture uninteresting information in the first place? IOW the
> > > > > throttling here should be based not on the time between traces but on
> > > > > the amount of change of the traced signal. Maybe a generic facility
> > > > > like that would be a good idea?
> > > >
> > > > You mean like add a trigger (or filter) that only traces if a field has
> > > > changed since the last time the trace was hit? Hmm, I think we could
> > > > possibly do that. Perhaps even now with histogram triggers?
> > >
> > > I was thinking along the same lines. The histogram subsystem seems
> > > like a very good fit here. Histogram triggers already let users talk
> > > about specific fields of trace events, aggregate them in configurable
> > > ways, and (importantly, IMHO) create synthetic new trace events that
> > > the kernel emits under configurable conditions.
> >
> > Hmm, I think this tracing feature will be a good idea. But in order not to
> > gate this patch, can we agree on keeping a temporary threshold for this
> > patch? Once such idea is implemented in trace subsystem, then we can remove
> > the temporary filter.
> >
> > As Tim said, we don't want our traces flooded and this is a very useful
> > tracepoint as proven in our internal usage at Android. The threshold filter
> > is just few lines of code.
> 
> I'm not sure the threshold filtering code you've added does the right
> thing: we don't keep state, so if a counter constantly flips between
> one "side" of the TRACE_MM_COUNTER_THRESHOLD and the other, we'll emit
> ftrace events at high frequency. More generally, this filtering
> couples the rate of counter logging to the *value* of the counter ---
> that is, we log ftrace events at different times depending on how much
> memory we happen to have used --- and that's not ideal from a
> predictability POV.
> 
> All things being equal, I'd prefer that we get things upstream as fast
> as possible. But in this case, I'd rather wait for a general-purpose
> filtering facility (whether that facility is based on histogram, eBPF,
> or something else) rather than hardcode one particular fixed filtering
> strategy (which might be suboptimal) for one particular kind of event.
> Is there some special urgency here?
> 
> How about we instead add non-filtered tracepoints for the mm counters?
> These tracepoints will still be free when turned off.
> 
> Having added the basic tracepoints, we can discuss separately how to
> do the rate limiting. Maybe instead of providing direct support for
> the algorithm that I described above, we can just use a BPF program as
> a yes/no predicate for whether to log to ftrace. That'd get us to the
> same place as this patch, but more flexibly, right?

Chatted with Daniel offline, we agreed on removing the threshold -- which
Michal also wants to be that way.

So I'll be resubmitting this patch with the threshold removed; and we'll work
on seeing to use filtering through other generic ways like BPF.

thanks all!

 - Joel

next prev parent reply	other threads:[~2019-09-06  3:01 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 20:09 Joel Fernandes (Google)
2019-09-04  4:44 ` Suren Baghdasaryan
2019-09-04  4:51   ` Daniel Colascione
2019-09-04  5:15     ` Joel Fernandes
2019-09-04  5:42       ` Daniel Colascione
2019-09-04 14:59         ` Joel Fernandes
2019-09-04 17:15           ` Daniel Colascione
2019-09-04 23:59             ` sspatil
2019-09-04  5:02   ` Joel Fernandes
2019-09-04  5:38     ` Suren Baghdasaryan
2019-09-04  8:45 ` Michal Hocko
2019-09-04 15:32   ` Joel Fernandes
2019-09-04 15:37     ` Michal Hocko
2019-09-04 16:28       ` Joel Fernandes
2019-09-05 10:54         ` Michal Hocko
2019-09-05 14:14           ` Joel Fernandes
2019-09-05 14:20             ` Michal Hocko
2019-09-05 14:23               ` Joel Fernandes
2019-09-05 14:43         ` Michal Hocko
2019-09-05 16:03           ` Suren Baghdasaryan
2019-09-05 17:35             ` Steven Rostedt
2019-09-05 17:39               ` Suren Baghdasaryan
2019-09-05 17:43               ` Tim Murray
2019-09-05 17:47               ` Joel Fernandes
2019-09-05 17:51                 ` Joel Fernandes
2019-09-05 19:56                   ` Tom Zanussi
2019-09-05 20:24                     ` Daniel Colascione
2019-09-05 20:32                       ` Tom Zanussi
2019-09-05 21:14                       ` Tom Zanussi
2019-09-05 22:12                         ` Daniel Colascione
2019-09-05 22:51                           ` Daniel Colascione
2019-09-05 17:50               ` Daniel Colascione
2019-09-06  0:59                 ` Joel Fernandes
2019-09-06  1:15                   ` Daniel Colascione
2019-09-06  3:01                     ` Joel Fernandes [this message]
2019-09-04 17:17       ` Daniel Colascione

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190906030142.GA29926@google.com \
    --to=joel@joelfernandes.org \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=carmenjackson@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dancol@google.com \
    --cc=jglisse@redhat.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mayankgupta@google.com \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=rcampbell@nvidia.com \
    --cc=rostedt@goodmis.org \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=zanussi@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox