ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: ksummit <ksummit-discuss@lists.linuxfoundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Julien Desfossez <jdesfossez@efficios.com>,
	daolivei <daolivei@redhat.com>, bristot <bristot@redhat.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [Ksummit-discuss] [TECH TOPIC] Pulling away from the tracing ABI quicksands
Date: Thu, 29 Jun 2017 19:58:54 -0700	[thread overview]
Message-ID: <20170630025852.xjoif3aai6rny5a2@ast-mbp> (raw)
In-Reply-To: <CA+55aFxFLvX62SyOC9qyVwEQXH8J224Fe03tvy624AUx0U2fRQ@mail.gmail.com>

On Thu, Jun 29, 2017 at 07:34:53PM -0700, Linus Torvalds wrote:
> On Thu, Jun 29, 2017 at 7:12 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > Well, I don't want to put words in his mouth, but as he's probably
> > currently putting mush in a baby's mouth, so I'll do it anyway. ;-) We
> > were talking about making the static tracepoints more "dynamic". I'm not
> > sure he's ever used eBPF with tracing.
> 
> I don't know how else you would make them dynamic, though.
> Realistically, ebpf seems to be working really well for the networking
> people, and seems to be the obvious solution.
> 
> Now, the networking people have obviously *made* it work for them. So
> it's not like it's some kind of "ebpf automatically solves all
> problems" thing. ebpf needs some infrastructure too, to be able to get
> to the interesting data sanely (and safely).
> 
> > eBPF is still very limited in tracing. Currently it is only implemented
> > for perf. Although, it has been on my todo list to get it working for
> > ftrace as well, and implementing eBPF for ftrace can also be on the
> > agenda.
> 
> Oh, I thought it worked outside of perf already. My bad. I'm actually
> surprised it doesn't interact with ftrace, since it seems like the
> perfect use case.

If Steven has a use case for bpf in ftrace, I don't mind, but I don't see
a value yet. Everything we wanted to see inside the kernel we can already
do with perf_events and bpf scripts.
Here is one scheduler related script:
https://github.com/iovisor/bcc/blob/master/tools/runqlen.py#L79
it builds a histogram of task->se.cfs_rq->nr_running like
# ./runqlat 
Tracing run queue latency... Hit Ctrl-C to end.
^C
     usecs               : count     distribution
         0 -> 1          : 233      |***********                             |
         2 -> 3          : 742      |************************************    |
         4 -> 7          : 203      |**********                              |
         8 -> 15         : 173      |********                                |
        16 -> 31         : 24       |*                                       |
        32 -> 63         : 0        |                                        |

This particular script just samples the whole system at given frequency
and uses probe_read() to walk kernel internal data strctures.
It's obviously unstable interace and scripts break from time to time.
The worst offender over the last years was constantly changing
internals of 'struct request' on block side, so the biosnoop.py bpf script
has ugly code like this:
#ifdef REQ_WRITE
    data.rwflag = !!(req->cmd_flags & REQ_WRITE);
#elif defined(REQ_OP_SHIFT)
    data.rwflag = !!((req->cmd_flags >> REQ_OP_SHIFT) == REQ_OP_WRITE);
#else
    data.rwflag = !!((req->cmd_flags & REQ_OP_MASK) == REQ_OP_WRITE);
#endif
to be able to run with different kernel versions, but that's fine.
When one wants to look inside kernel structures they need to
be ready to change their scripts with every kernel version
and that is well understood.

Re-reading Mathieu's original email I don't really understand
what he's trying to solve that is not solved already.

Also I'm not planning to fly to Prague just for tracing discussion.
There is netdev2.2 right after in Seoul.
And tracing microconf at plumbers in September which is imo better
suited to discuss tracing related topics.

  parent reply	other threads:[~2017-06-30  2:58 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-29 21:20 Mathieu Desnoyers
2017-06-29 23:55 ` Steven Rostedt
2017-06-30  0:03   ` Linus Torvalds
2017-06-30  0:32     ` Steven Rostedt
2017-06-30  0:41       ` Linus Torvalds
2017-06-30  0:59         ` Steven Rostedt
2017-06-30  0:52       ` Steven Rostedt
2017-06-30  1:00         ` Linus Torvalds
2017-06-30  1:16           ` Steven Rostedt
2017-06-30  1:27             ` Steven Rostedt
2017-06-30  1:51               ` Linus Torvalds
2017-06-30  2:12                 ` Steven Rostedt
2017-06-30  2:34                   ` Linus Torvalds
2017-06-30  2:48                     ` Steven Rostedt
2017-06-30  2:58                     ` Alexei Starovoitov [this message]
2017-06-30  3:02                       ` Steven Rostedt
2017-06-30  3:20                         ` Steven Rostedt
2017-07-27 14:35                           ` Mathieu Desnoyers
2017-07-27 15:57                             ` Steven Rostedt
2017-06-30 18:24                         ` Josef Bacik
2017-06-30 18:29                           ` Steven Rostedt
2017-06-30 18:30                             ` Steven Rostedt
2017-06-30 18:37                               ` Josef Bacik
2017-07-06 19:10                                 ` Steven Rostedt
2017-07-21 21:45                                   ` Mathieu Desnoyers
2017-07-21 23:15                                     ` James Bottomley
2017-07-22  2:18                                     ` Steven Rostedt
2017-07-23 16:24                                       ` Josef Bacik
2017-07-23 21:25                                         ` Steven Rostedt
2017-07-04 14:51                   ` Peter Zijlstra
2017-06-30  1:38             ` Linus Torvalds
2017-06-30  1:45               ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170630025852.xjoif3aai6rny5a2@ast-mbp \
    --to=alexei.starovoitov@gmail.com \
    --cc=bristot@redhat.com \
    --cc=daolivei@redhat.com \
    --cc=jdesfossez@efficios.com \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox