linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] ioctl()-based API to query VMAs from /proc/<pid>/maps
@ 2024-05-04  0:30 Andrii Nakryiko
  2024-05-04  0:30 ` [PATCH 1/5] fs/procfs: extract logic for getting VMA name constituents Andrii Nakryiko
                   ` (6 more replies)
  0 siblings, 7 replies; 46+ messages in thread
From: Andrii Nakryiko @ 2024-05-04  0:30 UTC (permalink / raw)
  To: linux-fsdevel, brauner, viro, akpm
  Cc: linux-kernel, bpf, gregkh, linux-mm, Andrii Nakryiko

Implement binary ioctl()-based interface to /proc/<pid>/maps file to allow
applications to query VMA information more efficiently than through textual
processing of /proc/<pid>/maps contents. See patch #2 for the context,
justification, and nuances of the API design.

Patch #1 is a refactoring to keep VMA name logic determination in one place.
Patch #2 is the meat of kernel-side API.
Patch #3 just syncs UAPI header (linux/fs.h) into tools/include.
Patch #4 adjusts BPF selftests logic that currently parses /proc/<pid>/maps to
optionally use this new ioctl()-based API, if supported.
Patch #5 implements a simple C tool to demonstrate intended efficient use (for
both textual and binary interfaces) and allows benchmarking them. Patch itself
also has performance numbers of a test based on one of the medium-sized
internal applications taken from production.

This patch set was based on top of next-20240503 tag in linux-next tree.
Not sure what should be the target tree for this, I'd appreciate any guidance,
thank you!

Andrii Nakryiko (5):
  fs/procfs: extract logic for getting VMA name constituents
  fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps
  tools: sync uapi/linux/fs.h header into tools subdir
  selftests/bpf: make use of PROCFS_PROCMAP_QUERY ioctl, if available
  selftests/bpf: a simple benchmark tool for /proc/<pid>/maps APIs

 fs/proc/task_mmu.c                            | 290 +++++++++++---
 include/uapi/linux/fs.h                       |  32 ++
 .../perf/trace/beauty/include/uapi/linux/fs.h |  32 ++
 tools/testing/selftests/bpf/.gitignore        |   1 +
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/procfs_query.c    | 366 ++++++++++++++++++
 tools/testing/selftests/bpf/test_progs.c      |   3 +
 tools/testing/selftests/bpf/test_progs.h      |   2 +
 tools/testing/selftests/bpf/trace_helpers.c   | 105 ++++-
 9 files changed, 763 insertions(+), 70 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/procfs_query.c

-- 
2.43.0



^ permalink raw reply	[flat|nested] 46+ messages in thread
* Re: [PATCH 2/5] fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps
@ 2024-05-04 18:37 Alexey Dobriyan
  0 siblings, 0 replies; 46+ messages in thread
From: Alexey Dobriyan @ 2024-05-04 18:37 UTC (permalink / raw)
  To: gregkh
  Cc: andrii, linux-fsdevel, brauner, viro, akpm, gregkh, linux-mm,
	linux-kernel

Hi, Greg.

We've discussed this earlier.

Breaking news: /proc is slow, /sys too. Always have been.

Each /sys file is kind of fast, but there are so many files that
lookups eat all the runtime.

/proc files are bigger and thus slower. There is no way to filter
information.

If someone would post /proc today and said "it is 20-50-100" times
slower (which is true) than existing interfraces, linux-kernel would
not even laugh at him/her.

> slow in what way?

open/read/close is slow compared to equivalent not involving file
descriptors and textual processing.

> Text apis are good as everyone can handle them,

Text APIs provoke inefficient software:

Any noob can write

	for name in name_list:
	    with open(f'/sys/kernel/slab/{name}/order') as f:
	        slab_order = int(f.read().split()[0])

See the problem? It's inefficient.
No open("/sys", O_DIRECTORY|O_PATH);
No openat(sys_fd, "kernel/slab", O_DIRECTORY|O_PATH);
No openat(sys_kernel_slab, buf, O_RDONLY);

buf is allocated dynamically many times probably, it's Python after all.
buf is longer than necessary. pathname buf won't be reused for result.

.split() conses a list, only to discard everything but first element.

Internally, sysfs allocates 1 page, instead of putting 1 byte somewhere
in userspace memory. /proc too.

Lookup is done every time (I don't think sysfs caches dentries in dcache
but I may be mistaken, so lookup is even slower).

Multiply by many times monitoring daemons run this (potentially disturbing
other tasks).

> ioctls are harder for obvious reasons.

What? ioctl are hard now?

Text APIs are garbage. If it's some crap in debugfs then noone cares.
But /proc/*/maps is not in debugfs.

Specifically on /proc/*/maps:

* _very_ well written software know that unescaping needs to be done on pathname

* (deleted) and (unreachable) junk. readlink and /proc/*/maps don't have
  space for flags for unambigious deleted/unreachable status which
  doesn't eat into pathname -- whoops


> I don't understand, is this a bug in the current files?  If so, why not
> just fix that up?

open/read DO NOT accept file-specific flags, they are dumb like that.

In theory /proc/*/maps _could_ accept

	pread(fd, buf, sizeof(buf), addr);

and return data for VMA containing "addr", but it can't because "addr"
is offset in textual file. Such offset is not interesting at all.

> And again "efficient" need to be quantified.

	* roll eyes *

> Some people find text easier to handle for programmatic use :)

Some people should be barred from writing software by Programming Supreme Court
or something like that.


^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2024-05-08  1:21 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-04  0:30 [PATCH 0/5] ioctl()-based API to query VMAs from /proc/<pid>/maps Andrii Nakryiko
2024-05-04  0:30 ` [PATCH 1/5] fs/procfs: extract logic for getting VMA name constituents Andrii Nakryiko
2024-05-04  0:30 ` [PATCH 2/5] fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps Andrii Nakryiko
2024-05-04 15:28   ` Greg KH
2024-05-04 21:50     ` Andrii Nakryiko
2024-05-06 13:58       ` Arnaldo Carvalho de Melo
2024-05-06 18:05         ` Namhyung Kim
2024-05-06 18:51           ` Andrii Nakryiko
2024-05-06 18:53           ` Arnaldo Carvalho de Melo
2024-05-06 19:16             ` Arnaldo Carvalho de Melo
2024-05-07 21:55               ` Namhyung Kim
2024-05-06 18:41         ` Andrii Nakryiko
2024-05-06 20:35           ` Arnaldo Carvalho de Melo
2024-05-07 16:36             ` Andrii Nakryiko
2024-05-04 23:36   ` kernel test robot
2024-05-07 18:10   ` Liam R. Howlett
2024-05-07 18:52     ` Andrii Nakryiko
2024-05-04  0:30 ` [PATCH 3/5] tools: sync uapi/linux/fs.h header into tools subdir Andrii Nakryiko
2024-05-04  0:30 ` [PATCH 4/5] selftests/bpf: make use of PROCFS_PROCMAP_QUERY ioctl, if available Andrii Nakryiko
2024-05-04  0:30 ` [PATCH 5/5] selftests/bpf: a simple benchmark tool for /proc/<pid>/maps APIs Andrii Nakryiko
2024-05-04 15:29   ` Greg KH
2024-05-04 21:57     ` Andrii Nakryiko
2024-05-05  5:09       ` Ian Rogers
2024-05-06 18:32         ` Andrii Nakryiko
2024-05-06 18:43           ` Ian Rogers
2024-05-07  5:06             ` Andrii Nakryiko
2024-05-07 17:29               ` Andrii Nakryiko
2024-05-07 22:27                 ` Namhyung Kim
2024-05-07 22:56                   ` Andrii Nakryiko
2024-05-08  0:36                     ` Arnaldo Carvalho de Melo
2024-05-04 15:32   ` Greg KH
2024-05-04 22:13     ` Andrii Nakryiko
2024-05-07 15:48       ` Liam R. Howlett
2024-05-07 16:10         ` Matthew Wilcox
2024-05-07 16:18           ` Liam R. Howlett
2024-05-07 16:27         ` Andrii Nakryiko
2024-05-07 18:06           ` Liam R. Howlett
2024-05-07 19:00             ` Andrii Nakryiko
2024-05-08  1:20               ` Liam R. Howlett
2024-05-04 11:24 ` [PATCH 0/5] ioctl()-based API to query VMAs from /proc/<pid>/maps Christian Brauner
2024-05-04 15:33   ` Greg KH
2024-05-04 21:50     ` Andrii Nakryiko
2024-05-04 21:50   ` Andrii Nakryiko
2024-05-05  5:26 ` Ian Rogers
2024-05-06 18:58   ` Andrii Nakryiko
2024-05-04 18:37 [PATCH 2/5] fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps Alexey Dobriyan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox